The flashcards below were created by user
anastasia611
on FreezingBlue Flashcards.

Basic Forms of Logic
Modus Tollens Denying the Antecedent: if P, then Q, not Q, therefore, not P
Modus Ponens Affirming the Consequent: if P, then Q, Q, therefore P

Determinism
 weak: all events have antecedent causes
 probabilistic/stochastic: if all relevant antecedents are known, then the distribution of future events can be known
 strong: if all relevant antecedents are known, then the future event can be known in advance

phenomenological assertion
 descriptive
 what happens when
 when peoples stated beliefs and behaviors are in conflict, they are more likely to change their stated beliefs than their behavior

Theoretical assertion
 explanatory
 why what happens when
 conflict between stated beliefs and behaviors causes unpleasant internal condition (cognitive dissonance) and changing the belief is usually easiest way to reduce conflict

data
 sets of values for variables
 needs to be objective: can be verified by others
 needs to be replicable: if someone else collects the same kind of data from similar people under same conditions, they should get same results (statistically)

variable types (3)
 continuous: numerical, can take any value between two extremes ex: response time
 discrete: numerical, can take only certain values ex: # of siblings
 qualitative: categorical, values differ in a non mathematical way ex: race

Univariate Data is summarized by...
 Center: mean
 Spread: standard deviation
 Shape: name
*bivariate data summarized as two sets of univariate plus some measure of their association (almost always the correlation between them)

Bivariate Data
 paired observations; can be any two measures
 can also be two measures of the same thing at different times
 1. do the descriptive Dstats on each variable alone
 2. calculate a measure of the relationship between the two variables

Correlations
 the simplest and most popular measure of the association between two variables
 can be calculated between any two variables
 max +1.00 min 1.00 both considered perfect
 no correlation=0
 correlation coefficient (r): provides a measure of the linear relationship 1.00 to 1.00
 coefficient of determination(r2): perfect=.7 provides a measure of how much of the variance in one variable is explained by another variable
 1. correlations are unaffected by linear transformations
 2. correlations have no units

Pre Processing Steps
 1. Nothing.. the raw data are what's needed
 2. Condensed score: combining a large number of different measures to get on value (usually done for convergence)
 3. Summary Score: reducing a large number of identical measures to a single value (usually done for noise reduction, averages are much less unreliable)

Descriptive Stats
 summarize a given set of data
 the set of data is usually a sample, not the entire population
 because these are summaries, they can't be wrong

Unreliability
 the standard deviation across scores
 if you measure the same thing many times, identical conditions, you often don't get the same value every time
 good estimate of the amount of noise in the data
 the maximum possible correlation depends on the unreliability of the measure
 unreliability is BAD to reduce use summary scores instead of raw scores

Reliability
 the correlation between scores
 measuring many people twice gets you test/retest pairs
 calculating the correlation between these two is the reliability of the measure
 test/retest reliability must be at least +.70

operational definition
a statement that maps one or more empirical measures on to one or more theoretical constructs

construct validity
 the extent to which the measure provides an exhaustive and selective estimate of the target theoretical construct
 exhaustive: measure should cover all aspects of target construct (the whole truth)
 selective: measure only covers things that are a part of the target (nothing but the truth)

Convergent Validity
 the extent to which the measure is correlated with other measures of the same or similar underlying constructs
 exhaustive part of construct validity
 at least +.70

Discriminant Validity
 the extent to which the measure is NOT correlated with measures of different and dissimilar underlying constructs
 selective part of construct validity
 within .20 of zero

Threats and Counters to Construct Validity
 lack of exhaustiveness: does not cover the whole thing, so add or expand items
 lack of selectivity: measuring other stuff too, so delete or refine items
 systematic error
 sometimes reactivity, evaluation apprehension, demand chars with reactivity

Kinds of Validity (4)
 1. Content Validity: whether the measure makes sense, exclusive and exhaustive measure
 2. Face Validity: whether the measure appears to measure the construct (to a subject), avoid this, subjects may change their behavior
 3. Criterion Validity: whether the measure correlates with known consequences of the construct
 4. Construct Validity: the extent to which the measure provides an exhaustive and selective estimate of the target theoretical construct

Internal Validity
the extent to which a significant (IVDV) relationship is causal and not spurious
 significant IVDV relationship= the data from the different conditions are different and it isn't just due to chance
 is causal= the data from different conditions are different because of the planned difference between conditions
 and not spurious= as opposed to the data from different conditions being different for some other reason
 Threats
 confounds
 experimenter / observer bias
 demand chars with good subject behavior

Experiment
 has at least one manipulated variable acting as the potential cause of interest
 has a labile measured variable acting as the potential effect

IV
a manipulated variable being treated as the potential cause of interest

DV
a labile measured variable being treated as the effect of interest

Manipulated Variable
 something that is under the complete control of the experimenter
 3 types
 1. Situational: features of the environment (example lighting)
 2. Task: elements of what subjects are asked to do (easy vs hard tasks)
 3. Instructional: elements of how subjects are asked to do the task (example use imagery vs rote memory)

Measured Variable
 Something that is determined by or built in to the subject
 2 types
 1. stable: built in, permanent, like gender or handedness; difficult to impossible to manipulate so referred to as subject variables
 2. Labile: situational, temporary, like mood or response time; relatively easy to manipulate these are called data variables

extraneous variable
a potential cause of the effect that is not of current interest

Confound
 an extraneous variable that covaries with the potential cause of interest
 in order for it to be a confound, the EV must change in parallel with the IV

Confounding
when at least one extraneous variable changes in parallel with the IV

Experimental Control
the ability of experimenters to hold everything constant

Control Experiment
an ancillary experiment designed to test whether a potential confound in the main experiment could have been responsible for the results observed in the main experiment

Experimenter Bias
 when beliefs and or expectancies of experimenter influences the results
 acts as confound and reduces internal validity
 to reduce..
 1.reduce the involvement of human experimenters by using computers
 2.standardize the behavior of human experimenters by using strict protocols like interaction scripts
 3.remove the human experimenters knowledge by making the experimenter unaware of predictions or double blind
 Checking for Exptr Bias..
 run an experiment that follows advice
 run a null manipulation experiment which is a type of control experiment

Participant Bias
 when beliefs of participant concerning how they should behave influence the results
 subtypes are demand characteristics and evaluation apprehension
 to reduce..
 1. "good subject" type: reduce the demand characteristics or bury them in a load of filler
 2.evaluation apprehension type: make the experiment less "social" and or convince the subjects that data is annonymous

Demand Characteristic
 any aspect of the experiment that indicates the purpose of the experiment
 acts like a confound

Evaluation Apprehension
 an internal state that causes subjects to alter their behavior so that they will be viewed more positively by other people
 reduces construct and external validity

Statistical Conclusion Validity
 the extent to which inferences about the sampling population, based on a sample, are accurate
 Type I Error: concluding there is a difference in the sampling population when there isn't (false alarm)
 risk: the probability of making this error .05
 Type II Error: concluding in favor of same in the sampling population when different (miss)
 power: probability of not making this error 1B .80 +
 Threats
 uncontrolled variability
 random error
 violating assumptions hurts all stats

What causes Type I and Type II errors?
 Type I:
 bad luck
 one or more assumptions were not true
 Type II:
 bad luck
 one or more assumptions were not true
 noisy data and or too small of sample

Within Designs
 do both conditions
 interest in a small effect
 very brief experiment
 heterogeneous subject population
 downsides: increased demand characteristics and variety of possible carry over effects

Between Subjects Design
 1/2 do one condition, 1/2 do other condition
 non repeatable measure
 long lasting manipulation
 need vanilla control condition
 fear of demand characteristics
 need to create equivalent groups by pseudo random assignment and covariates/ matching
 downsides: requires many more subjects

Random Assignment
 to produce equivalent groups in b/w design
 the means, across groups, of all EVs are equal
 1.true RA: each subject is independently and randomly assigned to a group
 2.blocked randomization: subjects are randomly assigned to one of the currently smallest groups
 3.pseudorandomization: the order of assignment to groups is set in advance and applied as the subjects arrive (effective with large group)
 alternatives/ additional procedures
 1.matching: measure everybody in advance on worried about variables, top two and so on in different conditions, makes same, big pain
 2.Verification: include measures of potential confounds, discard entire dataset if RA fails
 3.inclusion of covariates:USE THIS ONE include measures of the potential confounds, use these measures to remove the effects during the analysis, never throw out and one meeting, although each covariate = 1 df
 *use true RA with covariates*

Counterbalancing
 to equalize all sequence and order effects in within design
 1.complete counter balancing all possible orders are used
 2.random partial counter balancing: each subject gets the conditions in a pseudo random order
 3. latin square: k different orders are created
 4.balanced latin square: latin square where each condition is also followed by each other condition exactly ex: a precedes b once, c,d,...

Control Hierarchy
 1. exert control, Hold it constant: don't allow the potential confound to vary at all
 2. pre equalize, Equalize on Average: make the potential confound equal on average across conditions
 3. post equalize, Measure and Remove: remove the effects of the potential confound afterthefact
 4. run Control Experiment: test the potential confound in a separate experiment

Inferential Stats
 go beyond the actual data in hand and make a best guess about the population from which the sample was taken, can be wrong
 1.Point Estimation(for the mean): get a best guess for what you are interested in, get an estimate for how wrong you might be, standard error
 2.
 Paired Samples ttest for within design:
 with w/i subjects design you have pairs of data, is there a difference between the two conditions? hypothesis testing, convert the two values from step one into a single, y/n answer to a question
 Independent Samples ttest for b/w design: when you use a b/w subjects design, you have separate samples for cond 1 and cond 2, so the probability question becomes what is the probability of getting two sample means that are this different if we assume both samples came from one distribution? less than 5%, then population means are not the same

Standard error of the mean
 a measure of how far any given sample mean might be from the mean of the sampled population
 sd/ root N

Hypothesis Testing
 there are four possible outcomes from an experiment
 what is true for the sampled pop?
 1.the means for the two conditions are the same
 2.the means for the two conditions are different
 what did we conclude, based on sample?
 3.the means for the two conditions are the same
 4.the means for the the two conditions are different

Correlational Study
 two measured variables
 one called predictor (cause)
 one called predicted (effect)

Quasi Experiment
 one stable measured variable (SV) that is treated as IV
 one labile measured variable treated as DV

third variable problem
 when Z causes the correlation between x and y
 spurious
 include Z as a covariate and recalculate the correlation (partial correlation)
 if partial is as strong as before, then not the cause
 if partial is smaller than original then Z is not the only cause
 if partial is now zero, then Z was the entire cause of x and y's correlation

Third Variable
a typically unmeasured variable that could be the cause of both the measured variables in a correlational study

Spurious
a significant relationship that is not causal in either direction

Crosslagged
 the correlation between one variable at one time and another variable at another time
 used to determine the more likely direction of causation
 if r2x1y2>r2x2y1, then x is probably the cause

Partial with respect to Z
 the correlation between two variables after the effects of a third variable Z have been removed
 used to test and rule out a third variable explanation
 if pr XY(Z) = rXY then Z is not a cause of both X and Y

External Validity
 the extent to which the results from an experiment will generalize to other situations
 to minimize the need for external validity...
 1. do things to increase external validity like parallel random assignment and counter balancing
 2. reduce the need for EV by making the studied situation more similar to the target
 person and context specificity threaten external validity
 threats
 unique/atypical sample
 lack of mundane realism

context specificity
when the results from an experiment or study are unique to the situation

person specificity
when the results from an experiment or study are unique to the subjects

convenience sampling
when only easily recruited subjects are used

Probability sampling
 when each person in the population has a definable probability of being sampled
 2 types
 1.simple random sampling
 2.stratified random sampling

simple random sampling
 when all members of the population have a definable probability of being sampled, but no attempt is made to match the group sizes
 subtypes (use standard)
 1.standard: sample 500 people
 2.systematic: sample every tenth person on list
 3. bernoulli: each person has a 10% chance

stratified random sampling
 the sizes of the groups in the population are taken into account
 subtypes
 proportional: force the groups %'s in the sample to match the %'s of the groups in the population
 important when samples are small
 nonproportional: quota sampling, force the group %'s, in the sample to be equal to each other
 its only important when some groups in population are very small, some stats require a minimum number of observations in every cell to be used has equal error or use when the accessible population doesnt match the target population

To choose a sampling method...
 ask how important it is to have a sample that accurately represents the target population
 if not very: convenience
 if sort of: simple random sample
 if very: stratified random sampling
if matching population is important but population is huge then use cluster sampling: when people are conveniently pre grouped via an irrelevant variable

survey or questionaire
a structured set of items designed to measure attitudes, beliefs, values, or behavioral tendencies

survey types
1. face to face interview: often only weakly structured, + can go where you want topic wise,  highly susceptible to reactivity and exptr bias2.face to face survey: often paired with convenience sampling , +very fast,  limited to simple yes no answers3. phone survey: often paired with simple random sampling, + very fast,  limited to simple yes no answers4. written questionaire: controlled setting vs take home, +can use more complicated item types (scales) and less reactivity,  less expt realism, can suffer from biased attrition: when probability of a given subject completing a survey depends on what their response is 5. electronic questionaire: +lower reactivity  lower realism

Item Types
 1. open ended questions: +less demand  less control
 2.Closed questions: + easily codified ,  often require fillers to avoid demand
 3. Likert scales: sets of 7 point agree/disagree items + usually most reliable measure of attitude,  some subjects object to lack of it depends option
 4. Guttman scales: a set of ascending questions, how far will subject go? stop asking when they say no + adaptive,  assumption of order may not always be accurate
 5. Thurstone scales: checklists subjects indicate all that apply, each item pre rated for positivity, + often the best convergent and discriminant validity,  much more work
 6. Semantic differentials: pairs of opposites indicate position between extremes + works well for overlapping constructs,  requires complicated pre processing

Naturalistic Observation
 studying behavior in everyday environments without getting involved
 key threat is reactivity, then observer bias

participant observation
 studying behavior from within the target group
 key threat is standard exptr bias, then observer bias
 not often possible because no consent observation can only occur in reasonable places

observer bias
 when the beliefs or expectancies of the observers influence what is recorded
 intercoder reliability must be .90+
 to reduce...
 use multiple observers
 prevent observer overload thru checklists and time sampling and event sampling

expost facto quasi experiment
when you take only one sample and then divide the subjects into the groups after the fact

planned quasi experiment
when you take separate samples for each of the groups

longitudinal study
 aging
 when you follow the same subjects over time
 threat is time frame effects (zeitgeist)

cross sectional study
 aging
 when you take separate samples for each age group at the same time
 threat is cohort

solution to both aging studies
run a hybrid study and verify same results either way

Paradigm
a standard method for studying a particular issue

paradigm measure
a summary score requiring data from atleast two conditions that is used to estimate a single theoretical construct

Stroop Original Version
 target construct is automaticity (of reading
 task is name the ink color)
 paradigm manipulation is (incongruence of a to be ignored word with the correct vocal response for the trial)
 incongruent trial is green
 neutral trial is house
 paradigm measure is the increase in response time between incongruent and neutral trials

Stroop Complete version
 Target construct is the failure of attentional selectivity
 task name the ink color
 paradigm manipulation congruence (of a to be ignored word with the correct vocal response for the trial)
 congruent trial is black
 incongruent trial is blue
 paradigm measure is the increase in response time between congruent and incongruent trials

The two Stroops
 original: incongruent vs neutral (ie never congruent) automaticity of reading
 complete: incongruent vs congruent (chance congruence) failure of selective attention

Exogenous Cuing
 target construct is capture of visual attention
 task is to respond to large white square
 paradigm manipulation is spatial validity of a prior cue (whether the place holder that flashes is in the same location as the target or elsewhere)
 paradigm measure is the advantage for valid over invalid trials

Pretest/Posttest with one group
 measure the subjects before and after treatment and test if they get better
 problems are before vs after is within subjects and within subject manipulations need to be counter balances but that can not be done here

Post test only with two groups
 one group gets the treatment, the other does not and test if the treatment group ends up better
 problems: treatment vs control is between subjects and there could have been a failure of random assignment

Pretest/Posttest with two groups Analysis Option 1
 change score logic + level 4 confound control
 compare the change scores between the groups using an independent samples t test
 a significant advantage for the treatment group is evidence that the therapy is better than nothing

Pretest/Posttest with Two groups Analysis Option 2
 outcome logic + level 3 confound control
 compare the post scores between the groups but include the pre scores as a covariate
 a significant advantage for the treatment group is evidence that the therapy is better than nothing
 problem is that it doesnt tell if the treatment group got better or the control group got worse

Pretest/Posttest with two groups Analysis Option 3
 change score logic + level 3 confound control
 compare the change scores between the groups but also include the pre scores as a covariate
 a significant advantage for the treatment group is evidence that the therapy is better than nothing
 best option!

Non Equivalent Group Designs
 many subjects run at once
 subjects are already in groups (which can't be shuffled), so whole groups are assigned to conditions, instead of people
 problem 1. groups not equal to start with, to fix add covariates
 problem 2. different events after the covariates for the groups to fix add a control measure to assess these events

Won't inclusion of covariate allow us to remove any differences, regardless of source?
 covariates only remove main effects of confounds, they do not remove interactions
 covariates can't do anything about confounds that occur after you take the covariate measure

Don't these issues apply to all between subject designs and not just non equivalent group designs?
 yes but they are much more likely when group membership is inherent to the subject, instead of randomly assigned
 you can not stop these group based confounds from happening but can get a measure of them
 this is like running a control experiment, but its done in the same subjects, at the same time, so its a control measure instead

Matched groups design
 pick subjects that are between the overlapping groups
 run experiment on everyone but only get data on matched
 choose two subgroups that are equal so starting with equal groups

Regression to the Mean
 rooted in true score theory observed score = T + e
 the expected value of e is zero and separate measures have independent errors
 a particularly high observed score
 probably includes a large positive e
 is expected to be followed by a lower score since the odds of getting to large positive es in a row is small
 the amount of regression that is expected to occur depends entirely on the unreliability of the measure
 amount of regression depends on unreliability of the measure

How do you reduce unreliability so regression is also reduced?
 use multiple measures
 use a measure with very high test retest reliability

Interrupted Time Series Design
 allows you to remove the general effect of time
 add a non equivalent control group the data from them will provide a measure of the effect of wide spread events like 911

Ultimate time series design
 start with basic interrupted time series design to get an estimate of general trend across time
 add a staggered second group to get a second measure of the effect and get measure of effect of widespread events
 add a control measure to both groups to get a measure of the effect of local events in both groups

