# STATS 121

 The flashcards below were created by user kccall93 on FreezingBlue Flashcards. Bar graph a graphical representation of categorical data. Names of each category are listed on teh x axis and a bar is placed over each category name having height equal to the frequency (or percentage) in that category Bias a condition that occurs when the design of a study systematically favors certain outcomes Blocking the grouping of individuals according to some characteristic like rats in teh same litter or plots of land at the same locatio. the random allocation is carried out separately within each group Boxplot a plot of data based on the five number summary. a line is drawn from the minimum observation to Q1; a bos is drawn from Q1 to Q3 with a vertical line at the median and a line is drawn from Q3 to teh maximum observation. Good for side Categorical variable a variable that can be classified into groups or categories such as gender, religion, zip-code, etc. typically, words are used to describe an individual Comparative study a study where the explanatory variable has two active treatments rather than an active treatment versus a contro. purpose of study is to determine which treatment works best rather than whether a treatment works Completely randomized design an experimental design where all individauls participating in the experiment are assigned at random to the treatments Confounded variable a variable whose effect on the response variable cannot be separated from the effect of the explanatory variable on the response variable. (Note: usually confounded variables are lurking variables but only a few lurking variables are also confounded) Confounding a situation where the effect of one variable on the response variable cannot be separated from the effect of another variable on the response variable Control an 'inactive' treatment where no experimental condition is applied to teh individuals in order to determine whether the active treatment works. Randomizing together with a conrol enables the researcher to manage lurking variables when there is not a comparison group. Note: a control is not necessary for a valid experiment as long as two or more comparison treatments are used Convenience sample a sample where the researcher contacts those subjects who are readily available and does not use any random selection. the results are almost surely biased Distribution a list or a graph that shows the possible values of a variable together with the frequency of each value dotplot a one dimensional plot of a quantitative data set where each value in the data set is represented by a dot above its corresponding location on the x axis Double blind neither the subject nor the doctor, nurse or whomever is diagnosing the results knowns which treatment the subject recieved experiment a study where a treatment is deliberately imposed on each individual in the study before reonses are measured in order to observe responses to the treatment. a valid experiment must have 1) control or comparison, 2) randomization and 3) replication Explanatory variable a variable that may or may not explain the outcomes (responses) of a study. it is described using a phase that describes all possible treatments. Note: an observational study can have an explanatory variable, but a valid experimetn always has an explanatory variable five number summary minimum, Q1, median, Q2, maximum; preferred when data are very skewed or have outliers histogram: a graphical display of a quantitative data set; data are separated into intervals of equal width and a bar is drawn over the interval having height equal to the frequency (or percentages) are given on the y axis (hence, a histogram gives a distribution). Histograms are described by shape, center and spread. Used for large data sets. individual the basic unit (or subject) of the experiment upon which a tretment is applied interquartile range (IQR) a measure of variablitiy recommended for skewed data or data with outliers; computed as IQR = Q3 - Q1 lack of realism a weakness in experiments where the setting of the experiment does not realistically duplicate the conditions we really want to study left skewed a density curve where the left side of the distribution extends in a long tail (Mean < Median) Lurking variable a variable that has an important effect on the relationship among the variables in a study but is not taken into account mean a measure of the center of the data; it's the oint that "balances" the data median a measure of the center of data; it's the oint such that half the number are smaller and the other half are larger 9the midpoint of the ordered data set) multi-stage sample sampling is conducted in stages; for a two-stage smaple, the individuals are grouped according to some characteristic-- groups are first randomly selected and then individuals are randomly selected from those selected groups. (In a stratified sample, individuals are randomly selected from every group). for example, states could be randomly selected; then school districts within selected states, followed by schools within selected school districts within selected states and finally students would be randomly selected from teh selected schools from teh selected school districts from selected states. that would be a four- stage sample non-response bias bias resulting when individuals selected to be in a survey either cannot be contracted or refuse to answer survey quesitons Normal distribution a bell-shaped symmetric density curve used to model many data sets that have a symmetric mound or bell shape observational study a study that merely observes conditions of idividuals in a population and records information; the population is disturbed as little as possilbe (note: treatments are not imposed on units) Outlier an obervation that falls outside the overall pattern of the data set. can be detected by checking; observation < Q1 - 1.5 IQR or observation> Q3 + 1.5 IQR Pie chart a graphical display of categorical data using a "pie", each category is represented as a slice where the size of the slice is proportional to the percentage fo data in that category. not recommended by statisticians placebo effect the response of patients to any treatment even though it has no physical effect population the entire group of individuals about whom we desire to collect information probability sample a sample selected using a random device where each individaul in the population has a chance (doesn't have to be equal) of being selected. Probability samples are necessary for making inferences. Examples include: SRS, stratified and multistage Q1 a location measure of the data such that has one fourth or 25% of the data is smaller than it. Q3 A location measure of the data that has three-fourths or 75% of teh data is samller than it. quantitative variable a variable with numerical values such as heigh or weight random number table a table of digits consisting of digits 0-9 whose order cannot be determined but in the long run, each digit occurs 10% of the time. Randomization a method of assigning individuals in an experiment to treamtent groups using some random device that eliminates bias and gives each unit the same probability of bein assigned to any treatmetn group. randomization "balances" the treamtent groups, thus averaging out lurking and extraneous variables. allows us to use the laws of probability to maek inferences. Randomization as a condition can be SRS or RAT (Random allocation to treatments) Range the maximum observation minus the minimum observation Replication having more than one individual in each treatmetn group replication is necessary for measuring variablity. also the greater the replication, the more precise the results Response bias bias resulting from individuals in a samle lying or giving incorrect repsonse because they do not have knowledge about the question or can't recall; response bias could also result from wording of teh question or from interviewers influence the response either intentionally or unintentionally Response variable a variable that gives the results (may not be a numbeR) of the oucome of a study; measured on an individual right skewed distribution a density curve where the right side of teh distribution extends in a long tail; sample a subset of individuals in the population; the group of individuasl about which we actually collect information from simple random sample a sample of size a selected from the population in such a way that each possible sample of size n has an equal chance of being selected standard deviation a measure of the "average" or typical deviation of the observation about the mean; measures variability of data about the mean standard Normal curve a normal distribution with mean of zero and standard deviation of one. probabilities are given in Table A for values of the standard Normal variable Statistically significant results of a study that differ too much from what we expected to attribute to chance variation alone stemplot a graphical representation of a quantitative data set. leading values of each data point are presented as stems and second digits are given as leaves. used for small data sets stratified sampling a sampling scheme where the population has been divided into strata according to some characteristic and a simple random sample is selected from within each stratum symmetric distribution a density curve wher ethe right half is a mirror image of the left half of the distribution (Mean= Median) Undercoverage bias bias that occurs because the list of the population from which the sample is drawn is incomplete- meaning that some people in the population are not listed for selection (homeless) Voluntary response sample a method of sample selection that consists of peopel choosing themselves by responding to ageneral appeal z-score a measure of the number deviations a value or observation is from teh mean, a standardized value Authorkccall93 ID132115 Card SetSTATS 121 DescriptionTest 1 Updated2012-02-01T02:42:40Z Show Answers