2012-10-08

stats 140 midterm (vocab)
  1. statistic
    science of data
  2. individuals/cases
    objects being described by set of data
  3. variable
    any characteristic of an individual and can take on different values
  4. quantitative variable
  5. qualitative variable
  6. values
    particular things variables take on
  7. observational study
    observes individual and measures variables of interest but does not influence responses.  used to describe group or situation
  8. response
    variable that measures an outcome or result of a study
  9. sampling
    to gain info of whole through one part
  10. sample surveys
    survery goup of individuals by studying only some of members. it represents the larger group
  11. population
    entire group of individuals about which we want information (group want to study)
  12. sample
    part of population from which we actually get the information and use it to draw conclusions about the whole
  13. census
    sample survey that attempts to include entire population in sample
  14. experiments
    deilberately imposing some treatment on individuals to observe responses.  can give cause and effect
  15. biased
    statisitical study that systematically favors certain outcomes.
  16. convenience sampling
    selection of individuals who are easiest to reach
  17. voluntary response sampling
    chooses itself by responding to a general appeal (call in)
  18. simple random sample
    choose a sample of n individuals form the population by a way that every set of n individuals has a chance to actually be selected
  19. table of random digits
    • long string of digits with two properties...
    • 1. each entry is equally likely to be 0-9
    • 2. entires are independent of each other
  20. parameter
    number that describes the population
  21. statistic
    number that describes a sample
  22. parameter is to __________ as statistic is to ______.
    population; sample
  23. variablility
    describes how spread out values are
  24. SRS
    simple random sample
  25. p
    proportion (fraction thats divided)
  26. p-hat
  27. margin of error
    plus or minus 2% points of how close sample stat is to pop parameter
  28. 95% confident
    truth lies within the margin of error ( what % of all possible samples satisfy margin of error)
  29. confidence statements
    • fact about what happens in all possible samples and is used to say how trustworthy result of sample is
    • 1. margin of error
    • 2. level of confidence
  30. sampling errors
    • errors caused by thw act of taking a sample
    • -undercoverage
    • -random sampling error
    • -biased sampling methods
  31. random sampling error
    deviation between sample stat and the population paramenter cause by chance in selecting a random sample
  32. nonsampling error
    • errors not related to the act of selecting a sample from the population
    • -processing erros
    • -response error
    • -nonresponse
  33. sampling frame
    list of every individual from population
  34. undercoverage
    occurs when some groups in population are left our of process of choosing the sample
  35. processing errors
    mistakes in mechanical tasks like arithmatic or entering responses into a computer
  36. response error
    when subject gives incorrect response  (lie, guess, bad memory)
  37. nonresponse
    failure to obtain data from individual selected for a sample (cant contact or no coorpation)
  38. stratified random sample
    • 1. strata - divide sampling frame into distinct groups of individuals
    • 2. clusters - take separate SRS in each stratum and combine to make complete sample
  39. probability sample
    sample chosen by chance
  40. response variable
    variable that measures an outcome or reult of study (dependent)
  41. explanatory variable
    variable that we think explains or causes change to response variable (independent)
  42. subjects
    individuals studied in an experiment
  43. treatment
    specific experimental condition applied to subjects
  44. lurking variable
    variable that has important effect on relationship among variables in study but isnt one of explanatory variables studied
  45. confounded
    when 2 variables have effect on a response variable and cannot distinguish from each other
  46. clinical trials
    experiment that studies effectiveness of medical treatments on actual patients
  47. placebo
    dumby treatment with no active ingredients
  48. placebo effect
    response to dumby treatment
  49. double-blind
    neither subjects nor testers recording know which treatment was to who
  50. randomized comparative experiment
    one that compares two treatments and allow us to draw cause and effect and is random and compares two things that are actually operating equally
  51. control group
    can be placebo group (no treatment at all)
  52. control
    effects of lurking variable on response, most simply by comparing 2 or mor treatment 
  53. randomize
    use impersonal chance to assign subjects to treatments
  54. statistically significant
    observed effect of a size that would rarely occur by chance
  55. comparative
    good, compare in observance
  56. matching
    combine comparison in creating a control group
  57. nonadherers
    subjects who participate but do not follow the experimental treatment
  58. dropouts
    those hwo begin an experiment that continues over extended period of time then they do not complete it
  59. generalizability
    accurate of whole population
  60. completely randomized
    experimental design, all the experiemental subjects are allocated at random among all treatments
  61. matched pair design
    compares 2 treatments that the pairs of subjects are closely matched as possible
  62. block design
    random assignment of subjects to treatment is carried out sepaately within each block
  63. block
    experimental subjects that are similar in some way prior to experiment that is expected to affect response of treatments
  64. measure
    a property of person or thing that we assign a number to represent the property
  65. instrument
    make a measurment
  66. units
    used to record the measurment
  67. variable
    result of measurement is numberical
  68. valid
    meassure of a property if it is relevant or appropriate as a representation of that property
  69. predictive validity
    can be used to predict success on tasks that are related to the property measured
  70. bias in measurement
    sustematically tends to overstate or understate true value of measured property
  71. random error in measurement
    repeated measurements on same individual but gives different results
  72. reliable
    if random error is small
  73. average in measurement
    several repeated measurements of same individual is more reliable than a single measurement
  74. distribution
    variable that tells us what values it takes and how often it takes these
  75. frequency table
    • raw data 
    • values || frequency
  76. roundoff errors
    rounded entries dont quite add to total which is rounded seperately
  77. pie chart
    show how a whole is divided into parts and forces us to see parts that make a whole
  78. bar graph
    help distinguish tween variables whose values have meaningful numerical scale
  79. categorical variable
    places individual into one of several groups of categories
  80. quantitative varaible
    takes numerical values for which arithmatic operations like ading and averaging make sense
  81. pictogram
    bar graph in whic pictures replace bars and ar not proportional
  82. line-graph
    to display change over-time and plits each variable against time 
  83. histogram
    distribution of quantitative variable and bars touch
  84. center
    midpoint of distribution
  85. spread
    variability of data (dont count outliers)
  86. shape
    peaks (unimodal, bimodal, mutlimodal), symmetric
  87. right skew (positively skewed)
    when the tail goes to the right
  88. left skew (negatively skewed)
    when the tail goes to the left
  89. stemplot
    stem is on the left and leaves are on the right
  90. median
    midpoint of distribution, the # that is positioned half way tween all the observations
  91. quartiles Q1 and Q3
    midpoints from beginning to median and median to end and divides observations into quarters
  92. five number summary
    min, Q1, median, Q3, max
  93. boxplot
    graph of five num sum
  94. Mean (x-bar)
    average of set of observations
  95. mode
    most frequent number
  96. standard deviation (s)
    measures average distance of observations from mean