Med Stats exam 1

The flashcards below were created by user cooxcooxbananas on FreezingBlue Flashcards.

  1. Statistics is the science of...
    collecting, organizing, summarizing, analyzing and interpreting data in order to make decisions
  2. biostatistics is statistics for...
    biomedical applications
  3. The two branches of statistics are...
    descriptive statistics and inferential statistics
  4. Descriptive statistics involves....
    organization, summarization and presentation of data
  5. Inferential statistics involves...
    using a sample to draw conclusions about a population
  6. Two kinds of variability are...
    • explained (or attributable) variability
    • unexplained variability ("noise")
  7. two things that unexplained variability in data leads to is...
    • uncertainty about conclusions drawn from data
    • unpredictability of the next observation or measurment
  8. approximation is when you use...
    a simpler idea, object or representation to stand for the more complex one of interest
  9. when you use approximation you gain...
    convenience, feasibility, reduced cost or effort and clarity
  10. when you use approximation you lose...
    some characteristics or information in the original
  11. summarization is one form of...
  12. models are _______ to the real world
  13. models are an...
    abstract representation of some phenomenon or process
  14. models always miss....
    some properties of the orginal
  15. statistical models explicitly recognize the presence of...
    unattributable variability in data
  16. A population is...
    the collection of all responses, measurements, or counts that are of interest
  17. A sample is a...
    subset of a population
  18. A census is...
    a complete collection of the population
  19. What are the four reasons to use a sample?
    • the population is too large to obtain data
    • saves time and money
    • somteimes untis are destroyed in measurment
    • all members of a population may be difficult to contact
  20. a disadvantage to using a sample is...
    having some error
  21. A parameter is...
    a numerical description of a population characteristic
  22. A statistic is..
    a numerical description of a sample characteristic
  23. what are the 5 kinds of sampling techniques
    • simple random sampling
    • stratified sampling
    • cluster sampling
    • systematic sampling
    • Convenience Sampling
  24. every member of the population has an equal chance of being selected in what kind of sampling?
    simple random sampling
  25. What sampling is analogous to putting everyone's name in a hat and drawing out names at random?
    simple random sampling
  26. What is multi-stage random sampling?
    When subgroups of the population are first randomly selected and then within each subgroup, simple random ampling is done.
  27. When a population is divided into groups (strata) according to some characteristic it is called ________ sampling.
  28. The sampling in which, a simple random sample is selected from each group and then combined to form a final sample is...
    stratified sampling
  29. What kind of sampling is useful when the popultaion falls into subgroups that each have similar characteristics?
    cluster sampling
  30. What is the sampling in which the final sample consists of all members of one of more of the groups?
    cluster sampling
  31. In what sampling is each member of the population assigned a number and then are ordered?
    systematic sampling
  32. In which sample, is the final member made up of every kth member?
    systematic sampling
  33. What is the easiest sampling with the worst technique?
    convenience sampling
  34. In which sampling is only readily available data used?
    convenience sampling
  35. Which sampling is often not representative of the population?
    convenience sampling
  36. Every measurment obtained in a study should always have good ______ and ______.
    validity and reliability
  37. Validity is related to concepts of...
    accuracy and bias
  38. Reliability is related to concepts of...
    precision and variation
  39. Three sources of measurement are...
    • readings
    • ratings
    • reports
  40. Readings come from...
    readings on lab equipment
  41. "Ratings" are when...
    "Judges" are used to assess the condition of subjects using predefined criteria
  42. "Reports" refer to...
    when subjects provide their own recollections and reports of symptoms, conditions and performances.
  43. Almost all worthwhile scientific studies involve comparison of one or more groups with respect to a....
    response variable
  44. response variables are usually determined at the end of...
    the subject's participation in the study
  45. response variable relate, in varying degrees, to ....
    the primary and secondary questions
  46. In a randomized clinical trial, subjects are randomized to......
    treatment groups
  47. In a randomized clinical trial, the groups must be similar except for.....
    the treatment being received.
  48. What kind of trial is the best scientific approach for a comparative study?
    Randomized clinical trial
  49. Why is the randomized clinical trial considered to be the best scientific approach for a comparative study?
    Because the differences observed can be attributed to treatment.
  50. Randomized clinical trial can sometimes be ______ or ______ impossible to conduct.
    ethically or practically
  51. In randomized clinical trials, there needs to be a link between ______ and _____.
    efficacy and effectiveness
  52. Efficacy is..
    how well the therapy works under ideal conditions
  53. Effectiveness is...
    how well therapy works in a real-world setting
  54. An uncontrolled trial generally provides a ________ view of therapy
  55. Sometimes a "no-treatment" control group is not....
  56. Historical controls deal with...
    using a comparison group obtained from the medical records of similar subjects.
  57. The advantages to using historical controls are..
    • that its cheap and simple
    • All subjects receive, which investigators believe is superior
  58. The disadvantages to using historical controls are..
    • the quality and availability of historical data
    • Criteria of response may change
    • Ancillary patient care improves
  59. Using historical controls proves to be appropriate is the disease is...
    unifromally fatal initially and a new drug becomes available
  60. When using historical controls a decline in fatality would signify what?
    that the treatment works
  61. what are the four reasons that treatments and controls do differ?
    • sampling variability or chance
    • inherent differences between treatment and control subjects
    • Differences in the handling and evaluation of the treatment and control groups during the course of the investigation
    • True effect of the new procedure
  62. A good experimental design will reduce, if not eliminate what two factors?
    • inherent differences between treatment and control subjects
    • Differences in the handling and evaluation of the treatment and control groups during the course of the investigation
  63. Which design is the simplest, most used design?
    Parallel group design
  64. In which design are subjects randomized to groups and followed in a parallel fashion?
    parallel group design
  65. in parallel group design, each subject gets how many treatment assignments?
  66. in which design do subjects act as their own control?
    cross-over design
  67. in which design are generally fewer subjects needed?
    cross-over design
  68. In cross-over design, subject differences do not interfere with...
    treatment comparisons
  69. in the cross-over design, subjects are randomized to order of.....
  70. In which design, is there a "washout" period so that the first treatment that subjects recieve leaves the system?
    cross-over design
  71. What two problems are always a concern with the cross-over design?
    • carry-over effects
    • drop-outs
  72. What kind of therapies are cross-over designs appropriate for?
    therapies that may offer short-term relief of sign or symptoms and not a cure for a condition.
  73. in the cross-over design, comparisons are made between groups based on ......
    how we intended to treat the subject
  74. Our comparsions are based on how we intended to treat the subject because...
    Subjects may not always fully comply with the assigned treatment.
  75. Three examples of Observational studies are...
    • case-control study
    • cohort study
    • cross-sectional study
  76. In which study are characteristics of a sample observed at one point in time?
    cross-sectional study
  77. What is the advantage to a cross-sectional study?
    quick and cheap
  78. what are the disadvantages to a cross-sectional study?
    • often difficult to be sure that exposure precedes disease
    • only measures prevalence
  79. Which study is often called a "prospective study"?
    cohort study
  80. What is a cohort study?
    when you have a group of subjects that are classified according to some characteristics that might be related to an outcome and then followed over time to observe the outcome.
  81. What is the advantage to a cohort study?
    good for rare exposures
  82. what is the disadvantages of a cohort study?
    • time consuming
    • expensive
    • good follow-up difficult
  83. why might a good follow up in a cohort study be difficult to obtain?
    subject dropouts
  84. What is a case-control study characterized by?
    the identification of the two study groups on the basis of the presence or absence of the outcome of interest, and by retrospective observation of antecedent factors under study
  85. In a case-control study, the control must be representative of...
    the population from which the cases came
  86. A macthed case control study is when...
    a control is matched to every case by certain factors, so that the two groups will be more similar
  87. what are the advantages of case-control study?
    • good for rare diseases
    • fast
    • inexpensive
    • can simultaneously examine several antecedent factors of interest
  88. what are the disadvantages of the case-control study?
    • not good for rare exposures
    • indirect way of assessing effects
  89. cohort and case-control studies are considered to be longitudinal, meaning...
    the subjects are studied at more than one time
  90. What is confounding?
    a mixing of the effect of a third factor into the exposure-response relationship
  91. Confounding can be controlled in the analysis if...
    information on the confounders is available
  92. What is a bias?
    a condtion, tendency or inclination that prevents a fair comparison of groups
  93. A selection bias is...
    the decision to admit a subject to the trial or which group to assign the subject is effected by knowledge of how the subject may respond to treatment.
  94. In a volunteer bias, subjects who volunteer, or refuse new treatments may be...
    very different
  95. many studies show that _____ tend to be healthier than the general population and are more likely to comply with medical recommendations.
  96. Using a "placebo effect" creates a ______ bias.
  97. what is a placebo effect?
    knowledge that a patient is being treated effects the patient's response to treatment
  98. if a placebo is being used, subjects should be ______ to the treatment if possible.
  99. Assesment bias is when you have knowledge of...
    the treatment received by the subject effects researcher's assessment
  100. a kind of bias in which an observer may be unconsciously prejudiced is...
    assessment bias
  101. assement bias may be avoided by...
    double blinding
  102. Double blinding is...
    when neither the subject nor the assessor know which treatment the subject is receiving
  103. A data set is..
    a set of values of one or more variables for a collection of inidividuals or units
  104. a binary endpoint variable is...
    classifying the members of a given set of objects into two groups on the basis of whether they have some property or not. (if they are sick or not)
  105. a nominal endpoint variable is...
    • classification based on a categorical sense. Subjects are classified by different qualitative catergories.
    • (if the values / observations belonging to it can be assigned a code in the form of a number where the numbers are simply labels.)
  106. an ordinal endpoint variable is ...
    classification based on an order of rank (1st, 2nd, 3rd, etc.)
  107. a discrete endpoint variable is....
    if the values / observations belonging to it are distinct and separate, i.e. they can be counted (1,2,3,....)
  108. a continuous endpoint variable is...
    if the values / observations belonging to it may take on any value within a finite or infinite interval. You can count, order and measure continuous data. For example height, weight, temperature)
  109. You can always convert a variable from a more ______ scale to less, but not ________.
    • informative
    • vice versa
  110. What is the tabular display?
    Frequency distribution
  111. What are the graphical displays?
    Frequency histogram, dot plot, stem-and-leaf plot, pie chart, pareto chart, scatter plot, time series
  112. Frequency distributions are _______ tables for a _____ variable
    • one-way
    • single
  113. The relative frequency is...
    Proportion of observations that take each value
  114. a list of relative frequencies is often called the...
    distribution of the variable
  115. Frequency distribution shows how observed values are distributed across...
    possible values.
  116. For what kind of data is there sometimes too many values to list in a frequency distribution?
    continuous and discrete
  117. When using frequency distributions, it is good to use cumulative frequency and cumulative relative frequency when you have what kind of data?
    ordinal or quantitative data
  118. Cumulative frequency
    running sum of frequencies
  119. cumulative relative frequency
    running sum of relative frequencies
  120. the cumulative relative frequency gives proportion of observations that are...
    less than or equal to the value
  121. Cumulative realtive frequency is not sensible for what kind of data?
  122. A histogram is a...
    graphical display of the frequency or relative frequency
  123. The horizontal scale of the histogram is
  124. the vertical scale of the histogram measures...
    frequency or relative frequency
  125. Image Upload
    This chart is a...
  126. The dot plot is a one-way...
    scatter plot
  127. a dot plot is an alternative to the...
  128. Image Upload
    This graph is called...
    dot plot
  129. A pie chart shows a relative frequency of...
    qualitative data values
  130. In a pie chart, the area of each catergory is proportional to ...
    the corresponding relative frequency
  131. Angle for data entry in a pie chart =
    relative frequency x 360o
  132. A parto chart is...
    a chart that is similar to histogram but used for qualitative data
  133. in a pareto chart, the vertical axis may represent...
    frequency or relative frequency
  134. in a pareto chart, the bars are ordered form...
    largest to smallest frequency
  135. Image Upload
    This graph is a ...
    Pareto chart
  136. A graph for bivariate data is...
    scatter plot
  137. in a scatter plot, quantitaitve data is on which axis?
    both of them
  138. Each point of a scatter plot represents...
    one observation
  139. a scatter plot can show the relationship between...
  140. Image Upload
    What kind of graph is this?
    Scatter plot
  141. A time series chart is for....
    entries taken regulary over time
  142. A time series chart is useful for...
    identifying trends
  143. Image Upload
    This type of graph is...
    a time series chart
  144. Using a mode is not useful with ____ data
  145. the mode is useful for _____ data
    discrete (ex. Counts)
  146. the mean is only appropriate for ______ data
  147. Outliers are values which are...
    different from the bulk of the data
  148. the mean is very ______, always pulled in the direction of outliers.
  149. in response to outliers, the ____ is less sensitve than the mean
  150. Average deviation always equals...
  151. the advantages of sample standard deviation is...
    • uses every observation
    • mathematically manageable
  152. disadvantage to using standard deviation
    sensitive to extreme observations
  153. The sample variance is...
    the "average", or squared deviations
  154. The variance/Standard deviation is always...
    greater than or equal to 0
  155. In terms of (variance/SD) a larger value means _______ variability.
  156. for (variance/SD) = 0, there is no...
    variability (all data have the same value)
  157. comparing two or more variances is only meaningful when in ...
    the same units
  158. the coefficient of variation is ...
    a unitless measure of relative variability
  159. the coefficient of variation can be used to...
    compare the relative variation between any sets of values
  160. the advantages of the CV are..
    dimesionless, independet of units used
  161. the disadvantages of the CV are..
    statiscally awkward, and can only be when the mean is greater than 0
  162. p=
    the percentage of data less than or equal to desired percentile, divided by 100 (25th percentile = 25/100)=p=.25
  163. percentiles are also called...
  164. i=
    # of observation (1st,2nd,etc.)
  165. i formula=
  166. if i is an integer, then 100pth percentile =
  167. if i is not an integer, then 100pth percentile =
    xk + (i - k) (xk+1 - xk)
  168. k=?
    the interger part of i
  169. Quartiles are...
    percentiles which divide the distribution into four equal parts
  170. lower quartile = Q1= ___ percentile
  171. middle quartile = Q2= ____percentile
  172. upper quartile = Q3= ____percentile
  173. Range =
    Largest observation - smallest observation (max-min)
  174. the advantage of range is that..
    it is easily determined
  175. the disadvantages of Range are..
    • only based on two values
    • dpends on the number of observations
    • usually too sensitive to extreme observations to be useful
  176. Range never ______.
  177. IQR (interquartile range) is based on...
    the middle 50% of the data
  178. IQR = formula?
  179. the advantages to IQR are....
    • not sensitive to extreme values
    • independent or n
  180. the disadvantages of IQR are...
    • cannot be determined for small n
    • does not directly use majority of data
  181. a boxplot is...
    a graphical summary of the distribution of data for a single variable
  182. a boxplot can also be called a...
    box and whiskers plot
  183. the three parts to the boxplot are...
    • "box" covers interval from Q1 to Q3
    • whiskers extend from box to furthest observation within 1.5 x IQR from box
    • Observations beyond whiskers shown individually
  184. a boxplot ______ data near the center of the distribution and ____ individual observations far from center
    • summarizes
    • shows
  185. number of peaks in data is the...
  186. one peak in data =
  187. two peaks in data =
  188. symmetric data=
    data values are mirrored about a central value, one side is the mirror-image of the other
  189. skewed data=
    data values are more spread out in one direction than another
  190. right skew means it ____ tailed and _____ skewed.
    • right
    • positively
  191. in the right skew..
    mean ___ median
    mean > median
  192. left skew means it ____ tailed and _____ skewed.
    • left
    • negatively
  193. in the left skew..mean ___ median
    mean < median
  194. skewness pulls the mean in the direction of....
    the tail
  195. a symmetric and unimodal graph means...
    mean ___ median
    mean = median
  196. a reason a bimodal graph might occur is because...
    may be due to random chance or sample is made up of two distinct subgroups
  197. multiplying every observation by positive constant C does what three things?
    • multiplies mean, median by C
    • multiples SD, Range, IQR by C
    • multiplies variance by C2
  198. adding a constant C to every observation does what two things...
    • Adds C to mean, median
    • Does not change measures of spread (Range, IQR, Variance, SD)
  199. linear transformations (adding or multiplying constant) changes what?
    changes location or location and spread but not the shape of distribution
  200. non-linear transformations change the ___ of a distribution
  201. non-linear transformations are useful when...
    modeling data
Card Set:
Med Stats exam 1
2012-09-12 16:29:13
Med Stats

Med Stats
Show Answers: