biostatistics exam 1 ch 1-3
Card Set Information
biostatistics exam 1 ch 1-3
for 1st exam in biostatistics class ch 1-3
systemmatic study of one or more problems usually posed as research questions germane to a specific dicipline.
larger group researcher wants to draw conclusions about.
characteristic of a population.
usually unknown but are estimated with statistics.
group of the population that is actually studied.
characteristic of a sample.
a branch of applied mathematics that deals with collecting, organizing and interpreting data using well defined procedures.
purpose of statistics (3 parts)
describe and summarize info, reducing it to smaller, more meaningful data sets.
make predictions or generalize about occurnaces based on observations.
identify associations, relationships or differences in observations.
types of statistics (2)
descriptive and inferential.
characterizes data by summarizing it into more understandable terms without losing or distorting much information.
provides predicitions about a population's characteristics based on information from a sample of that population.
raw materials of research gathered from a sample that has been selected from a population.
characteristic being measured that varies among persons, events, or objects being studied.
a concept that a method of measurement has been determined for.
assignment of numerals to objects or events according to a set of rules.
types of measurement scales (4)
nominal, ordinal, interval & ratio.
lowest form of data.
organizes data into discrete untis.
allows researcher to assign numbers that classify characteristics of people, objects or events into categories.
assignment of numerals is arbitrary.
qualitative nominal variable
categorical nominal variable
types of nominal variables
categorical and qualitative
places characteristics into categories and categories are ordered in some meaningful way.
distance between categories is unknown.
distances between category values are equal due to some accepted physical unit of measurement.
i.e. F - temperature
types of interval variables
continuous and discrete
continuous interval variable
may take on any numerical value within a variable's range.
discrete interval variable
takes on only a finite number of value between two points.
most precise level of measurement.
meaningfully ordered characteristics with equal intervals between them and the presence of a zero point that is determined by nature.
i.e. pulse, bp, weight
clincial or substantive meaning of the results of statistical analysis.
principles of statistical data handling to fill the gap between getting data into the computer and running statistical tests.
summarize the key dilemas that researchers face when entering data into the computer.
he wrote a book in 1996 about them. our book only covers 18.
dp - appropriate data principle
you cannot analyze what you do not measure.
must anticipate variables needed to expalin results.
dp - social consequences principle
data about people are about people.
can have social consquences.
i.e. drug proven not to work better, unethical to advise people to take it?
dp - data control principle
take control of structure and flow of data.
monitor procedure for layout of data record.
dp - data efficiency prinicple
be efficient in getting your data into a computer, but not at the cost of losing cucial information.
dp - change awareness principle
data entry is an interactive process. try to use the computer to do as much computing and debugging as possible.
dp - data manipulation prinicple
let the computer to do as much work as possible.
let it manipulate the data for you by instructing it to do so.
dp - original data principle
always save a computer file of the original, unaltered data.
dp - default prinicple
know your software's defualt settings and whether they meet your needs.
(especially concerning missing values)
dp - complex data structure principle
if your software can accommodate complex data structures, then you might benefit from using that software feature.
dp - software's data relations principle
know if your software can perform the following four relations and if so, what commands are necessary for it to do so: subsetting, catenation, merging and relational database construction.
dp - software's sorting principle
know how to perform a sort in your software and whether your software requires a sort before a by group analysis or before merging.
dp - impossibility/ implausibility principle
use the computer to check for impossible and implausible data.
dp - burnstein's data sensibility principle
run your data all the way through to the final computer analysis and ask yourself whether the results make sense.
dp - extant error principle
data bugs exist even if you've corrected mistakes it's possible you've missed something.
dp - manual check principle
nothing can replace another pair of eyes to check over a data set.
check it yourself or get someone else to do it.
dp - error typology principle
debugging includes detection and correction of errors.
try to classify each error as you uncover it.
dp - kludge principle
sometimes the way to manipulate data is not elegant and seems to waste computer resources.
patching together cpomputer demands awkwardly to make data do what you want.
dp - atomicity principle
you cannot measure below the data level that you observe.
i.e. age 21-25 nominal (lowest)
vs. age? ___ more precise
using specific methods to advance the science base of the discipline by studying phenomena relevant to the goals of that discipline.
objects being described by a set of data.
quantitative research methods
experiments, surveys, correlational studies, meta-analysis, and psychometric evaluations.
simplest foorm of a chart for nominal or ordinal data.
category labels horizontally in a systematic order with vertical bars with spaces between.
appropriate for interval, ratio, and sometimes ordinal variables.
similar to bar chart except bars are placed side by side.
a chart for interval or ratio variables.
it is equivalent to a histogram but appears smoother made by connecting midpoints of the top of each bar.
a circle that has been partitioned into percentage distribution of quantitative variable total area 100% = 360 degrees.
data is organized into values or categories and then described with titles and captions.
a frequency distribution for interval or ratio variables.
an ordered array of values.
mean and formula
best known and widely used average.
the center of a frequency distribution.
x bar = M = sum of x/n
measures of central tendency
- center of trend or average
middle value of a set of ordered numbers.
point where 50% of distribution falls below and above.
not affected by outliers.
- place #'s in order and the middle number is median if n is odd
- if n is even average the middle two #'s
most frequent value or category in a distribution.
not calculated, just observed.
if all scores are different then there is no mode.
-use when dealing with frequency distribution for nominal data.
having low variability.
having high variability.
numbers are spread out.
measure of spread or dispersion.
measure of degree to which scores in a distribution are spread out or clustered together.
types of variability
standard deviation (SD)
range of values extending from the 25th percentile to the 75th percentile.
- divide by 2 for the semi interquartile range
max-min... highest #- lowest #.
simplest measure of variability.
sensitive to extreme #
unstable since it's only based on two numbers.
standard deviation (SD)
measure of dispersion of scores around the mean.
most widely reported, indicates spread.
low SD means close together and high means spread out.
interquartile range (IQR)
-range of values from P25 to P75
not sensitive to outliers.
used on growth charts.
good with skewed data.
use with median.
non symmetrical distribution.
measure of symmetry.
measure of flatness.
measures of variability
types of data transformation
square root transformation
square root transformation
degrees of freedom
pearson's skewness coefficient
fisher's measure of skewness