# Statistics

The flashcards below were created by user clainez27 on FreezingBlue Flashcards.

1. Population
• a collection of persons, objects or items of interest.
• Whatever the researcher is studying
2. parameter
• a descriptive measure of the population. Usually denoted by Greek letters
• e.g. mean(µ), population variance(σ^2), populuation standard deviation(σ)
• data from a census are parameters
3. sample
a portion of the whole and if taken properly, representative of the whole
4. statistic
• a descriptive measure of the sample. Usually denoted by Roman letters
• e.g. mean(x *bar*), sample variance (s^2), sample standard deviation(s)
• data from a sample are statistics
5. Descriptive Statistics
• Using data gathered on a group to describe or reach concclusions about that same group
• e.g. most athletic stats. The data is gathered from that group and conclusions are drawn about that group only. Basketball stats are about Basketball
6. Inferential Statistics
• gathering data from a sample and use the statistics generated to reach conlusions about the population from which the sample was taken
• sometimes referred to as inductive statistics
7. emprical rule
• The approximate values that lie within a given number of standard deviations from the mean of a set of data if the data are normally distributed.
• Distance from the Mean Values within Distance
• µ + 1σ 68%
• µ + 2σ 95%
• µ + 3σ 99.7%
8. Population Mean
• µ = (∑x)/N
• where x = actual data values
• N = # total terms
9. standard deviation
• square root of the variance
• σ = sqrt(σ)
• Σ = sqrt( (∑(x- µ)^2)/N)
10. sum of squares of x
• SSx
• The sum of the squared deviations about the mean of a set of values
11. variance
• average of the squared deviations about the arithmetic mean for a set of numbers
• Population Variance
• - σ^2 = (∑(x- µ)^2)/N)
12. deviation from the mean
x-µ
• the average of the absolute values of the deviations around the mean for a set of numbers
• where
• x-µ = actual value of a given number minus the mean
• N= Number of terms
14. Chebyshev's Theorem
• at least (1-1/k^2) values will fall within + k standard deviations of the mean regardless of the shape of the distribution. Assume k>1
• e.g. k=2.5, 1-1/(2.5^2) = .84. so at least .84 of all values are within µ + 2.5σ.
• or at least .84 of all values will be within 2.5 standard deviations of the mean, µ.
15. sample variance
• variance: s^2 = ∑(x- x(bar))^2)/(n-1)
• also
• s^2 = (∑x^2 - ((∑x)^2)/n)/n-1
• where
• x = actual value
• x(bar) = sample mean
• n = sample number
16. sample standard deviation
• sqrt(s^2) where s^2 =
• s^2 = (∑x^2 - ((∑x)^2)/n)/n-1
17. Percentiles
• measure of central tendency that divide a group of data into 100 parts
• 87.7% = 87th Percentile
• percentile location: i=(P/100)n
• where P = percentile
• i= percentile location
• n= number in the db
• if i is a whole number then then P = (i+(i+1))/2 or the average of the two numbers
• if i is NOT a whole number then P = whole number of i+1
• e.g. i= 11.8, P = (11.8+1) = 12.8 or 12th percentile
• e.g. i = 11, P = (11+12)/2 = 11.75 = 11th percentile
18. frequency distribution
• a cumming of data presented in teh form of class intervals and frequency
• e.g. 1 under 3, 3 under 5, etc.
• use classes rule of thumb, 5-15 classes
19. range
difference between the largest and smallest values of an order
20. classes
• 5-15 rule of thumb
• arrangement of values in groups
21. cumulative frequency
running total of frequency through the classes of a frequency distribution
22. relative frequency
proportion of total frequency that is in any given class interval in a frequency distribution
23. class width
range/# classes
24. histogram
typical vertical bar-chart used to depict a freq. dist.
25. frequency polygon
graph in which line segments connnect the dots depicting frequency distribution
26. ogive
cumulative frequency polygon- most useful for running totals
27. pie chart
• data represented as a whole
• Interval/total * 360
28. stem & leaf
constructed by separrating the digits for each # of the data into 2 groups
29. pareto
• Vertical bar chart that displays the most common types of defects
• ranked in order of occurence left to right
30. scatter plot
• 2 dimensional plot of pairs of points from 2 variables
• god for attempting to determine relationship between 2 variables
31. census
• gather data from a whole population
• data from a census are parameters
32. Levels of Data
• Lowest to Highest
• Nominal
• Ordinal
• Interval
• Ratio
33. Nominal
• Lowest level of data: Used only to classify or categorize
• e.g. doctor, lawyer, educator, other
• NON-METRIC Data, aka qualitative data.
34. Ordinal
• Higher than Nominal, can be used to rank or order subjects
• NON-METRIC Data, aka qualitative data.
35. Interval
• Higher than Ordinal
• Distances between consecutive numbers have meaning and the data are always numerical
• e.g. temperature
36. Ratio
• Highest Level of data measurement
• Have the same properties off Interval but they have an absolute zero which indicates absence
• Ratio of two numbers is meaningful
• e.g. Height, weight, Kelvin temperature, passenger miles
37. Parametric Stats
Must be Interval or Ratio
38. Non-Parametric Stats
Can be nominal or ordinal but can be used to analyze parmetric
39. grouped data
data that have been organized into a frequency distribution
40. ungrouped data
raw data or data that have not been summarized in any way
41. median
• middle value in an ordered array of #s.
• -an array with an odd amount of values, the median is the middle value
• -an array with an even amount of values the median is the average between the two middle numbers
• -the median number is (n+1)/2
• e.g. for 77 terms the median is (77+1)/2= 39th term
42. Quartiles
• same rules as percentiles, if i is a whole number Qx is the average of the i+(i+1) number
• Q25 = Q1 = first 25% of values ending in the Q25 term
• Q50 = Q2 = first 50% of values ending in the Q50 term
• Q75 = Q3 = first 75% of values ending in the Q75 term
• Q2 is the median
43. measure of central tendency
yield info about the center, or midddle part, of a group of values
44. mode
• the most frequently occuring value in a set of data
• bimodal- data set has two modes
• multimodal - data set has more than two modes
45. Inter Quartile Range : IQR
• The middle 50% of values
• IQR = Q3-Q1
• e.g. if Q3 = the 12th (70)term and Q1 = the 4th term (5) IQR = 70-5 or 65
46. Coefficient of Variation
• The ratio of the standard deviation to the mean expresed in precentage and is denoted as CV
• CV = (σ/µ)100
• e.g. for σ=4.84 & µ = 64.4, CV = 7.5%
47. z score
• number of standard deviations a value (x) is above or below the mean of a set of numberrs when the data are normally distributed
• z = (x-µ)/σ
• e.g. x = 1, µ = 4.28, σ = 2.491, z = -1.32
• x = 9, µ = 9, σ = 2.491, z = 1.89
• z scores still follow the empirical rule
48. coefficient of correlation
• correlation: measure of the degree of relatedness of variables
• coefficient of correlation = r
• r = (big equation)
49. classical method of assigning probability
• involves an experiment which is a process that produces outcomes, and an event, which is an outcome of an experiment.
• P(E) = n_e/N
• Highest probability of an outcome is 1.
• Lowest probability is 0
50. apriori
probabilities can be determined prior to the experiment
51. intersection
• contains the element common to both sets
• X = 1234 Y = 2367 X(int)Y = 23
52. mutually exclusive events
• when the occurence of one event precludes the occurence of another event
• e.g. Male and Female. OK and Defective. A person can not be both Male and Female and a part may not be both OK and Defective
• formula: P(X(int)Y) = 0
53. independent events
• events wherein the occurence or nonoccurence of one of the events does not affect the occurence or nonoccurence of the other event.
• e.g. Coin tosses or Die Rolls. The previous event does not influence the following event
• formula: Independent Events X & Y
• P(X|Y) = P(X) and P(Y|X) = P(Y)
54. complement
• All the elementary events of an experiment not in A comprise its complement.
• e.g. If the experiment is rolling die and the event is 5, then the complement is 1,2,3,4,6
• A'
• P(A') = 1 - P(A)
55. relative frequency of occurence method of assigning probabilities
the probability of an event occurring is equal to the number of times the event has occurred in the past divided by the total number of opportunities for the event to have occurred.
56. subjective probability
assigning probability based on the feelings or insights of the person determining the probability
57. mn counting rule
• For an operation of m ways and a second operation of n ways, the tw operations then can occur, in order, in mn ways.
• This rule can be extended to 3 or more operations
• e.g. # of Groups possible with the following factors
• gender, marital status, economic class = 2(m/f), 3(single-never married/married, divorced), 3(lower/middle/upper)
• =18 groups. Therefore 18 samples could be taken to represent all groups.
58. sampling from a population with Replacement
• sampling n items from a population of size N with replacement would provide N^n possibilities
• e.g. A die being rolled 3 times in succession, how many different outcomes can occur?
• N = 6, n=3, 6^3 = 216
• A lottery of reusable numbers 6 digits long from 0-9
• N=10, n=6, 10^6 = 1000000
59. Combinations
• Sampling n items from a population of size N without replacement
• N^Csub_n = N!\{n!(N-n)!}
• e.g. three lawyers are to be sent to a conference from a pool of 16
• 16!/3!13!= 560
• combination because once selected the lawyer can not be selected again
60. Spearman Rank
• r(sub_s) = 1 -(6(sum)d^2/n(n^2-1))
• where d = differenc in the ranks of each pair
• n= number os pairs
• High positive number indicates a positive correlation
• High negative number indicates a negative correlation
• e.g. if x and y pairs, and spearman's equals -.830 this indicates a strong inverse correlation,
• that is when x is high y is low and vice-versa
• P(A∪B)= P(A) + P(B) - P(A∩B)
• That is probability of A + probability of B - Probability of A&B together
• Applies only if Probabilities are mutually exclusive
• i.e. male or female, or P(A∩B) = .000
• Then the union of P(A) and P(B) = P(A) + P(B)
63. General Law of Multiplication
• This gives the probability that both A & B will happen at the same time
• P(A∩B) = P(A) * P(B|A) = P(B) * P(A|B)
• P(A∩B) means that A & B MUST happen.
• P(A|B) is the probability of A given that B is true
64. Special Law of Multiplication
If X, Y are independent, P(X∩Y) = P(X) * P(Y)
65. Independent Events X,Y
• To test to determine if X & Y are independent events, the following must be true
• P(X|Y) = P(X) and P(Y|X) = P(Y)
66. Conditional Probability
P(X|Y) = P(X∩Y)/P(Y) = (P(X)*P(X|Y))/P(Y)
 Author: clainez27 ID: 232539 Card Set: Statistics Updated: 2013-09-03 02:56:41 Tags: Statistics Folders: Description: Statistics Show Answers: