The flashcards below were created by user
on FreezingBlue Flashcards.
What is data analysis?
- to organize and summarize data.
- Individuals vs. variables
what are the two types of variables?
categorical and quantitive
what's the 6-W-H
Who, What, Why, Where, When, Whome, How
what's distribution of varbiables?
list of numbers
what do yo uuse for categorical variables?
bar graphs and pie charts
what is used to quantitive variables?
dot plot, stem plots and histograms.
if the data is too small, what do you use?
a dot plot
if the data is medium sized what do you use?
if the data is large, what do you sue
what is a percentage graph?
how do you create an ogive graph if you have the class and freguency?
add commulative and percentage columns
what's one way you can describe distribution?
what does SOCS stand for?
Shape, Outlier, Center and Shape
What types of shapes are there?
- Skewed to the right.
- skewed to the left
- peaks -> Unimodel or Bimodel
what's an outlier?
something unusual in the pattern
what's center and spread?
- center is the middle value.
- spread is how the data varies. A.K.A. IQR = Q3-Q1
what are the 2 ways to describe the center?
What's the downside of a mean?
aleasily influenced by the outlier. (always follows the outlier.)
Is the mean a resistant measure?
The mean is NOT a resistant measure.
How can you find the median?
Arrange from least to greatest and find the middle number
is the median a resistant measure?
Yes, the median is a resistant measure. It can resists outliers and are less influenced by it.
If a graph is skewed, do you mean or median?
If the graph is skewed, use median.
if the graph is normal, do you use mean or median?
If the graph is normal, you use mean.
How can you report the spread?
by standard deviation and quartiles.
What are quartiles?
example of Q1 and Q3
- the first quartile : 25th percentile
- the third quartile: 75th percentile.
what is the five number summary?
Minimum, 1st quartile, center, 2nd quartile, maximum.
What are box plots used for?
Which questions should you ask yourself when comparing boxplots?
- they are used to compare two or more distributions.
- you should ask yourself:
- -spread (IQR -> Q3-Q1)
what is the equation in order to find the outlier?
1.5 x IQR
What is standard deviation?
it measures how far the majority of the observations are away from their mean. Uses the sigma sign