E1 statistics flashcards.txt

 The flashcards below were created by user Anonymous on FreezingBlue Flashcards. Statistics is a collection of methods for planning experiments, obtaining data, and thenorganizing, summarizing, presenting, analyzing, interpreting and drawing conclusions based on the data. It is the science of data. Statistical thinking involves applying rational thought and the science of statistics to critically assess data and inferences. In this course we will divide our study of statistics into two categories: Descriptive statistics is where we will organize and summarize the data Inferential statistics is where we use data to make predictions and decisions about a population based on information from a sample. Descriptive statistics utilizes numerical and graphical methods to look for patterns in a data set, to summarize the information revealed in a data set and to present that information in a convenient form.Inferential statisticsutilizes sample data to make estimates, decisions, predictions or other generalizations about a larger set of data. population is the set of all measurements of interest to the investigator. Typically, there are too many experimental units in a population to consider every one. However, if we can examine every single one, we conduct what is called a census. sample is a subset of measurements selected from the population of interest. parameter is a numerical measurement describing some characteristic of a population and computed from all of the population measurements. For example, a population average (mean), the average obtained from every item in the population, is a parameter. statistic is a numerical measurement describing some characteristic of a sample drawn from the population. variable a characteristic that changes or varies over time or varies across different individual subjects. experimental unitindividual or object on which a variable is measured, or about which we collect data. Person Place Thing Event measure of reliability is a statement about the degree of uncertainty of a statistical inference. Continuous numerical data result from infinitely many possible values that correspond to some continuous scale that covers a range of values without gaps, interruptions, or jumps. Example: The finishing times of a marathon Discrete numerical data result when the number of possible values is either a finite number or a countable number. (That is, the number of possible values is 0 or 1 or 2 and so on.) Example: The numbers of fatal automobile accidents last month in the 10 largest US cities Quantitative variables are numerical observations Continuous Variables can assume all of the infinitely many values corresponding to a line interval. Example: Y = The amount of milk that a cow produces; e.g. 2.343115 gallons per day Discrete Variables can assume only a countable number of values. (i.e. the number of possible values is 0, 1, 2, 3, . . .) Qualitative variables are non-numerical or categorical observations. representative sample exhibits characteristics typical of the target population. In order to ensure that we get a good sample that is representative, we often employ a random sampling approach. random sample is selected in such a way that every different sample of size n has an equal chance of selection. class is one of the categories into which data can be classified. Class frequency is the number of observations belonging to the class. Pareto Diagram is a bar graph that arranges the categories by height from tallest (left) to smallest (right). Stem-and-Leaf Display shows the number of observations that share a common value (the stem) and the precise value of each observation (the leaf) Histogram are like bar charts for numerical data, but they never have gaps between the bars (unless the frequency for the class is zero). frequency distribution (or frequency table) lists data values (usually in groups), along with their corresponding relative frequencies. Lower class limits are the smallest numbers that can belong to the different classes. Upper class limits are the largest numbers that can belong to the different classes. Class boundaries are the numbers used to separate classes, but without the gaps created by class limits. Class midpoints are the midpoints of the classes. Each class midpoint can be found by adding the lower class limit to the upper class limit and dividing the sum by 2. class width is the difference between two consecutive lower class limits or two consecutive lower class boundaries. relative frequency histogram is a bar graph in which the heights of the bars represent the proportion of occurrence for particular classes. Central tendency the tendency of the data to cluster, or center, about certain numerical values. Variability is the same as the spread or clustering of the data. shows how strongly the data cluster around that (those) value(s) Mean found by summing up all the measurements and then dividing by the number of measurements. Median middle number when the measurements are arranged in numerical order. It is also called the 50th percentile since 50% of the data is below the median and 50% is above. Mode data value that occurs most frequently. Skewed when one side of the distribution has more extreme values than theother. If the population mean is greater than or less than the population median the distribution is skewed. Range largest measurement minus the smallest measurement.Range = Max Min Variance for a sample of n measurements is equal to the sum of the squared distances from the mean divided by n � 1. AuthorAnonymous ID104596 Card SetE1 statistics flashcards.txt DescriptionE1 statistics mcguckian Updated2011-09-27T04:15:39Z Show Answers