Today, we will be discussing distributions of numerical variables. Some variables are evenly distributed through all the values, other skew to have more high values than low, some are the other way around with more low than high.
A lot of data sets have what is called normal distribution, where the most common values are the ones closest to the average, with values much higher or much lower than average being much rarer, and the farther from average those values are, the rarer they become. This curve, known as the bell-shaped curve or the normal curve or the Gaussian curve represents the normal distribution, where the high point shows the density of the values that are near average, and the lower levels at both ends represent the scarcity of values farther away from average.
One of the reasons to find the standard deviation of a set of numbers is to calculate the z-score of a raw value x. This tells us how many standard deviations a value is away from the average. Negative z-scores are for numbers below average and positive z-scores are for values above average. z(x) = 0 only when x = x-bar, the value is exactly at average. The first four pages in the class notes let us change z-scores into proportions, and using these numbers we can talk about the probability of finding values greater than some value x, or the probability of finding values between two values, call them x1 and x2
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment