Wednesday, February 18, 2009

Class notes for 2/18

Increase and decrease, in absolute terms and in percentage: When we have numbers that have changed over time, we can get the absolute change by subtracting the old number from the new. If the numbers are rational data, which as you will recall has to do with the idea that zero means the complete lack of a thing, then we can also discuss the percentage change, which we get through the formula (new-old)/old*100. Don't forget to put the parentheses around the numbers in the numerator.

For instance, the graphical representation of data illustration shows the differing fortunes of two American corporations over the last eight years, by listing the market capitalization on 2000 compared to the same statistic in 2008. (Market capitalization means how much money it would cost to buy all the company's stock at the price it is trading for.)

Apple has had a good century so far, and their stock's value has risen from $5.56 billion in 2000 to $85.25 billion in 2008. In absolute terms that is an increase of (85.25-5.56) = 79.69 billion dollars.

The percentage increase is even more impressive as (85.25-5.56)/5.56*100 = 1433.27..., so the market capitalization has increase by a whopping 1433.3%.

General Motors has had a bad eight years, and their market capitalization has shrunk from $28.3 billion in 2000 to $2.99 billion in 2008. (2.99-28.3) = -25.31, so the absolute decrease is about $25.3 billion. The percentage decrease is (2.99-28.3)/28.3*100 = -89.43..., which means the percentage decrease for GM's market capitalization is about 89.4%. This illustrates an important point. It is possible for a percentage increase to be more than 100%, and this happens any time a number more than doubles. But unless a formerly positive number goes negative, and that is a rare thing, the percentage decrease will not be more than 100%.



Bar charts, also known as histograms: Bar charts are a popular way to represent categorical data, whether ordered or unordered. One of the great advantages of bar charts over pie charts is that it is easier to compare two data sets side by side. In the picture on the left, we see the percentages of people in four different demographic groups in the states of Florida (the yellow histograms) and Texas (the red histograms). What we see is that Texas has a slightly larger percentage of children under 5 years old (8% to 6%), of juveniles (19% to 16%), of adults under the age of 65 (63% to 61%), but Florida makes up for that gap by having a significantly higher percentage of senior citizens over the age of 65 (17% to 10%). Because Texas has a much higher population than Florida, it makes sense for us to look at relative frequency instead of frequency, so we can compare the data sets side by side and not have the Texas numbers completely overwhelm the Florida numbers.



Line charts: A very common use for line charts is to track numerical data over time. Newspapers and financial websites often show how the price of a commodity or stock is doing by using line charts, which look something like the profile of a mountain range. This graph was taken from kitco.com, a website devoted to trading metals, including gold, silver, copper and platinum. These are the prices of gold minute by minute for three days, Monday, Feb. 16, 2009 to Wednesday, Feb. 18. The Monday prices are shown in blue, the Tuesday prices in red and the Wednesday prices in green. What we can see is that prices rose on Monday from under $940 an ounce to nearly $960, Tuesday showed a steady climb from $960 to $970 an ounce, then Wednesday prices fell a bit early in the day, but rallied to finish near $980 an ounce.



Line charts can also be used to represent data that might be shown as histograms. This line chart in red and yellow shows the same demographic data we had in the bar chart section from above, with red showing Texas data and yellow showing Florida data.

There are two sets of lines in each color. The thin lines are the same as the histograms from the section above, while the thick lines that climb to 100% at the far right are ogives, pronounced "Oh-jives", which track the cumulative amount. Here are the numbers for each state, both from each demographic and the accumulation of demographic groups from youngest to oldest.

Florida:
Under 5 years old: 6.2%
6-18 years old: 16.0%
19-64 years old: 61.0%
65 and over: 16.8%

Florida (cumulative):
Under 5 years old: 6.2%
Under 18 years old: 22.2%
Under 64 years old: 83.2%
All age groups: 100%

Texas:
Under 5 years old: 8.1%
6-18 years old: 19.4%
19-64 years old: 62.5%
65 and over: 9.9%

Texas (cumulative):
Under 5 years old: 8.1%
Under 18 years old: 27.5%
Under 64 years old: 90.9%
All age groups: 100%


Pie charts: Pie charts work best with unordered categorical data. The pie slices are ordered from largest to smallest. The way most people set up the start of the data is to put a line segment pointing straight up and start putting the pie slices in clockwise from largest to smallest. This data shows the percentages of racial groups in the U.S. as of 2008. Besides the four main racial groups of whites (here identified as European Americans), Hispanics, African Americans and Asian Americans, all other racial groups account for less than 0.5%.
There is an alternative way of setting up pie charts where the "start line" points to 3 o'clock and the data is put from largest to smallest, moving counterclockwise. This picture is taken from the Sitemeter website attached to another blog that I maintain. 70% of the last 100 visitors are from the United States, 11% are from Canada, 10% cannot be recognized by the Sitemeter software by country of origin, followed by 3% United Kingdom, 2% Chile, and 1% each for Panama, Ireland, Guam and Germany.

No comments: