Tips for analyzing the population data

Except for a few twists that I’ll tell you about in a moment, the population data are structurally identical to the raises data you worked with earlier. Specifically:

  • Each row represents a county, just like each row represented a department head in the “Raises” dataset.
  • The first column names a county, just like the “Raises” dataset’s first row named a department head.
  • The second column gives each county’s “old” (2010) population figure, just like the second column in the “Raises” dataset gave each department head’s old (pre-raise) salary.
  • The third column gives each county’s “new” (2015) population, just like the third column in the “Raises” dataset gave each department head’s new (post-raise) salary.

About those twists, though:

  • In the “Raises” dataset, every “New salary” figure was larger than its corresponding “Old salary” figure, because each department head was gettingĀ  raise, even if only a minimal one. In the population dataset, however, some of the “new” estimates are larger than the “old” estimates, because populations in some of Tennessee’s counties declined between 2010 and now. This characteristic of the dataset means that if you calculate changes and percent changes from 2010 to now, some will be negative. That’s OK; when your county’s population shrinks, any measure of the change that has occurred should be negative.
  • There are a lot more rows of information in the population dataset (95) than in the salary dataset (22). You could make a data visualization involving all 95 rows of the population data, but it would be pretty big and complex. That’s why the assignment suggests making a visualization using only the 12 fastest-growing counties.
  • The “Percent of the whole” analysis shown in the salary figures demonstration would be tricky to implement on the population data, mainly because of the counties that saw declines in their population. If you wanted to use something like it, you’d have to divide the counties into those that gained population and those that lost population. For the counties that gained population, you could use the =sum function to calculate the total population gain, then show the percentage of that gain that occurred in each county that had a gain. You could do the same analysis for counties that lost population.
  • The population data contains a fourth column, “Region,” that indications which region of Tennessee – West, Middle, or East – each county falls within. With just a little creative adaptation of the skills you’ve learned, you could examine population change differences not only among individual counties but also among these three grand divisions. If you’re working on this for a class, maybe your professor will give you some bonus points for figuring out how to do so – or at least trying.

Here’s a summary of the similarities and differences between the salary example and the population exercise. Note that the summary shows 2013 population figures, because that’s when I made the summary. I update the “new” population figures each year to the latest ones available from the Census Bureau:

raises and population datasets comparison