(Watch the “Spreadsheet Basics for Journalists, Part 2” video to see a demonstration of the techniques covered on this page)
Reopen your Google Sheet
Now that you’ve learned some fundamentals about setting up and using a spreadsheet, let’s take a look at some of the analytical techniques you could use to figure out what any curious journalist would want to know: Who got the biggest raise?
Log back into drive.google.com, find the spreadsheet you created during Part 1, and double click on it to reopen it. It should look like this:
(Note: After reminding you in Part 1 to change Anderson’s salary back to 54109, I forgot to do so myself. Unfortunately, I didn’t remember until after I had finished everything in this part. So, if you haven’t already, please change Anderson’s salary to 55000, so that what you do will match what I do. Sorry — KB)
Sorting data with Google Sheets
It’s easy enough to eyeball the list and figure out that Daniel Anderson’s $5,810 raise is the largest (and that Maria Lopez’s measly $535 raise is the smallest). But it wouldn’t be so easy to do if there were, say 2,200 records to look at rather than only 22. And even with only 22 records, it would take you a while to put the raises in order, from highest to lowest, so that you could know, for example, who got the second largest raise, third largest raise, fourth largest raise, etc.
Fortunately, spreadsheets are lightning-fast at sorting numbers from highest to lowest – or, just as quickly, from lowest to highest. So, let’s sort all of the information we have by the numbers in the “Raise” column, putting the largest number at the top and the smallest number at the bottom.
First, left click on cell A1, (“Name”) then, holding down the left mouse button, drag down to cell A23 (“Wilson, Thomas”), then, still holding down the left mouse button, drag to the right, all the way to cell D23 (“1114”). Release your mouse button. All data should now be highlighted in light blue. Like this:
Now – with the data highlighted as shown above – click “Data / Sort range …” Your data will fade into the background, and a small menu will appear in the center of the screen. On that menu, check the box beside “Data has header row,” click the radio button next to “Z -> A,” and, finally, click the box next to “Sort by” and choose “Raise.” If you’ve set the menu up correctly, it will look like this:
You’re telling Google Sheets to sort the highlighted data (cell A1 to cell D23) by the figures in the “Raise” column, putting the largest figure at the top (Z -> A), but keeping the first row of data (the “header row”) at the top, because that’s the row that contains the headers, or labels, for each column of data: “Name,” “Old salary,” “New salary,” and “Raise.” When you’re all set, click the blue “Sort” button, and Google Sheets will do what you’ve told it to do:
Pretty cool, huh? Using the same procedure, you can sort by any of the other columns, too. You can even sort by the “Name” column, which contains words rather than numbers. If you tell Google Sheets to sort the data by the “Name” column, it will sort the names alphabetically if you choose “A -> Z” or in reverse alphabetical order if you choose “Z -> A.” Go ahead and try it, if you like.
It’s important to remember that Google Sheets will sort only the data that you highlight. So, for example, if you highlight only the “Names” column and tell Google Sheets to sort the names alphabetically, Google Sheets will happily do so. But the data in the “Old salary,” “New salary” and “Raise” columns will stay put, and you will no longer be able to tell who got which raise. If you make that mistake, you’ll have no choice but to scrap your data and start over.
If you want to highlight all of the data in your Google Sheet for sorting purposes, there’s a handy shortcut: Click on the blank square above the “1” and to the left of the “A.” In other words, right here:
Percent change: Another way to think about the “biggest raise.”
You might have noticed that there are at least two ways to think about who got the biggest raise. Sorting by “Raise” shows that Daniel Anderson got the largest raise in terms of additional dollars. But if you sort by “New salary,” you’ll find that Robert Williams got the largest raise in terms of who will have the biggest salary after the raises, if only because he had the biggest “Old salary.”
In fact, neither approach measures the size of the raises in terms of the financial impact they will have on the people getting them. For example, suppose you have two men, one making $20,000 a year and the other making $200,000 a year. Now suppose each of them gets a $10,000-per-year raise. The guy making $200,000 a year will barely notice, because the raise will add only 5 cents for every dollar he was already making. But the guy making $30,000 will be a whole lot more likely to notice, because he’ll be making about an extra 33 cents for every dollar he was originally making. So, even though both men get the same raise, the raise will have a much greater financial impact on the second one.
In mathematical terms, what we did, there, was calculate the “percent change” in each man’s income – that is, the change per “cent,” where “cent” means “100,” as it it does in words like “centipede” and “centimeter.” The rich guy whose raise amounted to 5 percent will be getting five additional pennies for every 100 pennies in his original salary.
If you need to ponder that for a bit to understand it, fine. But do yourself a favor and go ahead and memorize the steps for calculating a percent change. There are precisely two:
1. Subtract the old number from the new number to find the difference between the two
2. Divide the difference by the old number
So, in your Google Sheet, you can type a bold-faced “Percent change” column heading in cell E1, then figure the percent change in Daniel Anderson’s salary by typing this in cell E2:
… then hitting “Enter.” Remember that you can enter “C2” and “B2” by simply clicking those cells with your mouse as you type. When you’re done, your sheet should look like this:
So, it looks like Anderson got a .118, or 11.8 percent, raise. Go ahead and repeat that calculation for everyone else by grabbing the little blue square at the lower right corner of cell E2 with your mouse and dragging down to cell E23, in Maria Lopez’s row. Next, sort the data by your “Percent change” column, putting the largest percent change at the top, to learn who got the biggest percent increase in his or her income. Your results should look like this:
Admittedly, not a whole lot changed. Anderson has the largest salary increase, both in terms of raw dollars and percent change, and Lopez has the smallest, again in terms of both. But a few people toward the middle of the pack changed places. Elizabeth Thompson, for example, had the sixth largest raise in raw dollars but has the fifth largest raise in terms of percent increase.
We have learning something important, though – something potentially newsworthy: These raises were not handed out in an equitable fashion, either in terms of raw dollars or in terms of the percent increase over each recipient’s original salary. They ranged from as little as 1 percent to as much as nearly 12 percent. If the raises had been “across the board,” all of those percent change figures would have been exactly equal. Why are some department heads getting considerably larger raises than others, even after we account for the differences in their original salaries? Is the mayor rewarding exceptionally good performance by some department heads? Reflecting variations in the “market pay rate” for each department head’s skill set? Playing favorites? Hmm … let’s do some more digging.
Summing, averaging and other functions
The “Sum” function. You may recall that the plan we devised in Part 1 included determining “whether anyone’s raise was unusually large or small compared to the average, and what the total impact of the raises will be on the city’s finances.” A couple of spreadsheet “functions” can make quick work of both goals. Functions are like little, preprogrammed tasks that the spreadsheet will do for you if given the right instructions. For example, we could add up all of the raises by clicking on, say, cell D25 and typing:
… but that would be an awful lot of work. the “=sum” function can do the same thing in a fraction of the time. To use it, click on cell D25, type:
… then grab your mouse and highlight cells D2 through D23, the cells that contain the amount of each raise. Your screen should look like this:
Next hit “Enter.” Google Sheets will add a closing parenthesis to the function and display the sum of the numbers in cells D2 to D23, or, put another way, the total amount of money the mayor is proposing to spend on the raises: $60,442.
The “average” function. Another handy function will average the salaries for you. To get the average manually, you’d have to not only add up all the salaries but also divide your total by the number of salaries under consideration. The function, though, can do the job in a flash. To try it out, click on cell D26, type:
… and – once again – use your mouse to highlight cells D2 through D23. Just before you press “Enter,” your screen will look like this:
After you press “Enter,” you’ll see that the average salary increase was 2747.363636, or, rounding off, about $2,747. So, now you’ve learned that Anderson’s raise was more than twice the average, and Linda Martin’s raise was nearly so. You also can divide the whole group into those whose raises were above average and those whose raises were below average.
Yes, you can copy functions, just like you can copy calculations. Ready for another demo of the dazzling speed with which a spreadsheet does calculations? You saw in Part 1 that the formula for calculating Anderson’s raise, =C2-B2, if copied down the column, would similarly calculate each other department head’s raise. You can copy functions in exactly the same way. You can even copy two or more at the same time. To see how this works, use your mouse to highlight cells D25 and D26, the cells with the =sum and =average functions in them. Then grab the little, blue square in the lower right corner of cell D26 and, holding down the mouse button, drag left to cells B25 and B26. When you release the mouse button, your screen will look like this:
Now you also know the totals and averages for the old salaries as well as the new salaries. There are lots of functions – one, in fact, for just about anything you’d like the spreadsheet to do with any two or more cells. If you have something in mind and want to know whether there’s a function to get it done, just use Google and include the term “Excel,” Microsoft’s ubiquitous spreadsheet program. For example, a Google search of “Excel largest value in a range of cells” will put you onto the =max function, which will display the maximum, or largest, value in whatever range of cells you specify. Or you can consult this list of Google Sheets functions. If you have some spare time, try the functions below on the “Raise” column:
=max(D2:D23) – Finds and displays the largest raise.
=min(D2:D23) – Finds and displays the smallest raise.
=stdev(D2:D23) – Calculates the standard deviation of the raises, meaning the average of the differences between each raise and the average raise. This figure is a way of showing how much variation there was in the size of the raises.
=median(D2:D23) – Finds and displays the median raise, which is the raise that falls in the middle if you sort the raises from highest to lowest.
Percentage of the whole: One last way of comparing the raises
Now that we’ve totaled the amount of money the mayor wants to spend on all 22 raises ($60,442), it might be informative to look at what percentage of that total would go to each department head. So, for example, Anderson’s proposed raise is $5,810 / $60,442 = .096, or 9.6 percent, of the money set aside for the raises, while Maria Lopez’s lowly $535 raise is a mere $535 / $60,442 = .0089, or (if you round up), a mere 1 percent of the money available for raises.
These figures will sort in exactly the same order as the figures in the “Raise” column, but they will give you a clearer understanding, perhaps, of how each department head’s raise compares. For example, knowing that Anderson’s raise totals $5,810 while Lopez’s totals $535 makes it easy to see that Anderson’s raise is bigger than Lopez’s. But it’s easier to see how much bigger if you know that about 10 percent of the funds will go to Anderson’s raise, while only about 1 percent will go to Lopez’s raise. Basically, Anderson’s “slice” of the raise “pie” is roughly 10 times the size of Lopez’s.
To calculate this figure for all 22 department heads, start by typing a bold-faced “Pct. whole” heading in cell F1. Then, in cell F2, type this (and don’t forget the $ symbols; more about those in a moment):
… so that, just before you press “Enter,” your screen looks like this:
Next, press “Enter,” then copy the computation down the column by clicking on cell F2, grabbing the little blue square, and dragging down to cell F23. You should get this:
What’s with the dollar signs? Now, as promised, a quick word about the odd inclusion of those dollar signs in the computation. All they do is tell Google Sheets to keep the D25 cell address from changing as you copy the calculation down the column. The dollar sign in front of the column letter, “D,” says to keep the column the same, and the dollar sign in front of the row number, 25, says to keep the row the same. To see why these instructions are important, give it a try without the dollar signs. You’ll get what’s below, which amounts to incorrect information in cells F3, followed by “#DIV/0!” error codes in each of the subsequent cells, indicating that you have tried to divide something by zero, an operation for which there is no correct answer:
To understand how omitting the dollar signs messes things up, remember that whenever you copy a computation down a column, Google Sheets automatically adjusts the row and column locations for you. In this case, doing that without the dollar signs correctly divides Anderson’s raise by cell D25 (the total of the raises), but incorrectly divides Martin’s by cell D26 (the average of the raises), and, starting with Brown, everyone else’s by an empty cell, which Google Sheets interprets as a zero. Click on To understand why dividing by zero produces an error, consider that 8 / 4 = 2 because 4 * 2 = 8. Try the same thing with a zero, and you get: 8 / 0 = ?, because 0 * ? = 8. There is no known number you can substitute for the “?,” because 0 multiplied by anything is 0.
As you might have figured out, omitting the dollar sign in front of the column and writing the computation as D2/D$25 would work OK in this case, because copying the computation doesn’t involve changing the column. But it’s OK to go ahead and nail down both the cell and column location, just to be extra safe.
Formatting: Changing “5810” to “$5,810” and “0.1181134377” to “12%.”
If the lack of dollar signs and percentage symbols in the spreadsheet’s figures has been bugging you, this section will come as a relief.
The figures in the “Old salary,” “New salary” and “Raise” columns are all dollar amounts, so let’s tell Google Sheets to display them with dollar signs and, where needed, commas. Highlight the figures in all three columns. Or, if you prefer, highlight the whole of each column by clicking on the “B” in the gray-shaded area at the top of column B and, holding your mouse button down, dragging across the “C” and “D” at the tops of columns C and D.
Next, click “Format / Number / More Formats / Custom number format,” as shown here:
Click, then scroll through the resulting menu, find a format with an example that matches what you want, and double click on it (or click it once and then click “Apply.”) I’m choosing this one:
… which will format the figures with a dollar sign, any commas necessary to show place value divisions, and two digits for amounts less than one dollar. The format immediately below, by the way, does the same thing, except that it shows negative figures in red, which is what the “[Red]” attributed in the code on the left indicates.
Now select the figures in columns E and F, click “Format / Number / More Formats / Custom number format” again, but this time choose one of the percentage formats, because these figures are percentages, no dollar amounts:
I picked the percentage format with no decimal places. Google Sheets will round each percentage off to the nearest full percentage point. Other choices will show you fractions of percentage points, if that’s what you want. In the end, you should have something like this:
It also would be good to label the “Total” and “Average” rows. And if, like me, you’re fussy enough to want a line under the column headings and also under the row containing Maria Lopez’s information, you can add one by highlighting the row you want to underline, clicking the icon that looks like a window pane, and choosing the “Bottom border” icon:
One last thing: Let’s give our “Untitled spreadsheet” an actual name, like, maybe, “Analysis of Proposed Raises.” To do so, click the current name, “Untitled spreadsheet,” in the upper left corner. A window will pop up inviting you to “Rename spreadsheet.” Type the new name, “Analysis of Proposed Raises,” in the “Enter a new spreadsheet name” box, and click “OK.”
The finished spreadsheet should look like this:
Here in Part 2, you learned a pair of helpful comparative techniques, percent change and percent of the whole, and got introduced to two functions, =sum and =average, that illustrate the kinds of things that the considerable variety of functions available to a spreadsheet user can accomplish. You also learned that a spreadsheet can sort numbers in ascending or descending order and words in alphabetical or reverse-alphabetical sequence. Trust me; you will use that all the time. Finally, you learned how to format numbers as percentages or dollar amounts, and you saw that still other formats are possible.
Part 3, coming up next, will show you how to make an interactive graphic from some of the above data and put it on the Web.
Go to Part 3: Making an interactive graphic
Return to Part 1: Making a Plan & Getting Started