Microsoft Excel and other spreadsheets such as Calc in OpenOffice are widely
used to manage data and more importantly to perform computations and analysis.
One of the most attractive features of these spreadsheets is their ability and
ease of use for data visualization. In this workshop, we will present several
examples where you will have a chance to learn about and try some of these
features in Excel.
Excel Basics & Conventions
Open an Excel Spreadsheet by double clicking on its icon in the list of programs on your PC. Often the main Spreadshet "Sheet1" is the default page that will be opened. As you can see at the bottom this sheet, there might be several other Sheets available (usually 3 others). In the screen that is shown below, you can see three Sheets (Sheet1, Sheet2, and Sheet3).
A Spreadsheet like the one shown below has many cells on it. Each of the cells has an address, which is based on the Column and Row where it is located. For example, the cell in Column B and Row 4, has the address of B4. Of course, we haven't done anything with these cell yet, so it may not be clear why we need to know this. To illustrate this addressing format, we have marked four of these cells A8, B1, C4, and D10. It is very important to know how this addressing is done as we will use this convention throughout this lab. Just for the warm-up, write the cell numbers for the two cells that have the arrows pointed to them.
Activity (1) – Working with Data and Formulas
Entering Data and Building Formulas
Let's enter some data and start manipulating them in
Excel. The following spreadsheet holds the salaries for 4 employees in a
John Doe -- $45000
Jane Doe -- $55000
Jim Jungleboy -- $61300
Roger Tallman -- $42400
As you can see in the image below, the name of each employee is stored in Column A in Rows 6:9 (6 through 9) and the salary of each employee is stored in Column B in Rows 6:9 of the spreadsheet. For example, in cell B7 you will find the salary of Jane Doe which is $55,000. Also, you may have noticed that we had some other things above the data. For example, we have put a title and then in B3 we have defined a percentage.
We want to increase the salary of these employees by 3%. To do so we need a formula:
You can see this in the spreadsheet below. Note that in this formula we have used B6 that holds John Doe’s current salary. Repeat this for the rest of employees. Remember to press enter after each entry.
After you typed (or copied) the formulas for all employees, you will get something that looks like this:
Let’s compute the total salary for all the employees before and after the raise. A pre-defined function we want to use to compute the total salary is called sum. To compute the sum we need the range of cells for which we want to compute the sum. In our spreadsheet, the range of salaries before the raise is B6 through B9, i.e., B6, B7, B8, and B9. So go to cell A11 type Sum (this is a label), then go to cell B11 where you will compute the Sum of salaries before the raise and type:
The B6 : B9 will give us a range of cells starting from B6 and ending at B9. Note that we used the pre-defined function sum to compute the sum of salaries. There are many pre-defined functions in Excel that help you compute different things. To find out what these functions are click in a cell and then under Formulas -> Insert Functiont to see a list of available built-in funstions.
For the next sum, we want to do something a bit different. In cell C11 type:
then left click on the cell C6, and hold the button and drag down to C9 as you are still holding the button, then at C9 release the button. The C11 cell gets filled automatically due to fill automatic feature of Excel. The cell looks like this:
In the above example we use cell B3 to keep the percent of the raise, i.e., 3%. But in our salary calculations we didn’t use it because we used a fix 0.03. Now we want to use that cell in the calculation of the new salary. If we use:
C6 = B6 + B3*B6
and then use the automatic copy and fill option to copy the formula to other cells B3 will change as you move to different cells due to relative cell addressing. Let’s try this so you can see what will happen. Click in C6 and type:
= B6 + B3*B6
and press enter. You will get the same value as before. Now, click on that cell and drag the bottom-right of the cell down on cell C7. Once done, click in a blank cell, then double click on cell C7, you will see:
= B7 + B4*B7
Did you notice that B3 was not used and Excel automatically has used the next cell which is B4. To avoid this problem we can fix the cell B3 by placing a $ in front of it in the formula in cell C6. So let’s try this. Click on cell C6 and type:
= B6 + $B$3 * B6
Press enter. The result for C6 is the same. But this time when you drag the bottom-right of the cell on C7, you will see:
= B7 + $B$3*B7
But there is another way to do this which is very powerful and often more popular. Using this approach we name the cell that holds the rate. So, we name B3 as raise and will use word raise in the calculations. How do we do this?
Click on cell B3
At the top-left, right below Home, Click on where it says B3
Type raise instead of B3, then press Enter.
This will name the cell B3 as raise. To make sure that this has been saved, click on another cell, and then back on B3, the spce right below Home should say raise if the naming is done correctly. From now on we can use raise iin all formulas instead of B3.Let’s try this (see the screen below):
Now when we use the automatic fill option we no longer need to be worried about raise being changed as that will always stay as variable Raise.
What is interesting with this approach is that now we can change the value in cell B3 where we stored the percentage of the raise and that affects the calculations automatically without the need of copying the formula again. Here is an example:
Compute the average salary for both the current and the salaries after 3% raise. Label this calculation as Average Salary.Exercise 1.2
Compute the Min and Max of the original salaries and those for after 3% raise.Dynamics in Excel and Goal Seek
What parameter do you want to change to make it happen? In our case that is Raise, B3
What will change as a result of the change in the Raise? In our case sum of salaries, C11
What target value do we want to reach for sum? In our case, $240,000
Ok we are ready.
Click on cell C11 where you want to set the target sum of salaries.
* Click on Data at the top
* Locate the What if Analysis item. From the What if Analysis menu, select Goal Seek
* The following dialogue box appears. Enter the values as shown and pay attention on why you are entering the values in the fields this way.
Now click on Ok. Also, click on the next screen. Your calculations are done. You will get:
So you need 18% raise to get the sum of the salaries to $240,000. What a great raise in such a tough time.Exercise 1.3
Suppose we want to include a tax penalty rate in the salary calculation. This rate is 3.2% that will be applied to the portion of the salary that is higher than $40,000. Thus, in case of John Doe, the new salary is:
NewSalary = CurrentSalary + Raise*CurrentSalary – PenaltyRate*(CurrentSalary – 40000)
Note that the sum of salaries after the raise and penalty has to stay $240,000. So you need to run Goal Seek.Visualizing Data
Most scientific graphs show the relationship between
two quantities. In Excel, such a relationship is graphed using a scatter plot.
Visualization of the data will help us better understand the data.
Let's work on a problem together.
Now we will download a file from the Internet that contains some data and then will get that data into an Excel sheet. To get the data which is stored in two columns in a file called unknowData.txt right click on this file name unknownData.txt and use Save Link As (Or Save File As) to save the file on your PC, perhaps on your Desktop or a directory you may have created for the workshop.
You can double click on the link above which will display the content of the file and then use Save As, to save the file.
How to get this data into our excel sheet?
Getting the data to an Excel sheet involves several steps:
1) Open a New Excel Sheet, by going to the top-left corner and choosing New.
2) Then, on the top-left of the sheet, find the Open, and click on it
3) Browse to the location where you had saved the unknownData.txt file. Once you are in that directory, at the bottom where you have Files of type, choose All Files (*.*). This way you will be able to see the .txt file. Click on the unknownData.txt file and then on Open at the bottom.
4) You will get a new form called "Text Import Wizard -- ". In that screen, for the Original data type, Choose Delimited. I am sure you know why. The data you just downloaded contains two columns that are separated by one space. Thus, the identifier of each column is that space. Then click on Next.
4) The two column data are stored a Tab apart, so you can choose Tab as Delimiters, and then click on Next.
5) In the next screen, choose General for the Data type format, and click on Finish.
Now you should have two columns of data in your sheet. Excel will assume that the left column is intended for the horizontal axis. Graph a scatter plot of these points by highlighting the data, then clicking on Insert, Scatter. This produces a scatter plot of the two columns of data.
You should get a graph like the one shown below. Since you have only one set of data, click on Series1 on the right-hand-side of your plot and then press Delete to delete that caption.
Fitting Curves to
To label axis click on the scatter plot and then go to Design -> Chart Layouts and choose the left most option (or really whichever one you like but this will do). You can now right click on the labels to edit them.
Often scientists look at scatter plots like the one
above to determine the trend of the data by finding a function that fits the
data well. Excel provides an option for Curve Fitting. Here is a document that
explain the process in details Curve
Fitting in Excel . We will go through some steps below to fit a curve to
the data we have plotted above.
First be sure that your chart is highlighted (left click on the chart anywhere if it isn't). Go to Layout -> Trendline. Be sure to choose more Trendline options. You have 6 options and note that when selecting polynomial you need to set the order of the polynomial (2 is quadratic, 3 is cubic). At the bottom of the Format Trendline box choose Display Equation and Display R squared. Note also that you can choose to give the equation a name, choose to have the equation extrapolated forward or backward, and set a fixed intercept value. From the look of our plot, I guessed that this might be a polynomial of order 2, so I selected that one. I also played with the Line Color and Width and selected color Red for the Trendline and width of 2Pts.
The curve will appear on the graph along with the data.
always, you should use caution when interpreting the r-squared value. Mathematically this value must increase for
“curvier” functions such as polynomials of higher degree. Choosing between
different functions should be done visually, not relying solely on the r-squared value.
Can you guess what this formula is for in Physics? You are right. It is actually the formula for the height of a projectile defined as:
Height = -(1/2)*g*t^2 + v0y*t
So you can see that g = 9.8 m/s^2 and v0y = 28.284. Note that v0y is actually v0y = v0*cos(angle), where v0 is the initial velocity of the projectile and angle is the angle at which it was sent off.
The projectile motion is a two-dimensional motion where the projectile moves in x (distance) and in y (height) directions. In this motion the projectile is usually sent of from its initial location (x0 and y0) with an initial speed (v0) at an initial angle (angle).
Assuming our initial velocity is v0 = 40 (m/s) and the initial angle is angle = 35 degree. The initial horizontal and vertical speeds are computed using:
InitialHorizontalSpeed = v0 * cos(angle)
InitialVerticalSpeed = v0 * sin(angle)
On Earth gravitational constant g = 9.8 (m/s^2).
We want to create 4 different plots to:
Note that horizontal speed will never change and it is always the same as InitialHorizontalVelocity. The above values are computed as following (note three are time-dependent):
height = -(1/2)*g*time^2 + InitialVerticalSpeed * time + initialHeight
verticalSpeed = -g*time + IntialVerticalSpeed
Please note that in Excel you have to type angle by pi/180 to convert to Radian. you will type angle*PI()/180. for example:
Once again note that the last one is always constant because it is not time-dependent.
We need to compute these values at some time instances. We will use the total time that the projectile is traveling before it comes to stop and will create equally spaced time instances between 0 to TMAX. TMAX is computed as:
TMAX = 2*v0*sin(angle)/g
So you need to start the time from 0 and end it at TMAX. Use an Excel spreadsheet to solve this problem. Please use cell naming for angle, v0, Initial height, and g.
Create three plots.
Make observations based on the plots you have created.
What will happen to the graph if we repeat the same experiment on planets:
moon: g = 1.63333 m/s2
Mars: g = 3.26667 m/s2
Europa (moon of Jupiter): g = 1.61 m/s2
Create all the graphs for these cases and make your observations.
Some Sample Data and Assignment you can do on Your Own:
(graphing 1) Use linear regression to find a formula relating the amount of the specimen to the output from the gas chromatograph in the dataset.
(graphing 2) The inverse of Hubble’s constant can be used as an estimate of the time since the big bang. Use the data to fit a curve to find Hubble’s constant (note that the relationship has no intercept).
(graphing 3) Investigate Gompertz’ law for population growth.
(graphing 4) Investigate the relationship between mortality from breast cancer and average annual temperature..