Excel Practical Two: Frequency Distributions
__________________________________________________________________________
Go into your email and download & save ExcelLab2.xls
Biofuel Consumption
Column 2 in Worksheet 1 represents the 2013 biofuel consumption of some of the world’s countries. I have left out some of the countries with significantly large biofuel consumption figures. The units of measurement are TJ (terajoules).

If you have not previously loaded the Analysis ToolPak you may have to do so again:

Click on the File tab.

Click on Options

Under Addins, select Analysis ToolPak and click on the Go button.

Check Analysis ToolPak and click on OK.

On the Data tab, you can now click on Data Analysis

We want to replace establish a frequency distribution for the raw data. Excel isn’t very clever at picking bins but we’ll allow it a try. Go to Data > Data Analysis > Histogram. Select all of the raw data. Select an Output (Range) cell for the frequency distribution and also check the Cumulative Percentage. This should produce a frequency distribution and a cumulative percentage distribution.

Note that the bins are horrible. Now in another part of the worksheet, input nicer bins. For example,
0, 3000, 5000, 7000, 10000, 20000, 550000

Now go back to Data > Data Analysis > Histogram.

In the Input Range select the raw data

In the Bin Range select the bin data

In the Output Range input a cell in an empty region of the work sheet

Check Cumulative Percentage

Press Enter to produce a frequency distribution and the relative cumulative frequency distribution.

In an empty part of the worksheet, input two column headers: x< and f. Under x< write the upper class marks (i.e. 0, 3000, 5000, 7000… if you used the above bins). Now under f we want the number of data points less than these upper class marks. To do this multiply the relative cumulative frequencies by the total frequency (59).

Once this table is complete go to INSERT > CHARTS and pick the scatter plot with a smooth curve. This is your ogive.

(Roughly) Estimate the Median and Interquartile Range using the Ogive. To increase the number of tick marks on an axis right click the axis, press FORMAT axis, change the UNITS (MAJOR) and press ENTER.

Now calculate the MEDIAN and InterQUARTILE Range using the raw data.
Heights of Argentinian Males
Column 1 in Worksheet 2 represents (simulated) data of the heights of a sample of 170 Argentinian males.

Go to a free space in the worksheet and enter the following bins in a column:
145, 155, 165, 170, 175, 180, 185, 195, 205

Now go back to Data > Data Analysis > Histogram.

In the Input Range select the raw data

In the Bin Range select the bin data

In the Output Range input a cell in an empty region of the work sheet

Unheck Cumulative Percentage but check chart output

Press Enter to produce a frequency distribution and some class of a bar chart.

Excel is famously bad for producing histograms. There are at least three problems with an Excel histogram: see if you can find them all. There are workarounds but are simply not worth the hassle. Use the frequency distribution that Excel produced and produce a proper frequency distribution (with 145155, 155165, etc) and a proper histogram (on paper). Remember
Height = Frequency/Standard Width

Use Excel to help you find the sample mean of the frequency distribution. Compare this with the (mean) AVERAGE of the raw data.

Use your frequency distribution and histogram to calculate the median. Compare this with the MEDIAN of the raw data.

Use your histogram to estimate the mode. Verify your estimate by using a formula.

Use Excel to help you find the sample standard deviation of the frequency distribution (formula at back of notes). Compare this with STDEV.S of the raw data.

Using a formula from the back of the notes, calculate the coefficient of variation.

Using a formula from the back of the notes, calculate the coefficient of skewness.
If you have completed this use Excel to help you with questions 5, 2, 6, 4, 1, 8, 7, 11, 9, 12, 10 on P.33
 Page
Dostları ilə paylaş: 