Chemical spreadsheet notes


Steps involved in making graphs



Yüklə 327,71 Kb.
səhifə2/5
tarix09.01.2019
ölçüsü327,71 Kb.
#94260
1   2   3   4   5

Steps involved in making graphs.


  1. Data entry. Your program should have standardized data entry forms for your data – i.e. forms that you use each time you record data.

  2. Convert raw data to final results. This may involve a number of operations, from averaging replicate readings to tossing out bad data.

  3. Arrange data conveniently. This can include copying data from entry sheets to a summary sheet, transposing columns and rows, or sorting a data set, or filtering it to isolate a few values you want to portray.

  4. Select the type of graph you want and create it, using the Excel Chart function.

For the most part, this manual assumes that you have completed step 1and 2, and we therefore offer only a few remarks on these steps. However, here’s one data entry issue we would like you to be aware of:



Dealing with missing data points.


For the most part, spreadsheets collect data from a particular location on a chart: e.g. 5th through 12th rows, 2nd through 19th columns. If your program samples some sites or parameters irregularly, we recommend that you still keep a record for each date and site, even if it contains an empty value. This will retain the row-column integrity of your data array, and avoid later problems when graphing. We provide numerous examples below of repeating sequences of dates, site numbers, water depths, etc. If there are missing records, those sequences lose their relationship with one another, and errors result.
For example, here are two different ways to catalogue pH records for 2 different sites, 3 dates, with identical procedures for graphing the data sets. One site has a missing sample – perhaps that site wasn’t sampled that day, or perhaps bad data were discarded.

Deerfield River pH example







Date

Site

pH

4/19/98

COR-010

6.2

5/16/1998

COR-010




6/21/1998

COR-010

7.05

4/19/98

DER-010

6.19

5/16/1998

DER-010

7.04

6/21/1998

DER-010

6.58




Date

Site

pH

4/19/98

COR-010

6.2

6/21/1998

COR-010

7.05

4/19/98

DER-010

6.19

5/16/1998

DER-010

7.04

6/21/1998

DER-010

6.58






In the first example, a record exists for the missing data, and the chart shows the correct placement of pH values. In the second example, no record exists for site COR-010 on 5/16, and Excel wrongly places the 6/21 value in the 5/16 spot, giving an erroneous reading. You could tinker with the chart for a while to fix this error, but it’s more work, and in many cases when you are dealing with large data sets you are not even aware of missing data points, you may not catch the error.


Arrange data conveniently

File Organization

However you want to represent your data, it will help to first organize the data in an arrangement that is convenient for graphing. Typically, this means an array of adjacent cells within the row and column spreadsheet architecture. Note that it is not required to arrange the data in adjacent cells, but it usually helps. We provide some examples of graphs generated from non-adjacent data in this handbook.


You can either enter your data by hand (i.e. type it in), or cut and paste the appropriate cells from other worksheets you’ve used for data entry or computation. In either case, be sure to check your work for accuracy.
The following examples are taken from worksheets summary and/or chartsummary, which are found in workbooks lakecharts and Deerfield data. There are slightly different versions of summary and chartsummary in the two workbooks – reflecting different types of data you might want to capture for lake or river sampling, respectively. Note also that in both workbooks, chartsummary is a copy of the summary worksheet. We left both in the workbooks to emphasize our point that when you sort, filter, and otherwise manipulate data, it’s wise to do this from a copy of your data set, to avoid corrupting your original data. The examples given in these workbooks show some fields that a group might use to document their sampling program. Feel free to copy these examples and use for your program, adding or changing columns to reflect the indicators you are using.
A (partial) sample data set:


Waterbody

Date

Sample ID

Site

depth

TP

Secchi

Lkdpth

Temp

DO

DO Dpth

Onota Lake

5/5/00




D2







2.5













Onota Lake

5/24/00




D2







3.2













Onota Lake

6/13/00




D2







3.2













Onota Lake

6/28/00




D2







3.0













Onota Lake

7/11/00




D2







2.4













Onota Lake

7/20/00




D2







2.2













Onota Lake

7/28/00




D2







2.0













Onota Lake

8/8/00




D2







2.2













Onota Lake

8/16/00




D2







2.9













Onota Lake

8/22/00




D2







2.4













Onota Lake

8/30/00




D2







2.7













Onota Lake

9/7/00




D2







2.9













Onota Lake

9/14/00




D2







2.6













Onota Lake

9/22/00




D2







3.0













Onota Lake

9/29/00




D2







3.0













Onota Lake

10/14/00




D2







3.4













Onota Lake

11/2/00




D2







2.7













Lake Warner

5/5/2003




2




26.5

1.7

2.5

16

9.9

2.2

Lake Warner

6/3/03




2

0.5

22

1.7

2.4

19

8.72

2.1

Lake Warner

6/10/2003




2







1.5

2.5

17

9.02

1.5

Lake Warner

7/14/2003




2




40

1.9

2.4

26

5.68

2.1

Lake Warner

8/11/2003




2







1.5

2.5

24

5.06

2

Baker's Pond

6/6/96




1

0.5










17.6

7.43




We recommend that you use a file structure similar to this – one row for each data record, including identifying data (information about the collection such as water body, site number, date, etc.) as well as the actual results (e.g. DO, Secchi, temperature measurements).


One noticeable point about this file structure is that many fields (e.g. water body name, site) are repeated on each line, even though they rarely if ever change for some programs. It may seem unnecessary to repeat “Deerfield River” or “Lake Warner” for each data entry. Or you may always sample just beneath the water surface, so why bother to record depth? Ultimately, your program and/or database manager must decide what’s best for your program. But it is a good idea to remember two principles of database architecture:

  • Each data record should be uniquely identified. In this case, each row constitutes a record, and the unique information that identifies it will be some combination of water body name, site name/number, date, time, and/or sample ID code that you give it. The last thing you want is to have a data value and not know which collection it came from. And don’t forget to include year, if you expect your program to be around for more than a year.

  • Design your database so growth is easy to accomplish. You may only sample one water body now, but if you decide to sample tributaries next year, or want to compare your data with that from a neighboring program or contribute to a statewide data set, you’ll need to add that identifying information. It will likely be easier to do it now than later.


Tip 1. To repeat values in Excel:

Excel makes it easy to repeat values. If for instance, you have 20 samples taken at Lake Warner and 3 at Campus Pond, you can enter “Lake Warner” once in the water body column, then click on the small black square at the lower right corner of that cell, and holding the mouse key down, drag the mouse down to the 20th cell. Let go of the key and the name “Lake Warner” will appear in all of those cells. Then type in “Campus Pond in the 21st row, and do the same for the next 2 rows.


Repeating vs. incremental values: Note that Excel will sometimes increment values instead of repeating them, if the initial value appears to suggest a stepped increase. Try typing “Jan” in an empty cell, then dragging the lower right corner down a few cells. “Feb”, “Mar”, “Apr” etc. will appear. If you want to keep the same value (e.g. “Jan”, “Jan”, “Jan”), hold down the control key when you drag the mouse down (or across). Conversely, Excel isn’t incrementing and you want it to, hold down the control key when you drag to force increments. You can also force incrementing values by typing the first 2 values in a sequence. Then highlight both cells, and drag the black square down (or across) to produce the sequence. E.g. “1” and “1.5” in the first two cells will yield “2”, “2.5”, “3”, etc. in subsequent cells.
Tip2. Hide and “Unhide” cells

This feature is helpful when reviewing data sets with many columns, or when you want to arrange data in different ways for data entry purposes. For instance, your lake program collects bi-weekly Secchi, DO and temperature data, and monthly TP data. This suggests 2 different data entry/organization options:



  • Use different data forms for the different dates, as in the sample below. One is used just for Secchi, TP and DO; the other also has a TP field. This isn’t necessarily the best option, as you will still need to merge these data sets in order to organize all your data for later graphing, and you might get into some real file management headaches.




Waterbody

Date

Site

Secchi

Temp

DO

Lake Warner

5/19/2003

2

1.5

17

9.02

Lake Warner

6/17/2003

2

1.5

24

5.06




Waterbody

Date

Site

TP

Secchi

Temp

DO

Lake Warner

5/5/2003

2

26.5

1.7

16

9.9

Lake Warner

6/3/03

2

22

1.7

19

8.72

Lake Warner

7/1/2003

2

40

1.9

26

5.68



  • Use one form that includes all possible parameters, and fill in only the appropriate fields.

Waterbody

Date

Site

TP

Secchi

Temp

DO

Lake Warner

5/5/2003

2

26.5

1.7

16

9.9

Lake Warner

5/19/03

2



1.5

17

9.02

Lake Warner

6/3/2003

2

22

1.7

19

8.72

Lake Warner

6/17/2003

2




1.5

24

5.06

Lake Warner

7/1/2003

2

40

1.9

26

5.68

This makes more sense conceptually, but the arrangement may still be prone to errors, as data can easily be entered in the wrong cells. One way to minimize this problem (particularly when entering large amounts of data at one time) is to use the “Hide” function in Excel. On days when you are not entering TP data, hide the TP column by dragging the mouse over the column header, then right-clicking the mouse (or open the “Format” dialogue box), then selecting “Hide”. The column will disappear from view (but the data remains!), making it easier to enter only the data you want. Follow the same steps to reverse the process when you want to view all columns – just drag the mouse over the adjoining columns on either side of the hidden ones and select “Unhide”.



One worksheet or several?


Using a simple, uniform row-column format allows you to easily organize data you wish to graph. You can either store your data in one large worksheet by appending data from each collection to it, or you can save data in separate worksheets, according to the needs of your program. For instance, one worksheet for each year’s data; or if you sample different lakes or streams, you may want a separate file for each. As long as the format is the same for each, it’s a simple matter to copy the records you want records from several locations into a single location sheet that will facilitate graphing. Bear in mind that files can get quite large when you start adding a lot of charts. This might create problems when trying to save data to a floppy disk or send email attachments.

Copy records to a separate area


Once you’re ready to start graphing, we recommend that you copy all the data you plan to work with into a separate area; either a separate worksheet or somewhere else on the current worksheet. For instance, you would copy the 5 Lake Warner records shown above to a separate space to begin graphing. In the sample worksheets and charts we have created, you will notice that there we sometimes create one copy of a data subset for a single chart, and at other times create several charts from a data subset, for reasons given in the following bullet point.
Creating multiple copies of your data set – or of portions of your data set - will help you with graphing problems, but it is not without consequences. The pros and cons of having several copies of your data:

  • Pros: When you create a graph from sorted data and then rearrange the data (e.g. by resorting or by transposing rows and columns), the graph will automatically redraw itself to reflect the new data order – thereby destroying the original graph. We give examples of this below, in the river section showing graphs of one site over several dates, and graphs of several sites on one date. Also, it’s much easier to work with a graph if you can see the data points right next to it. If you make lots of graphs, this is impossible.

  • Cons: Multiple data sets leave you prone to data corruption. If you correct a mistake or update a value in one copy of the file, you’ll need to manually correct that value in all other copies as well. Similarly, additions to the data set are not as easily graphed. For instance, if you are distributing monthly reports on your monitoring program, adding each month’s results to a growing graph, you may have to enter the new data in multiple places to update each graph that is based on the data.


Tip 3. Keep headers in view when scrolling.

To keep the top row or rows in view when you scroll down through a large worksheet, place the cursor on the row just below the row or rows you want to remain visible. Open the “Window” tab, then select “Freeze Panes”. You can employ the same tactic to keep columns in view when scrolling left to right.





Yüklə 327,71 Kb.

Dostları ilə paylaş:
1   2   3   4   5




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin