Intro to Spreadsheets with MS Excel
Many computer technologies developed as improvements on previous non-computer technologies. For example, word processing is so successful as an improvement on typewriting, that typewriting is pretty much obsolete. Similarly, spreadsheets, primarily used for arithmetic calculation, offer major advantages over non-computer technologies such as hand calculators and pencil and paper calculation.
One of the major advantages of the spreadsheet is that its formulas can recalculate immediately if any of the data upon which a formula depends, is changed. Indeed, probably the two most important spurs to the popularity of personal computers in the late 1970’s and early 1980’s were the power of word processing compared with typewriting, and the power of a spreadsheet, particularly to recalculate.
An Excel workbook may have several worksheets, indexed by their tabs. You may change the name of the sheet by clicking its tab and editing the name.
A worksheet has lettered columns and rows that are numbered. The intersection of a column and a row is a cell, whose address is named by the in the format
columnLabel rowLabel
For example, the cell in column G and row 5 has the address G5, as is confirmed in the “name box” when the cursor is in that cell.
Definitions of what cells display: a user may enter constant values, text, dates, Boolean constants (True or False), etc. A cell may also be defined by a formula. A formula typically starts with an equal sign. Many formulas are used to specify arithmetic calculations. Often we refer to other cells in formula, by using the cell’s address as a variable whose value is displayed in the cell.
Notice also that as we edit a formula, we can use “cursor pointing” clicking on a cell to place the cell’s address into the formula being edited. Often, this is faster and less error prone than typing the cell’s address.
Since many formulas are used for arithmetic you should know the arithmetic operators:
Operator
|
Explanation
|
Example
|
+
|
Addition
|
=B2+B3
|
-
|
Subtraction; also, as unary operator
|
=B2-B3
=-C8
|
*
|
multiplication
|
=B2*B3
|
/
|
Division
|
=D9/3
|
^ (caret)
|
Raise to power
|
=C12^2
|
Other elements used in formulas include functions. Excel provides hundreds of functions. Each is used in the format
FunctionName(parameterList)
where the parameter (or argument) list has 0 or more entries; if more than 1 parameter, then parameters are separated by commas. A parameter of a function may be a constant, a cell, a “range” of cells, or an expression. Among the commonly used functions:
-
Sum – used to total several arguments. For example, the formula
=SUM(B2:B4)
says to add up the values of the cells in the range B2:B4. A range of cells is a contiguous rectangle of cells, specified in the format
upperLeftCorner:lowerRightCorner.
Other valid uses of the Sum function
=Sum(B2:F6) (note range of multiple columns and multiple rows)
=Sum(B5, B7, B12:D15, 25*F4, 10)
-
The IF function is used to distinguish between cases. It uses the format
If(logicalCondition, expressionForTrue, expressionForFalse)
The “logicalCondition” is evaluated as either True or False. If it’s True, then the expressionForTrue is evaluated as yielding the value of the IF function; otherwise (the logicalCondition evaluates as False), then the expressionForFalse is evaluated as yielding the value of the IF function.
For example: Tax calculations typically take the form
Tax rate * taxable income
Suppose a state has a 2-tiered tax system, in which taxable income above a criterion level (say, $50,000) is taxed at one rate (say, 2%), and taxable income at or below the criterion level is taxed at a lower rate (say, 1.5%). Suppose, further, the taxable income is in cell D2. Then an expression for the tax rate is
If(D2 > 50000, 2%, 1.5%)
and the formula for the tax would be
= If(D2 > 50000, 2%, 1.5%) * D2
There are several ways in which the formula above can be improved. For example: The formula above could require 5 decimal places to show the exact product. However, US currency transactions usually use no more than 2 decimal places for fractions of a dollar. There is often a difference between the precision of a calculation and the precision of the display of a cell. The sum of unrounded numbers may appear to be slightly in error due to the accumulation of roundoffs. To avoid the appearance of such errors, we can use the Round function, whose form is
Round(expression, numberOfDecimalPlaces)
in order round off the value of the first parameter to the number of decimal places specified as the 2nd parameter. Thus, a better tax formula:
= Round(If(D2 > 50000, 2%, 1.5%) * D2, 2)
(the highlighted section corresponds to “expression,” above).
Another way to improve the tax calculation makes use of the observation that the “magic numbers” of the tax calculation ($50,000; 2%; 1.5%) could be changed by legislation. Suppose that, instead of the small number of clients we have in our example worksheet, we have 5,000 clients, each with their own row. If any of the parameters of the tax calculation is changed, then, if we compute as discussed above, each of the 5,000 cells would have to have its formula revised. Suppose, instead, we place these values in cells and refer to the cells in our tax calculation formulas. Then a change in tax legislation would call for revision of the 3 cells with these values, rather than 5,000 cells. Thus, if we place the criterion income in B13, the high rate in B14, and the low rate in B15, we might think that the formula
= ROUND(IF(D2 > B13, B14, B15) * D2, 2)
will work for us – but this formula won’t copy correctly down its column. For example, in the comparison, the reference to B13 is meant to refer to the criterion income, but this gets adjusted to B14 instead of remaining B13 when copied to the next row. What we need is a different kind of cell reference. The references used previously are relative references that adjust, as discussed above, through copy-and-paste operations. We can use a fixed (absolute) reference to fix (prevent adjustment of) the reference with respect to its column, its row, or both, during copy-and-paste operations. Thus, there are 4 ways to refer to B13:
-
B13 - relative in column, relative in row
-
$B13 - fixed in column, relative in row
-
B$13 - relative in column, fixed in row
-
$B$13 - fixed in column, fixed in row
Copying formulas: You can copy a formula, and what gets copied is the form of the formula, with its relative cell references adjusted according to the translation between the source of the copy and the destination of the paste. For example, when we copied the formula
=B2-C2
from D2 to D3, we found that the formula in D3 had been adjusted to
=B3-C3
corresponding to the fact that between the source (D2) and the destination (D3) is a translation of 0 columns (so the column references in the formula were unchanged) and of 1 row (so the row references in the formula were adjusted by 1 row).
However, fixed references stay fixed (unchanged) through copy-and-paste operations. For example, suppose the formula of cell F7 is copied to cell H11. Note the column translation is from column F to column H, which is 2 columns to the right; therefore, all relative column references are adjusted 2 columns to the right. Similarly, the row translation, from row 7 to row 11, is an increase of 11-7=4 rows, so relative row references are adjusted by an increase of 4. Everything else in the formula is copied without change.
If F7 has the formula
|
then H11 will have the formula
|
=B4 + C14
|
=D8 + E18
|
=$B4 + C14
|
=$B8 + E18
|
=B$4 + C14
|
=D$4 + E18
|
=$B$4 +C14
|
=$B$4 + E18
|
The first parameter of the IF function is a logical condition (evaluates as True or False). Typically, this condition is a comparison, such as the logical expression
D2 > B$13
The logical operators you might use to form comparisons are:
Explanation
|
Operator
|
Example
|
Is greater than
|
>
|
D2 > B$13
|
Is less than
|
<
|
H8 < I20
|
Is equal to
|
=
|
K5 = 0
|
Is greater than or equal to
|
>=
|
Q6 >= 2 * H5
|
Is less than or equal to
|
<=
|
L6 <= D$7
|
Is not equal to
|
<>
|
K9 <> 0
|
One way to copy: as in Word, block the cell(s) you with to copy from, then either click the Copy button or strike Ctrl-c; then move the cursor to the cell(s) that are to receive the copied formula(s); either click Paste, or strike Ctrl-v. Another way to copy: Notice the tiny square – a copy-and-paste-handle - in the bottom right corner of a cell you wish to copy from. Provided the cell(s) you wish to copy to is/are adjacent to the cell(s) you wish to copy from, you can hold down the left mouse button with the cursor over this copy-and-paste-handle and drag over the cells you wish to copy to.
Stretching or shrinking a column or row is often desirable. To change the width of a column, place the cursor among the column labels at the right edge of the column whose width is to be changed. Hold down the left mouse button while dragging the column in the appropriate direction (right for wider, left for narrower). Changing the height of a row is done similarly. For example, you may wish to make a column wider when you see an array of the “#” character in a cell in which you expect to see a number, as the appearance of an array of the “#” character indicates the column isn’t wide enough.
The average of multiple expressions can be computed by using the AVERAGE function, in the form
AVERAGE(parameterList)
where the parameterList is as we described for the SUM function. For example, the formula
=AVERAGE(B2:B8)
gives the average value among those cells in the range B2:B8 that have numerical values. The italicized words are stressed to make the point that if any of the parameters of the function do not have numerical values, they’re not included either in the total that’s the numerator of an average calculation or in the count that’s the denominator of an average calculation. For example, if we have a blank cell in B9, then using the formula above yields the same result as the formula
=AVERAGE(B2:B9)
Since the blank cell B9 is not counted in the denominator of the average calculation.
The maximum value of several expressions can be computed by using the MAX function, in the form
MAX(parameterList)
where the parameterList is as we described for the SUM function. For example, the formula
=MAX(B2:B8)
gives the maximum value among those cells in the range B2:B8.
The minimum value of several expressions can be computed by using the MIN function, in the form
MIN(parameterList)
where the parameterList is as we described for the SUM function. For example, the formula
=MIN(B2:B8)
gives the minimum value among those cells in the range B2:B8.
There are many other functions available for use in Excel formulas. Among these are trigonometric, logarithms, and exponential functions; financial analysis functions; and many others. See the Formulas tab, Function library section for additional information about available functions. One additional function you should know is the VLOOKUP function. This is a more efficient version of what might be computed as a sequence of nested Ifs – as
If(a, b, IF(c, d, If …)))
That is, we typically use the VLOOKUP (“vertical lookup”) function to distinguish among several cases. For example, suppose we want a formula for computing a student’s grade. Suppose the grade standards are as follows:
Minimum grade standard
|
Corresponding grade
|
-100%
|
F
|
42%
|
D-
|
45%
|
D
|
53%
|
D+
|
55%
|
C-
|
58%
|
C
|
72%
|
C+
|
75%
|
B-
|
78%
|
B
|
83%
|
B+
|
88%
|
A-
|
90%
|
A
|
95%
|
A+
|
If a student has an average of, say, 85%, then proper use of the VLOOKUP function will find that this “lookup value” (85%) falls between 83% and 88% in the first column of the range of data searched, and will yield that the corresponding grade (from the 2nd column) is a B+. Thus, the formula
=VLOOKUP(I2, A$18:B$30,2)
uses I2 (the student’s percentage) as the first parameter – the lookup value (to be looked up in the first column of the cells being searched – these must be listed in ascending order – otherwise, the VLOOKUP function is likely to return incorrect results); the range A$18:B$30 as the search range, which generally has cells from at least 2 columns, using the first of these columns as its list of criterion values in which the lookup value should be found; the 3rd parameter (here, the value 2) is the index of the column within the range of cells searched from which the value corresponding to the lookup value is to be found.
Formatting your worksheet
It’s possible for all your data to be correct, and your worksheet may look terrible.
Alignments: For example, the default alignment of text values is to the left, and the default alignment of numeric values is to the right. This can result (especially in a wide column) in a column header not appearing over the data it’s meant to head, with the result that the explanatory value of the column header is degraded. The alignment buttons are used to align the data of cells. Often, column headers over columns of numeric data should then be aligned to the right. Note also that often, a column of numbers that display the same number of decimal places, is expected to show these numbers with the decimal points aligned. This is easier to achieve by aligning to the right.
Wrapping text: Often it’s useful to let a cell display its data on more than one line. To achieve this, do the following:
-
Stretch the row tall enough to display the desired number of lines of data,
-
Arrange for the text to wrap by, on the Home tab of the Excel ribbon, clicking (in the Cells section) the Format button, then clicking Format Cells; on the resulting dialogbox, on the Alignment tab, check the Wrap Text checkbox and click OK.
Number formats: The Number section of the Home tab has several buttons that are used to format numbers, including
The Currency button (shows a dollar sign) will display a number with a currency symbol (usually, a dollar sign, but you may choose a different currency symbol from the associated drop-down menu). Note also that large numbers in the currency format show commas; note negative numbers in this format may be displayed in parentheses, without a leading minus sign – e.g.,
$ (67.40) rather than $ -67.40
-
The % button is used to display a number with a “%”. Don’t forget that percent means “divided by 100.”
-
The comma button is similar to the Currency format without a currency symbol. For example, a negative number may be shown in parentheses, without a leading minus sign
(8,345) instead of -8,345
A larger menu of number formats may be obtained by using (on the Home tab, in the Number section) the drop-down menu; the Number tab of the resulting dialogbox has a Category listing of different number formats, including General, those represented by buttons, and Scientific. For example, the value 8345 (in General format) may display as
8.345E+03
in the Scientific format, where the “E” represents “times 10 to the power” – in current example, 8.345 * 10+03.
Notice also in the Number section of the Home tab, the buttons to increase or decrease the number of displayed decimal places.
Other formatting capabilities include font effects familiar from Word: bold, italics, underlining, change of font size, change of font color, change of font style, etc. These are typically applied to entire cells.
You might also call attention to a cell or block of cells by using a “fill color” to change the background color. Boxer recommends: light background colors and dark text, to provide sufficient contrast that the data is easy to read.
The Merge and Center button, found in the Alignment section of the Home tab, may be used to merge together adjacent cells in a row so that the union of space occupied by the merged cells is considered to be a single cell, and the data displayed is centered in the space of the merged cell. This is useful, for example, if you wish to create a heading for multiple columns. This button is a switch – its action is undone by clicking again (assuming the highlight is on the merged cell).
To stress the power of recalculation, and to introduce a small number of new features, consider the following problem. Imagine yourself as charged with the task of determining pay raises for a small staff of employees, subject to certain constraints. Let’s say negotiations has produced an agreement so that the average rate of pay increase will be about 3%, but higher rated employees (on a scale of 1 to 5) will get a larger percentage increase than lower rated employees. The total staff pay increase must be at least 3% of this year’s total pay, and not more than $100 over 3% of this year’s total pay.
In mathematics, a common situation is expressing the idea of a number x falling between two other numbers by a chain of inequalities, e.g.,
0 < x < 10.
Recall this is an abbreviation for
0 < x and x < 10.
In Excel, we might think this is expressed as
0 <= x and x <= 10.
However, in Excel, the AND operator is a logical function, used in the form
AND(exp1, exp2, …)
where each of the parameters is a logical expression (evaluates as one of TRUE or FALSE). Thus, the formula
=AND(E13<=E10, E10 <=E14)
is Excel for
E13<=E10 and E10 <=E14
or what we might think of
E13 < E10 < E14 (note, however, the latter doesn’t mean the same thing in Excel).
We saw that, with our first set of guesses for percentage increases not yielding a valid solution, we were able to use the power of the spreadsheet to recalculate to make a series of modifications in individuals’ percentage increases; this series of modifications brought us quickly to a valid solution.
An amortization schedule is a document showing how a loan is repaid. The calculations include interest (important, because it represents the lender’s gross profit, and, sometimes, the borrower’s tax deduction), and the principal owed at each stage – after each payment – which is important for the following reason: as interest rates change, loans are often refinanced; in order to do so, we need to know what the borrower owes at the time of refinancing.
For our demonstration, we’ll use a simplified form of an ordinary annuity: the payment period coincides with the interest period (e.g., monthly payments and interest compounded monthly); a periodic payment is due at the end of the period; all payments are of the same size, with the possible exception of the last payment, which must be whatever is necessary to pay off the loan. If the last payment is larger than the other payments, it may be called a “balloon payment.”
Note also the use of a fixed reference in the formula used to calculate the interest that comes out of a payment.
Charts: Excel allows you to create a variety of charts based on the numbers that appear in a worksheet: bar and column charts, pie charts, line graphs, etc. Typically, a chart is created as follows: Block the data in a worksheet that is to be the basis of your chart. Then, on the Insert tab of the ribbon, in the Charts section, click the button corresponding to the desired chart type and start the process of creating the chart (this process has several steps).
(Note a worksheet can be deleted via: on the Home tab, click the drop-down menu button labeled Delete, and choose Delete Sheet from the resulting menu.)
For example, suppose we click the Column button. A menu of column chart sub-types appears. Suppose we choose a Stacked Column chart. We obtain a chart that is (typically) overlaid on the worksheet that was the source of the chart’s data. The chart is a graphic object that can be moved around the worksheet and resized, much as we have done with previously used graphic objects.
Our first clustered column chart has 3 clusters of columns – one for Revenues, one for Expenses, one for Gross Profit (the cluster names are the x-axis labels. The legends are the color codes for the columns (here, blue for Galleria, etc). Labels and legends are typically taken from row and column headers, provided that those cells are under the block when the chart is created. Note that a chart created without its labeling cells under the block will typically get default labels (1,2,3) and legends (Series1, Series2, Series3) that don’t help to understand the chart.
With the cursor on a chart, we can use Excel’s Design tab to alter the appearance of the chart in various ways. For example, you can select a different chart style to change the color scheme and 2-D or 3-D effects. You can choose a different Chart Layout to make use of, for example, chart titles, axis titles, etc. You can also click on a title and change its font properties in familiar fashion.
The Layout tab has more tools that can be used to affect the appearance of the chart. For example, in the Labels section of the Layout tab, we can edit Chart Title and Axis Titles (alternate to the tool discussed above on the Design tab); the Legend button can be used to place the legends of the chart in the desired position; etc.
Suppose we want our chart to display data that is not contiguous on the source worksheet (e.g., show Galleria and Eastern Hills, but not Boulevard).
-
One way to create such a chart: First, create a chart based on a block of data including the desired data. Then, in the chart, click on undesired chart data and strike DEL.
-
Another way: As before, create the chart with excess data. Then, on the Design tab, click Select Data; from the resulting dialogbox, select the undesired data series and click Remove.
-
Hold down the CTRL key while extending the mouse highlight over the (non-contiguous) cells you wish to base the chart upon. Then create the chart.
A stacked column chart is typically used to show how quantities contribute to totals. Note a common misuse of the stacked column chart is to include total data among the blocked data when the chart is created. For example, we got a stacked column show the sum of Total Revenues, Total Expenses, and Total Gross Profit – a sum that’s not likely to be meaningful. In our worksheet, this is the total of the entries of row 6. A different chart based on the same data might show a stacked column for, e.g., the Revenues data, with a block for Galleria, a block for Boulevard, a block for Eastern hills, and a block for Total – so that (since the sum of the heights of the 1st 3 blocks in the stack equals the height of the 4th block in the stack) the stack would be exactly twice as tall as it should be.
With the cursor on a chart, if you go to the Design tab and click the Switch Row/Column button, you interchange the roles of the rows and columns in determining legends and x-axis labels; in so doing, you change the structure of your chart. This button gives two possible arrangements for a chart; sometimes, both will be meaningful, and sometimes one will be nonsense and the other meaningful.
A pie chart is typically used to illustrate relative proportions of a total. After you have created the pie chart, you can choose (on the Design tab) a different Chart Layout – for example, you might choose to display the slices’ relative percentages of the total represented by the pie, or the data values that generated the slices. The Layout choice will also affect the formatting of the chart title, legends, axis titles, etc. (including whether or not you use them).
You can copy a chart into a Word document. Here’s an example:
A line chart is typically used to illustrate how quantities change with respect to time, or, more generally, how one quantity is related to another.
A stock chart is typically (not necessarily) used to illustrate changes in stock prices. The High-Low-Close form of the stock chart requires that for each of several “stocks,” we have in the worksheet 3 numerical data values in the order: highest value (e.g., high price); then, lowest value (e.g., low price); then, an intermediate value (e.g., closing price, which is between the low and the high). This chart produces, for each “stock,” a vertical line segment whose high point corresponds to the highest value; whose low point corresponds to the lowest value; with a marked point corresponding to the intermediate value.
A stock chart’s applications are not restricted to stock prices. For example, in our Grades worksheet, we can consider a graded experience to be a “stock,” since our summary data includes, for each graded experience, the high score, low score, and class average, listed in that order.
In a large worksheet, important labeling data may scroll offscreen, making it difficult to read or record or correct data. For example, in our Grades worksheet, in order to read the Grade column (col. K), we could lose our view of column A, where the students’ names appear. Thus, it would be easy to make a mistake in recording final grades in a different document. Similarly, if the class is large, it would be easy to record, say, for the student in row 35, data in the wrong column, since the column headers of row 2 are not visible when we view row 35. We can overcome this problem by “Splitting” the view of the worksheet into 4 panes (subwindows). This is done as follows.
-
Determine where in the worksheet you wish to split into the 4 panes. Place the cursor at the lower right cell at the intersection of the 4 cells at which the split should take place (the upper left cell of the lower right pane).
-
On the View tab, click the Split button. This is a switch, so when you’re done with the split view, you can click this button again to remove the split.
When your view of the worksheet is split, it usually will make sense to do further work in the lower right pane.
Calculation takes time. If you’re entering a large series of data values into a very large worksheet (perhaps, using a slow computer), you could find yourself having to pause for several seconds while formulas recalculate before you can enter the next data value. You can temporarily disable automatic recalculation, in order to save this time, as follows:
-
On the Formulas tab, click the Calculation Options button.
-
On the resulting menu, select Manual.
While in Manual recalculation mode, you can “manually” recalculate by striking the F9 key. Usually, when you finish editing data, you should restore automatic recalculation by selecting Automatic from the Calculations Options menu.
Often, calculations are so large and/or complex that it’s desirable to split them among multiple documents (worksheets). This implies that it may be useful for a formula in one worksheet to use a cell of another worksheet. A good example: tax calculations, in which you may wish to have a worksheet for each of the schedules used.
Suppose in Schedule B, cell B10 has the total interest. We wish to copy this value, say, to B9 of Schedule 1040. This can be by starting the definition of B9 with “=”, then cursor pointing to the desired cell (i.e., click on the tab of Schedule B, then click on that worksheet’s B10). This yields, for Schedule 1040’s B9, the formula
='Sched. B'!B10
More generally, reference to a cell of another worksheet in a formula takes the form
‘sourceWorksheetName’!cellName
Suppose you are working with calculations over a large worksheet, or over multiple worksheets. Reference in a formula to a desired cell may be awkward, because the cell might not be in view; finding it might be awkward, but on the other hand, if you guess at the address from memory, there’s a good chance your guess will be wrong. One solution: use the Name Box to give the cell a name that’s easy to remember. You can then refer to the cell via the name you have given it, as a fixed reference (thus, to refer to the cell as a relative reference, you would still use the address rather than the name). For example, we can use the formula
=VLOOKUP(L3,MinScore:HiGrade,2)
to look up the value of cell L3 in the range of cells MinScore:HiGrade and return the value in the same row and 2nd column of the range of cells.
In determining, in the homework, the formula for the management bonuses: You were given a verbal description, that they’re computed as
=(percentage, according to sales) * (sales for city)
or
=(percentage, according to sales) * B6
Now, the first factor depends on a comparison between actual sales (B6) and the level criteria, with the corresponding percentage in the same row as the matching criterion. Thus, we can determine the percentage by the expression
VLOOKUP(B6, $B12:$C14, 2)
which makes the formula (so far)
= VLOOKUP(B6, $B12:$C14, 2) * B6
and, since we wanted to round to the nearest dollar (0 decimal places), we obtain as a final version of the formula,
= ROUND(VLOOKUP(B6, $B12:$C14, 2) * B6, 0)
Another approach to the problem makes use of the IF function to determine the percentage. However, the VLOOKUP approach has the following advantages:
-
The use of IF yields a much longer and more error-prone notation, roughly because you were required to have at least 3 cases. The VLOOKUP version of the percentage, recall, is
VLOOKUP(B6, $B12:$C14, 2)
The easiest way to use the If function requires you to nest a use of IF inside another use of IF:
IF(B6 >= $B14, $C14, IF(B6 >= $B13, $C13, $C12))
-
Suppose we were to add levels to the lookup subtable. In the VLOOKUP approach, we merely have to modify the range of cells: the percentage, using 4 levels, would be
VLOOKUP(B6, $B12:$C15, 2)
In the IF approach, adding another level (another case to consider) requires another nesting of the IF function, so the percentage becomes
IF(B6 >= $B15, $C15, IF(B6 >= $B14, $C14, IF(B6 >= $B13, $C13, $C12)))
Dostları ilə paylaş: |