VLookup Assistant Excel Add-In.
This document provides details of an add-in written for Microsoft Excel, which provides assistance for the VLOOKUP function. Whilst this function is both useful and powerful, it is frustratingly easy to create a lookup which stubbornly returns #N/A or the wrong results whatever modifications are made.
This add-in provides support by analysing each element of the formula and how it relates to the range being looked into. It also tests the lookup using different Excel formulae to establish if the data you’re trying to find really exists. As an add-in, the assistant will be available to any spreadsheet opened.
What’s a VLOOKUP ?
A vlookup is a function which allows a given value to be looked up from within a table, and an alternative column to be returned from that table. It can also return the closest match to the value being looked up so that each row in the table can represent ranges of values. The table being looked into can be within a different spreadsheet to the one currently being worked on, and this spreadsheet can be shared between multiple users.
There are many advantages to using a vlookup. The main advantage is that tables of values don’t need to be repeated on the spreadsheet, significantly reducing space and data maintenance issues. Once a formula for a lookup has been set up, it can quickly and easily be copied to different areas in the spreadsheet so that it can refer to adjacent cells in a list. An example is shown below;
The vlookup function can be entered directly in a cell, by using Insert, Function, or by clicking on the Insert Function icon . Using the Insert Function options are better option for the novice because they open the “Function Arguments” window (shown below) which prompt you for each piece of information required, and offers help.
The arguments that should be specified for the vlookup are;
The value you’re trying to find (lookup_value). This can either be a literal value, or a cell reference, or a formula. If it is a cell reference or formula, then the value that will be looked up will be the value of the cell reference, or the result of the formula.
A reference to the cells where the results can be found (Table_Array). This is expressed in the form of an Excel range (e.g. $C$1:$D$5), or a named range of cells. The leftmost column must contain the data being looked up. For example, if we were looking up a branch number, then the full list of branch numbers should appear in the rightmost column. The data that should be returned must also exist in the range.
The relative column number in the Table Array which should be returned if the value is found. So if we have a table of branch numbers and branch names (e.g. in the range $C$1:$D$5), and we want the branch name, we’d specify column 2. Note that this is relative to the range specified in the formula – not the spreadsheet (i.e. in the example quoted, we’re returning column D, but we specify 2 as the relative column number, not 4).
An exact match indicator. A 0 in this column (which is the default) specifies that an exact match must be found. A value of 1 (or True) can be specified in which case the vlookup will return the last value which is less than the lookup value. The range of cells being looked into must be sorted in ascending order for this to work because the vlookup function will stop looking once it finds a value higher than the value being looked up, and return the previous value it saw.
Why do I need a VLOOKUP Assistant?
The specification of the VLOOKUP function above looks relatively simple, and the principle of looking up a value against a table sounds simple, so what can go wrong? The trouble is that all too often we find that whilst we know the data we’re looking for is in the lookup table, and generally, we can physically see the data in both locations, the VLOOKUP function can’t find it. This can lead to highly frustrating results;
Unfortunately, what we can see on screen isn’t necessarily the same as the way the computer sees it or stores it. Likewise, what we would call an exact match, and what the computer would call an exact match often differ. A prime example of this (and a common problem in VLOOKUPs) is when spaces appear before or after a value that is being looked up, or its corresponding value within the table we’re looking into. Usually these spaces are difficult to spot because they take up so little room, or because they appear at the end of left aligned text, at the beginning of right aligned text, or either side of centre aligned text!
Another common problem is when the value you’re looking up is held in Excel as text, and the corresponding value in the lookup table is held as a number (or vice versa). Generally, values held in Excel as text have a single quote ‘ in front of them. It is a common misconception that reformatting a numeric cell so that it looks like text will mean that it is treated as text. Unfortunately Excel itself doesn’t help to dispel this myth by stating that “Text format cells are treated as text even when a number is in the cell.”. Choosing Format cells against a selected cell which holds a number , and then choosing the category Text from the Number tab will make the cell look like text, but a Vlookup function still won’t find it if it is looking in a lookup table which holds text values.
The reason is due to the way computers store numbers and characters (text). Computers store everything as numbers. Ultimately they are stored as binary ones and zeros, but in the simplest terms one storage location (one byte) holds a value as two hexadecimal digits with a maximum value of FF, which corresponds to 256. Each character is represented by a number corresponding to that number in a “character map” of 256 different characters. Therefore one character can be stored in one byte of information. Numbers on the other hand, can simply be held as hexadecimal digits, and therefore one byte of information can hold a number between 0 and 255, which looks more like 3 digits to us.
Occasionally issues can occur because the different components of the vlookup function are incorrectly specified. A common mistake is with the column index number which specifies which column from the lookup range should be returned. In the examples I’ve shown above where only two columns are specified in the lookup range, this would be difficult to get wrong, but in much larger tables (especially where the size of the table is such that it can’t all be displayed on the screen together), it’s easy to miscount the number of columns.
Most special case scenarios are not covered by the assistant. Such cases are grammatical issues, such as missing or added punctuation marks, or plurals specified rather than a singular value. One case which is tested for is a missing leading zero off a text value. Leading zeros appear before many data items such as telephone numbers, but are often omitted when they are being looked up.
Unfortunately Excel isn’t great at telling you what the issue is, and the error message that gets returned isn’t useful in itself either. Specifying a column index number which goes outside the number of columns in the lookup range for example, will return a value of #REF! Likewise, if a value can’t be found, Excel treats that as an error and returns a value of #N/A. Sometimes, we can fully expect that a value won’t be found and we’d want to treat that as a space, or a zero. Unfortunately, that #N/A value will propagate to the results any formula entry which refers to the vlookup that can’t be found.
For example, let’s say we had set up a spreadsheet which showed on one single screen several aspects of a stores profile taken from multiple sheets or other spreadsheets. One of the fields we show are the number of seats the in-store restaurant has, and six other fields are for each line of the stores’ address. Below that data, we have some formulae which multiplies the number of seats by a factor to work out how much time would be necessary to clean every seat in the restaurant, this value is displayed in another cell, and ultimately becomes part of a total at the bottom of the sheet. If the store has a simple address with just a few lines, we’d prefer the missing lines to be shown as blanks rather than #N/A. Worse still, if the store doesn’t have a restaurant, the number of seats would show as #N/A, the time to clean these seats would show as #N/A, and the total number would also show as #N/A.
Sometimes, a value might get returned by a vlookup which we wouldn’t expect. This could be because we’ve specified that we don’t want an exact match, or because more than one cell with the lookup value exists.
A further danger with a vlookup is when the formula entry is repeated on subsequent cells, but the lookup range specified is dynamic rather than static. A static range has dollar signs “$” specified in front of any part of the range specification that should not change when the range specification is copied to another cell. For example, copying a formula using the range $C$1:$D$5 to any other cell will use exactly the same cell reference. For a dynamic range specification like C1:D5 however, copying this to a cell two columns across and one down, would refer to E2:F6. A common way to use a vlookup is to repeat the first vlookup on several rows below the first one, by setting up the formula in a single cell, and repeating it down by using auto-fill. If the formula is referring to a dynamic lookup range specification, then auto-fill will change this lookup range on each subsequent cell, resulting in an incorrect lookup range. This is contrary to the cell reference specifying what we want to lookup, which we probably do want to change on each subsequent row. It is good practice to specify the lookup range as a static range, or better still as a named range.
What issues are covered by the VLOOKUP Assistant?
After all of those issues, here’s VLOOKUP Assistant to the rescue!
Several options were considered for the implementation of a routine which would help identify and fix the issues listed above. The first prototype was in the form of a new Excel function which would be specified almost exactly the same as a VLOOKUP, with an additional argument specified at the end which could provide an override value for #N/A. The only trouble with this approach was with the danger of complacency on behalf of the user. Having a function which fixes all of those issues for you doesn’t help you to identify the cause, and choose a suitable remedy for yourself. The other issue would be that if the new function was installed on your computer as an add-in, then while it would be available to every spreadsheet you worked on, it wouldn’t work on another computer you opened the spreadsheet on.
The VLOOKUP Assistant tries to identify the issues listed above, and suggest a fix for them. In most cases, it will display a modified VLOOKUP function, and offer to copy this to the clipboard for you. The modified formula uses regular Excel functions available to any copy of Excel, including older issues. There is therefore no issue in implementing the new formula on your spreadsheet, and then another user opening the same spreadsheet on their computer. The Assistant also shows you why the anomaly may be happening so that your understanding of the issue is broadened.
The assistant works by analysing the lookup value and range that is referred to, counting the number of rows in the range, and examining the type of data being looked up. It uses other Excel functions (such as COUNTIF) which work in slightly different ways to establish the existence of the lookup value in the lookup range. It exploits features of these other functions such as the use of wildcards in a COUNTIF test, to see if blanks are present in the lookup range values. It also tries the vlookup in different ways by converting the lookup value to text or to a number before testing the lookup again. As the assistant uses each of these methods, it adds a narrative to the final description and then displays these options on screen with options to copy new formulae entries to the clipboard if necessary.
Examples of the assistant in use are shown below;
Here, the cell with the thick black line around it has been selected, and the assistant clicked. The #N/A indicates an error with the vlookup in this cell.
The countif function is used to count how many instances of 2007 it can find in the lookup range (i.e.1)
A value of zero has been specified in the Range_Lookup argument, indicating that an exact match only should be returned.
The assistant recognises that the data being looked up has been entered as a number, and that it needs to turn it into text to fix the vlookup. Furthermore, it recognises that it would also need to add a leading zero to the text to fix the vlookup. It displays the new formula and offers to copy it to the clipboard.
In this example, the assistant can’t even find an example of the text with the countif function.
Ultimately, it recognises that the text being looked up contains blanks on the left, and suggests removing them using the Excel TRIM function.
Having modified the formula, it offers to copy it to the clipboard for you.
If we look carefully at the lookup range shown in the table on the left, we can see that the value of “four” is slightly mis-aligned compared to the rest of the values. This is because it has spaces to the right of the text.
The assistant recognises this by adding wildcards to the end of the countif function.
Trying to modify the lookup value doesn’t help, so the assistant suggests using the ISERROR function to turn #N/A into zero, and offers to copy the new formula to the clipboard for you.
Finally, having recognised the existence of trailing blanks at the end of the lookup range, the assistant offers to remove them for you.
STOP Press ; … and now there’s more …
29 September 2006 : Replacement of #N/A formula offered as a default. Previously, the assistant would offer an alternative formula which tested for an error in the result of a VLOOKUP, and return a default value if found. This formula would only be shown however, if a fix for the formula could not be found. Now it is shown by default and an additional button added to the message window to allow the new function to be copied to the clipboard (see below for an example). This option is not offered if the VLOOKUP is successful.
30 September 2006 : Message window expanded.
The initial version of the assistant showed the results of the analysis using a standard message box. Such message boxes enable simple Yes / No buttons, but are limited to around 1,000 characters of text.
As the assistant’s analysis functionality expanded, much more text was enabled for display. Additionally, the offer to copy more than one formula to the clipboard complicates the issue of just having a yes / no button.
To circumvent this problem, a dedicated window display was added, with the capability to add scroll bars to the window to handle much more text, and also with the addition of further buttons to copy the formula which replaces #N/A to the clipboard, and a Help button which links to Microsoft’s Vlookup Help web page.
30 September 2006 : Column index num check. All too often with VLOOKUPs, the data you wish to extract is many columns over to the right of the identifying cell. For example, if you’re trying to look up week 26’s sales for store xyz from a matrix of 52 weeks sales values with an identifying column showing the store reference, you’d need to specify 27 for the column index number (26 weeks of sales + 1 column for the identifying column).
When the range becomes bigger than you can see on screen, it’s very easy to miscount the number of columns, and enter a number which is too high.
Now the VLOOKUP assistant checks for this, tests a changed value, and reports the results.
1 October 2006 : Warning message for dynamic table arrays. Once a VLOOKUP has been created, and proves to be successfully working, a common mistake is when the formula entry is copied to another area of the spreadsheet, and the table array (the area being looked into) is automatically changed, relative to where the formula is copied to.
Excel automatically assumes that if you copy a formula to another place in a spreadsheet, then it should act on cells in a similar position to the original formula, relative to it’s previous position. For example, If you had a formula in cell B1 which stated VLOOKUP(A1,G10:H20,2,0), and you copied that to the cell below (B2), then Excel would assume that you wanted to carry out the VLOOKUP against cell A2. The trouble is, that it would also change the area you were looking into by one row reference to G11:H21 which is probably incorrect.
Such a cell reference is called a dynamic range. In contrast to this, if each part of the range reference was preceeded by $, then Excel would not adjust the range when the value was moved or copied elsewhere. This is called a static range. It is generally good practice to specify ranges that you will never want to change as static. This can be achieved more simply, by pressing F4 after each part of the range (either side of the : sign). It is better practice to use a named range. This is a method where a range of cells are given a name. To name a range either ;
Select the range you wish to name, and click the name box (which appears above the top left cell). Type in the name of the range, and press enter. This name box also displays the name of the range currently selected, if one has been assigned.
Here, the range B5:D11 has been selected, and “OpeningTimes” has been entered in the name box.
Select the range of cells you wish to use, and select Insert, Name (or just press Ctrl F3). This will open the Define Name dialogue box, where you should enter the required name in the Name box.
Once a named range has been defined, it can be used in any formula entry by typing the range name where the actual range would usually appear. Named ranges are much easier to understand than cell references too – they make formula entries more meaningful and easer to read. The most important fact is that they are always used as static ranges.
Copying formulae which have dynamic ranges in the table array part of the VLOOKUP is a common cause for #N/A appearing in cells. The assistant now counts the number of “$” signs which appear in the range and reports a warning message if this is less than four. It checks to ensure that the table array reference is not a named range in advance.
1 October 2006 : Check for unsorted range in a find closest match lookup. The ability to find the closest match in a range of values can be very useful – especially when fine tolerances are specified. When the VLOOKUP assistant is coded this way, the data you’re looking up MUST be sorted in ascending order. The VLOOKUP function is quite efficient in the way it tries to search for your data, by stopping it’s search as soon as it finds a match to your specified criteria (starting at the top cell in your range, and moving towards the last cell). In the case of a closest match, if the cell it is checking is greater than the value you’re looking for, it stops it’s search, which is why the data needs to be sorted in Ascending order.
This is an important consideration when setting up the VLOOKUP, but more significantly care needs to be taken when setting the value of the range value argument of the VLOOKUP. If you are searching for an exact match, but specify closest match in the range value argument, then VLOOKUP will stop searching once it finds a value greater than your search value. It should also be noted than searching for the closest match is the default value for the range lookup option. Therefore if you specify nothing in this part of the formula, then the closest match will be sought. An example below illustrates the danger of this ;
In this example, the range value has been omitted, so VLOOKUP defaults to searching for the closest match.
As it searches the lookup array, it sees that the value Wednesday is higher (alphabetically) than “Friday”, so it stops searching and reports #N/A.
The assistant finds it using COUNTIF, and reports this first.
Next it recognises that the lookup range B5:D11 is dynamic, and warns you that this may cause a problem.
The third paragraph reports that you have not specified anything in the last argument, and hence defaults to closest match. Since it can find the value using VLOOKUP, it recognises that the sort order of the range is probably incorrect.
The assistant has alerted you that you haven’t specifies anything, and hence suggests remedial action whether you want an exact match, or the closest match.
1 October 2006 : Adding wildcards into the VLOOKUP function. One possible cause of VLOOKUP failure is when embedded space appear around the value searched for within the table array, but not in the lookup value. As far as Excel is concerned, the values don’t match. Now the VLOOKUP Assistant will suggest an alternative VLOOKUP function without modifying your underlying data. See below for an example ;
The VLOOKUP Assistant recognises that the values don’t match by trying the same lookup using a countif function with an added wildcard. It first adds the wildcard to the first part of the string, then the end, then around both parts of the lookup value. This identifies that blanks or other characters appear at the end of the data.
Previously, the assistant would offer to remove leading and trailing blanks from the lookup values in the table array.
Now it offers to modify the VLOOKUP formula to add a wildcard after the lookup value.
However, this option should be used with caution. As the Assistant tells you, the wildcard will work with more than just blanks, and searching for SIX would find SIX, SIXTEEN, SIXTY, SIXTIES, SIXPENCE and even SIXFIELDS.
Because this should be used with caution, the assistant still asks you if you wish to remove leading and trailing blanks from the table array, after the main window (left) has been displayed.
2 October 2006 : External Files ; On the previous version of the wizard, problems could arise if the VLOOKUP was trying to look into a range held in a spreadsheet which wasn’t already open. Now the wizard will try to open the spreadsheet so that it can continue it’s analysis. If it can’t open the spreadsheet for any particular reason, it alerts you to this fact, and displays the error message associated with the attempted open. If the results of the VLOOKUP are #NAME?, then this may be the cause of your vlookup failure (e.g. the file does not exist, it has been moved, or renamed). If it can open the spreadsheet, then it will ask you if you want it to be closed once the analysis is complete.
What’s an Add-in?
In Microsoft’s words, “Add-ins are programs that add optional commands and features to Microsoft Excel.”. Often, these take the form of macros created in a spreadsheet which are then saved in the “add-in” format, and in the users add-in directory (usually this is "\xe66\Application Data\Microsoft\AddIns\ " on the C drive if available, or D drive.) . Once in the add-in format, all sheets behind the spreadsheet disappear, and the macro’s behind the spreadsheet become available to any other spreadsheets opened.
Typically, within the spreadsheet where the macro was created, it would be attached to a toolbar which would be displayed with any other toolbars at the top of the screen. Once clicked, the macro would operate against whatever spreadsheet, worksheet and cell that was active.
Excel already comes with several optional add-ins courtesy of Microsoft. One example is the “Analysis ToolPak” which “Provides functions and interfaces for financial and scientific data analysis”. Further information regarding this add-in can be found at Microsoft’s web site.
How do I install this incredible piece of software?
The add-in itself should be copied to a safe location on your C or D drive. Once copied, it should be installed by;
Locate the add-in you saved in the step above, and click OK.
The add-in will then be copied to your add-in directory (usually this is "\xe66\Application Data\Microsoft\AddIns\ " on the C drive if available, or D drive.), and the name will appear in the list of add-ins with a tick against it.
Finally, the macro security should be set to Medium. To set or check this, choose Tools, Options, select the Security tab, and click the Macro Security… button, then choose Medium.