Update: Professor Pi has posted a much simpler method for Excel in his write-up, using a named range and the TRIMMEAN function. I originally used the MS Works spreadsheet, which has no TRIMMEAN function (as far as I know) so the below might be nice in Works, although a less painful alternative would be just to buy Excel.


Inspired by Sharq's write-up above, I decided to try my hand at a simpler way of calculating the interquartile mean in Excel. The operative word there is "try."

SharQ's write-up above shows that we can calculate the I-Q mean with one formula. We're going to break it down into several formulas, each of which will be a tiny step, so that at the end the process is clear.

To help us with the cases when the number of data points cannot be split evenly into fours, we will use this property of the interquartile mean: the I-Q mean of {x1, x2, x3, ..., xn} is the same as the I-Q mean of {x1, x1, x1, x1; x2, x2, x2, x2; ...; xn, xn, xn, xn}. In other words, if we create a new data set consisting of four copies of each element of the original set, the two sets will have the same interquartile mean. (This somewhat makes sense intuitively, since this is how the normal average works. If you are in a class with two exams and you score 90 and 95, your final grade will be the same as in a class with four exams in which you score 90, 90, 95, 95.) We will refer to the set with four copies of each item as the expanded set. Now that our expanded set is guaranteed to have a size that is a multiple of four, we no longer have to handle four cases in our formula, just one.

Enough talk. If you have the volition, fire up Excel and follow me.

Let's make some headers first. In A1, put "Data". In B1, put "IQ Mean". Our input will go into column A (starting on row 3), and our result will show up in cell B3. Skip C; label D "Range", E "Counters", F "Original", G "Expanded", H "Contribution", I "Sum", and J "Count."

In D3, put the formula   = COUNT(A3:A10000)
In D4, put the formula   = D3 * 3
In E2, put -1. (Read nothing into that.)
In E3, put the formula   = E2 + 1
In F3, put the formula   = A3
In G3, put the formula   = VLOOKUP(INT(E3/4), $E$3:$F$10000, 2) and don't panic.
In H3, put the formula = IF(AND(E3>=$D$3, E3<$D$4), G3, 0)
In I3, put the formula   = SUM(H3:H10000)
In J3, put the formula   = 2*COUNT(A3:A10000)
In B3, put the formula   = I3/J3

Now we want to copy the formulas in E3, F3, G3, H3 down at least a thousand times. To do this fast, highlight those four cells, hold Shift while hitting Page Down repeatedly, and then type Ctrl-D.

The construction of our spreadsheet is done. To use it, just enter in values (no more than 250) starting in cell A3 and going down the A column, then sort the values (using "Sort" on the Tools menu). Cell B3 will show the interquartile mean.

Although figuring it out yourself is half the fun, I'll explain it if you like...

In column F, we duplicate the entries of column A for convenience (and so that VLOOKUP works; that's later). The extra zeroes in the column (coming from empty cells in column A) can be ignored. In column E3, we number each value in column F, starting from 0 (since I'm a C-array kind of guy).

The formula in the G column is the magic part. In G3, it reads = VLOOKUP(INT(E3/4), $E$3:$F$10000, 2).

Huh?

Well, imagine that our columns E and F are a hashtable, mapping a value in column E to a value in column F. VLOOKUP(SOMEVALUE, $E$3:$F$10000, 2) returns the value in column F next to the cell in column E containing SOMEVALUE. (The parameter 2 tells Excel we want the second column in the range (the middle term).) In our actual formula, SOMEVALUE is INT(E3/4); this means that each number in column F will appear four times in column G. (INT rounds a decimal down to an integer.) So our column G looks like F[INT(0/4)], F[INT(1/4)], F[INT(2/4)], F[INT(3/4)], F[INT(4/4)], F[INT(5/4)], F[INT(6/4)], F[INT(7/4)], which is F[0], F[0], F[0], F[0], F[1], F[1], F[1], F[1]..., our expanded data set. (The VLOOKUP function is best understood by a warped mind; I am pleased to be at your service.)

In column H we eliminate the contributions of the low and high values from our expanded set. If the index of a value is not within our desired range (which comes from D3 and D4), we replace its original value with a zero. If the index is in the desired range, we pass the value on unmodified.

Finally, we add up column H, count how many values from the expanded set still remain, and find the quotient. This quotient is the interquartile mean.

This method uses a lot more space than SharQ's does above, but it requires (I think) a little less typing, complemented by a lot of copy-down-the-rows hullabaloo.

So now, armed with the power to calcuate interquartile means for any set with 250 elements or less, have fun, um, finding out your merit or something....


If you're lazy like me and just skipped to the bottom of the write-up (I only do that sparingly...) then here is a portion of the spreadsheet, with my current reputations entered in as data.

Data  IQ Mean       Range  Counters  Original  Expanded  Contrib  Sum  Count
                              -1
  -1  6.529...        17       0        -1        -1        0     222    34
  -1                  51       1        -1        -1        0
   1                           2         1        -1        0
   2                           3         2        -1        0
   3                           4         3        -1        0
   4                           .         4        -1        0
   5                           .         5        -1        0
   6                           .         6        -1        0
   6                                     6         1        0
   6                                     6         1        0
   8                                     8         1        0
  10                                    10         1        .
  11                                    11         2        .
  12                                    12         2        .
  13                                    13         2      (below,
  13                                    13         2       non-zero)
  17                                    17        ...     

(This calculation of merit is slightly off from the one displayed on Honor Roll and you; I'm not sure why.)

I and my army of imaginary minions have striven to ensure that no slip-ups occurred in the transcription of data from Excel to E2. But we are not perfect, so if you attempt this and it fails or Excel attacks you, let me know. Thanks.