Everything Statistics - September 29, 2001 (3) (thing) by Qeyser

Just some mathy questions directed at the Prof, but I'd like to hear answers from anyone who can answer this:

So I really like the idea of the proposed MNFP system of level advancement, however I am moved by some of the comments that the median node rep might not be the best measure of central tendency. While it seems that the distribution of node reputation is roughly gaussian, it does seem that some users have a larger right side tail; that is, they have many more high rep nodes than low rep nodes.

Although a noder may not have enough high rep nodes to make his distribution very skewed, the traditional measures of central tendency for gaussian distributions may not be able to tell the whole story.

What I'm getting at is this: is there any way to meaningfully quantitify the right-side skewness of a noder's rep distribution and thus be able to reward noders that have many more high rep writups than low rep writeups?

Thanks for reading, Yours Truly Qeyser

Professor Pi's Comments:

Actually, the distribution of node reputation only resembles a normal (Gaussian) distribution. The real distribution is most likely closer to a Binomial Distribution. But other factors such as writeup nuking and C!ing (more "air-time") have influence on the shape of the distribution as well.

It would be very impractical to use models such as the Binomial Distribution or the Poisson Distribution (which is the limiting case of the former and would work out for larger rep-sums) because (1) there is far too much computation work required; lots of slow factorial calculations, and (2) the whole procedure of calculating an "average" node reputation would become far too complex, for the average noder to make sense of.

There is actually a parameter that can be used to evaluate the degree of asymmetry of a distribution; it's called skewness. It is a 3rd order function of the node-reputations. I doubt thus parameter is practical in the evalation the "average" node distribution, and it would again make the entire procedure too complex.

The median actually would be a fair measure of central tendency for the reputation-distributions we encounter, but as was already mentioned: it's easily broken by targeted downvoting on writeups with a reputation at, or slightly above the median. It is not robust enough, since it is only a single-parameter description of the distribution. In order to make it more robust, it would be better to incorporate more factors, such as the 1st and 3rd quartile values, or the reputations of all the writeups between the the 1st and 3rd quartile. Again, we don't calculate the mean of all the nodes, since that would favor the outlier points too much.

Everything Statistics - September 29, 2001	Everything Statistics - September 29, 2001 (2)	Median Node-Fu Product	Grandfather clause
Everything Statistics - January 20, 2002	The Three Men I Admired Most: Manhattan, 9/11/01	Richard Ira Bong	simulive
Googlewhacking to estimate the number of pages indexed by Google	solid state laser	Law of large numbers	Interdecile Mean
October 8, 2001	Asian Financial Crisis	Matthew 20	Poisson distribution
Devotion	Interquartile Mean	binomial distribution	The quest for high rep nodes
Node for the Ages	statistics	E2 Annex