An artistic experiment in the way we use language

"He multiplieth words without knowledge" - Job 35:16


Words have fascinated me since...as long as I can remember. I must have been rather like the child that Christine tells of - a whippersnapper of seven months who had learned and uttered his first word, simply by being totally intent on others' speech, watching like a hawk until he got it. His first word may not have been "Historiarum" or "Unmorrised", but it was certainly precocious (not the word, the utterance!) The usual age that a child starts to speak is around twelve months, apparently.

Why, you ask, this sudden need to write about words? Well, look at two of the books I am currently reading; "The Meaning of Everything" is one (being a history of the Oxford English Dictionary), "Language Through the Looking Glass" the other, looking at language and linguistics.

Coupled with this is my current personal quest to write 750 words a day - an effort that sent me scurrying in search of an accurate word counting widget, as that embedded in E2 is an approximation, and I needed something approaching accuracy. Google "word count" and you will probably see wordcount.org at the top of the list, and whilst it wasn't what I was looking for at that moment, it certainly was what I had been needing for some time, without realising it.

So, What Is It?

Well, WordCount is a website (to be found here) Edit: Damnit, it requires Adobe Flash, so unusable nowadays. It's a list of words - in fact, the 86,800 most frequently used English words, ranked in order of frequency relative to the words that precede and follow it, laid out in a "visual barometer of relevance". (I have to add that it's based on British English usage, using a thing called the British National Corpus, an online collection of texts totalling 100 million words.) <

Is it useful? Well, yes it is, if you're fascinated by language and words! If you're not, then you'll rapidly become bored, and wander off to get your daily fix of online cartoons. But I digress. When it starts up (there's a Flash widget), the first thing you notice is the most frequently used word. Guess what it is. THE. I tested it by using a frequency analyser on the text I have written thus far, and it's "THE". The least-used word I'll come to later, but I want to play a little game for a moment.

Imagine that you took the thirty most common words from this list. How useful do you think they'd be? What do you think you could do with 'em? Well, try using all of them:

the of and to a in that it is was I for on you he be with as by at have are this not but had his they from she

Valuable? Well, yes. Invaluable words, in fact. We couldn't do without them, and in fact, there's a surprising amount we can say using them alone. Read them aloud, in order, at random or backwards. There's a pattern, a sense. "In that it is, was I! For on you he be..." and "but not this are have" both seem to have something sensible about them, and I just now plucked them at random from that seemingly senseless string.

I doubt you could communicate very much, however, using just those words. Basic English, a constructed language devised by one C. K. Ogden, used just 850 words. You'd certainly need to include more verbs and nouns, but I hope the point is made, that we rely heavily on certain words, which become the scaffolding on which all our language, written and spoken, is based.

Enough Natter, Where's the Fun?

Well, here comes the good bit of WordCount.com. You can flip back and forth to find the next or previous word, search for a particular word, or look for words by their rank. So, searching for word number 86,800 gives us the last word in the list. It's in the softlink table here. Guessed which it is yet? The middle word is also there, and that's softlinked too.

Of course, given a search facility, we all get curious about where certain of our favourites are in the batting order. And you know, the developers of the site thought that the statistics of search usage would be interesting, too. So they also included a widget they call the "QueryCount", which ranks the words searched for, in order. And guess what? Of the top fifteen words searched for, ten are rude, crude or lewd in nature. I'll leave you to discover which ones are and aren't - the URL is given at the end of this writeup.

People are also having fun with it - there are poetry competitions using neighbouring words, even conspiracy theories based on adjacent words. Examples include the classic "America ensure oil opportunity". I myself found "Bush admit specifically agents smell". Need we say more?

So anyway, I promised you something interesting, so here you go - I looked up my own name, Kevin (ranked 3,825, flanked by "Charter" and "indicates") and Weedon (at position 37,219 between "vigilantes" and "Montego"). My middle name is there, too, nudging up at 1,119 between "d" and "drive".

So the last word? I have to admit I'm beaing a meanie now - you'll pardon me for using ROT13 encoding so's not to spoil the surprise: "pbadhvfgnqbe". The middle word is "ybbcrq". Oh, and "wert" and "perch" are there, of course.

Just in case you were wondering, the list of the last 30 is given below, in reverse order.

Conquistador recrossed workless carniola tangency multilingualism lauro Golgotha homemakers savills tella Blick inro historiarum moyne criers allocatively chalkis sibomana tarrow mymouse Chudleigh pandanus arara stupendously UPF behead superette maji carob

There now, you're as wise as I am. Wondering about E2's favourite word? Webster may have it, but these buggers missed out on "unmorrised". Give 'em time.


http://www.wordcount.org/
http://www.wordcount.org/querycount.php
http://en.wikipedia.org/wiki/Basic_English
Word Counter: http://www.javascriptkit.com/script/script2/countwords.shtml
Frequency Analysis: http://www.georgetown.edu/­cball/webtools/web_freqs.html
http://freakytrigger.co.uk/ft/wedge/2004/08/wordcount/
http://www.number27.org/projects/wordcount/conspiracy.html