Today I had one of my many epiphanies.

Over the years I, like so many philosopher kings, dilettantes, and schemers that have come before me, have come up with a lot of ideas. You know, not just simple "Hey, I wonder if it's time for me to water the plants again" but Big Ideas, ones that were revolutionary in my mind. More often than not these ideas died a quick death, but a number of times I have come up with an idea that was so good I wasn't the only who thought of it, and I missed out on some cool chances.

In 2001, I began conceiving an idea for a three-dimensional search engine for musical tastes, to provide both data and recommendation. Enter, my idea almost pixel for pixel (though a bit kludgy with the Flash.)

In 2002, I began conceiving an idea for a repository of abandonware video games, but never got past the initial selfish collection stage. Enter Abandonia (who do it better than I would've thought to do it.)

And sometime in late 2003, I spent nearly 3 hours searching the Web for a way to use javascript to access a PHP file. I finally found it, on one of those dinky Tek-Tips for Ad Revenue pages you see strewn about, and I implemented XMLHTTPRequest. Fast-forward to today, and it's one of the hottest web buzzwords, thanks to the new Google Suggest (I wrote a Javascript autocomplete, too, you thieving cowards!)

My point is not so much that I am a programming genius who deserves praise and years of backpay (I'm looking at you, Google) as that the ideas that seem to come forth and take people by storm are not so far ahead of the curve - they're murky shadows in a straight and inevitable path. Blogging was the inevitable result of years of bad Geocities webpages, iTunes was the result of the seething underbelly of P2P networks, and, yes, Virginia, even Everything2 has its brothers and sisters in the content management and information delivery world.

Digression over. Today I had an epiphany.

I was reading another dry blog entry by Clay Shirky about XML namespacing and the new folksonomy and the Semantic Web and so on and so forth, and I finally saw the light: the doom of Google.

Now, lest you think that I am out for petty revenge, hear me out. I don't necessarily think Google the company is going to tank (and the stock market is bearing me out something fierce) and I don't think that the many features of Google will simply vanish in the face of the new Semantic Web - people need maps, they need calculators, and they need their sweet, sweet image porn - but let's face it, Google is 95% search engine.

Speaking of which, what is a search engine anyway? A lot people talk about "programs" holding "keywords" and returning "relevant" information. Let's focus on these last two words: keywords and relevant.

Now in the age of the new folksonomy, where everything gets tagged by both authors and readers, the self-organizing web has its own distinct advantages.

  • The first advantage should be obvious, and that's for any given user, they know what they've tagged something, and they can find it pretty easily. I type in "bank", I get my bank's website, not a bunch of banks competing for my business or info about bank robbers or rivers or sound banks or Ernie Banks.
  • The second advantage (another obvious one) is that it makes it easy for people to separate out their content. There is an old adage that most people only visit maybe 10-20 sites on a regular basis. These core sites can be tagged as such. They can all be tagged "ON_A_REGULAR_BASIS."
  • This leads right to the third advantage: The web becomes more meaningful by becoming less encumbered by meaning! One man's meat is another man's poison, or trash and treasure or however you want to call it, but we get to decide where things are placed. The current search engine paradigm goes directly against this notion.

    An apt comparison might be between your pantry at home and the grocery store. Right now, search engines are grocery stores. You have to go walk down the aisles, pick out the produce, put it in your cart, take it home, and put it in the pantry. We want search engines to be like pantries - you get the stuff you want, where you want it, and you don't get the rest.

Now, you may argue that Google's PageRank is a step in the right direction, and I would agree except that it's easily hijacked. And that is really only the symptom of the actual problem - that Google results are impersonally organized. So now we have a bunch of websites tagged by you and tagged by users AND tagged by authors. How will the new semantic search engine display these?

Enter my Big Idea: the new semantic search engine will serve more as a pantry than a grocery store. And what's more, the new search engine will be entirely automated.

The concept is fairly straightforward:

  • A few main categories will be clearly delineated, but extremely simplistic, with perhaps 10-15 overall. They will be all-inclusive, but not necessarily mutually exclusive.
  • Within these categories, co-tags (tags that appear alongside these category tags) that match a certain frequency threshold will be listed as subtags. These subtags in themselves will have subtags, which will have subtags - ad infinitum.
  • In short, the entire breadth of user tags will fall into our large directory tree. The tags themselves will be sorted by popularity, but would be searchable for easy location.
  • The important factor here is that all tags are determined by everybody. This is direct democracy at its finest, and it is much harder to hijack, because popularity wins the day every time, and bullshit tolerance is low for tag hijacking. In addition, if you created a special tag, say, "kthejoker", for things you felt were important to you or about you, then this gets passed along, too.
  • These personal tags become important at the second stage, which is application of the user's own tags. URLs associated within the tags will be listed in the following order descending: relevance to the user * relevance to their query, relevance to the user, relevance to the query. Now we are back to our original discussion, relevant keywords.

    Relevant is relative, and search engines should be made to understand personal relevance. The new Semantic Web appreciates popularity but values relevance. Our search engine does the same. If 99% of your tags are "PHP" and you type "arrays" in the search box, the search engine had better turn up arrays in PHP first.

And that, as they say, is all she wrote. This is something of a preliminary sketch. I pondered the idea of a web portal for some time, and finally I laid my pen down and said, "Forget it." I went and played a video game, read a book, took a nap, talked to Courtney, and then I said to myself, "Well, I need a forum on the web for my Big Ideas." Hence I arrived at E2, to set down some ideas and the initial template for the new Semantic Web search engine.