Update - I wrote a caching e2 proxy. If you're in edev, you have the URL (check your inbox)... If you're not but you have a strong desire to see it in its horribly incomplete state, /msg s_alanet. And those of you viewing this page through the proxy... Salud! More information below under "The Implementation."

Introduction

E2 has some problems. We've got a huge database to store, and hefty bandwidth requirements. We don't have the cash to buy a shiny cluster to host the site, or a bigger, phatter pipe to feed it through. There isn't really cash to hire full time developers or editors, though we have a lot of really wonderful volunteers that should be paid.

I was thinking about the whole distributing e2 thing, and I realized that we need a stopgap. An interim measure. Maybe a long term measure (after all, making a fully distributed e2 system is a big task - it might not be done for years!).

I was also thinking about the difference between e2 and a blog. I like my friend's blogs because they're very personal. I look at whose I want. If they're fancy I can have little discussion in the comment boards with my friends. I post stuff, they post stuff back - it's a lot of fun. But not very interesting for other people. E2 is the opposite - it's all about high quality writing for everyone to enjoy. If I post a story about my dog, it had better be good, not just some drivel that only people who have seen my dog will appreciate. Factual information - which would be boring on a blog - has a strong presence here.

We also have a lot of neat tools - the e2 node tracker, the E2 Nodegel Visualizer, and a lot of really cool clients.

How - I thought to myself - could these wonderful things be combined? It seems like they're all connected... but how?

And then it all came together.

The Idea

An Everything2 Hub - or an E2 Blog, or whatever you want to call it.

What I have in mind is a website that you can drop into your webspace on your ISP or university. It's a PERL or PHP script that automagically does everything to set itself up - a seed, so to speak. It probably uses flat files for maximum compatibility.

  • It provides a blog syched with your daylogs. You post as you see fit - it deals with making sure the right daylog writeup on E2 is appended to, edited, or created. Sort of a blogger for the E2 set. It also displays the blog entries (including converting hardlinks to href's).
  • It shows statistics relevant to you - latest writeups, for instance. People who are interested in you can hit your e2 hub and see what you've written. Optionally, more statistics could be available - most popular writeups, "goodest" writeups, least voted on writeups, etc.
  • It provides a cached view of E2. Click on a link in the blog and it'll take you to a cached view of the node. You only hit e2's server if you want to.
  • It provides "user services" to the owner of the blog. An integrated node tracker would be a good start (and pretty much a necessary feature to do the statistics). A nodegel visualizer would also be useful - in fact, there could be a whole subsection devoted to managing one's homenode. In addition, a scratchpad would be good (saving E2 from rendering lots of pageviews of the scratchpad), and the ability to make rough drafts of writeups, then upload them when they're ready, would be even better.

Basically what I'm talking about is a web-based client integrated with a web-based user info tool. A hub for your everything experience.

The Implementation

In the process of writing this thing, I've found some lingering XML malformations in E2's code. I've submitted patches for these. If you're using the e2hub and you get a blank page, it's probably because you hit a page with an unescaped character, like a bare & or something.

I also went through and cleaned a lot of stuff up in the code... It caches user information, nodes, and displays them. I'm about ready to call it 1.0 - I have a few issues with Simpleton's wonderful E2Interface library, mostly relating to me using it in ways it was not designed to be used, and when I clear those up, I'll be posting the code to the hub somewhere.

Last updated 15:52 23/2

So I've actually started to code some of the stuff mentioned here. Specifically, what I have at this point is a web-based caching E2 proxy. In other words, it grabs data from the various XML sources in the site to generate an e2-look-alike. It's got a 100 megs of cache space, and it keeps data cached for one hour. This is all tweakable. Right now it supports viewing nodes and users. If you try to view something else it will tell you to go back to E2 and look at it. The whole thing is templated.

This is the key to the hub concept, in my mind. Without a cache - without an interface to E2 - then I'm just talking about fancy newsletters. Now that I have a pretty solid cache implementation, I can really go somewhere.

If you have any questions, feel free to contact me.

Extensions

It's also not hard to imagine such a tool being turned towards making E2 more approachable. How often have you heard, "It seems nice, but I just don't think I'd fit in..."? We are losing talented contributors every day because they don't know what to make of hot nude thespians. Sure, it's a part of E2's charm.... but we're reaching the end of where charm can take us. We need quality to reach the next stage of development.

How better to attract quality contributors for music writeups than a hub that focuses on, say, "Music Reviews"? Or a "Computer Science" hub that focuses on terminology and concepts? A "classical literature" hub? E2 is a big place. If we want to lure in experts and promote really quality writing, we're going to have to start filtering things to fit the audience. Metanodes are a good start, but they're not enough.

It should always be easy to start free-associating - clicking through the database, wherever the information flow takes you. But if we want to attract serious contributors, we need to be able to present them focused information.

Conclusion

There are a lot of other areas that this concept could be extended along - letting people check their messages from their hub, for instance, or making a hub "farm", sort of like Sourceforge does for software projects or LiveJournal does for blogs. Such a site could even charge people a small fee for advanced functionality, or faster service - thus helping E2 stay afloat.

Call for Comment

But of course, all this is just conjecture. I would like to make it, or something like it, a reality. E2 should live on! So please, if you have any comments, thoughts, or suggestions on this, drop s_alanet a message or e-mail me at <bgarney@purdue.edu>. I'll note them down here (as well as thanking you profusely for your willingness to tolerate my ramblings :). If you are struck by the urge to start coding this, talk to me, too... I'd like to help (I don't have time to write it on my own).


And the responses come in...

Ouroboros says re Everything2 Hubs: like a "distributed" E2?
you said "re hubs: Except here we have everyone talking to a central server. On Making E2 Distributed, I was talking about a fully distributed, peer to peer model. This is just a bit more hierarchical."
(That is to say, this is an extension to the existing architecture whereas distributed e2 in my mind is a total redesign.)

PhillC sez re: Hubs. I like the idea, but perhaps the easiest 4 hubs would be E2 People, E2 Places, E2 Ideas, E2 Things. They could all be easily pulled from the DB.
you said "re hubs: True... Though given the arbitrary nature of the people/place/idea/thing distinction, I think topical hubs would be more useful."

HongPong sez:

well i think it would be good to get better clients available for everyone. there is no good reason for scratch pads. however i think it will be more effective to look at where e2 is getting processor eaten. also i think it would be good to have a perl/PHP script which could generate fuzzy connections from your nodes to other topical nodes... within topics and those of your friends...

i kno from my metanodes that e2 has a broad-level anarchy to its organization. my general idea is to make a new 'metanode' node type which appears bold in the softlinks.. ie my Middle east would always appear bold. this would point people to topical indices... eventually fuzzy math + something like voting could be used to create topic fields. think fractal organization

have you seen this thing called 'helloworld' which works by making fuzzy collections of data. http://www.cooperatingsystems.com/helloworld/factsheet/index.html

i support anything making topical stuff and metanodes better. i think in particular it could be possible to flag certain nodes and have it generate others you might be interested in... but how to do this without crushing the server...

me sez:

"re hubs: A goal of the E2 hub is to offload things from the central server. I think that a generic mechanism to 'cache' unfinished writeups would serve nicely in that role, as it would give people a chance to collaboratively review or individually rewrite wus before posting them on the live server. The scratchpad is a very convenient part of e2, but it's also a source of database usage and system load.
A huge portion of e2's load is caused by the complexity of its pages. Every pageview results in dozens and dozens of comples queries, as well as in the execution of lots of code. Making the system more complex by adding filtering rules or nodetypes will do nothing to reduce pageviews caused by casual browsing, while making page renders more expensive. Quite the opposite of what we wish to accomplish!
On the other hand, if we write a hub that works to off-load as much as possible - both caching content for casual browsers, as well as providing a "draft then upload" paradigm for nodes in progress, we cut two expensive operations out of the loop - we reduce page renders for passive consumers by at least an order of magnitude, and page renders/database updates caused by writing/rewriting.
As of 1pm PST today, E2 has gotten fifty thousand hits. What if we could reduce that to 25k?

(Addendum to that: helloworld does look pretty neat... but for E2, I think that a web based solution will be more amenable than a program to download and run.)

Content syndication isn't a new idea altogether. There are a lot of sites (including slashdot et al) that have rdf feeds and other "tickers" that allow you to syndicate content to other places. If you want a slashdot or everything2 "What's new" box on your homepage, it's really quite easy. With a bit of XML::Simple and basic knowledge of CGI/mod_perl you can cough one out.

What s_alanet has put together is really quite interesting. It has a lot of features of E2 and has some interesting ideas as to the right way to syndicate content, but what's the long term effect of this? It is tough to imagine a website that exists completely as a collection of content hubs (to use his language), working as mirrors of the whole. There are a lot of issues with breaking down a website into a distributed application as such.

The first of which is authentication. Now you can always shortcut this and say that this is guest user only, but that severely strips out the feature set that is available. In essence, you are using a third party to authenticate. It's like telling your friend or secretary what your password is, and to fetch your messages and preferences. The same problem exists with things like the XP tracker from cowofdoom. Now I've met Will, and I'll go kick his ass if anything happens to my password (I still don't use it anymore), but the point remains is that you simply can't trust a random third party with your authentication credentials.

Another issue for full time syndicate use is that you can't guarantee that a particular hub would be syndicating the content correctly (ie: non-maliciously). There's nothing to stop them from redirecting your votes to their writeups or other things. Similarly, there is no safegaurd against people changing words inside of your messages or otherwise misreport what is available here.

dem bones says "Get the hell off my website."



You simply can't trust a downstream publisher. There is no secure end-to-end connection that can be guaranteed or authenticated in any way. Even open source websites don't have to report their source correctly. There's not really any way that you can compile the source yourself and run a "pure" binary. Before we ship any binary version of an E2 client on this website in an official capacity, I'll have to inspect the code myself thoroughly (or any other code administrator), and build it on one of our machines here. Trojaned binaries are a problem.

Some of E2's features, such as leaving softlinks, won't work if you are using a third party website to view it. Right there, I feel that that this is a great drawback. Sure there are technical ways around this that I can think of, but they involve a lot of pageloads to the e2 server, and break a lot of the caching that is done.

The temptation to want to reduce load from E2 is great. I personally would love to see pages load lightning fast without kicking all of your asses off ;). The problems that exist with non-first-party endeavors as such is the same as trust in any P2P application. Without some sort of crazy encryption or verifiable digital signatures, this sort of thing wouldn't work because of the security implications.

On the other hand, what this does make is some cool ways to keep your writing on E2, and your contributions here on another webpage in sync. Until we have some of the cool content management features due in later this year, there are some neat things that you should be able to do on a homepage with this sort of item. I can see wanting to collect certain writeups and format them differently, or want to perhaps make your own sort of collections of writeups. We as E2 are merely publishers (under a legal sense), and you're free to reproduce your writeups wherever you see fit. Syndication of things like softlinks and stuff are kinda cool (if not an excusable rip-off of our content here).

In the end, the point of E2 on one hand was to create a website where you could read for hours and days without leaving. Seeing as you've all been here for years, I'm assuming it's working. Spreading it across hubs is a neat novelty idea, and I definitely applaud its efforts. It's quite a cool trinket, but it isn't right for E2 full time.

Open for comments. Anything longer than two messages please send to jaybonci@everything2.com

Log in or registerto write something here or to contact authors.