It's up:

http://www-cgi.cs.purdue.edu/cgi-bin/amcclure/index.pl

You should ignore the following, and read the HELP file at the above URL. I am leaving the below for historical reasons..

(this is primarily for the edev people..)

See Degrees of Everything: Code. Or go to the URL http://charon.sjs.org/~mcc/epathfind.pl . I dare not post a huge PRE'd perl script to this place.. and keep in mind that this script could never ever actually be used. But it's certanly interesting to think about. So:

OKOK.. ummm... i feel really nervous posting this somehow :) Just feels like it's arrogant of me or something.. dunno. Heh.. uumm anwyay here goes:

One of the most fascinating things about e2, i think, is the bizarre connections softlinks make between things. The softlinks on e2 give you something unlike anything else on the universe-- if e2 is, as mr. bones says, a neural network, then the softlinks turn it into a kind of a neural map of everything in the universe, linking bit by bit concepts and ideas in unpredictable but wierdly logical ways and letting you see how you connect to each other.. blah blah blah and other obvious melodramatic rambling like this. At any rate, follow the chain at random (you've done this, i'm sure), and it's fascinating.. you go from hokey pokey to dance liberation front to cabaret law to cabaret to springtime for hitler to swastika to jainism to Vishnu.. which has little or nothing, it seems, to do with the hokey pokey.

What i've thought would be fascinating, though-- i've been thinking this since shortly after i got here (it was, in fact, my first contribution to that little blob of pain that is suggestions for e2), and i have run across others thinking the same thing-- would be this:

Some kind of script that would find these odd chains for you, given predetermined endpoints-- take any two nodes and snarf through softlinks until it found the path of softlinks through e2 that took you from the one to the other. The whole idea is rather like that silly game involving Kevin Bacon, but on a far more transcendental scale (What is the connection between Legend of Zelda and Supply Side Economics? Between Prussia and Dance Dance Revolution?).

So.. curious to see if i could do it, i actually went and wrote the code to said superdoc. It's rather quick and dirty, and i didn't spend very long on it, but in my humble opinion it is somewhat elegant. (I don't think it works, though.) It's pretty self explanatory-- enter start node, enter end node. It also has a little form where you can give it a list-- separated by commas-- of nodes that you want to avoid. (This would be in case your search goes through some node that has been neurotically linked to a huge number of things that it does not actually relate to, say MR. T ATE MY BALLS or learn how to spell, and you want to know what a purely logical path would look like.) The script then gives you the first path it finds between the two nodes, the number of nodes it touched, the number of times the path it followed converged back on itself, and the number of times it hit the nodes in your avoid list.

(One really nifty idea for version 2.0, by the way, to keep in mind, that would be totally impractical now even if the script worked, would be this: Instead of storing in the touch hash simply the first softlink the script finds and then ignoring any future path collisions, store in the touch hash a reference to a list containing every node that references the key node_id. Then change things so that instead of stopping its search the instant it finds a solution, the script doesn't stop until all its queues are cleaned out. The point of this would be: Once this was done, instead of simply giving you as simple linear list, the script could actually show you all paths from node a to node b. It would use SVG (SVG is cool)-- construct a box with a title for each node, and line them up, one row for each PASS, drawing lines from box to box where the softlinks are. The resulting map would be huge but fascinating, showing all the ways that the single idea billows out into random thoughts and then converges back to a single point... (If i haven't clearly explained that, msg me and i'll try to fix it. I'm assuming you've read the code, though.)
So, look at the code if you're interested or have done everything work before.. and to anyone who does, i wish to deeply, deeply apologize for the shoddy state of the code. I probably could have been doing something more productive around here, the testing i did was minimal-- i'll do some myself, later, for now i'm just tired. I am not 100% certain what constraints superdocs should be written under, and i'm still not 100% sure which API calls, if any, should be preceded by $db-> or when you're dealing with a node object. But, still. There you are. Isn't it nifty?
(From msgs: (dem bones says I think it's been 'kind of' done once before but only in E1 where it was much simpler. Not sure, though, by time I came aboard the doc was busted. Malda did it. ... keep in mind I'm not sure that's exactly what Malda did. It was called the Everything Wandermeister or something stupid like that...maybe ask nate) (anotherone brought up the same node).. it looks to me tho like the wandermeister is something slightly different, just something that you feed it a starting point only, it follows 10 random softlinks and gives you the endpoint it wound up at and the path it took. Interesting, but different.. Hm. :shrugs:)

WHY THIS SCRIPT CAN NEVER, EVER ACTUALLY BE RUN.

Think about this for a minute. This script is almost certain to be disgustingly resource-intensive. Branching out, the searching is bound to touch a simply huge number of nodes in a very small number of steps, having to carefully hash and process each one. The script works on sheer brute force-- as far as i can tell, this is the only way to do it, given the task-- and as such every run must eat a huge gob of memory and a huge gob of processor time and then effectively DOS the sql server with link requests. Putting this up, especially since e2 is already so overtaxed, would more than likely have a disasterous effect on the normal operation of the server, as five or six people run the e2 pathfinder over and over while the normal page views slow to a crawl.

However, i don't think the time i spent writing this was exactly wasted, and not just because i had nothing better to do. First off, the script could be useful to people running their own Everythings (I'm going to try to nodeball it and post it on everydevel once i can find a way to test it a little on my half-functional everything install across the room..)

Still, though, there may be ways that this script actually could be publicly available on e2, which would be mad cool. Post if you can think of any. For example, maybe somehow calls to the e2 pathfinder node could somehow renice themselves to -20 (can MySQL queries be "niced" or prioritized as such..?), and/or monitor resource usage and simply return an error if the system's resources dip low enough that continuing running the script is going to cause a problem. Kinda like slashdot's overload mode, have the script shut off if e2 can't handle it right now.. That's about all i can think of, Well, except-- hm. I feel bad asking this, but question-- what exactly is being done with the e2 stat server right now?


Dataknife: Isn't that the kind of thing that UNIX processes/"nice" are meant to handle for you automatically in a much easier and more efficient manner? trikyguy: Well, first off, the script wouldn't be a map of softlinks-- the softlinks are the map. The script would just find the quickest path on that map.. Secondly if you get "frivolous" softlinks in your path-- say, you want to go from Virginia Woolf to Beethoven, and the path it returns just goes Virginia Woolf->Gorilla->Beethoven.. and you don't think that "Gorilla" makes any sense there at all.. you can just add Gorilla to the "avoid" list, and rerun the search, and it will give you an alternate path. Thirdly, while the script could be rather easily altered to use hard links instead of soft links (although doing so would make it even MORE processor intensive), this isn't a great idea, since hard links are likely to be MORE illogical than soft links-- see, in many people's posts, you will find that they have simply hard linked every other word, whether said hard link is to something obvious or unrelated, because of a persistent belief that if the author of the post does not meet some vague quota of links, they will be systematically downvoted. Still, though, you have a point.

There is, actually, a simple way to prevent this script from hosing the server.
Make it follow a configurable number of links, such as 5, 1 level deep and then sleep() for a few seconds. repeat until the solution is found, or all possible nodes have been exhausted. This is much slower, but the goal is to find the path between the 2 nodes, not to crash the server in the attempt.
That having been said, I wouldn't run it either.
Mapping Everything
My writeup should be on this node, but I think it should be titled as the above mainly because of the multiple interpretations.

First, I am new and don't claim to completely understand the system. However, it seems to me that the real problem in making a good map is the frivilousness of soft links. A soft link is created even when there is no connection and order in which the nodes are viewed was just random. I think i read somewhere that soft links are strengthened if for they would have been posted by two people. There needs to be a quantitative way of measuring this. Any map should take advantage of this so that the relatedness between two nodes is evaluated by the strength of the soft links between them. Also, there should be a method that lets random soft links be removed.

From another angle, a hard link map would be interesting. Maybe the map of everything should take hard links into account.

Given a way to pause and shutdown the script, without having it lose state might be the key.

I might run this script if i could get a DB backup from E2 which i could import into a spare computer, and just let the thing run. Maybe have a way to get a "diff" of the database to import into the system. One just including the new softlinks and node titles. Make the script work as just a perl script connected directly to the database, without any form of webserver in the middle, or even at all.

Might be interesting to be able to queue the maps from a web based form, and cache the completed ones for future viewing. Whenever the first map completes, it emails the requestor, puts the link up onto a seperate computer for viewing, and then loads the next one.

How much memory and HD space would this require. CPU time would be trivial, because this would just sorta until completion. Would a 500meg HD work well as one big swap file to supplement the physical memory?

Log in or register to write something here or to contact authors.