After a day here on Everything, I wrote an article called "my first day on everything" in which I lamented about being short of XP. Someone, I don't know who, soft linked me to "learn how to integrate" and I am now and forever in his or her debt.

From "learn how to integrate" I learned the importance of taking five minutes to follow the hard links in the writeup to create soft links back to me.

But, since I'm a little bit of a geek boy, I thought to myself, Why spend five minutes doing something simple when I can spend a couple of hours to write a program to do the very same thing?

So I did.

Here's a short perl script that looks at your E2 node and automagically creates soft links for you.

If nothing else, it's a nice example of how to use HTML::Parser and LWP.


#!/usr/bin/perl -w

#
# $Id: e2autonoder.pl,v 1.2 2001/02/26 06:02:17 eric Exp $
#

use strict;

use HTML::Parser;
die "HTML::Parser needs to be version 3.x or higher!" 
    if ( $HTML::Parser::VERSION < 3 );

use LWP::UserAgent;


exit main();

##############################################################
############################################################## main
##############################################################
sub main {
    my $node_id = $ARGV[0] || return _usage();

    my $content = _get_e2_node( $node_id );          # download the node
    if ( !defined($content) ) {
        print "Error getting node no. $node_id.\n";
        return 0;
    }

    my $parser = _e2_parser();                       # parse the node
    $parser->parse( $content );

    my %temp = map { $_ => 1 } @{$parser->{links}};  # make a unique list
    my @unique_uris = keys( %temp );

    _set_e2_soft_links( \@unique_uris );              # set soft links to node

    return 0;
}


##############################################################
############################################################## e2 interface
##############################################################
sub _get_e2_node {
    my $node_id = shift() || return undef;

    my $ua = LWP::UserAgent->new();
    $ua->agent( "e2autonoder" );
    my $req = HTTP::Request->new( "GET", 
        "http://www.everything2.com/index.pl?node_id=$node_id" );
    my $response = $ua->request( $req );

    return undef unless ( $response->is_success() );
    return $response->content();
}

sub _set_e2_soft_links {
    my $uri_ref = shift();

    my $ua = LWP::UserAgent->new();
    $ua->agent( "e2autonoder" );

    foreach my $uri ( @$uri_ref ) {
        my $req = HTTP::Request->new( "GET", $uri );
        my $response = $ua->request( $req );
        if ( $response->is_success() ) {
            if ( $response->content() =~ />Here's the stuff:/ ) {
                print "NAK $uri\n";
            }
            else {
                print "OK $uri\n";
            }
        }
        else {
            print STDERR "ERROR $uri\n";
        }
    }
}

##############################################################
############################################################## e2 node parser
##############################################################
###
### This is the E2 parser.  It's only job is to find and
### store the links in the actual writeup.  The writeup
### is defined as being the second table row after the
### topic (i.e. node name) is displayed
###
sub _e2_parser {
    my $p = HTML::Parser->new( api_version => 3 );
    $p->handler( default => "" );
    $p->handler( start => \&_start, 'self, tagname, attr' );
    $p->handler( end => \&_end, 'self, tagname' );
    
    $p->{flag_topic} = 0;       # set when we get to the topic
    $p->{flag_writeup} = 0;     # set when $p is looking at the writeup
    $p->{links} = [];           # list of nodes linked to
    $p->{tr} = 0;               # incremented at <tr>, decremented at </tr>
    $p->{tr_rows} = 0;          # # of table rows we've seen

    return $p;
}

sub _end {
    my ($self, $tagname) = @_;

    if ( $tagname eq 'tr' ) { 
        ($self->{tr})--; 
        if ( $self->{tr} == 0 ) {
            $self->{tr_rows}++;
        }
        if ( $self->{flag_writeup} ) { $self->{flag_writeup} = 0; }
    }
}

sub _start {
    my ($self, $tagname, $attr) = @_;

    if ( $tagname eq 'h1' && $$attr{class} eq 'topic' ) {
        $self->{flag_topic} = 1;
        $self->{tr_rows} = 0;
        $self->{tr} = 0;
    }
    elsif ( $tagname eq 'tr' ) {
        ($self->{tr})++; 
        if ( $self->{flag_topic} && $self->{tr_rows} == 1 ) {
            $self->{flag_writeup} = 1;
        }
    }
    elsif ( $tagname eq 'a' && $self->{flag_writeup}) {
        my $cgi_param_string = $$attr{href} or return;
        $cgi_param_string =~ s/^.*?\?//;
        my $uri = "http://www.everything2.com/index.pl?$cgi_param_string";
        push( @{$self->{links}}, $uri );
    }

}


##############################################################
############################################################## miscellany
##############################################################
sub _usage {
    print <<__HERE__;
usage: e2autonoder.pl <node_id>
       where node_id is the id number of your writeup, *not* the full node.
__HERE__
}

I'll leave it as an exercise for the reader to edit the script to confirm that the links were made.

For hard links, see the E2 node autolinker.

Update: 20010225, if a soft-link can't be made because the target node does not exist, "NAK" is printed instead of "OK". Kudos to dmd for the suggestion.