root log: April 2010 (log) by OldMiner

Hey guys! I never really got around to posting my first root log when I started, back in June of 2009, so I'd like to start out with some thoughts I wrote from way back then that lay out my philosophy on coding here. They come from when I had just gotten my shiny new splat and was a more than a bit worried about breaking things.

Purpose of Root Logging

Complaints have been heard about the amount of verbiage we put into root logs lately. I felt it was important to clarify the multiple purposes these logs serve before I went on with all of my normal verbiage.

Root logs have three audiences:

the users of E2 at large
the present staff
the future coders

The first group, the users of E2, is our most important audience. What the staff of Everything2 does, we do for the people reading this. However, we coders while doing our primary job, focus upon the technical functioning of E2. The monthly log exercise gives us a chance to shift perspective to the human impact we have and to maintain communication with all you readers and writers out there. Without feedback from those of you using E2, it is difficult to determine how effective we are being. We're pretty much always going to like our code because it's ours. We rely on the direction of alex, the leadership of Oolong, and the feedback of everyone to determine whether we've succeeded in what we've tried.

On our side of the communication, if we do not express what we're trying to get done, we can look like shadowy stagehands shuffling madly behind the curtain. This is the main reason these logs are verbal as well as technical. We understand that our job on E2 can create the same sense of unease as a new mechanic working under the hood of your long-beloved automobile. But we love this car too, and we want you to know that! We make changes only because we expect them to make things better. And, personally, I find it frightening the amount of negativity that has resulted from some changes which were small in programming terms, but much larger in community terms. We never want to scare away a user with an unexpected change, and hopefully by keeping clearer communication, we won't see these problems as much in the future.

For our second audience, the present staff, root logs provide a function served by the clarification memo in large engineering projects. It allows we coders, who often presume that everyone is working under the same assumptions, to compare assumptions. We can clarify misunderstandings with each other, restate objectives, and evaluate our own progress to improve future work. Just as importantly, root logs let us compare notes with the directors of the site, who ultimately decide what we concentrate on. This allows them a regular chance to correct our course if we get a little too off the map pursuing pet projects.

Finally, root logs provide a snapshot of past priorities, bugs experienced, and a state of the site from a coder's perspective. This allows us to more easily track down problems we caused but did not notice immediately and to passively teach newer coders how we did things.

Towards these ends, I try to write my root logs in inverted pyramid style, with general interest items first and increasingly technical talk as I go on. I use headings a lot which make it easy to skim and skip sections. And I do my best to avoid jargon. Still, I lack brevity. That one I'm unlikely to ever overcome.

Node Titles and Accented Characters

E2 has had issue on and off relating to international characters. I can talk at length about the technical details, but that's not important right now. The short version is, for a while some nodes with characters like vowels with diareses (e.g. ö, ä, ü) couldn't be reached. This is no longer a problem.

We only saw this particular problem recently because we had newer software on web3, which we just brought into rotation last month, when web6 keeled over.

The technical description of the matter, for those who are curious: This was caused by a combination of having an updated version of Perl and a not completely updated version of CGI.pm. I updated both to the most recent version initially using apt. This gave us Perl 5.10, which changed the packing behavior of bytes of Unicode strings, and CGI.pm 2.79 which relied on older packing behavior when URL-encoding strings. The result was that we took links, encoded them, and then when they were unencoded, we got a different result than the original link. This made some nodes unreachable unless you were clever enough to either manually encode the URLs correctly or figure out the the target node's node_id and use that directly. Let this be a lesson to me: Never use apt to manage Perl modules when CPAN works better. Once I updated CGI.pm using the CPAN module, all was right with the nodegel.

Daylogs

Cool Man Eddie now makes the nodeshell for today's daylog link if it doesn't already exist. The bot choice was arbitrary and largely because I like Eddie and I thought using Klaproth would be a little more grim than most dayloggers' output. We might switch that to Virgil.

Credit and all, how it happened: Halspal had noted a while back that the links to the logs on the front page frequently go to "Findings:" pages which is a bit confusing to new users. DonJaime mentioned how we had two ambiguous names for daylogs on one day, and then NanceMuse queried how one goes about posting the first daylog for a day, not being aware it was "first come, first nodeshell". I figured this showed pretty clearly we had an unintuitive interface. The lesson here is that the chatterbox is good for more than just wasting time.

Significant Events

Unplanned Downtime

On the positive side, 500 and 503 errors have been reduced a great deal. Something was done with web6, and I don't entirely know what, but apache stopped crashing. I also changed the maximum number of file handlers that pound could have open, so web2 stopped having as many errors trying to delegate requests to the backend webservers.

On the negative side, we have a continuing theme from last month. On March 30, all of Everything2 was down for a few hours. As we have seen before, heat had caused servers to shut down. To reduce issues with heat in the future, the dev server (web4) and tcwest/web5 were taken offline. For all of April the dev server was kept offline, and so development slowed to a crawl, including important work being done for our hosts.

The present setup for the site is quite anemic:

  everything2.com
 ---->   web2 ----*--- web3 ----*---> db2

That's right — we have just one server serving all of our requests. It's a pretty nice server all in all, but it's not enough. During high traffic periods, E2 drops some requests on the floor. As I understand it, we will not be adding new servers to this pool in the near future (a vague thing), so we'll do what we can do to reduce the load from the code side. We are still expectant that our servers will be moved to more reliably-ventilated location which will at least make additional servers not so much of a threat to site health.

As of May 3rd, the dev server is back up, but for how long is uncertain. Having it up is a bit of a liability because of the heat it generates, but some coding is necessary, and it's pretty unsafe to do that on our lone production box.

April 1st

There's a tradition of doing something unusual around here for April Fool's Day. Last year, there was some name mixing and such which alex and Swap conspired to create. This year, kthejoker created a bit of an errand, which you can still follow, depending on how foolish you'd feel doing it a month late.

Coding Commentary: Experience Functions

This next section is pretty much just for those interested in E2's guts. This is a pretty simple coding tip.

You don't really need to understand Perl's package system to write code for E2, but it helps. Functions relating to voting, experience, and GP are mostly in the file Experience.pm (the 'pm' stands for Perl Module). Functions in this file are not available to be called by default. If you ever want to use one of these functions (such as getLevel() or hasVoted()) in your code, make sure that, at the top of your code, you have:

use Everything::Experience;

If you don't do this, your code will randomly fail, but it might appear to work without error when you test it. You can tell an error like this because it'll report an issue locating Everything::HTML::getLevel where getLevel is your target function.

The reason for this is twofold. First of all, Everything code runs inside of Apache which uses mod_perl. This loads Perl into memory as soon as it's needed and keeps important data cached so future requests are served faster. There may be anywhere between 8 and 60 Apache processes running on E2's servers. When load is high, Apache spawns more processes, and when it's low, it kills the little-used ones. When a new process is started, it does not have Everything::Experience (and many other things loaded). As soon as that process serves a page which has a use Everything::Experience;, it loads those functions and they are available for every future call. During high load times, code which needs these functions is more likely to be running on a fresh Apache process, and therefore more likely to have an error if it doesn't have the required use line.

Coding Commentary: Stored Procedures and Stored Functions

Up until about the time I started this root log, we didn't use any SQL code stored in the database. However, as part of the changes I made to support range-style IP blocks, I introduced a stored function, ip_to_uint which is used in IP Hunter. It saves us from writing the same procedure multiples places. It changes a dotted IPv4 addresses into its equivalent unsigned int. This way we can do a range search on an IP already stored in the database in string notation, which is what we store in most places. For instace, IP hunter uses this WHERE clause: WHERE ip_to_uint(iplog.iplog_ipaddy) BETWEEN min_ip AND max_ip

CREATE FUNCTION `ip_to_uint`(ipin VARCHAR(255))
  RETURNS int(10) unsigned
  DETERMINISTIC
  BEGIN
    RETURN (CAST(SUBSTRING_INDEX(ipin,'.',1) AS UNSIGNED) * 256 * 256 * 256)
     + (CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(ipin,'.',2),'.',-1) AS UNSIGNED) * 256 * 256)
     + (CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(ipin,'.',3),'.',-1) AS UNSIGNED) * 256)
     + (CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(ipin,'.',4),'.',-1)  AS UNSIGNED))
     ;
  END

I also wrote up a stored procedure for the purpose of cleaning up the IP range table after I did testing. It's not really useful for much anymore, but it demonstrates what we could be doing with stored procedures:

CREATE PROCEDURE `fixup_blacklistref`()
BEGIN
   DECLARE bl_id int;
   SELECT MIN(ipblacklist_id) INTO bl_id
     FROM ipblacklist
     WHERE ipblacklistref_id = 0
       OR ipblacklistref_id IS NULL
     ;
   WHILE bl_id IS NOT NULL DO
     INSERT INTO ipblacklistref
       ()
       VALUES
       ()
       ;
     UPDATE ipblacklist
       SET ipblacklistref_id = LAST_INSERT_ID()
       WHERE ipblacklist_id = bl_id
       ;
     SELECT MIN(ipblacklist_id) INTO bl_id
       FROM ipblacklist
       WHERE ipblacklistref_id = 0
         OR ipblacklistref_id IS NULL
       ;
   END WHILE;
   SELECT MIN(ipblacklistrange_id) INTO bl_id
     FROM ipblacklistrange
     WHERE ipblacklistref_id = 0
       OR ipblacklistref_id IS NULL
     ;
   WHILE bl_id IS NOT NULL DO
     INSERT INTO ipblacklistref
       ()
       VALUES
       ()
       ;
     UPDATE ipblacklistrange
       SET ipblacklistref_id = LAST_INSERT_ID()
       WHERE ipblacklistrange_id = bl_id
       ;
     SELECT MIN(ipblacklistrange_id) INTO bl_id
       FROM ipblacklistrange
       WHERE ipblacklistref_id = 0
         OR ipblacklistref_id IS NULL
       ;
   END WHILE;
END

We don't really have an automated way to document, patch, or run these, but I've gotten spoiled being able to use stored procedures outside of E2. Frankly, I'd really advise we avoid using these things at all. Functions require SUPER privilege in the database to be created and read, which only the root user has right now (for good reason). Keeping these things synched between dev and prod would require adding additional support to our existing patch system. And I don't think we have many on staff who can maintain this sort of code. So, er, what was I thinking? Guess I didn't really consider it a big issue at the time.

Coding Commentary: VARS, the Eighth Deadly Sin

This next section is pretty much just for those interested in E2's guts. I'm going to cover what VARS are, why they're awesome, and why, despite being awesome, I think they're terrible. Read on at danger of your own boredom.

All users on E2 have a set of values associated with them we normally just call VARS. For things like your "mission drive within everything" on your homenode, your Javascript preferences, and the order you like certain lists sorted, this is the catchall place for variables that don't have any other.

VARS are stored in the vars column of the setting database table. vars is presently a text column which can store up to 64kiB. The setting_id column of the table is the same as the user_id of the associated user.

So we understand how this works, let's walk through an example. Say we have two variables:

hintSilly, value 1
wufoot, value "l:kill,c:vote,c:cfull,c:sendmsg,c:addto,r:social"

We take each name and value, URL encode them both, put an equal sign between them:

hintSilly=1
wufoot=l%3Akill%2Cc%3Avote%2Cc%3Acfull%2Cc%3Asendmsg%2Cc%3Aaddto%2Cr%3Asocial

Then we take each of those and put an ampersand between them to get the final text value we store in the database:

hintSilly=1&wufoot=l%3Akill%2Cc%3Avote%2Cc%3Acfull%2Cc%3Asendmsg%2Cc%3Aaddto%2Cr%3Asocial

The nice thing about vars is that it provides an easy way to store values associated with a user that will follow them around. A single line of code will create a value in VARS which will then automatically follow a user around forever.

$$VARS{important_information} = 'sex is great';

You can test this value anywhere else:

my $fornicationFan = $$VARS{important_information} || 'nope';

This checks if the value is set for the current user. If it isn't set, the result is "undef" so we assign the default value 'nope' instead.

Pros and Cons

VARS have a few advantages:

No special code to create them
Automatically available at no additional cost on each pageload
Automatically saved with no additional code at the end of each pageload
Don't add database cruft if they're being used for test code which doesn't wind up going live

However, VARS have some major disadvantages:

All values are loaded and cached on multiple levels even though many are not used on most pageloads (it's worth noting, as presently implemented, a user's homenode text has this same downside).
Changes may not register due to race conditions.
Settings can't be updated or searched without scanning the entire setting table.
It's difficult to check and update when the underlying code that settings refer to changes — for instance, if we wish to replace a nodelet with another, it's a bit difficult and very inefficient to change this for all users.
Vars have limited documentation which is getting more sparse with time as more vars are made
There is potential to lose all values if a single value becomes too long.

The last item is fairly worrisome but also fairly unlikely. Presently the potentially-very-long values of your notelet text and your custom CSS styles are stored in VARS. If these get too big for you, random VARS values of yours could just disappear, which would be, in the very least, weird. I would consider myself the worst case scenario in this regard because I've got large chunks of Javascript in my notelet and pages of custom CSS. Yet my VARS are just under 30kiB. I'd have to more than double that to be in trouble.

Getting rid of VARS?

There was a previous discussion in edev about eliminating VARS which petered out without any final action. Right now, they are strongly discouraged. If anybody wants to make a new setting, my default response is that they should create a dbtable. Well, really, the default response is "Do you really need a new setting?" However, I acknowledge it's a lot harder to think out and create a dbtable than use VARS, and more code is better than perfect code, so I'm laying out my thoughts of how to eliminate VARS and replace them with things that should be just as easy to use and better suited for the purposes we use VARS for now:

Homenode Items

Things like "motto" and "mission drive within everything" can be displayed the same way they are now but stored in registries.

Long Text Items

Style Defacer and the notelets should be stored in a new nodetype, "userdoc" inheriting from document, which has an associated node_id and an associated user_id. The node_id, in the above case, would be the ID of Style Defacer. Associated the editing page with the setting provides automatic documentation. Further, this makes userdocs inherit any improvements to document such as change tracking. The big win here is that the values in the userdoc only get loaded (and cached) when necessary. Moving these out of VARS eliminates the primary risk of accidentally overwriting other VARS values.

Homenode text should be moved to a similar setup, and document removed from the dbtables associated with the user nodetype.

Options We Just Plain No Longer Need, Dangit!

We should ditch a bunch of the existing VARS values outright, stop checking them, and stop setting them. This includes many values that exists solely to support the obsolescent pre-zen themes. DonJaime has already gotten a start on taking out code which references old VARS. The more options, the more likely things are to break under a particular configuration and the harder E2 is to use. And, yeah, part of the reason I fell in love with E2 was precisely because it was so damned inscrutable. We can partition our inscrutability a little bit, though, so just changing the background color as a level 0 user isn't a crazy affair that can be done in thirty different ways.

Entity-Attribute-Value Maps

The below is "somebody should do this, and it'll be me soon enough if nobody else does". It'll be on the agenda, I hope, at the next coder meeting.

I dreamt of eliminating all VARS to reduce the size of loading a user. Sad to say though, many of the remaining values are used on every page load for a user. For these items, we can keep the same syntax for VARS for now, but change the storage mechanism on the backend. sam512 noted in the edev discussion on this subject, the standard way of storing this sort of stuff is in a table that's something like:

user_id   |   setting_name - varchar(30)   | setting_value - varchar(50)
------------------------------------------------------------------------
220       |   'cools'                      | '5'
220       |   'theme_id'                   | '5555555'

He also noted that, if the value is almost always set (like 'cools' or 'theme_id'), it belongs in its own database column, most likely on the user table. I'm on the fence on this one. I'm tempted not to move things for now just because it means changing a lot of code. I'm also a bit concerned about the user table getting wide since it increases size of data -- granted, we gain nothing if we're always getting this data from the settings table anyway.

Further, sam512 noted that it's counter-intuitive to use a string to store numeric data, and a good amount of our settings are ints. We're going to bow to pragmatism here, though. We already store these values as strings, and having multiple columns or multiple tables to keep things properly type-separated would make things hairier and harder to index.

Note that I intentionally made the column lengths short. This is both a database speed and behavior-shaping decision. We want the index keys to be short, and we want long data items to be stored in other tables. Things like lists of nodelets and your node trail belong in a different storage mechanism that's oriented towards lists.

Minor implementation note: Care will have to be taken so that delete $$VARS{no_thin_border} still wipes out the row in the database table when VARS are written back, not just blanks it. This works automatically with the current single-string VARS.

Entity-Attribute-List Maps

These are about the same as the previous value items, but I felt it was worth calling them out separately. In multiple places, such as the list of nodelets, we use VARS to store a comma-separated list (i.e. ",1,3,90,50023,3"), and we almost always add or remove a single item from this list. I haven't enumerated all of the use cases, but I'm thinking of a table structure of the sort:

user_id   |   setting_name - varchar(30)   | setting_ordinal - int | setting_value - varchar(50)
------------------------------------------------------------------------------------------------
220       |   'nodelets'                   | 0                     | '4444444'
220       |   'nodelets'                   | 1                     | '7777777'
220       |   'nodelets'                   | 2                     | '9999999'

Since we already have the $$VARS hash, lists of this sort could be array refs, then straightforward operations like the below would work.

# Set this value to a new list
$$VARS{nodelets} = [ 4444444, 7777777, 9999999 ];
# Remove nodelet 2222222 if it exists
$$VARS{nodelets} = [ grep { $_ != 2222222 } @{$$VARS{nodelets}} ];
# Add 3333333 to the end of this list (even if it's already in it)
push @{$$VARS{nodelets}}, 3333333;
# Return a list of your node trail
return map { "<li class='trail'>" . linkNode($_) . "</li>" } @{$$VARS{node_trail}};

Doing this will require joining across an additional table when pulling in user vars and using a bit more CPU and RAM use to construct the structure, so maybe it's a bad idea. It would make coding some things easier, as opposed to keeping this "just as easy" which is what I'm aiming for with all of the previous sections.

Enumerated Patches

Below is an item-by-item list of all patches created during this month. It was partially machine-generated, but inspected and edited by hand. It might be boring, but it provides my understanding of every change that happened. This gives some helpful accountability in case stuff blows up long after the trigger for the explosion has been set.

I'm going to attempt to make one of these going backwards in time until we're covered through June 2009 when I was intending to start this. When I post my May root log, February's patch list will go up in the February root log and so on until we're covered.

Persistent abuser identifiers: by avalyn, Oolong, and in10se - Several improvements were made to admin-specific tools to easily identify and exclude the rare persistent trolls, spammers, and abusive users.
displayWriteupInfo: by kthejoker - April Fool's code
IP Blacklist: by OldMiner - Stop incorrectly reporting that an IP is already banned
Create a New User: by OldMiner - Two patches, one to be slightly more stringent when checking for valid emails, then one more to fix the now-overly-stringent check which didn't like dots in the username (e.g. Ye.Old.Miner.Not.Real@gmail.com)
writeupsNuked100: by OldMiner - Fixed to only count writeups. It used to count all nuked nodes, including nodeshells, for the "Welcome to Hell, Here's Your Accordion" achievement
admin toolset: by OldMiner - Added extra space between the "Delete Node" link and other links so it's even less likely to be clicked accidentally
Everything User Search: by OldMiner - Two patches. The first was to show editors when a writeup (and/or its e2node) has a node note and to reduce SQL queries -- we did several queries in an inner loop before. The second was to fix a bug in the first patch which made this information visible to authors, so you could tell if there was a note on your own writeups. Whoops.
nodenote: by OldMiner - Continuing the note display stuff, made node notes for parent e2node show up when viewing a writeup (before you only saw these if you clicked to the "full" view of the node)
private message XML ticker: by OldMiner - At the request of someone in clientdev, I added the requesting user's title in the XML so an XSLT parsing this can make a nice display without any state information.
zensearchform, Scratch Pads: by OldMiner - When I upgraded CGI.pm as part of fixing high-bit character handling in titles, it broke some of the weird calls we made to it. I unweirdified these calls so these worked again.
pollvote: by OldMiner - Added "use Everything::Experience" for hasVoted()
daylog: by OldMiner - Made Cool Man Eddie create the daylog e2node if it's not already there. This way, the daylog link is never dead.
Security Monitor, SQL Prompt, user maintenance delete: by OldMiner and alex - I added logging of the rare case of user deletions (only spam accounts which never logged in) as well as logging of SQL queries so I could look up and reuse old queries; the SQL query logging was nixed (and records erased) due to concerns about their sensitivity
IP Hunter: by OldMiner - Admin tool to investigate IPs. Made it deal well with deleted users
epicenterZen, Epicenter: by OldMiner - On pages which manually used setVars() on the current user, a user's C! and vote count, if 0, would be set to a string with a space because of the way the VARS-saving code works. This would evaluate to true, so you'd get screwy statements like "You have C!". Made it use "int" when checking C! and vote counts so that calling setVars doesn't make them display weirdly when they are 0.
The Old Hooked Pole: by OldMiner - Created this tool for deleting spam accounts and made it not break on usernames with spaces in them
log ip: by OldMiner - Two patches to reduce unnecessary logging. Excluded more unroutable addresses (previously, we just ignored 192.168.1.1, now we exclude all local address spaces). Stopped repeatedly logging IPs if we've seen them recently from the same user.
vitsection_maintenance: by OldMiner - Add E2 Bugs and Suggestions for E2 to Vitals
listcode: by OldMiner - This is the htmlcode used when viewing the code for htmlcodes, superdocs, and such on E2. It already linked htmlcode() calls to the appropriate htmlcode. I made it do additional linking to calls to the nodelet section function so that the section htmlcodes are easily browsed. See epicenter to see it in action.
Vitals: by OldMiner - This used the old htmlcode("foo", "arg1,arg2,arg3") calling style. Changed it to use the normal htmlcode("foo", "arg1", "arg2", "arg3") form
showmessages: by OldMiner - I twiddled with this a little because I'm working on expanding its functionality for a new nodelet. Made the spacing consistent, removed dead and commented code
openform2: by OldMiner - In response to a bug report a while back about being unable to chat while editing a scratch pad, I made this save scratch ID if it's present so chatting from scratch pads without AJAX works
uploaduserimage: by OldMiner - I added a datestamp subdirectory to the generated filename when a user uploads a new image and then made Apache ignore this additional directory. This causes the source URL to change whenever you upload a new image, so users should no longer see the incorrectly cached old images
Everything's Most Wanted: by OldMiner - Encoded parameters to guard against XSS
Favorite Noders: by OldMiner - In response to a bug report from GhettoAardvark, I rewrote this nodelet to use favorites as they exist now (links) rather than as a (Yech!) VARS setting. There's some work to still be done here in the same bug report.
IP Hunter: by OldMiner - Indicate banned address by striking through them
dbtable display page: by OldMiner - Make "show row count" link a form button instead of a weird, ugly link. Fixing my own bad UI.
SQL Prompt: by OldMiner - Report rows affected for INSERT/DELETE/UPDATE statements
IP Blacklist, IP Hunter, blacklistedIPs, check blacklist/: by OldMiner - Multiple patches. Allow adding banned ranges such as 127.0.0.1/24. Although these can be added and viewed, I was hesitant to place a call to check blaklist into Create a New User without code review. Since no one volunteered, I'm going to be adding some logging for my own reassurance, then roll it out in June.
Create a New User: by OldMiner - Enable Everything2 AJAX by default.
Universal Message XML Ticker, Other Users XML Ticker II: by in10se - add md5 for gravatar support; this allows chatterbox clients to see a hash of which might accidentally result in identifying yourself if you've used the email address you signed up to E2 with for a gravatar somewhere else
Kernel Blue: by OldMiner - In response to a bug report a while back, and with assistance from DonJaime, add left/right margin to "Up"/"Down" buttons so it's harder to click the wrong one by accident
ilikeit: by DonJaime - There was some concern that people were still seeing repeated "Somebody likes..." from the "I like it!" links. DonJaime added an additional layer of prevention so that robots and search spiders never got the link -- he had already insured that they couldn't follow that link. ; Just in case, I also added code to temporarily log all "like it"s by non-personally-identifiable user agent to see if spiders are hitting us a lot. (Gecko users are the most likely to show appreciation.) likeit logger (eds and gods only) displays this information.
Universal Message XML Ticker: by in10se - Allow specific 'backtime' parameter when requesting message inbox so that response can be nearly empty if no new messages have been received since last check.

root log: March 2010	root log: May 2010	April 9, 2010	root log: February 2011
Nivea for Men	clarification memo	Bud and Travis	Soup Is Good Food
Great Vowel Shift

root log: April 2010 (log)