So, it's November now, and we're well down the to-do list, but there's always more to do. It's an exciting time to be working on the site, and we're working super hard on improvements both to reliability, and new feature work. There is literally years of technical debt built up around these walls, and I am working full-tilt on shoring them up. I'm working to throw out a lot of code. There's a ton of junk here that the site doesn't need, and years of betas and unreleased features laying around. I'm going to be working to kill off the stuff that's making this place less nimble to develop for, so that we can really push this site forward. This log serves as a detailed record so that it is easy for staff to help track down potential breakage and lists my thoughts and priorities day by day.


Last month

In October we really saw a huge amount of progress. I'll drop you the major highlights:

Solved the google problem: We've been having a tough time trying to figure out how to interact with Google, Bing/Yahoo and Baidu. We're now properly generating correct metadata tags for description, language, and proper noindex suggestions where it is appropriate(printable pages). This allows us to have relevant, non-spammy links to our site be in the search engines. We are also publishing a sitemap (available from robots.txt) that gets updated once a day and pushes new nodes out to the search engines. This has been my number one priority as new blood and interest in the site needs to keep up so that we can continue to operate and keep the place interesting while I clean it out to really add some killer new features.

Eliminated a ton of technical debtThere's a lot more to go, but we're working very hard to eliminate years of decisions that looking back upon it, we wouldn't have made now. I've deleted or pared down a lot of unused code, and a lot of code that was duplicated or simply broken. If you are the owner of said code, or there is something you missed, we can restore it out of our github repository that powers the site.

Better operationalized webservers: I did a bunch of work for me, to make it easier to operate the site so I can focus on code, and less on kicking servers when they don't work.

Some legal stuff: I have also filed for a DMCA safe harbor provision so that we can appropriately handle DMCA copyright allegations if they ever creep up. This protects me and the site from liability. We'll be posting a DMCA policy shortly after the letter is accepted, but nearly none of you will ever need to worry about it. The short takeaway from this is that it is the first business step to be able to "handle" images and other non-text based content sanely. It's not going to happen anytime soon (as there is plenty more to do elsewhere)

Engine improvements: I'll shift the burden of the explanation here to this month, as it is underway.

This month

I'm looking to get a feature out to users this month, but I'm not sure what it looks like yet, as it's still in the design phase. Once I have a bit more details on the interface, I'll be good to go, but it very likely includes the nodeparam work from last month. This allows the application to tack any piece of data onto any type of node. I'm considering opening up the ability to tag information to book or movie reviews (like author, director, release date, rating, ISBN, that sort of thing), and have that as a bit of boilerplate information that goes along with a node or reviews. I'd like it to be crowd sourced, so I'm looking at some other way for you to use your votes: perhaps as a way to verify that a particular piece of data looks right or that you agree with a classification or a piece of metadata. Like I said, the exact design of it is still floating around, but I'm excited about dipping our toe into crowd-sourced suggestive editing. Writeup licenses are also right around the corner as well, pending UI decisions.

There are a lot of project in-flight, so I'll be carrying forward some of last month's list. It's going to be listed as a stack of items that needs to be addressed, so apologies for the format, but it's how I have to think about the complexities of the project:

    Top level projects
  • I'd love to work on a new layout, but there's a lot to do before we can sanely get there. The first thing we need to fix is how we are using Amazon's S3 to cache CSS and javascript (jscss.everything2.com)
    • S3 upload should happen directly from an update maintenance, which is doable, but requires some surrounding features
        Performance
      • The engine should support just-in-time minification of the Javascript and CSS
      • The Everything::S3 adapter needs to properly upload a regular and gzipped version of the file
      • We need to support marking a certain version of a CSS or javascript file as being what is in "production", so that is going to take a node parameter. This will prevent us from deleting it on upload, and we can allow developers to hit the webserver as the origin for a particular CSS or javascript file if they are working on it
      • The problem with S3 however gets into the engine: currently S3 uses the version from the version table (checkGlobalVersion) type stuff which works great, but it breaks the test environment, and it can put us out of sync pretty hard if we get into it. I've changed the types to join on a new interim table (s3content), and they can each manage their own individual sub-versions. However, the problem here is that we have to implement a new maintenance type to support it pre-update, so that it can set the version before it is updated, and let the post-commit update hook do the S3 uploading. I'm willing to bet that a pre-update handler for nodes is also going to allow me to unwind hacks in quite a few places to make this work.
    • Node backup also depends on better S3 integration
  • The days of themes are numbered. I need to wind down how the theme engine works, both for performance, but also because it is the major impediment to keeping ourselves agile and able to make changes quickly. Even if we say themes other than zen are unsupported, we have to figure out what codepaths are supported, and it is a huge issue to get done.
    • htmlpages do not go to the NodeCache right now because they are retrieved with getNodeWhere. There is a winding way in which we determine which page is fetched for which theme for which nodetype for which displaytype, but once themes are dead, we can make a straight decision tree and cache how this is done, rather than calculating it every pageload. We could do something with multiple themes, but it is burdensome to generate it for each theme and themesetting, so I just need to double down and kill themes.
    • I need to make sure we have CSS equivalents for the old themes so I can kill them and set everybody's variables properly as a part of migration so that it's painless.
    • Lucy-S noted a problem where XP is not being notified for the jukka dim emulation.
  • The concept of settings needs to die. Either the value is a tweakable item that can be solved with a production push, it is an arbitrary attribute on a node, or it belongs in a class of cached data objects that need to be represented as a packed JSON store, and not as the buggy, strange parameter assignment that these are now. This also means getting rid of $VARS eventually, and replacing it with one-off bits of data from the database. This approach lends itself very well to caching and only grabbing the keys that are needed when they are needed.
    • System settings especially needs to go. The entire %HTMLVARS structure causes us to have to bootstrap it during the in-server pageload. This lends itself to various bits of namespacing horror, but once it is gone, we can create an Everything::HTML emulation container that will allow us to pull htmlcodes in to executable, in-library blocks, and start to remove them from being in the database. evalCode is our number one performance hit, by a huge, huge margin.
    • User settings will become node parameters which will have sanity, speed, and application-level logic surrounding each one, rather than the wild no-mans land that we have now.
  • This will also help unwind the various namespace cross-pollution hacks that have put us in the mess that we are in. We need to finish the module consolodation and move Everything::MAIL::node2mail into Everything::HTML, and from Everything::HTML into Everything::Application over a few server pushes
  • Mails going out from Everything::MAIL are not tagged as HTML mail, even though they are, and it looks really unprofessional.
  • Even more verbose crontabs on the servers (super, super low priority)
  • Eliminate the unneeded binaries so that there's less to clean up when we sweep for huge functional changes with grep
  • Move the rewrite conf into chef and out of the ecore/ tree, which is a weird place for it to live
  • Webserver stuff
    • Add in a reverse proxy rule for images, which might solve the Jukka Dim emulation problem below
    • Add in a reverse proxy rule for java chatterbox, move to S3
    • Add in a reverse proxy rule for TinyMCE, because having it off page breaks some stuff, including the html edit function
    • Increase the amount of memory allocated to each thread
  • Disallow <h1>s and possibly <h2>s in writeups, because it confuses google. We might solve this with some kind of regex, but no idea on how to solve this properly. Possibly with some CSS magic
  • Category javascript isn't performing well
  • Our database transaction strategy has caused us some serious performance headaches, and continues to do so. It needs to be rethought and perhaps thrown out. Since we are moving more towards atomic transactions of individual properties and less about squishy implicit joins on node tables requiring cascading updates, we can consider its demise.
    • Because of this, new writeups is stuck on a 5 minute timer, and that needs to be adjusted. We might go as low as a minute, but it can't be updated as a part of the writeup update maintenance. It was being triggered on voting!
  • Node locking (as in the property on node), needs to die. Seriously, have you ever seen the page Node Locked? Yeah, that's what I thought. We just don't use it, and I don't think we're going to ever. It was more of a problem before atomic updates happened in innodb
  • Small stuff:
    • Kill off isAdmin for something inside of Everything::Application
    • Kill off isInUsergroup for DB->isApproved, since that is really performant now.
    • Make it so that every usergroup doesn't "poison" the cache since they are permanent. Only mark system groups as permanent (gods, c_e, edev, chanops, etc)
    • Remove e2imagenode
    • Remove adminbar, and the admin nodelet, and supercloak. But to do that, I need to (and I kid you not) unwind the Everything::Mail problems above. I'm completely serious
    • Commit the patches to git, and delete a bunch of them as they're mucking up my search results a lot of times. We don't need an archival record now that we have sane source control.
    • Take down the patch importer.
    • Update the email and email sending capability of verify your email account. We are sending raw HTML as text, and the from address should be accounthelp@
    • We have to move the messaging system off of the htmlcodes and opcodes that it is into a library of some kind. It is madness and message is our most fragile piece of code
    • Node tracker keeps retitled writeups forever.
    • Add quick rename to security monitor
    • Egg commands should know about costumes.
So, without further ado, let's get on with it


    Nov 2
  • Made Guest User unnukable
  • Made system settings unnukable
  • I need to kill off system settings. %HTMLVARS are horribly abused, and we need to move to a place where we can have a compiled execution context. To that end, I'm thinning out the keys in there so I have limited stuff to scrape out of the codebase.
  • Removed the systemsettings key from system settings. It was only there to make it un-nukable. Set the node parameter to do the same.
  • Made Piercisms Generator unnukable
  • Removed 'submissionsforquoteserver' from system settings. It was a maintenance node that doesn't exist anymore, and the key isn't being used.
  • Deleted all of the nodes from node heaven that were submissions for the Everything Quote Server. I have them on backup, I guess, but maintenance nodes really shouldn't be in node heaven.
    Nov 4
  • Worked on the backup system more, specifically have a mechanism for pulling logs off of S3
  • Removed commented-out code in socialBookmark (opcode), so that I can tell where system settings are being used a bit better.
    Nov 6
  • Nuked hash2var, as no one was using it
  • Removed htmlvars ed/gods logic out of coolit for an E:A function
  • Changed the sendPrivateMessage logic in coolit to pass in a hash param instead of HTMLVARS
  • Changed Websterbless to allow all CEs and use the hash param version of sendPrivateMessage
    Nov 7
  • DonJaime reworked the login system so that we are no longer storing plaintext passwords in the system. We will need to work through some of the backlog of password hashing, but we're on the other side of it.
  • Got rid of the HTMLVARS carry for sendPrivateMessage in displayuserinfo, displayuserinfozen