CRUD - Create, Read, Update and Delete - is the very simple-sounding solution to what is actually quite a complicated question in software engineering.

When designing a system, how can you anticipate the future needs of the system? Say something as seemingly simple as a blog. You create a post, right? So that's easy. But then posts have comments. And an author. And some pictures. And some tags. Eventually, you end up with a very unwieldy database table, where every time you add a new piece of functionality, it gets a little bit more messy.

the basics

The secret is that instead of creating complicated databases with lots of fields in them, it's possible to carefully design a system where CRUD are the only operations you need.

In an carefully designed CRUD system, an administrator creates a blog post, which an user can read. The user can e-mail the administrator with a typo, so the admin can update the blog post. When it has finished being useful, it can be deleted.

So far so good, but it doesn't sound particularly clever, right? Well, let's see what happens in a bit, when we add commenting to the mix.

It's all about databases...

... Specifically, about relational databases. Each of the user interactions can be mapped directly to a basic SQL statement - Create is INSERT. Read is SELECT. Update is, er, UPDATE. and Delete is, well, DELETE.

The idea is that all the information stored across a system can CRUD'ed easily, by using the bare-bone-basics of database instructions and a simple database design. The simplicity of this means that you end up with a lot of database tables with a lot of rows in them, but the strength in this simplicity is that databases are fast, and it's easy to cache simple database queries.

In practice

To continue with our example above, let's say that our intrepid site visitor wants to vote on a particular write-up. One way of doing this would be to read the current vote field from the blog post table, add one to it, and write it back to the database - but that would mean that the user can 'Vote', and 'vote' isn't one of the actions used in this solution.

Instead, when casting a vote, the user creates a vote for a post. This might sound like exactly what we did above, but from a database point of view, it's quite different: In this case, User 12 casts a vote for blog post 98, which is simply dumped into the correct database table, without first having to do reads. Add a datestamp (always handy for stats and bug hunting) and a unique ID, and you have a 4-field database table which is really quick to work with.

To tally the votes cast by a particular user, simply SELECT COUNT(*) 'votes' WHERE user=12, and it returns a number. To tally all votes cast for a particular post, you can SELECT COUNT(*) 'votes' WHERE blog_post=98, and it returns a number.

In practice, it turns out that doing many simple database queries can be faster on than doing big, juicy queries on huge tables - and it scales quite well, too. To add a comment, you don't comment on the post - you create a comment in a comments table. To subscribe to changes to a post, you don't subscribe to the post, but you create a subscription in a subscriptions table - and so forth.

A different way of working

Nobody thought to teach me about databases when I first started programming, and I've got some legacy web applications where I had a single database, with a single table, and the table would be 80 fields across. Don't get me wrong, it worked fine, but the problem is that adding more functionality, and readability of legacy code, becomes cumbersome enough that it becomes a hurdle to ongoing development.