MySQL is a RDBMS. It's fast. Really fast. However, speed is no substitute for quality: MySQL lacks transactions and some other critical features that, say, Oracle does have. In other words, MySQL is, as of yet, unreliable and limited.

For a rant, see this document: http://openacs.org/why-not-mysql.html (also a lot of contrary points, too - don't take my word for these =)

Personally, I chose PostgreSQL because MySQL came with a nasty license and PostgreSQL is under BSD license. (As pointed out, MySQL is under GPL now...) I didn't knew about MySQL's other faults at that time. Now I know better.


Updatish: Other reasons:

  • Yeah, Postgres' feature set is cooler. Transactions are nice.
  • PostgreSQL handles complex queries better than MySQL.
  • What the heck is auto_increment? In PostgreSQL we do it with name SERIAL that automagically creates a sequence. Oh, right... let me guess, MySQL has no such features?
  • ...uh, yeah, MySQL was built by those Swedes, dammit. <rivalry type="neighboring countries">My support goes anywhere else but there.</rivalry> =) (I wish Solid's databases would be open-source so I could use a Finnish database... Okay, enough of this silliness.)
MySQL is a lightweight, fast relational database program distributed under the GPL license. It is very fast. It runs on many popular operating systems, typically Linux, but also Microsoft OSs and BSD. It is stable (except, in my experience, in the case of a local client crashing on Windows 98, but hey, that's what you get with Win98).

The website at www.mysql.com is quite simple to use and contains good documentation on MySql. The site refers to MySQL as "the world's most popular Open Source database server". The company sells support and services, commercial licences, pro versions and suchlike.

MySQL is a fine program, a worthy open source, whoops, free software endeavour and the right tool for some jobs. It is widely used to serve web content. it is good for applications where there is much reading and not much writing, and where a failure or inconsistency will not cost money.

However, in it’s current form it is not the same kind of program as any of the others that are marketed as relational database management systems. Compared to these programs, mysql is a toy.

If Oracle or Microsoft SQL Server is what you are used to, then you might rapidly decide that MySQL is not a DBMS, just a program that reads and writes files of data.

Wait a minute. Isn’t that what a DBMS is?

No, a DBMS is more than that. Many hackers have a very limited conception of what a SQL database system can do, based on using MySQL. SQL in its own way a powerful and subtle language. It can ensure that all the heavy lifting is done on the server, and the client just specifies what it wants done and is never stuck doing read-a-row, process-the-row, write-the-row, next row loops.

Or at least, that's the potential when SQL is fully implemented.

If you have spent a lot of time using an industrial-strength DBMS, you will rapidly become frustrated at MySQL's limitations. To be blunt, it has no balls (see below).

It seems that MySQL is popular in open-source circles partly because of the herd-mentality – MySQL is well-known so it the first choice, partly because its speed is seen as more important – (speed means scalability, doesn't it? Erm, E2 lag anyone?), and because of low expectations of what the DBMS can do, i.e. its users don't know any better.

Whilst they take pride in subtle hacks and comprehensive knowledge in C or PERL, they are content with this junior league implementation of SQL.

Leaving aside the commercial alternatives such as Oracle, IBM's DB2, Informix, Sybase and Microsoft SQL Server, there are free open databases such as Interbase and PostgreSQL that support the features that MySQL lacks.

But MySQL is fast. Of course. A motorbike will easily outpace a pickup truck. That’s because the bike’s engine drags less stuff around with it. But which one are you going to use to move house? Which one do you want to be transporting you when it crashes? In fact, PostgreSQL is known to perform dramatically better than mySQL in some high volume, high load scenarios.

But UNIX is the standard, isn't it? One of the more annoying things about MySQL is that it favours C and UNIX standards over ANSI SQL standards, for e.g. in string escaping and comments. This suggests just plain blinkered ignorance on the part of its creators.

The missing balls of MySQL

The missing left testicle: Atomic, Consistent, Isolated and Durable transactions.

BEGIN TRANSACTION
Update Account set Balance = Balance – 100 where AccountId = 1234
Update Account set Balance = Balance + 100 where AccountId = 5678
COMMIT TRANSACTION

The updates above should succeed together or fail together. If the second one fails, the first must be undone.

This is one good reason why MySQL is so fast – it can’t do this. It doesn't even try to meet the ACID transaction test. It doesn't have to keep logs and audit trails because it will never be called upon to abort and roll back a pending transaction.

It is useless for applications where data integrity is of utmost importance, as in the financial example above.

The missing right testicle: Nested queries.

Select * from Person where
  Person.email_address in
    (select email_address from account where balance > 1000)

Imagine that you downloaded a C compiler, started to use it, and then discovered that it didn’t support {} blocks inside procedures. Is that still C? This is what MySQL does to the SQL language.

This limitation also applies to Delete and update statements. For e.g. you cannot do this:

 /* remove all dead people’s addresses */
 Delete Address
 from Person
 where Address.PersonId = Person.Id and
 Person.Deceased = 1

As is rightly pointed out elsewhere, this limitation requires that you write code that does two sequential reads of the two tables, and then process rows on the client one at a time. Erm, what was that about a speed advantage?

Other missing bits that I have come across include


The latest versions of MySQL have indeed added a table format that allows transactions.

MySQL is catching up, soon it will reach the functionality that commercial databases had in oh, 1980 or so.

Now it should be noted that most databases do not feel the need to have more than one kind of table. It should also be noted that some of the performance optimisations that this feature allows the developer to make manually are performed automatically, in dynamic response to user load by more advanced servers. E.G. MS SQL server will store a table in memory automatically if the table is small and often used.

The SQL language goes back to IBM projects circa 1975 and has transactions around 1980. SQL databases are a very mature technology and we expect an incredible degree of functionally and reliability. We get it, even from free software like PostgreSQL.

The assumption that one table is one file is purely an implementation detail, and in fact this assumption is false in some higher-end SQL servers which store the entire database in one large file. This implementation detail should not concern the person writing the SQL query. But here it does.

What happens if you start a transaction, update tables of more than one format, then roll it back? Data in some tables can be restored, in others it can't. Oops. Transactions are supposed to be ACID, and C stands for Consistent. Rolling back part of a transaction and leaving the rest is never going to be consistent, but that's the best you can do here. ACID is better thought of as a property of the database as a whole than a property of a single table.

To say that a table is "doesn't need to use transactions because it is non-volatile data" is to miss the point by mixing up two different concepts - the frequency of an operation and the need for it to happen without error when it does happen. Furthermore, when a table is not updated, then how does using transactions for updates add overhead?

The fact that mySQL now supports multiple table formats, each with a subset of features, is an accident of history. It is cruft. Sure, as a mySQL user, you can get better performance by working with it, and taking a guess way up front as to which tables need to be transactional. But it is definitely not a reason why a new DB user should go with mySQL. Quite the reverse.

But if your database is so delicately poised on the knife-edge of performance viability that the change from transactional mySQL to non-transactional mySQL will make or break, then I heartily recommend that you shell out $100 for more RAM, and change over to PostgreSQL.

Sure it is poor mySQL database design to update transactional and non-transactional tables in a transaction. If you are already using mySQL and designing mySQL databases. If you want to arse around with multiple table formats and sorta-transactional databases, by all means use MySQL. You could, however, think about your database at a higher level, in terms of compliance to the SQL standard.

Or if you don't know what transactions are, and you will be serving lots of non-mission critical text, and you do not already have a real database running, mySQL might even be the right tool for your job.

The previous writeups are very dismissive of MySQL, but they're also badly misinformed. MySQL is unique in that it allows you to use several completely different types of tables; you can even mix table types within the same database. Contrary to what many people would have you believe, MySQL is ACID-compliant if you use the InnoDB or Berkeley DB table types.

MySQL supports the following table types:

  • MyISAM
  • The default MySQL table type. MyISAM tables are extremely fast and very efficient for storing and retrieving large amounts of data very quickly. Each MyISAM table consists of one data file and one index file, which makes MyISAM tables easy to move or back up, since you just need to copy a few files. As of MySQL 4.0, MyISAM tables also support query caching, which makes things very speedy for servers that handle lots of identical select operations.

    Unfortunately, MyISAM tables do not support transactions and will be corrupted if the MySQL server dies in the middle of a write. Luckily, MySQL has a very good table repair tool, which will almost always return a corrupted MyISAM table to working order with little or no loss of data.

    Coincidentally, Everything2 itself runs off a MySQL database using MyISAM tables.

  • MERGE
  • A MERGE table is a collection of two or more MyISAM tables with identical columns and keys that are merged together and can be used like one big table. This is very useful in situations where you have huge tables or tables that you'd like to split onto separate hard drives for speed purposes, but which you still need to access as one table. MERGE tables are particularly useful for logs, where you may want to compress old data to a low-size, read-only format, but you still need the recent data to be read-write.

  • ISAM
  • Predecessor to MyISAM, included with MySQL mostly for backwards-compatibility, but is slated to disappear in version 4.1. ISAM tables are basically the same as MyISAM tables, except that they are not binary portable across platforms, they can't handle tables larger than four gigabytes, they have smaller key limits, and dynamic tables tend to get more fragmented.

  • HEAP
  • HEAP tables are stored completely in memory, and are thus extremely fast. HEAP tables are excellent for storing temporary data. Just remember that all that data will go away forever when the server shuts down or crashes. It goes without saying that you probably shouldn't create a HEAP table bigger than the amount of available RAM in your server, since that'll just force the data to be swapped to disk, negating all the benefits of being stored in memory. Insert operations are roughly 30% faster on HEAP tables than on MyISAM tables, while selects are only around 15% faster due to MyISAM's excellent query caching features.

  • InnoDB
  • InnoDB tables are robust, transaction-safe and ACID-compliant, with complete commit, rollback and crash-recovery capabilities. They also support foreign key constraints, unlike other MySQL table types, and are designed to process huge amounts of data very efficiently. If you hear someone complaining about MySQL's featureset, they probably haven't discovered InnoDB tables. InnoDB is used on many large production sites, the most recognizable of which is probably Slashdot.

  • Berkeley DB
  • Berkeley DB is a high-performance transactional table type that's been around for quite a while, and is actually used in many other applications than just MySQL. Thanks to its high performance, small footprint, low overhead and good scalability, Berkeley DB is used in lots of switches, routers and other embedded systems.

Why on Earth would anyone need a database with more than one table format, you ask? Let's imagine a scenario. You've been hired to develop an online banking website for the Everything Bank in Everything, Kansas. The bank itself already keeps track of all its transactions using a very beefy, professionally-installed rack of servers running Oracle. For the online banking site, you aren't allowed to mess with this database directly, so you'll have to roll your own, which will then send updates to the master every night.

After doing lots of research, you decide you'd like to use MySQL for the task, due to its support for multiple table types. The most important table in your database is the transaction table, which will keep track of all the money that flows in and out of your users' accounts. You'll want to use InnoDB for this, so that you don't credit someone's account with $5,000 and then fail to debit the source account due to a server crash in the middle of the update.

You also need a table to hold user account information. This needs to be speedy and able to be updated quickly, and must also be easy to back up, although it doesn't need to use transactions because this is just non-volatile data like usernames, addresses, etc. You decide to use MyISAM for this table, which will give you a nice balance of speed, efficiency, and functionality without the overhead of transaction support that you won't be using anyway.

In the interests of security, you'll want to store as little user information as possible in browser cookies. You decide to assign each user a secure id via a cookie, and store any temporary session data in a table on the server. You use a HEAP table for this, because you'll be reading and writing to this table on almost every pageload.

It's important to the bank that you keep a detailed log of site usage, so you configure Apache to use MySQL for logging. Once you've got a few months of log data stored, you'll split old months off into a separate compressed read-only MyISAM table and use a MERGE table to generate reports from all these separate tables as if they were one table, which makes your job lots easier.

And there you have it. You've managed to use most of the table types MySQL supports all in one database, which results in a database optimized for a wide range of uses and well-suited for the main task it was intended for. Of course, most of the time you won't need all these features. In your job as a web developer, you rarely ever really need transaction-safe tables, so you tend to use MyISAM most of the time, which results in much better performance since that unused transaction overhead isn't there.

And oh, what a smart web developer you are.

Log in or registerto write something here or to contact authors.