Simple, static web pages still have their place alongside complex, dynamic, database-driven behemoths. Some web sites are small and simple enough for a collection of .html files to be the best method to store and serve them.
Now that most web browsers have the ability to accept gzip-compressed HTML, these files can be stored in a compressed format as well as the original uncompressed one. Modern browsers will be able to retrieve these smaller versions while less up-to-date browsers can still get the original, uncompressed files. This all happens automatically, so the people browsing your site won't need to do anything differently. This is good for two main reasons:
- Because the files requested are usually the smaller ones, the people browsing your web site can get them much quicker. This can really help out those on slower internet connections.
- The smaller transfer sizes help you out as well because the accumulated effort of many visitors will use significantly less of your bandwidth.
It's worth noting, however, that there are some drawbacks:
- You will be storing two versions of your files, the original and compressed versions. This means that each time you update an uncompressed file, you will need to delete and recreate the compressed copy of it.
- Compressed HTML files don't get put through the Server-Side Includes processor.
- With the language method, if any of your visitors have specified in their web browsers that they can't read the language your site is written in, they will get an error page with status code 406. This page links to the right file but warns that it's in a language they don't know. Technically this is what's meant to happen, but it may look unprofessional to serve these error pages.
The only problem is actually setting Apache up to send the compressed files. While compressing the output of a PHP script on the fly gets a mention on many web designers' sites, there doesn't appear to be much documentation about doing the same thing with static HTML files, besides verbatim copies of the mod_mime related part of Apache's documentation.
So here's a step-by-step guide, just in case you have a site comprised of good, old-fashioned HTML files and want to save your visitors time and yourself some bandwidth.
The language method
- Edit .htaccess You need to add three lines to it. First, add the line Options +MultiViews. This enables Apache to work out what a file is, based on its extension, and give it to the user based on their preferences. Then add the line AddEncoding x-gzip .gz so that it recognizes the gzip extension and knows that it's what the user wants if their headers specify that they can accept compressed files. Lastly, add the line AddLanguage en .en (or the equivalent for whichever language the files are written in). I'll explain why you have to do that in the next step.
- Rename the uncompressed HTML files If your site's visitors request file.html and that file exists, Apache will give it to them, even if they can accept compressed files and you have a compressed version of that file. For this to work, a file cannot exist with the exact extension that the visitors request. This is why you have to rename all your files so that instead of ending in .html, they end in .html.en, or your language's equivalent ISO 3166 code. Don't worry; your site's visitors will be blissfully unaware that any of this is happening.
- Make compressed copies of the HTML files Now you can make the compressed copies of your files. Unfortunately, the gzip program doesn't appear to have the ability to keep the original file as well as the new, compressed version, so you'll have to create the new files one at a time like this: gzip -c --best file.html.en > file.html.en.gz. Remember that if you later edit one of the original files, you'll need to delete the compressed version of it then create a compressed version of the new file.
- Test your site There are several ways you can test whether this all worked, most of which involve complicated things like using telnet or viewing the HTTP response headers in your browser. If you can do the latter, look for a line which reads Content-Encoding: gzip. Don't worry if you don't have such a fancy browser, though, as there's a much simpler way of testing which version of the files you are looking at: make a file called test.html.en with a line in it along the lines of This is a compressed page, then compress it. Finally, make a new page with the same filename that the first one had, and a line such as This is an uncompressed page. Then try to access test.html in your browser and see which message you get. When I tried this, Firefox, Amaya, Opera, ELinks, and even Internet Explorer retrieved the compressed file. Success!
The filetype method
- Edit .htaccess With this method, you only need to add the first two lines: Options +MultiViews and AddEncoding x-gzip .gz. You don't need to specify anything about languages.
- Make compressed copies of the HTML files This is pretty much the same as in the other method. Just use gzip -c --best file.html > file.html.gz
- Change the hyperlinks to exclude the file extensions This is where this method differs from the other one: don't ever link to, say, file.html directly. Instead, link to file and let Apache work out which version of the file the people visiting your site want.
- Test your site Again, this is very similar to in the other method. Create test.html and test.html.gz with slightly different content, then try going to test and see which message you get.
It's worth noting that both methods don't just work with HTML files, they work with all text files. You can link to ascii.txt and have the files ascii.txt.en and ascii.txt.en.gz ready to serve, for instance, or you can link to ascii and have the files ascii.txt and ascii.txt.gz ready.
You can also offer the pages in different languages, but that's for another guide.
You should hopefully now have a static web site that can send pages in both compressed and uncompressed formats (or at least know how to set one up, if you're a real geek and reading this just for fun!). Give yourself a well-earned rest.
To speed things up, here's an .htaccess file to copy and paste:
Options +MultiViews
AddEncoding x-gzip .gz
AddLanguage en .en