Knowing your filesystem is an essential part of using a computer. Not
only on Linux, but on all operating systems.
There are four different types of files on the ext2 (Linux) filesystem.
These are:
You should know that your harddrive is divided up in several "blocks".
The size of these blocks depend on how you formatted your harddrive.
These blocks can hold data. But what good is data without meaning?
Why such a complicated fs?
"seven and four!". That made no sense to you, did it? If I were to
tell you that are the ages of my nice and my nephew, it would make sense.
We have to label the data. These labels are called "filenames". The
problem is that we also need to store it somewhere, and point to the
data. This is where "inodes" (no, these have nothing to do with iMacs)
come in. The "I" is for "Information". And the node part... well, you're
on Everything, so you can make that out. These Inodes are placed on your
filesystem. They store the filename, the permissions, where the data
starts on your harddrive, and where it stops. If you read a file, the
inode is queried to find out the information of the file. Your harddrive
will then "seek" to the appropriate magnetical data on it, and pass
n blocks of data to your CPU. The CPU will process that data,
and give it back to your browser.
Of course, this is nice, but we don't want the age of my nephew anywhere
near your physics thesis. This is a job for directories. Directories
store several files, or more directories (which are then called
subdirectories). This way you can organize information in a way you can
find it again.
Introduction to file attributes
Files are used to store data. This can be textual data, like your thesis,
or "binary" data, like the game you play to get your mind off the thesis.
Binary data can not be read by humans, but the filesystem doesn't care
about that, since it isn't human. Files on the ext2 systems have
permissions and attributes. Here's the output of a command to show
permissions on a regular text file:
tribbel:~/docs/culture]% ls -li hamlet.all.txt
98700 -rw-r--r-- 1 tribbel staff 200059 May 30 17:41 hamlet.all.txt
Just so you know, this is the full Hamlet play by William Shakespeare.
The first command shows:
98700 - The number of the inode which provides information about the
file. Usually, this is irrelevant. But we need it to explain
hard links later on.
-rw-r--r-- - The permissions on the file. The first position (which is a -
in this case) is used to indicate a so-called "sticky bit" or
to indicate that the file is a directory, or a soft link.
When the "sticky bit" is set, the file will be saved on the swap partition
for faster execution. The next three indicate the owner's
permissions (in this case "tribbel"). The 'r' is read, the
'w' write, and the - can be used for 'x' which means execute.
The next three are for the group's permission (staff). Same
story here. The group can only read (r) this file.
The last three are for everyone who is not the owner, and
not in the group staff.
1 - The number of links to a file or directory. We'll get to this
later.
tribbel - The owner of the file.
staff - The group who "owns" this file.
200059 - The size of the file in bytes.
May 30 17:41 - The date and time the file was last modified.
hamlet.all.txt - The filename.
Here's one for a directory:
tribbel:~/docs% ls -ldi culture
98691 drwxr-xr-x 3 tribbel staff 1024 Sep 6 16:24 culture
As you can see, the first character in the permissions list is a "d".
This indicates that it is a directory. Also, note that it is executable.
When you change to a directory it is "executed", so if it is not executable
you can't change to that directory. Directories take up space, depending
on how much subdirectories and files it has. Usually it takes up 1024 bytes
(1 kilobyte).
The `3', which indicates the number of hard links, indicates the number of
subdirectories for a directory. The directory itself (called `.') and the
underlying directory (called `..') are counted too. This directory has
one "real" subdirectory.
Hard links
Now for the famous hard link. A hard link is fairly simple. It is basically
a duplicate of an inode. Normally when you copy a file it becomes something
like this:
INODE 234 -> DATA-START .... DATA-END
INODE 235 -> DATA-START .... DATA-END
There are now two inodes, pointing to two different places on the harddrive
with the same data. A hardlink works like so:
INODE 234 -> DATA-START .... DATA-END
INODE 235 _____/^
The data will be stored only once on the harddrive, but there are now two
inodes pointing to it. Simple, eh?
Soft links
Another concept is the soft link, or symbolic link.
These are much like hard links, only different (well, duh).
This is a visualization of a soft link:
INODE 234 -> DATA-START .... DATA-END
INODE 235 -> DATA-START ->234 DATA-END
The two inodes now point to different parts of the hard drive, but the
filesystem makes you believe they point to the same place. The softlink
can be deleted at will, and the original file will not be altered.
When you "edit a softlink" you will actually be editing the original file.
Back to Linux for Monkeys