Just expanding jt's good writeup.
A pathname is a filename including its location in a directory tree. This word is used in cases where filename is unclear. For example, suppose I saved a file on my computer and gave it the name
fist.jpg
We usually call "fist.jpg" "the filename" of the file I saved.
But this really is not a complete description. The file probably resides on a
filesystem with a
directory structure, in a specific directory. A complete specification of the file's location must also include the information which directory of which filesystem. A
pathname is such a specification.
A pathname can be absolute or relative. If absolute, it specifies
the file's name and location all by itself; if relative, it specifies it relative to another location that is known in the context.
In Unix variants, all filesystems with available files are united into a single directory tree, using the mount operation. Therefore, if a file is on an available (mounted) filesystem, its location can always be specified as a sequence of directory names plus the filename itself.
The concept and syntax of a pathname is hardcoded into the Unix system calls that deal with files, such as open, stat, read and write. It is interpreted as a walk through the directory tree, as follows:
- the pathname is chopped into components by splitting on the / character;
- an empty component specifies nothing; it is redundant;
- the components specify directory lookups, to be applied from left to right, starting at the root directory if absolute, or a directory known from context if relative
The interpretation of a pathname as a file on the file system is known as
pathname resolution. Every system call that operates on named files does so by accepting a pathname to the file as an argument and resolving it with this algorithm. This means that it is impossible on Unix variants to create, open, read or write files with / in their names.
Some examples:
/ # specifies the root directory
/etc/syslog.conf # looks up "etc" in the root directory,
# then "syslog.conf" in the directory found
../a/./..//b # starting at the directory given by the context,
# consecutively looks up "..", "a", ".", "..", and "b"
In a consistent Unix filesystem, every directory has an entry "." that points to the directory itself, and ".." that points to some other directory in which the directory appears as an entry (or / if the directory is /), and the .. entries do not form any cycles.
Nowadays, the use of . and .. to mean "the same directory" and "the parent directory" is also applied to filesystems that do not actually have . and .. entries in every directory.
Even in such cases, the directories are actually looked up on the filesystem.
This means that while ../a/./..//b can be simplified to ../a/../b, it can not be simplified to ../b: the former will fail if ../a does not exist, is not a directory, or cannot be read, while the latter does not require any of this.
This is a difference with relative URL resolution, which rewrites .. away without actually looking up anything in a directory structure.
The situation is complicated by the existence of symbolic links.
If an entry found during the lookup process is found to be a symlink, the link is followed: its value is another path, evaluated relative to the directory it is in, and that value is substituted, after which (if the value points to a valid directory or another symlink) the resolution process continues.
In Unix, and many other OSes, every piece of code runs in a current context that always defines, among other things, a current directory.
This is the directory against which a relative pathname is resolved.
The value of symlinks however is resolved against the directory in which the symlink appears. For example, the command
cd /; ln -s . home/rp/link; cd /etc; ls -l ../home/rp/link/.
will list the contents of /home/rp, not those of
/ or /etc.