UNIX Power Tools

UNIX Power ToolsSearch this book
Previous: 18.1 What's So Complicated About Copying Files? Chapter 18
Linking, Renaming, and Copying Files
Next: 18.3 Files with Two or More Names
 

18.2 What's Really in a Directory

Before you can understand moving and copying files, you need to know a bit more about how files are represented in directories. What does it mean to say that a file is really "in" a directory? It's easy to imagine that files are actually inside of something (some special chunk of the disk that's called a directory). But that's precisely wrong, and it's one place where the filing cabinet model (1.19) of a filesystem doesn't apply.

A directory really is just another file, and really isn't different from any other data file. If you want to prove this, try the command od -c .; on many UNIX systems, it dumps the current directory to the screen in raw form. It will certainly look ugly (it's not a text file - it has lots of binary characters). But, if your system allows it, od -c (25.7) should let you see the names of the files that are in the current directory [and, probably, some names of files that have been deleted! Sorry, they're only the old directory entries; you can't get the files back (23.2). -JP]. If od -c doesn't work, use ls -if instead.

So a directory is really just a list of files. It contains filenames and inode numbers (1.22). That is, we can visualize a directory like this:

The file named    .          is inode 34346
The file named    ..         is inode 987
The file named    mr.ed      is inode 10674
The file named    joe.txt    is inode 8767
The file named    grok       is inode 67871
The file named    otherdir   is inode 2345

So when you give a filename like grok, the kernel looks up grok in the current directory and finds out that this file has inode 67871; it looks up this inode to find out who owns the file, where the data blocks are, and so on.

What's more, some of these "files" may be directories in their own right. In particular, that's true of the first two entries: . and ... These entries are in every directory. Single . just refers to the current directory, while double .. refers to the "parent" of the current directory (i.e., the directory that "contains" the current directory). The file otherdir is yet another directory that happens to be "within" the current directory. But there's no way you can tell that from its directory entry-UNIX doesn't know it's different until it looks up its inode.

Now that you know what a directory is, let's think about some basic operations. What does it mean to move, or rename, a file? If the file is staying in the same directory, the mv command just changes the file's name in the directory; it doesn't touch the data at all.

Moving a file into another directory takes a little more work, but not much. A command like mv dir1/foo dir2/foo means "delete foo's entry in dir1, and create a new entry for foo in dir2." Again, UNIX doesn't have to touch the data blocks or the inode at all.

The only time you actually need to copy data is if you're moving a file into another filesystem. In that case, you have to copy the file to the new filesystem; delete its old directory entry; return the file's data blocks to the "free list," which means that they can be re-used; and so on. It's a fairly complicated operation, but (still) relatively rare. (On some old versions of UNIX, mv won't let you move files between filesystems.)

Now let's see if you've understood. How does UNIX find out the name of the current directory? In our "current directory," there's an entry for ., which tells us that the current directory has inode 34346. Is the directory's name part of the inode? Sorry - it isn't. The directory's name is included in the parent directory. The parent directory is .., which is inode 987. So UNIX looks up inode 987, finds out where the data is, and starts reading every entry in the parent directory. Sooner or later, it will find one that corresponds to inode 34346. When it does that, it knows that it has found the directory entry for the current directory, and can read its name. Article 14.4 has a diagram and more explanation.

Complicated? Yes, but if you understand this, you have a pretty good idea of how UNIX directories work.

- ML


Previous: 18.1 What's So Complicated About Copying Files? UNIX Power ToolsNext: 18.3 Files with Two or More Names
18.1 What's So Complicated About Copying Files? Book Index18.3 Files with Two or More Names

The UNIX CD Bookshelf NavigationThe UNIX CD BookshelfUNIX Power ToolsUNIX in a NutshellLearning the vi Editorsed & awkLearning the Korn ShellLearning the UNIX Operating System