Home Wiki > SDB:Basics of partitions, filesystems, mount points
Sign up | Login

SDB:Basics of partitions, filesystems, mount points

tagline: From openSUSE

Situation

When it comes to installing openSUSE or to adding/removing of disks to an existing system, people start talking about mounting and umounting, and about filesystems and partitions. Some people have little real idea what all these words mean in a Unix/Linux environment which means that discussing problems in this area is full of potential misunderstandings.

Many people do have some knowledge based on familiarity with other Operating Systems. That increases the possibility of confusion.

Procedure

The real procedure to cope with this situation is to spend some time in reading some documentation. This SDB hopes to be a primer and a starting point for further reading.

All examples shown below were made by the superuser (root).

Disk

In this document, the word disk will be used as a generic term for anything that is seen by the system as a disk drive, whether it is a flash device that looks like a disk, an optical drive or a 'real' hard disk, or even a combination of several devices.

Because newer types of mass storage devices present themselves to the system as disks, they fall into this category. The systems can not see if a disk is inside the system enclosure, beside it, on a table, or on another shelf, so there is no difference between what some people call internal and external disks.

Disks are normally supplied preformatted by the manufacurer. That means that damaged tracks are marked and replacements used, sector numbers are written. This is done to interface with the controler hard-/firmware. This type of formatting is also known as low-level formatting.

(Disk) Partitions.

Unix systems always have had the possibility of assembling a storage sub-system from several partitions. These can be partitions spread across several physical devices, or can be multiple partitions on one physical device. Every disk must have at least one partition, but often there are more. We will limit ourselves here to the partitioning as done on disks in a PC environment.

In the beginning, when disks had little space, there was the idea that four partitions would be enough for all possible usage (which it was not). The data about the partioning is written in the partition table. It is on a special place on the disk and the BIOS knows all about it.

When it became very apparent that four partitions was not going to be sufficient, a backward-compatible solution was found: make one of the four a special partition, which can hold more partitions. Newer BIOSes (nowadays all BIOSes) know about it. Thus we have 3 types of partitions:

  • primary partitions, these are the partitions of old, there can be four of them (numbered 1, 2, 3, 4);
  • extended partition (in fact also a primary partition), there can only be one, it has normally the number after the highest created primary partition (3 or 4) and should contain all of the remaining space of the disk (otherwise space is wasted);
  • secondary partitions, they are created inside the extended partition, their numers are always 5 and higher (the maximum is debatable, but enough for most people).

Some people are bewildered by the extended partition, it seems to be of no use and takes a lot of disk space and that space seems also to belong to other useful partitions; of course, the extended partition is acting as a container for the secondary partitions, and so space shows up in both the extended partition and the secondary partitions that it contains. Reading the above might reassure them.

An example of a disk with partitions of all three types goes here:

beneden:~ # fdisk -l
Disk /dev/sda: 200.0 GB, 200049647616 bytes
255 heads, 63 sectors/track, 24321 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x1c841c84

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1         574     4604008+   c  W95 FAT32 (LBA)
Partition 1 does not end on cylinder boundary.
/dev/sda2   *         575        8967    67416772+   7  HPFS/NTFS
/dev/sda3            8968       24321   123331005    f  W95 Ext'd (LBA)
/dev/sda5            8968        9229     2104483+  82  Linux swap / Solaris
/dev/sda6            9230       11840    20972826   83  Linux
/dev/sda7           11841       24321   100253601   83  Linux

beneden:~ # 
  • Partition 1 is a restore partition put there by the manufacturer.
  • Partition 2 contains a Windows XP system (this system calls it the C: drive).
  • Partition 3 is the extended partition it runs until the end of the disk and holds the following three:
  • Partition 5 is the openSUSE swap partition.
  • Partition 6 is the openSUSE root partition.
  • Partition 7 is the openSUSE /home partition.

Partition naming

In the partition table partitions are simply numbered as we saw above.

A special warning here about GRUB. Grub has its own naming/numbering schema and the most important thing to know is that GRUB starts all disk and partition counting at 0 and not at 1!

When the Linux system starts, it scans the hardware and when it finds disks and partitions it gives them names. These names are to be found as the names of the device special files in /dev/. Disk will have names like hda, hdb, hdc, ... or sda, sdb, sdc, ... and partitions of hda will be hda1, hda2, ... and those of sda will be sda1, sda2, ... where the number is the partition number.

What are these device special files? A special type of file that is associated with hardware. I/O to the hardware is done by reading from or writing to this file. This is the file metaphor in Unix/Linux. There are two types of these files: character device special file and block device special file. For disks we do I/O in blocks. Of course this is too long to pronounce so most people talk about device files. Normally they are created in the /dev/ directory (nowadays dynamically by udev). Here is what they look like when you list them:

boven:/dev # ls -l sda*
brw-r----- 1 root disk 8, 0 Apr 27 09:45 sda
brw-r----- 1 root disk 8, 1 Apr 27 09:45 sda1
brw-r----- 1 root disk 8, 2 Apr 27 09:45 sda2
brw-r----- 1 root disk 8, 3 Apr 27 09:45 sda3
boven:/dev #

Most things will be familiar to you. The b indicates a block device special file. Normal users are not allowed to read/write direct from/to the disk, otherwise all security would be futile. The number 8 (called the major number) is the number the kernel uses internally for the driver that works with these devices (so it will not send tape commands to disks) and the 03 (the minor number) tell them apart from each other inside that driver.

Until recently these /dev/sdb2, etc. were used to mount the partitions and when you give the command mount (shows what is mounted) you will see them used.

boven:~ # mount
/dev/sda2 on / type ext3 (rw,acl,user_xattr)
/dev/sda3 on /home type ext3 (rw)
  ...
boven:~ #

This is based on the fact that on every boot the system finds the disks in the same sequence. If this were not to occur, sda and sdb may be exchanged and the wrong partitions mounted (and the user become very confused)! This can happen not only at boot, but when a disk is connected to a running system it will get the 'next' name (e.g. sdc). So, for example, adding an external USB drive to a system, is not guaranteed to get the same mount point every time, which is inconvenient.

To abolish this confusion, we would need there to be a unique identifier for the disk. And there is. There are even more. Look at /dev/disk/. There are several directories there:

boven:~ # ls -l /dev/disk
total 0
drwxr-xr-x 2 root root 360 Jul  5  2008 by-id
drwxr-xr-x 2 root root 100 Jul  5  2008 by-label
drwxr-xr-x 2 root root 220 Jul  5  2008 by-path
drwxr-xr-x 2 root root  80 Jul  5  2008 by-uuid
boven:~ #

Look inside these directories. Every one of them has some or all of the partitions, identified in a different way. And each of these files is not a device special file, but a link to a device special files that we saw earlier. When you look in /etc/fstab:

 
boven:~ # cat /etc/fstab
/dev/disk/by-id/scsi-SATA_Hitachi_HDT7250_VFJ201R23XUEXW-part2 /                    ext3       acl,user_xattr        1 1
/dev/disk/by-id/scsi-SATA_Hitachi_HDT7250_VFJ201R23XUEXW-part3 /home                ext3       defaults              1 2
/dev/disk/by-id/scsi-SATA_Hitachi_HDT7250_VFJ201R23XUEXW-part1 swap                 swap       defaults              0 0
  ...
boven:~ # 

You will see that openSUSE nowadays uses the /dev/disk/by-id/ names to see that the correct partitions are mounted:

boven:~ # ls -l /dev/disk/by-id
lrwxrwxrwx 1 root root  9 Jul  5  2008 scsi-SATA_Hitachi_HDT7250_VFJ201R23XUEXW -> ../../sda
lrwxrwxrwx 1 root root 10 Jul  5  2008 scsi-SATA_Hitachi_HDT7250_VFJ201R23XUEXW-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Jul  5  2008 scsi-SATA_Hitachi_HDT7250_VFJ201R23XUEXW-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Jul  5  2008 scsi-SATA_Hitachi_HDT7250_VFJ201R23XUEXW-part3 -> ../../sda3
  ...
boven:~ # 

Filesystems Organisation (short: fs)

Now we have one or more partitions and we want to use them. A few programs, which make heavy use of disk I/O, can use raw partitions, but mostly we want to put data organised in directories/files, or use it for swap. For the usage as a filesystem (or swap) an empty filsystem must be created on the partition. Many people call this formatting, but in Unix/Linux you create or make a filesystem and the tool is mkfs. We will not go into details here, but what goes into an empty filesystem (inodes, etc.) depends on the filesystem type. A ReiserFS is different from an ext2 or ext3 fs. You have to choose, but the defaults are usually reasonable.

You do not need to call the mkfs of the fs type you want yourself. YaST wil do this for you.

What is important here is that Linux can not only access filesystem types that are designed for it, but also types that are used by MS/DOS ... Vista sytems like FAT and NTFS. But there are restrictions. FAT has restrictions on filename length. Neither FAT nor NTFS have the Unix/Linux permission bits and this means that sometimes access is restricted without proper handling (filesystem options at mount time).

Before we start to use the partitions, first some explanation about the terminology and concepts used in Unix/Linux in this area.

Unix/Linux systems view all of the disk area as one hierarchic tree starting from the root (/). This is contrary to the MS/DOS .... Vista concept, in which every disk (partition) has it own hierarchic tree (starting at A:, B:, C:, D:, etc.) and, while some people regard the Unix approach as confusing, it does bring significant advantages in advanced operations.

A directory is a special type of file that holds the descriptions of other files. All directories are nodes in the hierarchic tree. From other OSs and from the desktop metaphor words like map and folder are also used by end users, but the Unix/Linux word is directory.

The root of the hierarchic tree is thus also a directory. It is called the root directory or short root. (This is the same word, but not the same entity as the root user). When used on its own, the root directory is written /.

The mount point is the place in the hierarchic tree where a filesystem is mounted. This point can be anywhere in the (already mounted) tree (no need to keep oneself restricted to mount points in /mount/ or /media/). It is a directory. Every directory can be used as a mount point. But be careful. When a directory already has files, the mounting of an fs on it will make these files unreachable! Only after unmounting they will be reachable again (they stay where they were and still occupy space on the disk).

Some system managers make use of this. When creating a mountpoint they will make a file in it with a useful name:

mkdir mountpoint
touch mountpoint/The-fs-is-not-mounted
mount /dev/... mountpoint
ls mountpoint

ls will now show the contents belonging to the mounted fs. When the fs is not mounted, the same ls will show the file The-fs-is-not-mounted.

Filesystems Hierarchy

There is a standard organisation for the filesystems on Linux (see reference Linux Filesystem Hierarchy), but this can be extended and manipulated.

A lot of mounting is done high in the hierarchy. e.g. On the highest level there can (and must) only be one:

  • / (this is mounted readonly very early in the startup sequence, otherwise no tools would be available, not even mount, and later it is re-mounted read/write)

One level down:

  • /home (very often a separate mount point)
  • /tmp
  • /bin (mounted over the network for a very small system)
  • /sbin (NO, do not try this one, /sbin/ contains those things like mount needed at system startup before the rest of the mounting is done!)

Further down:

  • /mnt/... (nice place for a system manager to mount things during maintanance)
  • /media/... (conventional place for CDs, DVDs and the like)

And according one's needs:

  • /home/movielover/Documents/movies/archive (A partition for this user only. When it is full he is on his own. Let him pay for an extra disk).

It is worth noting that under /, it will be normal to see a directory /home; this will look the same to a casual observer whether this is a separate partition mounted here, or is a just a subdivison of /. Particularly in server/high performance usage, it can be worthwhile for some areas of the tree to have different performance characteristics from others (raid arrays, logging of the access times of files, different filesystems types (ext2, ext3, ext4, etc). By carefully planning how the filesystem is laid out, this fine-grained control can be achieved.

For any mount operation to succeed, the existing tree must provide the point on to which the new sub-structure is mounted; this means that success of mounting is order-dependant.

Note also that if the normal behaviour whereby a new mount cloaks existing files in the mount directory is not desired, it is possible to merge two sets of files on different partitions, by using a special filesystem type (unionfs, aufs).

The mount tool and the fstab configuration file

The very act of (un)mounting is of course done by the kernel. The tool that the system manager uses to interact with the kernel for this is mount. Read man mount for more information. The tool needs basically four things:

  • which fs type to mount (with the -v option)
  • options that influence the fs, which are different for each fs type (with the -o option)
  • what to mount (for a local mount this is the block special device file of the partition)
  • where to mount (the mount point directory)

That can be a lot of typing when you are testing, but /etc/fstab comes in handy here. When you specify all those parameters in /etc/fstab, mount only needs either the mount point or the device special file. What is not specified is found in /etc/fstab.

But the real usage of /etc/fstab is that it is used at system startup to mount all those partitions that you always need. And do not put the line for /home/movielover/Documents/movies/archive before the line for /home in /etc/fstab.

The umount tool only needs the mountpoint or the device special file.

From static to dynamic

We are almost leaving the basics now, so we will touch this only briefly here.

All the above is about where and how mounting is done at system start-up (boot). But nowadays many mass storage devices such as CDs/DVDs and USB sticks are connected to and disconnected from the system at the whim of the user, in the role of operator. This mentioning of the "operator" is not just for fun. Linux is a multi-user system and that means that there can be more users logged in at any time. So when the "operator" inserts a CD or an USB stick, which user should be given access?

At the time of writing this is not a solved issue.

Now what must happen when a device is dynamically added and who does what? First we must have device special files. Udev is doing this in the same fashion and to the same rules as it does at system start-up. The kernel signals the udevd daemon when new hardware becomes available. After you connect an USB stick you will find new entries in /dev/disk/. When e.g. the partition on the stick has a label, you will find it in /dev/disk/by-label/.

So far for the device special files. But now for the mounting. Here we have HAL which also runs as a daemon. It gets a signal from udevd. It uses /media/. It mounts everything there. It has its administration there. This is not configurable. When a the partition to be mounted has a label (e.g. Backup), it will use that label and will mount at /media/Backup/. When there is no label it will mount using the device type as a name and thus mount at /media/cdrom/ or /media/disk/. It will add numbers to avoid double names (/media/disk-1). Thus you can not get your wandering home directory mounted inside /home/ with HAL!

HAL will also signal the desktop, which will act in its own particular way. So in KDE3 this will be different from Gnome, etc. Also when there are different Desktop logins at the same time (or other programs that registered with HAL), it is not clear who will get the signal and will be the owner of the mountpoint!

Tools

This being a primer we do not talk much about tools, but some are mentioned (e.g. fdisk, mkfs, mount). As always, they and configuration files like /etc/fstab have their man-pages on your system (and on the Internet). They are there to be read by you. While Yast > System > Partitioner is a usefull GUI, you may have to revert to CLI commands in runlevel 3 or single user mode because you need filesystems being unmounted while working on them. For the same reason you might even have to boot from a rescue CD/DVD running fdisk, and friends, from there (else you would not have an unmounted /). Bootable GUI tools like Gparted are a step further. They know a lot about the inerrnals of filesystems and can thus change their size, but this is far beyond the scope of this document.

Links

<keyword>partition,filesystem,mountpoint,udev,HAL</keyword>