SDB:SCSI Device Identification

Jump to: navigation, search


With sg3_utils 1.48 and later (openSUSE Tumbleweed since 10/2023), unreliable SCSI device identifiers are not used, and /dev/disk/by-id symlinks for such identifiers are not created. It is possible to change this policy by editing 00-scsi-sg3_config.rules.

SCSI device identification and SCSI symlink generation in sg3_utils 1.48

With the update of sg3_utils to version 1.48, SCSI device identification and SCSI symlink generation by udev changes in Tumbleweed.

Referring to disk devices by their device node (like /dev/sda) is unreliable. In most situations, it is recommended to refer to devices by their content, i.e. using file system UUIDs or file system labels. Sometimes, however, the devices must be referred to by hardware properties. For this purpose, udev maintains various symlinks under /dev/disk/by-id. This article is specifically about SCSI symlinks, which have the format /dev/disk/by-id/scsi-*.

If you don't use dm-multipath, and don't use /dev/disk/by-id/scsi-* identifiers to refer to any disks or partitions anywhere, you can stop reading here.

SCSI device identifiers

SCSI devices can be identified in different ways by the contents of SCSI VPD (Vital Product Data) pages. One or more of the following identifiers may occur (see SCSI specification "SCSI Primary Commands" (SPC) 6, §7.7.6 for details):

  1. NAA registered extended: 36... (+ 31 hex digits)
  2. NAA registered: 35... (+ 15 hex digits)
  3. NAA extended: 32... (+ 15 hex digits)
  4. EUI-64: 2... (+ 16 hex digits)
  5. SCSI name string: 8... (+ utf-8 string)
  6. NAA Local ("L"): 33... (+ 15 hex digits)
  7. T10 vendor-ID ("T"): 1... (+ string)
  8. Vendor-specific ("V") 0... (+ string)
  9. Serial number ("S"): S... (+ vendor/product/serial string)

The list above is sorted roughly top-down by the reliability (uniqueness) of these identifiers.

When udev processes events for SCSI devices, it determines the ID_SERIAL property (which is used e.g. by multipathd to detect identical devices) from these identifieres, and creates /dev/disk/by-id/scsi-$ID symlinks. This happens in the rules files 55-scsi-sg3_id.rules and 58-scsi-sg3_symlink.rules.

Where to look for these identifiers

SCSI device identifiers can occur an various places where block devices need to be referenced. The following incomplete list should be a good starting point:

  • /etc/fstab,
  • grub2 configuration files: /etc/default/grub, /boot/grub2/grub.cfg,
  • configuration files of other bootloaders like sd-boot,
  • /etc/lvm/lvm.conf,
  • /etc/crypttab,
  • /etc/mdadm.conf,
  • /etc/smartd.conf,
  • /etc/hdparm.conf,
  • libvirt configuration: disk devices with type="block", storage pool definitions,
  • storage configuation for other virtualization tools like VirtualBox or vagrant,
  • target configuration: backstores for targetcli, nvmetcli, or tgt,
  • manually written scripts or udev rules,
  • configuration files of 3rd party tools handling block devices, for example SCST or EMC PowerPath.

In /etc/fstab and bootloader configuration files, it should be possible to avoid the use of hardware identifiers completely, and use filesystem identifiers (by-uuid, by-label) instead. It is highly recommended to do this wherever possible.

Some tools may embed device identifiers in the device metadata of other devices. For example, mkfs.f2fs is known to do this if a multi-device F2FS file system is created.

If none of the identifiers 6.-9. above are used in any these places on your system(s), and if you don't use dm-multipath, you are probably on the safe side, and you can stop reading here. Otherwise, please read on.

Historical policy for device identification and symlink creation

On all openSUSE and SUSE distributions with sg3_utils before 1.48 (including SLE 15 and Leap 15.x), all identifiers from the list above are used. This means they are tried (top-down) for determining ID_SERIAL, and symlinks are created for each identifier that is defined for a given device. This ensures that whatever identifier a user has chosen for referring to a device will be available as symlink under /dev/disk/by-id.

Reasons for changing this policy

The identifiers 6)-9) are not reliable; there can be multiple devices in the same system that use the same identifier, although they don't represent the same physical disk. This applies to 9) in particular: Some large storage arrays use the same serial number for all LUNs they expose. Using such identifiers for multipath can cause fatal data corruption. Moreover, the creation of these symlinks during boot can slow down booting a lot on large systems, where thousands of devices may be contending for the same symlink. Lastly, not creating useless symlinks tidies up /dev/disk/by-id and make it easier to read.

New policy with sg3_utils 1.48 and later (10/2023)

sg3_utils 1.48 changes the policy for SCSI device identification (note: sg3_utils-1.48~20221101, a predecessor of 1.48, does not have this change). For determining ID_SERIAL only 1)-5) above are tried. Likewise, only symbolic links for identifiers of type 1)-5) are created. Under normal circumstances, this shouldn't have negative consequences, because SCSI devices manufactured in the last 15 years usually support one or more of the "reliable" identifiers 1)-5).

Possible problems and workarounds

In rare situations, a system may contain ancient or broken devices that don't have a reliable identifer, and be configured such that some of these devices are referred to by one of the unreliable hardware identifiers (no issue will occur if the devices are referenced e.g. by filesystem UUID). Such devices would appear missing in the system after updating sg3_utils to 1.48. For such cases, the policy change must be reverted.

The policy can be customized by editing the udev rule 00-scsi-sg3_config.rules (to change the file, copy it from /usr/lib/udev/rules.d to /etc/udev/rules.d first), or using kernel command line parameters.

Device identification in virtual machines

libvirt's <serial> element, which is traditionally used for for device identification in virtual machines, translates into an unreliable device ID, which will be ignored by default. Use the <wwn> element instead. Likewise, in qemu, use device=scsi-hd,wwn= instead of device=scsi-hd,serial=. Note that the value of <wwn> must consist of 16 hex digits.

Customizing ID_SERIAL sources

If users need to use additional identifiers except 1)-5), they can set the udev property ".SCSI_ID_SERIAL_SRC" (leading "." is intentional) to any combination of the capital letters "LTVS" (see the list above). E.g. "ST" would mean to take 7) and 9) into account. To restore the pre-1.48 behavior, set ENV{.SCSI_ID_SERIAL_SRC}="LTVS". Alternatively, use the kernel boot parameter "udev.scsi_id_serial_src=LTVS".

The current upstream default is "T", so that 7) above is still enabled. To disable all "unreliable" IDs, set ENV{.SCSI_ID_SERIAL_SRC}="". If udev encounters a device for which ID_SERIAL can't be assigned, it prints a warning to the journal.

Note: regardless of this option, the udev rules always choose the "best" available identifier for any given device. Thus if ENV{.SCSI_ID_SERIAL_SRC}="T" is set, the T10 vendor ID (7) will be used, but only for devices that have none of the identifiers 1)-5).

Customizing symlink generation

Symlink generation is controlled by the variable ".SCSI_SYMLINK_SRC". It can also be set to any combination of the letters "LTVS", with the same meaning of the letters as above, and controls which "scsi-..." symlinks udev will create under /dev/disk/by-id/, in addition to the symlinks for the identifiers 1)-5) and the symlink /dev/disk/by-id/scsi-$ID_SERIAL, which will always be created. Again, to restore the previous behavior of the udev rules, set ENV{.SCSI_SYMLINK_SRC}="LTVS", or use the kernel boot parameter "udev.scsi_symlink_src=LTVS".

Notes

  • The symlink generation applies to both disks and partitions. Partition symlinks have a "-part$P" suffix appended to the name of the symlink of their parent disk.
  • ATA and USB devices have additional identifiers generated by the transport (/dev/disk/by-id/ata-* and /dev/disk/by-id/usb-*), which are unaffected by this change. These devices often don't expose proper SCSI IDs, but they actually don't need them because the ATA or USB identifiers are unique.