Little Step

linux

Recently we got a disk write problem during our service running.

The dmesg log is as following:

[08:32:30 2021] NET: Registered protocol family 38
[08:32:30 2021] EXT4-fs (dm-0): warning: mounting fs with errors, running e2fsck is recommended
[08:32:30 2021] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null)
[08:32:30 2021] ata2.00: exception Emask 0x10 SAct 0x2 SErr 0x800000 action 0x6 frozen
[08:32:30 2021] ata2.00: irq_stat 0x08000000, interface fatal error
[08:32:30 2021] ata2: SError: { LinkSeq }
[08:32:30 2021] ata2.00: failed command: READ FPDMA QUEUED
[08:32:30 2021] ata2.00: cmd 60/08:08:78:19:c1/00:00:12:00:00/40 tag 1 ncq dma 4096 in
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:32:30 2021] ata2.00: status: { DRDY }
[08:32:30 2021] ata2: hard resetting link
[08:32:31 2021] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[08:32:31 2021] ata2.00: configured for UDMA/133
[08:32:31 2021] ata2: EH complete
[08:32:31 2021] ata2.00: Enabling discard_zeroes_data
[08:33:08 2021] ata2.00: exception Emask 0x0 SAct 0xc0000 SErr 0x400001 action 0x6 frozen
[08:33:08 2021] ata2: SError: { RecovData Handshk }
[08:33:08 2021] ata2.00: failed command: READ FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 60/08:90:00:1d:c1/00:00:12:00:00/40 tag 18 ncq dma 4096 in
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 61/08:98:00:18:c4/00:00:6f:00:00/40 tag 19 ncq dma 4096 out
                         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2: hard resetting link
[08:33:08 2021] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[08:33:08 2021] ata2.00: configured for UDMA/133
[08:33:08 2021] ata2.00: device reported invalid CHS sector 0
[08:33:08 2021] ata2: EH complete
[08:33:08 2021] ata2.00: Enabling discard_zeroes_data
[08:33:08 2021] ata2.00: exception Emask 0x10 SAct 0xc00000 SErr 0x400100 action 0x6 frozen
[08:33:08 2021] ata2.00: irq_stat 0x08000000, interface fatal error
[08:33:08 2021] ata2: SError: { UnrecovData Handshk }
[08:33:08 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 61/08:b0:00:18:c4/00:00:6f:00:00/40 tag 22 ncq dma 4096 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2.00: failed command: READ FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 60/08:b8:00:1d:c1/00:00:12:00:00/40 tag 23 ncq dma 4096 in
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2: hard resetting link
[08:33:08 2021] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[08:33:08 2021] ata2.00: configured for UDMA/133
[08:33:08 2021] ata2: EH complete
[08:33:08 2021] ata2.00: Enabling discard_zeroes_data
[08:33:08 2021] ata2: limiting SATA link speed to 1.5 Gbps
[08:33:08 2021] ata2.00: exception Emask 0x10 SAct 0x3fc SErr 0x400100 action 0x6 frozen
[08:33:08 2021] ata2.00: irq_stat 0x08000000, interface fatal error
[08:33:08 2021] ata2: SError: { UnrecovData Handshk }
[08:33:08 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 61/30:10:10:18:c4/00:00:6f:00:00/40 tag 2 ncq dma 24576 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 61/18:18:40:18:c4/00:00:6f:00:00/40 tag 3 ncq dma 12288 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 61/08:20:60:18:c4/00:00:6f:00:00/40 tag 4 ncq dma 4096 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:09 2021] ata2.00: cmd 61/08:28:68:18:c4/00:00:6f:00:00/40 tag 5 ncq dma 4096 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:09 2021] ata2.00: status: { DRDY }
[08:33:09 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:09 2021] ata2.00: cmd 61/08:30:58:18:c4/00:00:6f:00:00/40 tag 6 ncq dma 4096 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:09 2021] ata2.00: status: { DRDY }
[08:33:09 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:09 2021] ata2.00: cmd 61/38:38:70:18:c4/00:00:6f:00:00/40 tag 7 ncq dma 28672 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:09 2021] ata2.00: status: { DRDY }
[08:33:09 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:09 2021] ata2.00: cmd 61/08:40:a8:18:c4/00:00:6f:00:00/40 tag 8 ncq dma 4096 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:09 2021] ata2.00: status: { DRDY }
[08:33:09 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:09 2021] ata2.00: cmd 61/20:48:b8:18:c4/00:00:6f:00:00/40 tag 9 ncq dma 16384 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:09 2021] ata2.00: status: { DRDY }
[08:33:09 2021] ata2: hard resetting link
[08:33:09 2021] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[08:33:09 2021] ata2.00: configured for UDMA/133
[08:33:09 2021] ata2: EH complete
[08:33:09 2021] ata2.00: Enabling discard_zeroes_data

The default speed for SATA is 6.0 Gbps, but during the device running, something hardware problem happens, and the original speed is not met.

After several handshakes, the speed is limited to 1.5 Gbps.

The whole procedure is normal for disk problem, but it takes 39 seconds(from 08:32:30 to 08:33:09), and during this time, the disk is blocked and programs can't write data to the disk.

It certainly is a hardware problem, maybe caused by some dust in the hard disk interface or due to violent vibration, but how can we mitigate this problem in the system level?

We checked the normal write speed of the disk and found the lowest speed(1.5Gbps) is enough for our usage. So the simple way is the limit the SATA speed at this speed to reduce the handshake times when hardware problem happends.

To implement this limit we can add libata.force option to kernel:

GRUB_CMDLINE_LINUX="libata.force=1.5"

#sata #linux #disk

How to use

We have a device running Ubuntu and will be powered off directly without shutdown gracefully.

In order to keep the file system from damage, we use the OverlayFS provided in Linux kernel since 3.18.

The usage of OverlayFS is very simple, as following:

# first install overlayroot package
$ sudo apt-get install overlayroot

# second, change the config file /etc/overlayroot.conf
# the simple config is as following
# we enable swap, and disable recurse overlay
$ cat /etc/overlayroot.conf
overlayroot="tmpfs:swap=1,recurse=0"

After rebooting, we should see something like this:

$ df -h
Filesystem              Size  Used Avail Use% Mounted on
udev                     16G  8.0K   16G   1% /dev
tmpfs                   3.1G   74M  3.1G   3% /run
/dev/sda3                96G   16G   76G  17% /media/root-ro
tmpfs-root               16G   60M   16G   1% /media/root-rw
overlayroot              16G   60M   16G   1% /
tmpfs                    16G   24K   16G   1% /dev/shm
tmpfs                   5.0M  4.0K  5.0M   1% /run/lock
tmpfs                    16G     0   16G   0% /sys/fs/cgroup

$ mount
[...]
configfs on /sys/kernel/config type configfs (rw,relatime)
overlayroot on /var/cache/apt/archives type overlay (rw,relatime,lowerdir=/media/root-ro,upperdir=/media/root-rw/overlay,workdir=/media/root-rw/overlay-workdir/_)
overlayroot on /opt/var/cache/apt/archives type overlay (rw,relatime,lowerdir=/media/root-ro,upperdir=/media/root-rw/overlay,workdir=/media/root-rw/overlay-workdir/_)
overlayroot on /var/lib/apt/lists type overlay (rw,relatime,lowerdir=/media/root-ro,upperdir=/media/root-rw/overlay,workdir=/media/root-rw/overlay-workdir/_)
overlayroot on /opt/var/lib/apt/lists type overlay (rw,relatime,lowerdir=/media/root-ro,upperdir=/media/root-rw/overlay,workdir=/media/root-rw/overlay-workdir/_)

How to disable

When you change some file under OverlayFS, and after the reboot, the file will keep the same.

But sometimes you do want to change the original file, how to disable this feature?

We have these methods:

  • Remount the disk with rw, and change the file under lowerdir
  • Use overlayroot-chroot tool provided by the package
  • Disable OverlayFS when booting
  • Disable by overlayroot.conf

Remount

If you just want to change some files, this is very direct, remount the block device and change the file under lowerdir.

# remount with read-write
$ sudo mount -o remount,rw /dev/sda3

# say we want to change overlayroot.conf
# note: we must change under the file under the lowerdir: /media/root-ro
$ sudo vim /media/root-ro/etc/overlayroot.conf

# remount with read-only
$ sudo mount -o remount,ro /dev/sda3

Use overlayroot-chroot

If we want to install some package and keep the package after reboot, we can't use the first method, since the package may change many files under different directories.

We still have a simple way, you run overlayroot-chroot with root, make changes, and the changes will be saved after reboot.

$ sudo overlayroot-chroot

The change may not take effect immediately after exit the command, you can mount again like this:

$ sudo mount -o remount /

Disable OverlayFS when booting

The overlayroot-chroot method may solve 90% of the problem, but it do has some limitation.

The overlayroot-chroot just like chroot into the lower filesystem, and remount with writable.

If you have some scripts-say postinstall in some package, checks the chroot mode, it may refuse to execute under this case.

To fix this problem, we can disable OverlayFS during the booting phase.

We can edit the boot command line, append overlayroot=disabled and boot again.

linux /vmlinuz-4.15.0-123-lowlatency root=UUID=bfb40993-3xxxx ro systemd.unit=multi-user.target overlayroot=disabled

Under this case, the OverlayFS will be disabled completely.

Disable by overlayroot.conf

We can disable OverlayFS during the boot time, but if we want to keep it disabled after several reboots, we have a simple way.

You can change the overlayroot.conf config file using the remount method, and comment all the lines, and then reboot again.

If we want to enable, we can remount and un-comment the config lines, and after reboot, the OverlayFS will be enabled agian.

$ cat /media/root-ro/etc/overlayroot.conf
# overlayroot="tmpfs:swap=1,recurse=0"

#overlayfs #linux

If you write a bash script file, and want to keep the running log, what is the easy way?

You can just echo the log messages to stderr or stdout, and add redirection arguments when calling the script.

You can also define one log function and call it to log messages in the whole script, but it is a little complex.

Recently, I learn a easy way to log messages for simple shell script: using exec builtin command.


# !/bin/env bash

# add this at the begnning of script
exec > /tmp/debug.log 2>&1

# other commands may echo to stderr or stdout
if [ ... ]; then
    echo "success"
else
    echo "fail"
fi

Here is the help for exec command:

exec: exec [-cl] [-a name] [command [arguments ...]] [redirection ...]
    Replace the shell with the given command.

    Execute COMMAND, replacing this shell with the specified program.
    ARGUMENTS become the arguments to COMMAND.  *If COMMAND is not specified,
    any redirections take effect in the current shell.*

    Options:
      -a name	pass NAME as the zeroth argument to COMMAND
      -c	execute COMMAND with an empty environment
      -l	place a dash in the zeroth argument to COMMAND

    If the command cannot be executed, a non-interactive shell exits, unless
    the shell option `execfail' is set.

    Exit Status:
    Returns success unless COMMAND is not found or a redirection error occurs.

Check this message: If COMMAND is not specified, any redirections take effect in the current shell.

So it is very useful for simple script just run several commands for workflow, and especially very convenient to embed script in config files.

#Shell #Bash #Linux