Little Step

Recently I listen a podcase related to JASON and XML, and I got to know the interesting fact that JavaScript is a compromise between Netscape and Java, and is definitely related to Java.

Full link: JSON vs XML With Douglas Crockford

Originally it(JavaScript) wasn’t supposed to be called JavaScript. It was going to be called Moca, and there was a tension between Sun and Netscape.

So Sun had been claiming that if you write to the Java virtual machine, it doesn’t matter what operating system you’re running on, and that means we can be liberated from Microsoft. And Netscape said, “If you target all of your applications to the browser, the browser can run on all of the operating systems, so you’re no longer dependent on Microsoft.”

So they decided to have an alliance, and the first thing they agreed on was that Netscape would put Java into the Netscape browser, so they did, in the form of applets. So you could write applets in Java and they would run in the Netscape browser.

The next thing Sun demanded was, you have to kill Mocha, which I think by that time had been renamed LiveScript, because you’re making this look bad. We’re telling everybody that Java is the last programming language you’ll ever need, and you have this stupid looking thing called LiveScript. Why are you doing that? This is just confusion.

So Netscape thought they could do a similar thing for their navigator browser that, if they could get people programming in the same way that they did on HyperCard, on the browser, but now they can have photographs and color and maybe sound effects, it could be a lot more interesting, and you can’t do that in Java.

But Sun was not happy about this. They said, “We thought we agreed that Java was going to be how you script the web.” And Netscape probably said, “Listen, we can’t rebuild everything to make it centered around the JVM. That’s too much work and this scripting thing, we have works and is beginner-friendly.”

And so, they were at an impasse and their alliance almost broke when someone, it might have been Marc Andreessen, it might have been a joke, suggested that they changed the name of LiveScript to JavaScript, And we’ll tell people it’s not a different language, it’s a subset of Java. It’s just this little reduced version of Java, it’s Java’s stupid little brother. It’s the same thing. It’s not a different thing. And Sun said, “Yeah, okay.” And they held a press conference and they went out and they lied to the world about what JavaScript was, and that’s why the language has this stupid confusing name.

#json #javascript

As we all know that there is a this pointer for all objects, but if we want to use smart pointer, how to we use.

It turns out that it's not that easy by creating a share_ptr from this pointer, since we may have many shared_ptrs points to the same object with without knowing each other.

Taking this code as an example, if we create a share_ptr by dangerous function, we will have two share_ptrs pointing to the same object. Both sp1 and sp2 point to same object, and when the two exits it leads problem.

struct S
{
  shared_ptr<S> dangerous() {
    return shared_ptr<S>(this);  // don't do this!
  }
};

int main() {
  shared_ptr<S> sp1(new S);
  shared_ptr<S> sp2 = sp->dangerous();
  return 0;
}

How to fix this problem, use the std::enable_shared_from_this help class defined in, it is introduced in C++11.

struct S: enable_shared_from_this<S> {
  shared_ptr<S> dangerous() {
    return shared_from_this();
  }
};

int main() {
  shared_ptr<S> sp1(new S);
  shared_ptr<S> sp2 = sp->dangerous();  // not dangerous

  return 0;
}

#cpp #share_ptr

There a lot of tasks in computer that are not appropriate for working synchronizing, such as media transcoding.

For these tasks, the work will be submitted to a central controlled service, and then dispatched to many workers.

Message queue is commonly used in this case for communication during controller and workers, and sometimes even used to implemented task queue.

RQ is a very simple task queue with Redis, and is easy using, you even don't need to write specific code for workers.

The producer in RQ side enqueues the data and object to Redis queue, the object is serialized with Python's pickle lib. And the worker retries task from the Redis queue, and deserializes the job and fork one process to do the actual work.

There is a simple example to show the simpliciy for the logic.

You have one function to do the real job, such as:

# fib.py
def slow_fib(n):
    if n <= 1:
        return 1
    else:
        return slow_fib(n-1) + slow_fib(n-2)

And you create jobs and enqueue:

# run_example.py
import os
import time

from rq import Connection, Queue

from fib import slow_fib


def main():
    # Range of Fibonacci numbers to compute
    fib_range = range(20, 34)

    # Kick off the tasks asynchronously
    async_results = {}
    q = Queue()
    for x in fib_range:
        async_results[x] = q.enqueue(slow_fib, x)

    start_time = time.time()
    done = False
    while not done:
        os.system('clear')
        print('Asynchronously: (now = %.2f)' % (time.time() - start_time,))
        done = True
        for x in fib_range:
            result = async_results[x].return_value
            if result is None:
                done = False
                result = '(calculating)'
            print('fib(%d) = %s' % (x, result))
        print('')
        print('To start the actual in the background, run a worker:')
        print('    python examples/run_worker.py')
        time.sleep(0.2)

    print('Done')


if __name__ == '__main__':
    # Tell RQ what Redis connection to use
    with Connection():
        main()

On the producer side, you run like this:

python3 run_example.py
Asynchronously: (now = 8.04)
fib(20) = 10946
fib(21) = 17711
fib(22) = 28657
fib(23) = 46368
fib(24) = 75025
fib(25) = 121393
fib(26) = 196418
fib(27) = 317811
fib(28) = 514229
fib(29) = 832040
fib(30) = 1346269
fib(31) = 2178309
fib(32) = 3524578
fib(33) = 5702887

To start the actual in the background, run a worker:
    python examples/run_worker.py
Done

On the worker side, you only need this(but make sure this command executed at the same directory of fib.py):

rqworker
15:36:32 Worker rq:worker:bd9fbdd72217489288bcf6c47e499f9c: started, version 1.11.0
15:36:32 Subscribing to channel rq:pubsub:bd9fbdd72217489288bcf6c47e499f9c
15:36:32 *** Listening on default...
15:36:32 Cleaning registries for queue: default
15:36:32 default: fib.slow_fib(20) (0c5dc1dd-b8b8-4a23-9220-1f4f03781c53)
15:36:32 default: Job OK (0c5dc1dd-b8b8-4a23-9220-1f4f03781c53)
15:36:32 Result is kept for 500 seconds

But there are two things we need to pay attention, a) the actually function for the task need to be in a separate file(fib.py in this case), and b)rqworker need to be executed under the same source directory of producer(run_example.py in this case).

#rq #redis #taskq

We have a IOT device with several SIM cards aggregated to provided the communication for the upper layer applications.

During the test, we got several communications problems, after a short time debug and contacted with operators, we get to know that is the IOT DNS restrictions.

Say we have two SIM cards, and we want to access example.com.

We send DNS messages via the first one, and after that we can access the site via the first SIM card. If we access the site via the second card, the communication will be blocked. Because from the second card of view, the source is unknown.

To get around of it, the simple solution is to do name resolve periodically from all the SIM cards.

Ping is the first one came to mind, it has a -I option to specify the iterface/address.

-I interface
       interface  is  either an address, or an interface name.  If interface is an address, it sets source address to specified interface address.  If interface in an
       interface name, it sets source interface to specified interface.  For IPv6, when doing ping to a link-local scope address, link specification (by the '%'-nota‐
       tion in destination, or by this option) is required.

But it doesn't work, it only ensure ICMP message other than DNS message, even we use domain as the target.

The other one is dig, which has the similar option as ping:

-b address[#port]
    Set the source IP address of the query. The address must be a valid address on one of the host's network interfaces, or "0.0.0.0" or "::". An
    optional port may be specified by appending "#<port>"

It ensure DNS message via the specific address, and after periodicall dig the problem is solved.

#IOT #dig #DNS

Today I learn a trick on nixCraft to save a file without root permission in vim.

The simple case is like this:

You want to change a config file using vim, and after you edit, you got a permission problem, you don't have the permission.

So how to keep the current edit and save the change, the simple command is :w !sudo tee %.

Explanation as below:

:w – Write a file (actually buffer).
!sudo – Call shell with sudo command.
tee – The output of write (vim :w) command redirected using tee.
% – The % is nothing but current file name.

But what about Emacs, how do we do this in Emacs, we also got one:

C-x C-f /sudo::/path/to/file

It uses Tramp module to do the same thing.

#vim #emacs

Recently we got a disk write problem during our service running.

The dmesg log is as following:

[08:32:30 2021] NET: Registered protocol family 38
[08:32:30 2021] EXT4-fs (dm-0): warning: mounting fs with errors, running e2fsck is recommended
[08:32:30 2021] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null)
[08:32:30 2021] ata2.00: exception Emask 0x10 SAct 0x2 SErr 0x800000 action 0x6 frozen
[08:32:30 2021] ata2.00: irq_stat 0x08000000, interface fatal error
[08:32:30 2021] ata2: SError: { LinkSeq }
[08:32:30 2021] ata2.00: failed command: READ FPDMA QUEUED
[08:32:30 2021] ata2.00: cmd 60/08:08:78:19:c1/00:00:12:00:00/40 tag 1 ncq dma 4096 in
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:32:30 2021] ata2.00: status: { DRDY }
[08:32:30 2021] ata2: hard resetting link
[08:32:31 2021] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[08:32:31 2021] ata2.00: configured for UDMA/133
[08:32:31 2021] ata2: EH complete
[08:32:31 2021] ata2.00: Enabling discard_zeroes_data
[08:33:08 2021] ata2.00: exception Emask 0x0 SAct 0xc0000 SErr 0x400001 action 0x6 frozen
[08:33:08 2021] ata2: SError: { RecovData Handshk }
[08:33:08 2021] ata2.00: failed command: READ FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 60/08:90:00:1d:c1/00:00:12:00:00/40 tag 18 ncq dma 4096 in
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 61/08:98:00:18:c4/00:00:6f:00:00/40 tag 19 ncq dma 4096 out
                         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2: hard resetting link
[08:33:08 2021] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[08:33:08 2021] ata2.00: configured for UDMA/133
[08:33:08 2021] ata2.00: device reported invalid CHS sector 0
[08:33:08 2021] ata2: EH complete
[08:33:08 2021] ata2.00: Enabling discard_zeroes_data
[08:33:08 2021] ata2.00: exception Emask 0x10 SAct 0xc00000 SErr 0x400100 action 0x6 frozen
[08:33:08 2021] ata2.00: irq_stat 0x08000000, interface fatal error
[08:33:08 2021] ata2: SError: { UnrecovData Handshk }
[08:33:08 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 61/08:b0:00:18:c4/00:00:6f:00:00/40 tag 22 ncq dma 4096 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2.00: failed command: READ FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 60/08:b8:00:1d:c1/00:00:12:00:00/40 tag 23 ncq dma 4096 in
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2: hard resetting link
[08:33:08 2021] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[08:33:08 2021] ata2.00: configured for UDMA/133
[08:33:08 2021] ata2: EH complete
[08:33:08 2021] ata2.00: Enabling discard_zeroes_data
[08:33:08 2021] ata2: limiting SATA link speed to 1.5 Gbps
[08:33:08 2021] ata2.00: exception Emask 0x10 SAct 0x3fc SErr 0x400100 action 0x6 frozen
[08:33:08 2021] ata2.00: irq_stat 0x08000000, interface fatal error
[08:33:08 2021] ata2: SError: { UnrecovData Handshk }
[08:33:08 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 61/30:10:10:18:c4/00:00:6f:00:00/40 tag 2 ncq dma 24576 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 61/18:18:40:18:c4/00:00:6f:00:00/40 tag 3 ncq dma 12288 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:08 2021] ata2.00: cmd 61/08:20:60:18:c4/00:00:6f:00:00/40 tag 4 ncq dma 4096 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:08 2021] ata2.00: status: { DRDY }
[08:33:08 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:09 2021] ata2.00: cmd 61/08:28:68:18:c4/00:00:6f:00:00/40 tag 5 ncq dma 4096 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:09 2021] ata2.00: status: { DRDY }
[08:33:09 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:09 2021] ata2.00: cmd 61/08:30:58:18:c4/00:00:6f:00:00/40 tag 6 ncq dma 4096 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:09 2021] ata2.00: status: { DRDY }
[08:33:09 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:09 2021] ata2.00: cmd 61/38:38:70:18:c4/00:00:6f:00:00/40 tag 7 ncq dma 28672 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:09 2021] ata2.00: status: { DRDY }
[08:33:09 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:09 2021] ata2.00: cmd 61/08:40:a8:18:c4/00:00:6f:00:00/40 tag 8 ncq dma 4096 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:09 2021] ata2.00: status: { DRDY }
[08:33:09 2021] ata2.00: failed command: WRITE FPDMA QUEUED
[08:33:09 2021] ata2.00: cmd 61/20:48:b8:18:c4/00:00:6f:00:00/40 tag 9 ncq dma 16384 out
                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
[08:33:09 2021] ata2.00: status: { DRDY }
[08:33:09 2021] ata2: hard resetting link
[08:33:09 2021] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[08:33:09 2021] ata2.00: configured for UDMA/133
[08:33:09 2021] ata2: EH complete
[08:33:09 2021] ata2.00: Enabling discard_zeroes_data

The default speed for SATA is 6.0 Gbps, but during the device running, something hardware problem happens, and the original speed is not met.

After several handshakes, the speed is limited to 1.5 Gbps.

The whole procedure is normal for disk problem, but it takes 39 seconds(from 08:32:30 to 08:33:09), and during this time, the disk is blocked and programs can't write data to the disk.

It certainly is a hardware problem, maybe caused by some dust in the hard disk interface or due to violent vibration, but how can we mitigate this problem in the system level?

We checked the normal write speed of the disk and found the lowest speed(1.5Gbps) is enough for our usage. So the simple way is the limit the SATA speed at this speed to reduce the handshake times when hardware problem happends.

To implement this limit we can add libata.force option to kernel:

GRUB_CMDLINE_LINUX="libata.force=1.5"

#sata #linux #disk

How to use

We have a device running Ubuntu and will be powered off directly without shutdown gracefully.

In order to keep the file system from damage, we use the OverlayFS provided in Linux kernel since 3.18.

The usage of OverlayFS is very simple, as following:

# first install overlayroot package
$ sudo apt-get install overlayroot

# second, change the config file /etc/overlayroot.conf
# the simple config is as following
# we enable swap, and disable recurse overlay
$ cat /etc/overlayroot.conf
overlayroot="tmpfs:swap=1,recurse=0"

After rebooting, we should see something like this:

$ df -h
Filesystem              Size  Used Avail Use% Mounted on
udev                     16G  8.0K   16G   1% /dev
tmpfs                   3.1G   74M  3.1G   3% /run
/dev/sda3                96G   16G   76G  17% /media/root-ro
tmpfs-root               16G   60M   16G   1% /media/root-rw
overlayroot              16G   60M   16G   1% /
tmpfs                    16G   24K   16G   1% /dev/shm
tmpfs                   5.0M  4.0K  5.0M   1% /run/lock
tmpfs                    16G     0   16G   0% /sys/fs/cgroup

$ mount
[...]
configfs on /sys/kernel/config type configfs (rw,relatime)
overlayroot on /var/cache/apt/archives type overlay (rw,relatime,lowerdir=/media/root-ro,upperdir=/media/root-rw/overlay,workdir=/media/root-rw/overlay-workdir/_)
overlayroot on /opt/var/cache/apt/archives type overlay (rw,relatime,lowerdir=/media/root-ro,upperdir=/media/root-rw/overlay,workdir=/media/root-rw/overlay-workdir/_)
overlayroot on /var/lib/apt/lists type overlay (rw,relatime,lowerdir=/media/root-ro,upperdir=/media/root-rw/overlay,workdir=/media/root-rw/overlay-workdir/_)
overlayroot on /opt/var/lib/apt/lists type overlay (rw,relatime,lowerdir=/media/root-ro,upperdir=/media/root-rw/overlay,workdir=/media/root-rw/overlay-workdir/_)

How to disable

When you change some file under OverlayFS, and after the reboot, the file will keep the same.

But sometimes you do want to change the original file, how to disable this feature?

We have these methods:

  • Remount the disk with rw, and change the file under lowerdir
  • Use overlayroot-chroot tool provided by the package
  • Disable OverlayFS when booting
  • Disable by overlayroot.conf

Remount

If you just want to change some files, this is very direct, remount the block device and change the file under lowerdir.

# remount with read-write
$ sudo mount -o remount,rw /dev/sda3

# say we want to change overlayroot.conf
# note: we must change under the file under the lowerdir: /media/root-ro
$ sudo vim /media/root-ro/etc/overlayroot.conf

# remount with read-only
$ sudo mount -o remount,ro /dev/sda3

Use overlayroot-chroot

If we want to install some package and keep the package after reboot, we can't use the first method, since the package may change many files under different directories.

We still have a simple way, you run overlayroot-chroot with root, make changes, and the changes will be saved after reboot.

$ sudo overlayroot-chroot

The change may not take effect immediately after exit the command, you can mount again like this:

$ sudo mount -o remount /

Disable OverlayFS when booting

The overlayroot-chroot method may solve 90% of the problem, but it do has some limitation.

The overlayroot-chroot just like chroot into the lower filesystem, and remount with writable.

If you have some scripts-say postinstall in some package, checks the chroot mode, it may refuse to execute under this case.

To fix this problem, we can disable OverlayFS during the booting phase.

We can edit the boot command line, append overlayroot=disabled and boot again.

linux /vmlinuz-4.15.0-123-lowlatency root=UUID=bfb40993-3xxxx ro systemd.unit=multi-user.target overlayroot=disabled

Under this case, the OverlayFS will be disabled completely.

Disable by overlayroot.conf

We can disable OverlayFS during the boot time, but if we want to keep it disabled after several reboots, we have a simple way.

You can change the overlayroot.conf config file using the remount method, and comment all the lines, and then reboot again.

If we want to enable, we can remount and un-comment the config lines, and after reboot, the OverlayFS will be enabled agian.

$ cat /media/root-ro/etc/overlayroot.conf
# overlayroot="tmpfs:swap=1,recurse=0"

#overlayfs #linux

I have several related packages in different git repositories, and each repository has several branches. It is a headache to package all of these packages during release period.

So I decided to create a new repository including all of the packages, and package the single branch at one time.

The simple project structure, we have several packages in pkg directory, and each has its own build.sh.

Since it is very simple, don't want to just add one Makefile for the whole thing, the content is like:

CWD:=$(shell pwd)
BUILD:=$(CWD)/build
PACKAGE_NAME:= app-$(shell date "+%Y%m%d%H%M").tar.gz

package:
        for pkg in $(ls -1 $(CWD)/pkgs); do \
                echo "### starting build package: $pkg..."; \
                $(CWD)/pkgs/$pkg/build.sh  $(BUILD); \
                echo "### finish build package $pkg"; \
                echo ; \
        done

And after I run make, I got no success.

for pkg in ; do \
        echo "### starting build package: kg..."; \
        /home/xxx/app/pkgs/kg/build.sh  /home/xxx/app/build; \
        echo "### finish build package kg"; \
        echo ; \
done

The ls command and $pkg are not executed/expanded correctly.

After some search I know that the shell commands in makefile may be invoked in one shell, and the statements will be expanded twice for shell commands, so we need double dollar signsfor shell variables.

Essentially, gmake scans the command-line for shell built-ins (like for and if) and “shell special characters” (like | and &). If none of these are present in the command-line, gmake will avoid the overhead of the shell invocation by invoking the command directly (literally just using execve to run the command).
[...]
gmake expands command-lines before executing them.

Command expansion is why you can use gmake features like variables (eg, $@) and functions (eg, $(foreach)) in the recipe. It is also why you must use double dollar signs if you want to reference shell variables in your recipe...

The correct statement is:

for pkg in $$(ls -1 $(CWD)/pkgs); do \
        echo "### starting build package: $$pkg..."; \
        $(CWD)/pkgs/$$pkg/build.sh  $(BUILD); \
        echo "### finish build package $$pkg"; \
        echo ; \
done

Since CWD and BUILD are variables in Makefile, so there are referenced with single dollar signs. And ls and pkg are variables in shell, theses variables are referenced with double dollar signs.

#shell #makefile

I am using ppscheck to monitor with PPS status recently, and found a problem for the tool.

The task is quite simple. It checks the PPS status, with ppscheck, queries the output periodically, and it is done with Python subprocess.

Sample code is like this:

cmd = "sudo ppscheck /dev/ttyS0"
args = shlex.split(cmd)
proc = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
fd = proc.stdout.fileno()

poller = select.epoll()
poller.register(fd, select.EPOLLIN)

I create a process and register the stdout fd to epoll, but after the running the fd is not readable.

I try to execute the command by shell, the output is normal. And then I redirect all the outputs to one file, but find the file is empty!

# running OK
$ sudo ppscheck /dev/ttyS0

# redirect all output to file, file is empty
$ sudo ppscheck /dev/ttyS0 > pps.log 2>&1
$ cat pps.log
$

The first thing came to me is that the ppscheck tool buffers the output, and furthermore the mode is not line buffered.

And after some search I find stdbuf can be use to change the buffer mode of a program.

After add “stdbuf -oL” to the command, and the problem is solved.

cmd = "sudo stdbuf -oL ppscheck /dev/ttyS0"

And now I am wondering how stdbuf is implemented to achieve this goal.

It may read the output of the program, and when receive newline and then make a flush() call? Then I realize it can't be done in this way, since the original program is buffered, you will wait until the buffer is full and produce the outputs.

So I check the source code of stdbuf, the main code is as following:

/* main function */
if (! set_libstdbuf_options ())
  {
    error (0, 0, _("you must specify a buffering mode option"));
    usage (EXIT_CANCELED);
  }

/* Try to preload libstdbuf first from the same path as
   stdbuf is running from.  */
set_program_path (program_name);
if (!program_path)
  program_path = xstrdup (PKGLIBDIR);  /* Need to init to non-NULL.  */
set_LD_PRELOAD ();
free (program_path);

execvp (*argv, argv);

int exit_status = errno == ENOENT ? EXIT_ENOENT : EXIT_CANNOT_INVOKE;
error (0, errno, _("failed to run command %s"), quote (argv[0]));
return exit_status;

/* set_libstdbuf_options */
if (*stdbuf[i].optarg == 'L')
  ret = asprintf (&var, "%s%c=L", "_STDBUF_",
                  toupper (stdbuf[i].optc));
else
  ret = asprintf (&var, "%s%c=%" PRIuMAX, "_STDBUF_",
                  toupper (stdbuf[i].optc),
                  (uintmax_t) stdbuf[i].size);
if (ret < 0)
  xalloc_die ();

if (putenv (var) != 0)
{
  die (EXIT_CANCELED, errno,
       _("failed to update the environment with %s"),
       quote (var));
}

The above logic is very simple, basically it only set buffer options, and then executed the program with execvp.

The key part is how the buffer mode is set, the program use a nice way to archive this, via LD_PRELOAD trick.

The stdbuf has a nother part called libstd.

Stdbuf set two kinds of environment variables before run the program. The first one is the buffer mode, variables are _STDBUF_I, _STDBUF_O and _STDBUF_E for stdin, stdout and stderr. The second one is the LD_PRELOAD environment variable, adding libstdbuf. And after libstdbuf is loaded, the buffer mode is set, the code is as following:

/* Use __attribute to avoid elision of __attribute__ on SUNPRO_C etc.  */
static void __attribute ((constructor))
stdbuf (void)
{
  char *e_mode = getenv ("_STDBUF_E");
  char *i_mode = getenv ("_STDBUF_I");
  char *o_mode = getenv ("_STDBUF_O");
  if (e_mode) /* Do first so can write errors to stderr  */
    apply_mode (stderr, e_mode);
  if (i_mode)
    apply_mode (stdin, i_mode);
  if (o_mode)
    apply_mode (stdout, o_mode);
}

/* core part of apply_mode */
if (setvbuf (stream, buf, setvbuf_mode, size) != 0)
  {
    fprintf (stderr, _("could not set buffering of %s to mode %s\n"),
             fileno_to_name (fileno (stream)), mode);
    free (buf);
  }

The __attribute ((constructor)) is the entry point of shared library in GCC.

So when libstdbuf.so is loaded, stdbuf function is called and buffer mode is set.

And after that, the original program is executed normally.

#stdbuf #libstdbuf #ld #LD\_PRELOAD #ppscheck #coreutil

Expect is a very useful tool to automate work with interactive applications, while also it is a very old. It is based on Tcl language which was created at 1988, and is not seen much nowdays except in test domain.

The expect script is written with Tcl, and the syntax is a little wired compared with other popular language nowdays.

The only scenario I used expect many years ago was to automatic ssh login. And after I know how to use SSH key to login, I abandoned it.

I came up to expect recently because of the same scenario, SSH login. I need to ssh jump several times to reach the target server, and due to the system restriction, the ssh key can't be saved permently and will lost after reboot.

The scenario is a little complicated compared with my old case. In order to know how to better write the expect script, I read the Exploring Expect: A Tcl-based Toolkit for Automating Interactive Programs book. The book is nice-written and worth reading if you want to learn little about Tcl or expect.

Here some tips I learn from the book.

The first import thing is how to debug

The simple ways is to use -d option. Here are several ways:

# 1: add -d with expect command
$ expect -d sample.exp

# 2: add -d at the first line of expect script
#!/usr/bin/env expect -d

# 3: add exp_internal 1 in the script
spawn telnet abc.net
exp_internal 1

expect "Login: "
send "don\r"
expect "Password: "
send "swordfish\r"

# 4: you can also use -D 1 option for expect to trigger gdb liked debugger
$ expect -D1 sample.exp
1: expect "hi\n"

dbg1.0>

# 5: use strace <level> to print statments before excuted
# like the set -x in shell
expect -c "strace 4" sample.exp

Expect with pattern and actions

You can use expect with may patterns and actions, just like switch in C:

expect {
    "hi" { send "You said hi\n"}
    "hello" { send "Hello yourself\n"}
    "bye" { send "That was unexpected\n"}

    # a special pattern default(without quotation) is for timeout and EOF
    default {send "timeout or eof\n"}
}

Expect command support both globing and regex for pattern matching. The options are -gl- and -re, the default is globing(-gl).

The matched string is saved in expect_out(0,string), and any matching and previously unmatched output is saved in variable expect_out(buffer).

The command exp_continue allows expect itself to continue executing rather than returning as it normally would.

Spawn new process

You can spawn command to create new process like this:

spawn ftp abc.net
expect "Name"
send "anonymous\r"
expect "Password:"
send "don@libes.com\r"

But if the command is dynamic, a variable for example, you need to use with eval command. Eval in expect is like eval in shell, it will expand the variable and execute the command.

#!/usr/local/bin/expect --
set timeout [lindex $argv 0]

# spawn the command from argv with eval
eval spawn [lrange $argv 1 end]
expect

Interact with spawn process and continue

The simple usage for the interact it to return the control to the user.

Actually interact provides the functions like expect with patterns and actions.

Simple example is:

spawn ftp abc.net
# ...

interact {
    "~d"        {puts [exec date]}
    "~e"        exit
    "foo"       {puts "bar"}
}

When use input “~d”, date command will be executed, and the result is echoed, and so on.

It also provides the function to break or continue execution like this:

while {1} {
    interact "+" break "-" continue
}

In the above loop, if a user presses “+“, the interact returns and the loop breaks. If the “–” is pressed, the interact returns, and the while loop continues.

Signal handling

The trap is used to handle singal, simple example is like this:

trap intproc SIGINT
trap {
    send_user "bye bye"
    exit
} SIGINT

And a special singal SIGWCH is for window size change, the handler is like this:

trap {
    set rows [stty rows]
    set cols [stty columns]
    stty rows $rows columns $cols < $spawn_out(slave,name)
} WINCH

#tcl #expect #ssh