Aug 23, 2020

Linux: The time Command

The time command is used to time a running program along with its resource usage. The time command runs the specified program, together with its arguments. The command prints to stderr the time taken by the program together with the sub-command’s resource usage. For example, the following is the output when timing a sleep command:

$/usr/bin/time sleep 10
0.00user 0.00system 0:10.00elapsed 0%CPU (0avgtext+0avgdata 1956maxresident)k
0inputs+0outputs (0major+75minor)pagefaults 0swaps

Although being a very simple command, time provides a lot of information to analyze the behaviour of a program and can be useful in serious observability workflows. The time command is usually run first on a program to get a general idea about its runtime and resource usage, before using other heavy observability tools. By default, on most Linux systems(using the command present at /usr/bin/ instead of the shell built-in), the corresponding metrics are print out to stderr:

Elapsed time - The time between the invocation of the program and its termination
User CPU time - The time the program spent in user mode
System CPU time - The time the program spent in kernel mode
CPU Percentage - The percentage of CPU the program used during its lifetime
Average Text Segment Shared size - The average shared text segment of the program in Kbytes
Average Unshared Data Segment Unshared size - The average unshared data segment of the program in kbytes
Maximum RSS size - The maximum resident set size of the program during its lifetime in Kbytes
Filesystem Inputs count - The number of filesystem inputs performed by the program
Filesystem Outputs count - The number of filesystem outputs performed by the program
Major Page Faults count - The number of major page faults generated by the program. This results in the kernel reading the page from disk
Minor Page Faults count - The number of minor page faults generated by the program. This results in the kernel only updating the page table entries for the program as the requested page is already in memory
Swap count - The number of time the program has been swapped out to the disk due to memory constraints

To get the output that is Posix portable, the switch -p can be used. When run with this option, the time command prints only the elapsed time, user CPU time and the system CPU time. This can be seen below:

$/usr/bin/time -p sleep 10
real 10.00
user 0.00
sys 0.00

The exit status of the time command is the exit status of the sub-command that was run. It is 127 if the requested command is not found, 128 if the command was found and not invoked and if something else entirely went wrong, the return value is between 1-125.

Now, let us try to time two different algorithms to see which is more efficient with regards to CPU. Both the algorithms presented below are sorting algorithms and are CPU bound tasks. Without suspense, one algorithm is Bubble Sort which runs in O(N^2) time and the other is Quick Sort which runs in O(N lg N) time.

To generate numbers to sort, we can generate the input file using a script like below:

#!/bin/bash

start_range=$1
end_range=$2
count=$3
echo $count
shuf -i $start_range-$end_range -r -n $count

And we will use the following command to generate 100000 numbers in the range 0-1000000:

$./generate_numbers.sh 0 1000000 100000 > numbers.out
$wc -l numbers.out
100001 numbers.out

The number of lines is 100001 since we also add the count of the numbers as the first line, as required by the sorting programs. And now running the Bubble Sort on the numbers, we get the following metrics:

$/usr/bin/time ./sort1 < numbers.out
139.80user 0.00system 2:19.84elapsed 99%CPU (0avgtext+0avgdata 3600maxresident)k
0inputs+0outputs (0major+228minor)pagefaults 0swaps

And, running the same input on Quick Sort:

$/usr/bin/time ./sort2 < numbers.out
0.08user 0.00system 0:00.09elapsed 98%CPU (0avgtext+0avgdata 3624maxresident)k
0inputs+0outputs (0major+229minor)pagefaults 0swaps

We can now easily see how different the algorithms are, with respect to the same input. Although, the above does not come much of a surprise, many times you want to run similar algorithms to see which is faster. This command will also help in situations, where you have two black box programs and you want to find the more efficient one to use.

Next, let us analyze the memory usage of a two programs. Both the program that we will time, will try to solve the same usecase - searching for a number in a very long list. Both the programs will store the input in a std::vector<int>, and during the life of the program, we want to search for a number in the given input.

The first program will do a linear scan on the std::vector<int>, constructed out on the input, and try to find for a presence of a number. The pattern will be similar to the one below:

// Read input and store them in std::vector<int>
int num_to_search = some_num;
// numbers is a std::vector<int>
for (const int n : numbers) {
  if (n == num_to_search) {
    // num_to_search is found
  }
}

The other program will use an in-memory unordered map to hold the numbers to do a lookup, to determine if the requested number is present or not. The high level strategy would be as follows:

// Read input and store them in std::vector<int>
int num_to_search = some_num;
// numbers_map is a global std::unordered_map<int, bool>
if (numbers_map.find(num) != numbers.end()) {
  // num_to_search is found
}

We shall now generate 1000000 numbers as shown below:

$./generate_numbers.sh 0 1000000 1000000 > numbers.out
$wc -l numbers.out
1000001 numbers.out

And running time for the first program, the output is as follows:

$/usr/bin/time ./search1 < numbers.out
0.34user 0.01system 0:00.36elapsed 100%CPU (0avgtext+0avgdata 7084maxresident)k
0inputs+0outputs (0major+1110minor)pagefaults 0swaps

Again running time for the second program, the output is now:

$/usr/bin/time ./search2 < numbers.out
1.34user 0.06system 0:01.41elapsed 99%CPU (0avgtext+0avgdata 32312maxresident)k
0inputs+0outputs (0major+8757minor)pagefaults 0swaps

Clearly, we can make out the second program that takes more memory(7084 Kb vs 32312 Kb) and is less efficient in terms of the space used. This can easily be done for large programs to determine the memory hoggers which might warrant using other specialized tools to find the exact cause of the large memory usage.

Let us now use time to check the I/O done by the program. In this scenario, let us read/write to a file and check the counters on the inputs/outputs metrics. We shall use the below program to perform our input/output tests:

#include <iostream>
#include <limits>

#include <fcntl.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
    const unsigned int buf_size = 1024;
    std::string file_name = "some_tmp_file";
    char read_buffer[buf_size];
    int fd = open(file_name.c_str(),
            O_RDWR | O_APPEND | O_CREAT,
            S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
    std::string data = "This is some sample data";
    int n_written = write(fd, data.c_str(), data.size());
    std::cout << "Written " << n_written << " bytes to " << file_name
        << std::endl;
    lseek(fd, 0, SEEK_SET);
    int n_read = read(fd, read_buffer, buf_size);
    std::cout << "Read " << n_read << " bytes from " << file_name
        << std::endl;
    close(fd);
    return 0;
}

In the above, we are writing to the file some_tmp_file using the write() system call and reading the written bytes using the read() system call. Now, timing the above program, time produces the following output:

$g++ read_write.cpp -o read_write
$/usr/bin/time ./read_write
Written 24 bytes to some_tmp_file
Read 24 bytes from some_tmp_file
0.00user 0.00system 0:00.00elapsed 50%CPU (0avgtext+0avgdata 3500maxresident)k
0inputs+8outputs (0major+132minor)pagefaults 0swaps

Let us try to work out how 0 was captured in inputs and 8 for outputs, by running the time command on the program above. The inputs/outputs reported by time is the number of blocks written to or read from the disk. On my machine, a single block size is 512 Kb. And the OS reads and writes data in the units of a page size, and here on my machine the page size 4096 Kb and hence 4096/512 = 8 is being captured in the output.

So, why is 0 captured in the inputs? This is because the input from the file some_tmp_file is read from the page cache and not from the disk. The inputs/outputs of the time command captures only the inputs/outputs from the filesystem. Let us change our program to only read from the file and comment out the write part to the same file. The output after the change:

$g++ read_write.cpp -o read_write
$/usr/bin/time ./read_write
Read 24 bytes from some_tmp_file
0.00user 0.00system 0:00.00elapsed 100%CPU (0avgtext+0avgdata 3448maxresident)k
0inputs+0outputs (0major+132minor)pagefaults 0swaps

So, the inputs is still 0? Yes, this is expected since, the read again hits the page cache - the page being bought into the cache during the first access. The only way it will be taken off the page cache is when the system needs more memory when running processes or when we clear the page cache manually. Here, we will manually clear the page cache and then try again. This is done as below (Note, we are dropping all cache entires in the system and not just for the above program), before again running the program:

$free
              total        used        free      shared  buff/cache   available
Mem:       32887568      290460    28512824        1140     4084284    32117516
Swap:             0           0           0

$sudo sysctl -w vm.drop_caches=1

$free
              total        used        free      shared  buff/cache   available
Mem:       32887568      295724    32236100        1140      355744    32139788
Swap:             0           0           0

$/usr/bin/time ./read_write
Read 24 bytes from some_tmp_file
0.00user 0.00system 0:00.01elapsed 54%CPU (0avgtext+0avgdata 3352maxresident)k
632inputs+0outputs (6major+126minor)pagefaults 0swaps

In the above, not only is the input from the file some_tmp_file bought in from the disk, other files too are mapped into memory (for e.g. the text/data section of the executable, shared libraries etc.), since we cleared the entire cache in the system, and hence the high count. This also explains the high major page faults.

time Internals

Now, let us look at how the time command may be implemented, within Linux. We shall not go too much into the inner details but will look at it from a higher level as to how it works and what system calls it utilizes.

The main idea of implementing a custom_time command is to do a fork() to create a child and then follow it up with an exec() call to run the sub-command to time. And once the sub-command finishes, use the getrusage() system call to get the resource usage of the child and print it to stderr. And finally, return the exit status of the sub-command that was run, as the return status of the custom_time command. Also, instead of doing a wait() and then calling getrusage(), the above can be done in a single call using the wait3() or the wait4() system call.

A high level pattern in code will look like:

// Other code
pid_t pid = fork();
if (pid == -1) {
  // Error. Do something
} else if (pid == 0) {
  // Child Process
  execl("/usr/bin/sleep", "/usr/bin/sleep", "10", NULL);
} else {
  // Parent Process
  int status;
  rusage usage;
  int ret = wait4(pid, &status, 0, &usage);
  // Use usage
}

Here, we exec() into the program that we want to time. The parent then uses wait4() to wait and get the resource usage of the child which was under observation. The detailed fields of the rusage struct can be seen here. A thing to note — the rusage struct does not capture the total elapsed time. You need to surround the exec() and the wait4() call with some sort of time function like:

start = time();
# ...
execl("/usr/bin/sleep", "/usr/bin/sleep", "10", NULL);
# ...
int ret = wait4(pid, &status, 0, &usage);
end = time();
elapsed = end - start; // For some unit
# ...

This is a very good blog post to help you with choosing the right method to time your calls.

If you are interested to check out the source of the time command, you can check out Linux’s implementation here.

That’s it. Hope this post made it clear about the use of the time command. It is always useful to use the time command to quickly time a program and check it’s resource usage. The time command is usually used as the first step to get a general idea before turning to more heavy observability tools.

For any discussion, tweet here.

[1] https://man7.org/linux/man-pages/man1/time.1.html
[2] https://man7.org/linux/man-pages/man2/wait4.2.html
[3] https://man7.org/linux/man-pages/man2/getrusage.2.html
[4] https://git.savannah.gnu.org/gitweb/?p=time.git;a=tree;f=src;h=99622ce9a13a64d2369e2344ae7626a928b36217;hb=HEAD