Thursday, December 12, 2013

The 'cd' command

As everyone knows, the 'cd' command in Linux changes the current working directory to new directory. Really?! To be precise it changes the current working directory of the process in interest to new working directory. It cannot change 'cwd' of other processes. So whats the big deal?

Internally 'cd' command uses chdir system call to change current working directory. As mentioned before, 'chdir' can only change 'cwd' of process which is calling it not any other process. So what? :-). Now think of bash which is executing 'cd' command. How does it change its 'cwd' to new directory? Usually other commands like 'ls', 'dir' etc.. are executed with fork+exec combination i.e. by spawning a new process. Now in case of 'cd' you cannot spawn new process since new process cannot change 'cwd' of bash.

Aha! there is interesting part. Now how do you work it around? What does bash do? Bash does this by embedding the implementation of 'cd' in its own executable i.e. 'cd' is a command in bash itself rather than being stand-alone executables like 'ls' or 'dir'. Bash implements 'cd' in itself and exposes it as command in terminal. The user still interprets it as stand alone command because of this bash trick ;-). That means along with other commands enumerated by bash (using PATH variable), it also inserts 'cd' into the pool. Since 'cd' is now part of bash process, the changing to new working directory is straight forward :-). You can check your bin directory if any executable with name 'cd' could be found like one below ;-)

nandakumar@heramba ~ $ which ls || echo -e "get lost :-)"
/bin/ls
nandakumar@heramba ~ $ which cd || echo -e "get lost :-)"
get lost :-)

Even I was not aware of this fact until I recently read System Programming Book by Robert Love. The beauty of book is how Mr.Robert Love presents such minute things so accurately. At the end of day, there was a happy learner!

Thursday, December 5, 2013

Thread local storage with gcc __thread keyword

gcc provides __thread keyword to make a global variable (or in general data segment variable) local to thread.

This may be required when you use want to use thread safe/specific data within your code.

Consider an example:

#include <stdio.h>
#include <pthread.h>

void iterate()
{
        static int i=0;
        i++;
        printf("Thread id: %x, i=%d\n", pthread_self(), i);
}

void* thread_func (void* data)
{
        iterate();
}

int main()
{
        pthread_t tid[5];
        int i=0;

        for (i=0; i<5; i++)
                pthread_create(&tid[i], NULL, thread_func, NULL);

        for (i=0; i<5; i++)
                pthread_join(tid[i], NULL);
}


Here is output:

Thread id: 6ebcc700, i=1
Thread id: 6e3cb700, i=2
Thread id: 6d3c9700, i=4
Thread id: 6cbc8700, i=5
Thread id: 6dbca700, i=3


Static will be part of data segment which is not thread safe. We can make it thread safe by adding __thread keyword. Here is modified snippet of program and rest all things remain same.

<snip>

void iterate()
{
        static __thread int i=0;
        i++;
        printf("Thread id: %x, i=%d\n", pthread_self(), i);
}

<snip>


And the output:

Thread id: 31339700, i=1
Thread id: 2f335700, i=1
Thread id: 30b38700, i=1
Thread id: 30337700, i=1
Thread id: 2fb36700, i=1

As far as I know, __thread keyword can only be used with POC types (Plain old C types) but not hybrid or pointer types (citation needed). In that case, next statement provides an answer to achieve it. Also there is obvious overhead using __thread keyword since it requires some internal manipulation to get the data of particular thread of interest. (A simple dig into the assembly code will reveal IMHO)

There are also pthread_getspecific() and pthread_setspecific() APIs POSIX provides for TLS. Will try to experiment on the same in future.