Tuesday, August 26, 2014

Set or Unset a particular bit in an integer - A short program

Many times we need a method to set and unset a bit without having condition construct. It is easy to execute with 'if' construct. Just for fun, is it possible without having a conditional construct? Here is program I wrote

#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>

uint32_t
set_unset_bit (uint32_t src, uint32_t pos, bool set)
{
    return ((src & ~((uint32_t)1 << (pos-1))) | (!!set << (pos-1)));
}

int main (int argc, char **argv)
{
    uint32_t test1 = 0;
    uint32_t test2 = 0xFFFFFFFF;

    printf("%x\n", set_unset_bit(test1, 10, 1));
    printf("%x\n", set_unset_bit(test2, 11, 0));
}


This works for uint32_t types and as expected.  What was the approach? Basically you need to set the bit of interest to zero and OR with whatever option user requested for (set or unset i.e. 1 or 0). The first part "(src & ~((uint32_t)1 << (pos-1)))" forcibly sets bit of requested position 'pos' to zero while the next construct "| (!!set << (pos-1))" sets or unsets based on the requirement. Hope you enjoy the small code snippet.

Why do you think '!!set' required? Read my previous article here

Output:

nandakumar@heramba ~/setbit $ ./a.out
200
fffffbff

Final note: This works only for uint32_t however can be expanded to other types too! For uint64_t, make sure to cast '1' to uint64_t!

Sunday, August 24, 2014

A simple read-write lock implementation - Problem 1

Here I implemented a very basic read-write lock based on pthread libraries. You may be wondering as how this below code works?

void
my_rwlock_write_lock (my_rwlock_t *rw_lock)
{
    if (!rw_lock) {
        return;
    }

    pthread_mutex_lock(&rw_lock->rw_mutex);
    if (rw_lock->num_readers) {
        pthread_cond_wait(&rw_lock->rw_cond, &rw_lock->rw_mutex);
    }
}

Hell, where is the unlock? Does it not cause permanent deadlock?

If you look at the manpage of pthread_cond_wait(), the function atomically unlocks the mutex and waits on condition variable. So there is no need to worry!

Now coming to the problem. As you are aware, the previous implementation was running on continuous loop and hence readers and writers were grabbing their time-slices properly (even though not fair). Now, let me remove for-loops from all the thread routines and examine the output.

nandakumar@heramba ~/rwlock $ ./a.out
Grabbed read-lock -- 1
Grabbed read-lock -- 2
Grabbed write-lock -- 1


Voila, only one write lock gets woken up while the other is permanently waiting. What is the reason? There is no one to wake him up! pthread_cond_signal() wakes up only one thread and hence only one of the writer thread gets woken up. How to get around this problem? Simple, wake up all writer threads waiting on the lock. In that case, the writer threads will be scheduled as per scheduling policy but none of them will be blocked on condition.

The modified part of read_unlock is below.

void
my_rwlock_read_unlock (my_rwlock_t *rw_lock)
{
    if (!rw_lock) {
        return;
    }

    pthread_mutex_lock(&rw_lock->rw_mutex);
    if (rw_lock->num_readers) {
        rw_lock->num_readers--;
        if (!rw_lock->num_readers) {
            pthread_cond_broadcast(&rw_lock->rw_cond);
        }
    }
    pthread_mutex_unlock(&rw_lock->rw_mutex);
}


And here is output:

nandakumar@heramba ~/rwlock $ ./a.out
Grabbed read-lock -- 1
Grabbed read-lock -- 2
Grabbed write-lock -- 1
Grabbed write-lock -- 2


Works as expected! Another way is to signal the thread waiting on condition variable when write lock is unlocked. Here is the modifed code snippet and output. Note that conditional signal is required while unlocking final reader lock

void
my_rwlock_write_unlock (my_rwlock_t *rw_lock)
{
    if (!rw_lock) {
        return;
    }

    pthread_cond_broadcast(&rw_lock->rw_cond);
    pthread_mutex_unlock(&rw_lock->rw_mutex);
}


Output:

nandakumar@heramba ~/rwlock $ gcc my_rwlock.c -lpthread
nandakumar@heramba ~/rwlock $ ./a.out
Grabbed read-lock -- 2
Grabbed read-lock -- 1
Grabbed write-lock -- 2
Grabbed write-lock -- 1


I still feel #1 is better solution since it is reader's responsibility to wake up writing threads.

Finally here is comparison between the initial implementation and one with pthread_cond_broadcast in read_unlock()

Output with pthread_cond_signal:

Grabbed read-lock -- 1
Grabbed read-lock -- 2
Grabbed write-lock -- 1
Grabbed read-lock -- 1
Grabbed read-lock -- 2
Grabbed write-lock -- 2
Grabbed read-lock -- 1
Grabbed read-lock -- 2
Grabbed write-lock -- 1
Grabbed read-lock -- 1
Grabbed read-lock -- 2


Output with pthread_cond_broadcast:

Grabbed read-lock -- 1
Grabbed read-lock -- 2
Grabbed write-lock -- 1
Grabbed read-lock -- 2
Grabbed write-lock -- 2
Grabbed read-lock -- 1
Grabbed write-lock -- 2
Grabbed read-lock -- 2
Grabbed write-lock -- 1
Grabbed read-lock -- 1
Grabbed write-lock -- 2
Grabbed write-lock -- 1
Grabbed read-lock -- 2
Grabbed read-lock -- 1


As you see, the initial implementation does not guarantee fair chance. Also I mentioned about inherent problem above. For second, even though not in order, all of them get fair chance to access shared resource. Also second implementation confirms that no regression is introduced by change!

There may be far better solutions to circumvent the problem. The comment box is always in your service to help me :)

Hope you enjoyed the sequel. By the way, there are still more problems. Stay tuned for further analysis!

Thursday, August 21, 2014

A simple read-write lock implementation

Here is a simple read-write lock I implemented in quick succession as just to understand how pthred_rwlocks would have been implemented. I have not dug deep into the pthread code yet but believe would be having some similar mechanism. This implementation functions as per definition of read-write locks. Note that this is just very basic implementation I coded to satisfy my inquisitiveness. There are problems though with this implementation which I will be documenting further.

How does it work?

i)  The structure consists of mutex, codition variable and reader count.
ii) As soon as reader grabs the lock, the reader count is atomically incremented.
iii) Note that the lock is common for both reader and writer.
iv) If a thread wants to acquire write lock, then it first atomically checks if reader count is zero. If yes, it safely can acquire the lock
v) If there are readers pending, the write lock blocks on condition variable
vi) Once all readers are done with reading, the condition variable woken up signalling the waiting writer thread

Bugs:

If readers take same amount as writers, then writers are severely starved and may not even get a chance to hold the lock. I will simulate this in next write.


For brevity, error checks are offloaded to readers and users!

Here is the code then. Execute, test and enjoy! You are open to provide comments on improvements.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <pthread.h>

typedef struct my_rwlock {
    pthread_cond_t   rw_cond;
    pthread_mutex_t  rw_mutex;
    int              num_readers;
} my_rwlock_t;

#define MY_RWLOCK_INITIALIZER {PTHREAD_COND_INITIALIZER, PTHREAD_MUTEX_INITIALIZER, 0}

my_rwlock_t g_rw_lock = MY_RWLOCK_INITIALIZER;

void
my_rwlock_init (my_rwlock_t *rw_lock)
{
    if (!rw_lock) {
        return;
    }

    memset(rw_lock, 0, sizeof(my_rwlock_t));
}

void
my_rwlock_read_lock (my_rwlock_t *rw_lock)
{
    if (!rw_lock) {
        return;
    }

    pthread_mutex_lock(&rw_lock->rw_mutex);
    rw_lock->num_readers++;
    pthread_mutex_unlock(&rw_lock->rw_mutex);
}

void
my_rwlock_read_unlock (my_rwlock_t *rw_lock)
{
    if (!rw_lock) {
        return;
    }

    pthread_mutex_lock(&rw_lock->rw_mutex);
    if (rw_lock->num_readers) {
        rw_lock->num_readers--;
        if (!rw_lock->num_readers) {
            pthread_cond_signal(&rw_lock->rw_cond);
        }
    }
    pthread_mutex_unlock(&rw_lock->rw_mutex);
}

void
my_rwlock_write_lock (my_rwlock_t *rw_lock)
{
    if (!rw_lock) {
        return;
    }

    pthread_mutex_lock(&rw_lock->rw_mutex);
    if (rw_lock->num_readers) {
        pthread_cond_wait(&rw_lock->rw_cond, &rw_lock->rw_mutex);
    }
}

void
my_rwlock_write_unlock (my_rwlock_t *rw_lock)
{
    if (!rw_lock) {
        return;
    }

    pthread_mutex_unlock(&rw_lock->rw_mutex);
}

void*
thread_routine1 (void *arg)
{
    for (;;) {
        my_rwlock_read_lock(&g_rw_lock);
        printf("Grabbed read-lock -- 1\n");
        sleep(5);
        my_rwlock_read_unlock(&g_rw_lock);
        sleep(5);
    }
}

void*
thread_routine2 (void *arg)
{
    for (;;) {
        my_rwlock_read_lock(&g_rw_lock);
        printf("Grabbed read-lock -- 2\n");
        sleep(8);
        my_rwlock_read_unlock(&g_rw_lock);
        sleep(5);
    }
}

void*
thread_routine3 (void *arg)
{
    for (;;) {
        my_rwlock_write_lock(&g_rw_lock);
        printf("Grabbed write-lock -- 1\n");
        sleep(5);
        my_rwlock_write_unlock(&g_rw_lock);
        sleep(5);
    }
}

void*
thread_routine4 (void *arg)
{
    for (;;) {
        my_rwlock_write_lock(&g_rw_lock);
        printf("Grabbed write-lock -- 2\n");
        sleep(8);
        my_rwlock_write_unlock(&g_rw_lock);
        sleep(5);
    }
}

int
main (int argc, char *argv[])
{
    pthread_t tid[4];

    pthread_create(&tid[0], NULL, thread_routine1, NULL);
    pthread_create(&tid[1], NULL, thread_routine2, NULL);
    pthread_create(&tid[2], NULL, thread_routine3, NULL);
    pthread_create(&tid[3], NULL, thread_routine4, NULL);

    for (;;) {
        sleep(10);
    }
}

Monday, August 18, 2014

The msgbuf structure in SysV message queue

Most of you who are familiar with SysV message queue must be aware of the formatting the message to following weird structure. The weirdest part is none other than the char mtext[1]. What is this unusual stuff? It is nothing but plays the role of expandable structure.

struct msgbuf {
    long mtype;       /* message type, must be > 0 */
    char mtext[1];    /* message data */
};


For example:

If you wish to send a message of 40 bytes, then one needs to create a structure with following size.

sizeof(long)+40;

Of-course, first element should be of type 'long' which represents message type.

Now, why is this weird mtext[1]. It could have been mtext[0]? Simple, SysV IPCs were introduced way long back. There was no C standard of having zero length arrays in structure. That is the sole reason behind mtext[1]. I used to scratch my head for a very long time and it has been eventually settled [hope even louses are out]. If any other reason, please leave comment in the box.