Friday, March 30, 2012

Validating link local ipv6 address

I have seen in many places using string comparison to validate link local ipv6 address. However I feel this approach is not ideal and may not fit in all use cases.

For ex in c++ you may write:

string s = "fe80::1234";
s.find("fe80::");

something of this sort. However what if IP address in capitals? The above code will fail.

Ok lets take string library approach

strncasecmp(addr, "fe80:",5) == 0;

This may also work fine. However I feel we can do these things better and right way using standard libraries provided by glibc.

If you need to check for string, use inet_pton() and IN6_IS_ADDR_LINKLOCAL() combination to achieve validation. If you have sockaddr_in6 handy, you only need to use IN6_IS_ADDR_LINKLOCAL(). So here is the program I wrote. I feel modularity is maintained in program to some extent. There are two APIs. One takes string as argument while the other takes sockaddr as argument. In second case, sockaddr in casted to sockaddr_in6 and then validated for link-local ipv6 address.

#include <stdio.h>
#include <netinet/in.h>

typedef _Bool BOOL;

#define B_TRUE  1
#define B_FALSE 0

BOOL is_link_local_ip (const struct sockaddr* sock);
BOOL is_link_local_ip_str (const char* address);

static inline BOOL __is_link_local (struct sockaddr_in6* sock)
{
    return (((IN6_IS_ADDR_LINKLOCAL(&sock->sin6_addr)) == 1)
                    ? B_TRUE : B_FALSE);
}

BOOL is_link_local_ip (const struct sockaddr* sock)
{
    if (sock)
        if (sock->sa_family == AF_INET6) {
            struct sockaddr_in6 *temp =
                    (struct sockaddr_in6*)(sock);
            return __is_link_local (temp);
        }
           

    return B_FALSE;
}

BOOL is_link_local_ip_str (const char *address)
{
    struct sockaddr_in6 __in6_addr;

    if (address)
        if (inet_pton(AF_INET6, address, &(__in6_addr.sin6_addr)))
            return __is_link_local (&__in6_addr);

    return B_FALSE;
}

int main ()
{
    const char* ip_addr = "FE80::224:2CFF:FEA1:8a1";

    if (is_link_local_ip_str(ip_addr))
        printf("Address is ipv6 link_local\n");
    else
        printf("Address is not ipv6 link local\n");
}


So here is the output: Address is ipv6 link_local

You can test program with different test vectors. Apart from link-local, there are lot such macros that come in handy. Please refer: http://uw714doc.sco.com/en/man/html.3N/inet.3N.html

As always leave comment if you have better approach

Wednesday, March 21, 2012

Convert warnings to error with GCC

What happens if you execute below code?

int i = 10;
printf("%s Bad format specifier\n", i);


nandakumar@HERAMBA ~ $ gcc format.c
format.c: In function ‘main’:
format.c:6:2: warning: format ‘%s’ expects argument of type ‘char *’, but argument 2 has type ‘int’ [-Wformat]


Output: Segmentation fault

This is common mistake programmer does. Here printf de-references address location 10 and print out the string value at that address. However 10 is not valid address in the address space and hence the program results in seg fault. GCC also warns about invalid format. However programmers tend to ignore the warnings too often and when errors do happen lot of time is spent to debug the error. The warnings are taken into consideration in above condition since program is too small. However, while working on larger projects, programmers tend to overlook on warnings.

There are two solutions though:

1) Program neatly
2) Find the error during compilation time itself

The first seems to be the ideal solution. However mistakes do happen. The second can be achieved using GCC options. GCC provides compiler switch '-Werror' to convert all warnings to errors. You can even have specific warnings to be shown as errors. For example, in above case we can only use '-Werror=format' to make format warning as error. This greatly helps you to reduce errors even.

How does modified output look like now?

nandakumar@HERAMBA ~ $ gcc -Werror format.c
format.c: In function ‘main’:
format.c:6:2: error: format ‘%s’ expects argument of type ‘char *’, but argument 2 has type ‘int’ [-Werror=format]
cc1: all warnings being treated as errors


Now we can see the change! There is an error in program which makes our life easy. That is it for now. Leave comments if you have extra suggestions.

Thursday, March 15, 2012

Thread Safety of 'errno' variable

Everyone knows that each time system call gets execured, the 'errno' will be set 'errno' value of last system call that failed. So it is a good practice to store 'errno' value for further processing if any system call getting called subsequently.

For ex:

socket(...); /* do not worry about args */
printf("Trying to open socket\n");
printf("The errno was = %d\n", errno);


So what is problem with above code? Well, printf is internally write system call. If printf fails, the 'errno' value will reflect the output of printf but not the socket error.

Now consider the example. I create two sockets. One with appletalk stream socket and other arbitrary protocol with datagram socket. We know stream socket is basically for TCP and SCTP based sockets. So the one with SOCK_STREAM gives error and the arbitrary also gives error. See the differences in 'errno'.

socket(AF_APPLETALK, SOCK_STREAM, 0);
socket(200, SOCK_DGRAM, 0);
printf("%d\n", errno);


Output: 97

I modify the code back.

socket(AF_APPLETALK, SOCK_STREAM, 0);
printf("%d\n", errno);
socket(200, SOCK_DGRAM, 0);


Output: 94

You will see difference in 'errno' which confirms our observation. So it is always good practice to store 'errno' you want to catch for particular system call before you proceed further.

Question comes about thread safety. 'errno' is global variable and resides in data segment which means not thread-safe. Does glibc puts any mutex locks around it. Oh thats really horrible thing and if you are using 'errno' directly, it does not make sense even. So how 'errno' is managed? What if 'errno' in one thread is modified by system call return of other thread?

Well glibc does this by making 'errno' per thread variable. For more information on gcc per thread variable see here: http://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Thread-Local.html

So 'errno' modification is transparent to per thread and not across threads. Hence you can safely assume that 'errno' within a thread is localized and is completely abstracted from other threads. Consider the example:


#include <sys/types.h>
#include <sys/socket.h>
#include <errno.h>
#include <stdio.h>
#include <pthread.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdlib.h>

#define MAX_THREADS 3

pthread_barrier_t t_barrier;

void* syscall_thread1(void*);
void* syscall_thread2(void*);
void* syscall_thread3(void*);

int main()
{
    pthread_t tid[MAX_THREADS];

    /* It is not required to include main thread. Only for illustration */
    if (pthread_barrier_init(&t_barrier, NULL, MAX_THREADS+1)) {
        printf("Error in initializing pthread barrier\n");
        exit(1);
    }

    if (pthread_create(&tid[0], NULL, syscall_thread1, NULL)) {
        printf("Failed to create thread instance %d\n", 1);
    }
   
    if (pthread_create(&tid[1], NULL, syscall_thread2, NULL)) {
        printf("Failed to create thread instance %d\n", 2);
    }

    if (pthread_create(&tid[2], NULL, syscall_thread3, NULL)) {
        printf("Failed to create thread instance %d\n", 3);
    }

    /*
     * Make sure threads start simultaneously
     * atleast from code point of view,
     * may not be from scheduler point of view
     */
    pthread_barrier_wait(&t_barrier);

    /* Wait for each thread to complete */
    pthread_join(tid[0], NULL);
    pthread_join(tid[1], NULL);
    pthread_join(tid[2], NULL);

    pthread_barrier_destroy(&t_barrier);

    return 0;
}

void* syscall_thread1(void *instance)
{
    pthread_barrier_wait(&t_barrier);
    printf("In thread = %lu\n", pthread_self());
    if ((socket(AF_APPLETALK, SOCK_STREAM, 0)) == -1)
        perror("socket:");
}

void* syscall_thread2(void *instance)
{
    pthread_barrier_wait(&t_barrier);
    printf("In thread = %lu\n", pthread_self());
    if ((socket(200, SOCK_STREAM, 0)) == -1)
        perror("socket:");
}

void* syscall_thread3(void *instance)
{
    pthread_barrier_wait(&t_barrier);
    printf("In thread = %lu\n", pthread_self());
    if ((open("some_non_existing_file", O_RDWR)) == -1)
        perror("open:");
}


Output:

In thread = 3079547760
socket:: Socket type not supported
In thread = 3071155056
socket:: Address family not supported by protocol
In thread = 3062762352
open:: No such file or directory


Works as expected! However order of print depends on thread scheduling by linux scheduler. We may have to compile glibc with _REENTRANT option to achieve per thread 'errno' storage.


How about digging glibc code for a while?

/usr/include/errno.h says 'errno' is per-thread variable.

#ifdef  _ERRNO_H

/* Declare the `errno' variable, unless it's defined as a macro by
   bits/errno.h.  This is the case in GNU, where it is a per-thread
   variable.  This redeclaration using the macro still works, but it
   will be a function declaration without a prototype and may trigger
   a -Wstrict-prototypes warning.  */

#ifndef errno
extern int errno;
#endif


another snippet under /usr/include/i386-linux-gnu/bits/errno.h

#  if !defined _LIBC || defined _LIBC_REENTRANT
/* When using threads, errno is a per-thread value.  */
#   define errno (*__errno_location ())
#  endif
# endif /* !__ASSEMBLER__ */
#endif /* _ERRNO_H */


basically errno_location() function gets address of global 'errno' variable. As per the article in here: http://pauillac.inria.fr/~xleroy/linuxthreads/faq.html,

"Thus, for programs not linked with LinuxThreads, defining _REENTRANT makes no difference w.r.t. errno processing. But LinuxThreads redefines __errno_location() to return a location in the thread descriptor reserved for holding the current value of errno for the calling thread. Thus, each thread operates on a different errno location."

which means that each thread gets its own copy of 'errno'.

Small search in cscope for errno_location() leads to hurd/hurd/threadvar.h which says it all. What it means need to dig still further which even I am doing :-). The value seems to be obtained from some offset from thread stack pointer (just assuming).

/* Return the location of the value for the per-thread variable with index
   INDEX used by the thread whose stack pointer is SP.  */

extern unsigned long int *__hurd_threadvar_location_from_sp
  (enum __hurd_threadvar_index __index, void *__sp);
_HURD_THREADVAR_H_EXTERN_INLINE unsigned long int *
__hurd_threadvar_location_from_sp (enum __hurd_threadvar_index __index,
                                   void *__sp)
{
  unsigned long int __stack = (unsigned long int) __sp;
  return &((__stack >= __hurd_sigthread_stack_base &&
            __stack < __hurd_sigthread_stack_end)
           ? __hurd_sigthread_variables
           : (unsigned long int *) ((__stack & __hurd_threadvar_stack_mask) +
                                    __hurd_threadvar_stack_offset))[__index];
}

/* Return the location of the current thread's value for the
   per-thread variable with index INDEX.  */

extern unsigned long int *
__hurd_threadvar_location (enum __hurd_threadvar_index __index) __THROW
     /* This declaration tells the compiler that the value is constant
        given the same argument.  We assume this won't be called twice from
        the same stack frame by different threads.  */
     __attribute__ ((__const__));

_HURD_THREADVAR_H_EXTERN_INLINE unsigned long int *
__hurd_threadvar_location (enum __hurd_threadvar_index __index)
{
  return __hurd_threadvar_location_from_sp (__index,
                                            __thread_stack_pointer ());
}


Conclusion is glibc marks 'errno' as per thread variable and it is safe to access 'errno' across threads. Any comments or suggestions are greatly welcome

Thanks to wonderful book Linux System Programming by Robert Love through which I came to know that 'errno' is thread safe.