Tuesday, December 21, 2010

good C programming practices

The other day I was reading a survey on programming languages and I found that as of today, C programming language is still in the top 3 most used programming language by developers across the globe. With so much code being written in C, I thought to share some of the good C programming practices that I have learned over the years.

Some good C programming practices
1. Say no to gets() and strcpy()
This is one of the most widely known fact that functions like gets(), strcpy(), sprintf() etc should not be used any more. They are still there in library to support the code that’s already there using these functions. If you look at the man page of gets(), it says :
Never use gets(). Because it is impossible to tell without knowing the data in advance how many characters gets() will read, and because gets() will continue to store characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead.
Even compilers like gcc warn you when you try to compile a code using these functions. A warning like the one shown below is thrown by gcc :
warning: the `gets’ function is dangerous and should not be used.
So we see that the whole development environment is aware of this fact so you should also honour this fact and start using functions like fgets(), strncpy(), snprintf etc. which properly take care that only that much input should be copied that can be handled by the buffer.
2. Have a single point of return
If the ‘return’ statements are used liberally in a function then that’s not a good programming practice at all. Having multiple return points in code makes the management of code complex. My experience says that usually that code that is being handled by multiple developers has this problem. But nonetheless any body can make this mistake. Having only one point of return is the best practice.
For example :
#include<stdio.h>
...
...

int main(void)
{
    ...
    ...
    ...
    if(/*some condition*/)
        return 1;     // Not a good programming practice
    else if(/*some condition*/)
        return 2;     // Not a good programming practice
    else
        return -1;    // Not a good programming practice
}
A better code would be :
#include<stdio.h>
...
...

int main(void)
{
    int ret = 0;
    ...
    ...
    ...
    if(/*some condition*/)
        ret = 1; 
    else if(/*some condition*/)
        ret = 2;   
    else
        ret = -1;

    return ret
}
3. Have basic optimization in place
Sometimes while writing a logic, we tend to ignore how things are being achieved through that logic. What I mean is that we should always look forward to have some basic optimization in place.
For example look at the following piece of code :
...
...
...
int len = 0;
char *buff = "Linux";

while(/*Some condition*/)
{

    ...
    ... // updating 'len' here
    ...
    if(len > strlen(buff))
    {
        //Do some stuff here
    }
    ...
    ...
    ...
}
...
...
...
So we see that in the above shown code, the function strlen() is being used again and again in while loop to calculate the length of the string pointed to by buff. This is not an optimized code as the buffer ‘buff’ is not changed anywhere inside the while loop so the length of this buffer will always be the same. So its better to call the function strlen() once before the while loop and use a variable containing the length of buffer ‘buff’ inside the while loop. Something like this:
...
...
...
int len = 0;
int buff_len = 0;

char *buff = "Linux";
buff_len = strlen(buff);

while(/*Some condition*/)
{

    ...
    ... // updating 'len' here
    ...
    if(len > buff_len)
    {
        //Do some stuff here
    }
    ...
    ...
    ...
}
...
...
...
So this way make sure that your code has basic optimization in place.
4. CODE DOCUMENTATION
------------------------------------------------------------------------------

  - Add FIXME, TODO and XXX comments appropriately;

  - Never commit commented-out code without adding the reason
    why it's still there but disabled. Removing the code is
    better than commenting-out it;

  - Use doxygen for code documentation;

  - Add simple READMEs and pictures as needed, but concentrate
    on doxygen;

  - Never use a language different than English.
5. Use macros instead of hard-coded values
Suppose you have written thousands of lines of code that is spread over hundreds of files. You have used a hard-coded value (for some purpose) say ‘1024’ all over your code and across files. Now suddenly you realize that you have to change this value to ‘2048’. So what will you do? Identify all the places and change the value to 2048 manually or run a find command in your IDE and try to replace all the 1024’s with 2048?
Well, both of the above ways are not good. The better practice is to use a C macro to hold the value in one of the header files and use this macro all over the place so that if there is a requirement to change the value, the change is to be done at one place only. Also, using a relevant macro name will make it clear that what’s the value being used for.
6. Always use corresponding library calls
Suppose you want to open a file through C code on Linux. You know that there are two functions available for this. The fopen() and the open() function. While the fopen() function is a library call, the open() is a system call. Which function would you use?
Well, if you choose to use open() system call for this then I must say that it’s not a good practice to use a system call if a corresponding library call exists.
This is because:
§  The system calls are system dependent. For example, any code written with Linux system call is not guaranteed to work on a non-Linux machine. While any system having C library will have the corresponding library function. So the issue here is portability.
§  Secondly, the system calls are relatively expensive (in terms of time consumption) as compared to library calls.
So make sure that wherever available, always use library calls.


7. CODE INSTRUMENTATION
------------------------------------------------------------------------------

  - use const for non modifiable function parameters;
  - const is usually better than #define;
  - use static for internal functions;
  - use safer functions like snprintf(), strncpy() and alikes;
  - check validity of all received parameters;
  - use assert() where appropriate;
  - do not use alloca() or other non-recommended functions;
  - check for return errors even from malloc() and other
    standard system calls;
  - compile and run your code with ElectricFence enabled while
    developing;
  - never commit debug printf() and alike calls unless they're
    inside debug-enabled macros;
  - use valgrind to profile you application and find memleaks
    regularly.


8. COMMITS AND CHANGELOGS
------------------------------------------------------------------------------

  - in a changelog, what matters is not "what" but "why";
  - several commits are usually better than a big one;
  - do not commit unrelated modifications at once unless they're
    really trivial;
  - commit early, commit often;
  - always run "svn diff" and "svn status" before a commit and
    document all changes in the changelogs;
  - your changelog must be useful in a "svn log" operation, not
    just to people receiving the diff by e-mail (think about
    its usage years later, by someone not involved in the project)
  - changelog messages should respect the code text width limit
    (around 80 columns);
  - as expected, code in the repository should always compile,
    without warnings.


9. UNIT TESTS
------------------------------------------------------------------------------

  - All code should be written along with unit-tests. The tool
    used to implement the tests is "check", available on
    http://check.sourceforge.net/.

    The build-system infra-structure provides a configure option to
    allow check usage.

    The tests must be implemented inside a sub-directory called utests
    with the following modules:

    README                 --> duu :-)
    check_<component name> --> the test-manager
    check_<unit name>      --> implements tests for interfaces
                               exported by unit
    check_<...>

    Just to remember, a unit, or module, is a collection of
    source-code files (.c) with an interface exported through
    a header file (.h).

    All interfaces exported by units must be tested (that is,
    all non-static functions). The tests should implement at
    least the following test cases:

      - standard usage
      - upper and bottom limits of pre-allocated buffers
      - upper and bottom limits of variables (like sizes and
        lengths)
      - error conditions
      - invalid input parameters
      - test each possible use-case scenario

    Use incremental tests, that is, test functions before using them in
    other test-cases (for example, if you need to call function A to
    prepare the environment to test function B, then test function A
    first).

    Whenever possible, static functions should also be tested. In this
    case, you can include the .c file in the utest file, something like:

      #include "mymodule.c"

    If the test needs an external entity to work (for example,
    it needs to communicate with a hardware device), or if the test
    requires specific privileges, put the test inside a condition for
    an environment variable and document it in a README file.

Wednesday, December 15, 2010

Reduce C-language coding errors with X macros

X macros are a powerful coding technique that makes extensive use of the C-language pre-processor. This technique has the capability to eliminate several classes of common bugs.

It seems to me that the C preprocessor gets a bad rap. Granted, there are ways to use the preprocessor inappropriately, but to limit its use because of that constrains a valuable tool that can reduce coding errors and improve developer productivity though automatic code generation.

Code Ordering Dependencies
I discovered X macros a few years ago when I started making use of function pointers in my code. Frequently I would write code like this:

/* declare an enumeration of state codes */ 
enum{STATE_0, STATE_1, STATE_2, ... , STATE_N, NUM_STATES};

/* declare a table of function pointers */ 
p_func_t jumptable[NUM_STATES] = {func_0, func_1, func_2, ... , func_N};


The issue with this type of code is maintainability. The ordering of the array initializers has to match the ordering of the state code enumeration exactly. Historically I would comment this type of code liberally to warn future users about this dependency, but protection based on commenting is really no protection at all. What I needed was a tool that would automatically enforce the dependency.

I began investigating solutions for this problem and discovered that in the C99 standard there was a new way to initialize arrays. An improved way to write the above code is as follows:

/* declare an enumeration of state codes */ 
enum{STATE_0, STATE_1, STATE_2, ... , STATE_N, NUM_STATES};

/* declare a table of function pointers */ 
p_func_t jumptable[NUM_STATES] = {
         [STATE_1] = func_1,
         [STATE_0] = func_0,
         [STATE_2] = func_2,
          ... ,
         [STATE_N] = func_N
};


Now even if I change the ordering of the enumeration, the jumptable logic doesn’t break. Much better. My only problem was that the C compiler I was working with was not compliant with the C99 standard. Back to square one.

X macros to the Rescue
One day while talking shop with a friend of mine, I explained my problem and he suggested using the C preprocessor to enforce the ordering. He explained the basic concept: Use preprocessor directives to define a table in the form of a macro and then redefine how the macro is expanded, as required.
Here's how this technique enforces my code ordering dependency:

#define STATE_TABLE            \
        ENTRY(STATE_0, func_0) \
        ENTRY(STATE_1, func_1) \
        ENTRY(STATE_2, func_2) \
        ...                    \
        ENTRY(STATE_X, func_X)

/* declare an enumeration of state codes */ 
enum{
#define ENTRY(a,b) a,
    STATE_TABLE
#undef ENTRY
    NUM_STATES
};

/* declare a table of function pointers */ 
p_func_t jumptable[NUM_STATES] = {
#define ENTRY(a,b) b,
    STATE_TABLE
#undef ENTRY
};

In the case of the enumeration the table expands to ‘a’ which is the first column of the state table; the state code. In the case of the array, the table expands to ‘b’ which is the second column, the name of the function pointer. 

The code based on the X macro table is expanded in the same order for both the enumeration and the array. The preprocessor now enforces the dependency!

Cleaning up the code
One thing I don’t like about this implementation is the presence of #define and#undef throughout the code, which to me is ugly and makes the code less readable. Let’s look at a technique for getting rid of them.

You will notice that in my definition of the STATE_TABLE macro I don’t take any parameters. There is nothing to prevent me from passing the definition of ENTRYdirectly to the STATE_TABLE macro instead of defining it separately:

#define EXPAND_AS_ENUMERATION(a,b) a,
#define EXPAND_AS_JUMPTABLE(a,b) b, 
#define STATE_TABLE(ENTRY)     \
        ENTRY(STATE_0, func_0) \
        ENTRY(STATE_1, func_1) \
        ENTRY(STATE_2, func_2) \
        ...                    \
        ENTRY(STATE_X, func_X)

/* declare an enumeration of state codes */ 
enum{
    STATE_TABLE(EXPAND_AS_ENUMERATION);
    NUM_STATES
};

/* declare a table of function pointers */ 
p_func_t jumptable[NUM_STATES] = {
    STATE_TABLE(EXPAND_AS_JUMPTABLE);
};


Much better, but is there anything else that we could use the X macro table for? Since every function pointer corresponds to an actual function, we could use the table to generate function prototypes for us:

#define EXPAND_AS_PROTOTYPES(a,b) static void b(void);
STATE_TABLE(EXPAND_AS_PROTOTYPES);


Now I no longer need to remember to add a prototype when I add new states. The preprocessor can take care of it and will expand the table into the following code automatically:

static void func_0(void);
static void func_1(void);
static void func_2(void);
...
static void func_X(void);

Register Initialization
That's not the only way X macros can be used. In my code I commonly have to interface to custom FPGAs. These devices usually have many memory mapped registers that need initialization. It's easy to forget to initialize a newly defined register, but using X macros, this is another task we can automate.

#define EXPAND_AS_INITIALIZER(a,b) a = b;
#define REGISTER_TABLE(ENTRY) \
    ENTRY(reg_0, 0x11)        \
    ENTRY(reg_1, 0x55)        \
    ENTRY(reg_2, 0x1b)        \
    ...                       \
    ENTRY(reg_X, 0x33)
static void init_registers(void){
         REGISTER_TABLE(EXPAND_AS_INITIALIZER);
}


Simple; and as new registers are added, no code needs to be updated to initialize it - we just add a row to the table and the preprocessor does the rest. We can further improve this code to take into account not only the initialization, but the declaration of the registers:

#define FPGA_ADDRESS_OFFSET (0x8000)
#define EXPAND_AS_INITIALIZER(a,b,c) a = c;
#define EXPAND_AS_DECLARATION(a,b,c) volatile uint8_t a _at_ b;
#define REGISTER_TABLE(ENTRY)                   \
    ENTRY(reg_0, FPGA_ADDRESS_OFFSET + 0, 0x11) \
    ENTRY(reg_1, FPGA_ADDRESS_OFFSET + 1, 0x55) \
    ENTRY(reg_2, FPGA_ADDRESS_OFFSET + 2, 0x1b) \
    ...                                         \
    ENTRY(reg_X, FPGA_ADDRESS_OFFSET + X, 0x33)

/* declare the registers */
REGISTER_TABLE(EXPAND_AS_DECLARATION);


This code uses a compiler specific directive _at_ to place the variables at absolute addresses. This may not be possible with other compilers. Secondly, more than one table may be required to take into account different types of register declarations. You may need to have a read-only register table, a write-only register table, an uninitialized register table, etc.

I hope that this introduction to X macros has provided a glimpse into the power of this coding technique. In a future article I plan to dig a little deeper and show some more advanced uses of X macros to facilitate automatic code generation.


X macros are a powerful coding technique that makes extensive use of the C-language pre-processor (http://en.wikipedia.org/wiki/C_preprocessor). This technique has the capability to eliminate several classes of common bugs.

It seems to me that the C preprocessor gets a bad rap. Granted, there are ways to use the preprocessor inappropriately, but to limit its use because of that constrains a valuable tool that can reduce coding errors and improve developer productivity though automatic code generation.

Code Ordering Dependencies
I discovered X macros a few years ago when I started making use of function pointers in my code. Frequently I would write code like this:

/* declare an enumeration of state codes */ 

enum{STATE_0, STATE_1, STATE_2, ... , STATE_N, NUM_STATES};



/* declare a table of function pointers */ 

p_func_t jumptable[NUM_STATES] = {func_0, func_1, func_2, ... , func_N};



The issue with this type of code is maintainability. The ordering of the array initializers has to match the ordering of the state code enumeration exactly. Historically I would comment this type of code liberally to warn future users about this dependency, but protection based on commenting is really no protection at all. What I needed was a tool that would automatically enforce the dependency.

I began investigating solutions for this problem and discovered that in the C99 standard there was a new way to initialize arrays. An improved way to write the above code is as follows:

/* declare an enumeration of state codes */ 
enum{STATE_0, STATE_1, STATE_2, ... , STATE_N, NUM_STATES};

/* declare a table of function pointers */ 
p_func_t jumptable[NUM_STATES] = {
         [STATE_1] = func_1,
         [STATE_0] = func_0,
         [STATE_2] = func_2,
          ... ,
         [STATE_N] = func_N
};

Now even if I change the ordering of the enumeration, the jumptable logic doesn’t break. Much better. My only problem was that the C compiler I was working with was not compliant with the C99 standard. Back to square one.

X macros to the Rescue
One day while talking shop with a friend of mine, I explained my problem and he suggested using the C preprocessor to enforce the ordering. He explained the basic concept: Use preprocessor directives to define a table in the form of a macro and then redefine how the macro is expanded, as required.
Here's how this technique enforces my code ordering dependency:

#define STATE_TABLE            \
        ENTRY(STATE_0, func_0) \
        ENTRY(STATE_1, func_1) \
        ENTRY(STATE_2, func_2) \
        ...                    \
        ENTRY(STATE_X, func_X)

/* declare an enumeration of state codes */ 
enum{
#define ENTRY(a,b) a,
    STATE_TABLE
#undef ENTRY
    NUM_STATES
};

/* declare a table of function pointers */ 
p_func_t jumptable[NUM_STATES] = {
#define ENTRY(a,b) b,
    STATE_TABLE
#undef ENTRY
};

In the case of the enumeration the table expands to ‘a’ which is the first column of the state table; the state code. In the case of the array, the table expands to ‘b’ which is the second column, the name of the function pointer. 

The code based on the x-macro table is expanded in the same order for both the enumeration and the array. The preprocessor now enforces the dependency!

Cleaning up the code
One thing I don’t like about this implementation is the presence of #define and #undef throughout the code, which to me is ugly and makes the code less readable. Let’s look at a technique for getting rid of them.

You will notice that in my definition of the STATE_TABLE macro I don’t take any parameters. There is nothing to prevent me from passing the definition of ENTRY directly to the STATE_TABLE macro instead of defining it separately:

#define EXPAND_AS_ENUMERATION(a,b) a,
#define EXPAND_AS_JUMPTABLE(a,b) b, 
#define STATE_TABLE(ENTRY)     \
        ENTRY(STATE_0, func_0) \
        ENTRY(STATE_1, func_1) \
        ENTRY(STATE_2, func_2) \
        ...                    \
        ENTRY(STATE_X, func_X)

/* declare an enumeration of state codes */ 
enum{
    STATE_TABLE(EXPAND_AS_ENUMERATION);
    NUM_STATES
};

/* declare a table of function pointers */ 
p_func_t jumptable[NUM_STATES] = {

    STATE_TABLE(EXPAND_AS_JUMPTABLE);

};



Much better, but is there anything else that we could use the x-macro table for? Since every function pointer corresponds to an actual function, we could use the table to generate function prototypes for us:

#define EXPAND_AS_PROTOTYPES(a,b) static void b(void);



STATE_TABLE(EXPAND_AS_PROTOTYPES);



Now I no longer need to remember to add a prototype when I add new states. The preprocessor can take care of it and will expand the table into the following code automatically:

static void func_0(void);

static void func_1(void);

static void func_2(void);

...

static void func_X(void);



Register Initialization
That's not the only way X macros can be used. In my code I commonly have to interface to custom FPGAs. These devices usually have many memory mapped registers that need initialization. It's easy to forget to initialize a newly defined register, but using X macros, this is another task we can automate.

#define EXPAND_AS_INITIALIZER(a,b) a = b;
#define REGISTER_TABLE(ENTRY) \

    ENTRY(reg_0, 0x11)        \

    ENTRY(reg_1, 0x55)        \

    ENTRY(reg_2, 0x1b)        \

    ...                       \

    ENTRY(reg_X, 0x33)



static void init_registers(void){

         REGISTER_TABLE(EXPAND_AS_INITIALIZER);

}



Simple; and as new registers are added, no code needs to be updated to initialize it - we just add a row to the table and the preprocessor does the rest. We can further improve this code to take into account not only the initialization, but the declaration of the registers:

#define FPGA_ADDRESS_OFFSET (0x8000)



#define EXPAND_AS_INITIALIZER(a,b,c) a = c;

#define EXPAND_AS_DECLARATION(a,b,c) volatile uint8_t a _at_ b;



#define REGISTER_TABLE(ENTRY)                   \

    ENTRY(reg_0, FPGA_ADDRESS_OFFSET + 0, 0x11) \

    ENTRY(reg_1, FPGA_ADDRESS_OFFSET + 1, 0x55) \

    ENTRY(reg_2, FPGA_ADDRESS_OFFSET + 2, 0x1b) \

    ...                                         \

    ENTRY(reg_X, FPGA_ADDRESS_OFFSET + X, 0x33)



/* declare the registers */

REGISTER_TABLE(EXPAND_AS_DECLARATION);



This code uses a compiler specific directive _at_ to place the variables at absolute addresses. This may not be possible with other compilers. Secondly, more than one table may be required to take into account different types of register declarations. You may need to have a read-only register table, a write-only register table, an uninitialized register table, etc.
I hope that this introduction to X macros has provided a glimpse into the power of this coding technique. In a future article I plan to dig a little deeper and show some more advanced uses of X macros to facilitate automatic code generation. 

Andrew Lucas leads a team of firmware developers at NCR Canada. He is responsible for the firmware architecture of their intelligent deposit modules found inside NCR's line of ATMs.





PS: Contents are simply copied from  
http://www.embedded.com