Tuesday, December 21, 2010

good C programming practices

The other day I was reading a survey on programming languages and I found that as of today, C programming language is still in the top 3 most used programming language by developers across the globe. With so much code being written in C, I thought to share some of the good C programming practices that I have learned over the years.

Some good C programming practices
1. Say no to gets() and strcpy()
This is one of the most widely known fact that functions like gets(), strcpy(), sprintf() etc should not be used any more. They are still there in library to support the code that’s already there using these functions. If you look at the man page of gets(), it says :
Never use gets(). Because it is impossible to tell without knowing the data in advance how many characters gets() will read, and because gets() will continue to store characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead.
Even compilers like gcc warn you when you try to compile a code using these functions. A warning like the one shown below is thrown by gcc :
warning: the `gets’ function is dangerous and should not be used.
So we see that the whole development environment is aware of this fact so you should also honour this fact and start using functions like fgets(), strncpy(), snprintf etc. which properly take care that only that much input should be copied that can be handled by the buffer.
2. Have a single point of return
If the ‘return’ statements are used liberally in a function then that’s not a good programming practice at all. Having multiple return points in code makes the management of code complex. My experience says that usually that code that is being handled by multiple developers has this problem. But nonetheless any body can make this mistake. Having only one point of return is the best practice.
For example :
#include<stdio.h>
...
...

int main(void)
{
    ...
    ...
    ...
    if(/*some condition*/)
        return 1;     // Not a good programming practice
    else if(/*some condition*/)
        return 2;     // Not a good programming practice
    else
        return -1;    // Not a good programming practice
}
A better code would be :
#include<stdio.h>
...
...

int main(void)
{
    int ret = 0;
    ...
    ...
    ...
    if(/*some condition*/)
        ret = 1; 
    else if(/*some condition*/)
        ret = 2;   
    else
        ret = -1;

    return ret
}
3. Have basic optimization in place
Sometimes while writing a logic, we tend to ignore how things are being achieved through that logic. What I mean is that we should always look forward to have some basic optimization in place.
For example look at the following piece of code :
...
...
...
int len = 0;
char *buff = "Linux";

while(/*Some condition*/)
{

    ...
    ... // updating 'len' here
    ...
    if(len > strlen(buff))
    {
        //Do some stuff here
    }
    ...
    ...
    ...
}
...
...
...
So we see that in the above shown code, the function strlen() is being used again and again in while loop to calculate the length of the string pointed to by buff. This is not an optimized code as the buffer ‘buff’ is not changed anywhere inside the while loop so the length of this buffer will always be the same. So its better to call the function strlen() once before the while loop and use a variable containing the length of buffer ‘buff’ inside the while loop. Something like this:
...
...
...
int len = 0;
int buff_len = 0;

char *buff = "Linux";
buff_len = strlen(buff);

while(/*Some condition*/)
{

    ...
    ... // updating 'len' here
    ...
    if(len > buff_len)
    {
        //Do some stuff here
    }
    ...
    ...
    ...
}
...
...
...
So this way make sure that your code has basic optimization in place.
4. CODE DOCUMENTATION
------------------------------------------------------------------------------

  - Add FIXME, TODO and XXX comments appropriately;

  - Never commit commented-out code without adding the reason
    why it's still there but disabled. Removing the code is
    better than commenting-out it;

  - Use doxygen for code documentation;

  - Add simple READMEs and pictures as needed, but concentrate
    on doxygen;

  - Never use a language different than English.
5. Use macros instead of hard-coded values
Suppose you have written thousands of lines of code that is spread over hundreds of files. You have used a hard-coded value (for some purpose) say ‘1024’ all over your code and across files. Now suddenly you realize that you have to change this value to ‘2048’. So what will you do? Identify all the places and change the value to 2048 manually or run a find command in your IDE and try to replace all the 1024’s with 2048?
Well, both of the above ways are not good. The better practice is to use a C macro to hold the value in one of the header files and use this macro all over the place so that if there is a requirement to change the value, the change is to be done at one place only. Also, using a relevant macro name will make it clear that what’s the value being used for.
6. Always use corresponding library calls
Suppose you want to open a file through C code on Linux. You know that there are two functions available for this. The fopen() and the open() function. While the fopen() function is a library call, the open() is a system call. Which function would you use?
Well, if you choose to use open() system call for this then I must say that it’s not a good practice to use a system call if a corresponding library call exists.
This is because:
§  The system calls are system dependent. For example, any code written with Linux system call is not guaranteed to work on a non-Linux machine. While any system having C library will have the corresponding library function. So the issue here is portability.
§  Secondly, the system calls are relatively expensive (in terms of time consumption) as compared to library calls.
So make sure that wherever available, always use library calls.


7. CODE INSTRUMENTATION
------------------------------------------------------------------------------

  - use const for non modifiable function parameters;
  - const is usually better than #define;
  - use static for internal functions;
  - use safer functions like snprintf(), strncpy() and alikes;
  - check validity of all received parameters;
  - use assert() where appropriate;
  - do not use alloca() or other non-recommended functions;
  - check for return errors even from malloc() and other
    standard system calls;
  - compile and run your code with ElectricFence enabled while
    developing;
  - never commit debug printf() and alike calls unless they're
    inside debug-enabled macros;
  - use valgrind to profile you application and find memleaks
    regularly.


8. COMMITS AND CHANGELOGS
------------------------------------------------------------------------------

  - in a changelog, what matters is not "what" but "why";
  - several commits are usually better than a big one;
  - do not commit unrelated modifications at once unless they're
    really trivial;
  - commit early, commit often;
  - always run "svn diff" and "svn status" before a commit and
    document all changes in the changelogs;
  - your changelog must be useful in a "svn log" operation, not
    just to people receiving the diff by e-mail (think about
    its usage years later, by someone not involved in the project)
  - changelog messages should respect the code text width limit
    (around 80 columns);
  - as expected, code in the repository should always compile,
    without warnings.


9. UNIT TESTS
------------------------------------------------------------------------------

  - All code should be written along with unit-tests. The tool
    used to implement the tests is "check", available on
    http://check.sourceforge.net/.

    The build-system infra-structure provides a configure option to
    allow check usage.

    The tests must be implemented inside a sub-directory called utests
    with the following modules:

    README                 --> duu :-)
    check_<component name> --> the test-manager
    check_<unit name>      --> implements tests for interfaces
                               exported by unit
    check_<...>

    Just to remember, a unit, or module, is a collection of
    source-code files (.c) with an interface exported through
    a header file (.h).

    All interfaces exported by units must be tested (that is,
    all non-static functions). The tests should implement at
    least the following test cases:

      - standard usage
      - upper and bottom limits of pre-allocated buffers
      - upper and bottom limits of variables (like sizes and
        lengths)
      - error conditions
      - invalid input parameters
      - test each possible use-case scenario

    Use incremental tests, that is, test functions before using them in
    other test-cases (for example, if you need to call function A to
    prepare the environment to test function B, then test function A
    first).

    Whenever possible, static functions should also be tested. In this
    case, you can include the .c file in the utest file, something like:

      #include "mymodule.c"

    If the test needs an external entity to work (for example,
    it needs to communicate with a hardware device), or if the test
    requires specific privileges, put the test inside a condition for
    an environment variable and document it in a README file.

No comments:

Post a Comment