Skip to content

Programming »

Notes for C and C++ Programming

Tips, hints, and tricks for developers in programming C/C++.

Last update: 2022-06-29

Size of a datatype#

The size of a type is determined by the compiler, which doesn’t have anything to do with the actual hardware. The returned value of sizeof(char) is always 1 by definition, and sizeof(int) always returns 4. Starting at C99, bool is present as an 1-byte datatype.

Note that, size of a pointer should be 4 bytes on any 32-bit C/C++ compiler, and should be 8 bytes on any 64-bit C/C++ compiler.

Use int instead of char or uint8_t?

It depends on the target machine: if it is incapable to access unaligned memory (e.g. Cortex-M0 processors), then using int is much faster.

Use int32_t or int?

The primitive datatype char short int long may have different size on different platform (old machine, arduino int is 2 bytes).

The stdint.h or cstdint headers define int8_t int16_t int32_t which have a defined size on all platforms.

Therefore, it is recommeded to use stdint.h in cases:

  • in embedded system which has to know the size of data exactly
  • code is used for multiple platforms

⚠ Do NOT use sizeof on array argument#

The function fun() below receives a parameter arr[] as an array, then it tries to find out the number of elements in that array, using the sizeof operator.

In main, there is also a statement calculating the number of elements in the array arr[].

But 2 methods return different results.

int fun(int arr[]) {
    return sizeof(arr)/sizeof(arr[0]); // WRONG

void main() {
    int arr[4] = {0, 0 ,0, 0};
    int arr_size = sizeof(arr)/sizeof(arr[0]); // RIGHT
    if (arr_size == fun(arr)) {} // ???

In C, array parameters are treated as pointers. So the expression:



sizeof(int *)/sizeof(int)

Note that sizeof(int *) is the size of a pointer, which can be 4 or 8 depending on the target compiler (32-bit or 64-bit). This leads to a wrong result.

⚠ Compare with a float number#

Below code will run in an infinite loop. Why?

#include <iostream>

int main() {
    int i = 0;
    for (float x = 100000001.0f; x <= 100000010.0f; x += 1.0f) {
        std::cout << i << std::endl;

    return 0;

C/C++ compilers use IEEE 754 to represent float and double numbers.

Float number is single precision! It means float has precision at 7 degree.

The value 100000001.0f has the most accurate representation is 100000000.0
The value 100000010.0f has the most accurate representation is 100000008.0

To fix the loop, use double x.

⚠ Comparison numbers#

  • When comparing, compiler implicitly cast the type of number:

    • different size: from smaller type to a bigger type
    • same size: from singed to unsigned
#include <stdio.h>
#include <stdint.h>

int main() {    
    uint32_t x = 100;
    int32_t  y =  -1;

    // y is converted to unin32_t: 
    // y = UINT_MAX
    if (x > y){
        printf("OK. Good.");
    else {
        printf("WTF ???");

    return 0;
#include <stdio.h>
#include <stdint.h>

int main() {    
    uint32_t x = 100;
    int64_t  y =  -1;

    // x is converted to int64_t:
    // x = 100
    if (x > y) {
        printf("OK. Good.");
    else {
        printf("WTF ???");

    return 0;
  • When comparing with a literal negative number, compiler will implicitly convert literal number to the type of comparing variable.

    #include <stdio.h>
    int main() {    
        unsigned int x = 100;
        // -1 is converted to unsigned int
        // compiler replace with UINT_MAX
        if (x > -1) {
            printf("OK. Good.");
        else {
            printf("WTF ???");
        return 0;

⚠ Accessing a[i] is actually pointer accessing#

The expression a[i] is translated to *(a+i).
That’s why this weird code printf("%c", 3["abcdef"]); still runs.

C/C++ does not check the index range, therefore it is easy to access a memory block which is out of range of the array:

int a[] = {1, 2};
int b[] = {3, 4, 5};
// if a and b are allocated next to each other, b[-1] will be a[1]
printf("%d", b[-1]);

There are Buffer overflow exploiting techniques based on this problem of accessing memory block outsides of the array.

Short-circuit evaluation#

At runtime, in the expression with AND operator if(a && b && c), the part (b) will be calculated only if (a) is true, and (c) will be calculated only if (a) and (b) are both true.

The same story about OR operator: in the expression if (a || b || c), if the sub-expression (a) is true, others will not be computed.

This helps in case the next condition depends on the previous condition, such as accessing to pointer:

if (pointer != NULL && pointer->member == 3) {}

or, checking for a higher priority condition first:

if (flag_a || flag_b) {}

However, as the consequence, do not expect the 2nd or later conditions are executed in all cases.

Increase performance in if conditions

To improve the performance, we can place short and simple condition at the first place to make use of Short-circuit Evaluation.

For example:

if ( a==1 &&
     objB.getVal() == 2
) {}

can run faster than:

if ( objB.getVal() == 2 &&
     a == 1
) {}

Use lookup table instead of if-else or switch#

In long if-else if/ switch statement, the conditions are checked with all values in written order. A lookup table can be used to reduce checking, and in some cases, it is easier to maintain.

print("Delta") is executed after 4 comparisions in this case, but it might be more if having a lot of values:

if (a == 0) {
} else if (a == 1) {
} else if (a == 2) {
} else if (a == 3) {
} else {
    // nothing

print() is executed with only 2 comparisons even having more values:

static const char* messages[] = {

if (a >= 0 && a < 4) {

Know the variable scope#

  • Do not repeat to create and destroy an object in a loop
for (int i=0; i<10; i++) {
    ClassA a;
ClassA a;
for (int i=0; i<10; i++) {
  • Only create an object if needed
for (int i=0; i<10; i++) {
    ClassA a;
    if (i == 5) {
for (int i=0; i<10; i++) {
    if (i == 5) {
        ClassA a;

Initialize value of elements in array and struct#

We do know some methods to initialize elements in an array as below:

int arr[5] = {1, 1, 1, 1, 1}; // results [1, 1, 1, 1, 1]
int arr[ ] = {1, 1, 1, 1, 1}; // results [1, 1, 1, 1, 1]
int arr[5] = { };             // results [0, 0, 0, 0, 0]
int arr[5] = {1};             // results [1, 0, 0, 0, 0]

However, GCC also supports Designated Initializers to help to initialize array with elements’ indexes.

To specify an array index, write [index] = before the element value. For example:

int a[6] = { [4] = 29, [2] = 15 }; // result [0, 0, 15, 0, 29, 0]

The index values must be constant expressions, even if the array being initialized is automatic.

Each initializer element that does not have a designator applies to the next consecutive element of the array or structure. For example:

int a[6] = { [1] = v1, v2, [4] = v4 }; // result [0, v1, v2, 0, v4, 0]

To initialize a range of elements to the same value, write [first ... last] = value. This is a GNU extension. For example:

int widths[] = { [0 ... 9] = 1, [10 ... 99] = 2, [100] = 3 };

You may have seen the designator in C for struct:

struct point { int x, y; };
struct point p = { .y = yvalue, .x = xvalue }; // result p = { xvalue, yvalue };

Labeling the elements of an array initializer is especially useful when the indices are characters or belong to an enum type. For example:

int whitespace[256] = {
    [' '] = 1, ['\t'] = 1, ['\h'] = 1,
    ['\f'] = 1, ['\n'] = 1, ['\r'] = 1 

You can also write a series of .fieldname and [index] designators before an = to specify a nested sub-object to initialize; the list is taken relative to the sub-object corresponding to the closest surrounding brace pair. For example, with the struct point declaration above:

struct point ptarray[10] = { [2].y = yv2, [2].x = xv2, [0].x = xv0 };

Pass an object as a parameter#

If x is a read-only parameter:

  • If x is big, and can be NULL: use constant pointer: function(const T* x);
  • If x is big, and NOT be NULL: use constant reference: function(const T& x);
  • Other cases: use constant value: function(const T x);

If x is a output parameter:

  • If x can be NULL, use pointer: function(T* x);
  • If x NOT be NULL, use reference: function(T& x);

Buffered stdout but non-buffered stderr#

The stdout or cout is a buffered output, so a user, usually not aware of this, sees the output by portions. Sometimes, the program outputs something and then crashes, and if buffer did not have time to flush into the console, a user will not see anything. This is sometimes inconvenient.

Thus, for dumping more important information, including debugging, it is better to use the unbuffered stderr or cerr. Another way is to set unbuffered mode for stdout by calling setbuf(stdout, NULL);.

Negative error code#

The simplest way to indicate to caller about its children’s success is to return a boolean value: false — in case of error, and true in case of success. However, to indicate more status than just 2 states, a number can be returned. And it can be extended more to indicate different status of failure or success using signed numbers:

* func() - A function that does something
* input:
*      none
* output:
*      -2: error on buffer
*      -1: error on transmission
*       0: success but no response
*       1: success with a response = NACK
*       2: success with a response = ACK

Bitwise operation#

A very popular thing in C, and also in programming generally is working with bits. For flags specifying, in order to not make a typo and mess, they can be defined using bit shifting:

#define FLAG1 (1 << 0)
#define FLAG2 (1 << 1)
#define FLAG3 (1 << 2)
#define FLAG4 (1 << 3)
#define FLAG5 (1 << 4)

and some macros can be used to work on bits:

#define IS_SET(flag, bit) (((flag) & (bit)) ? true : false)
#define SET_BIT(var, bit) ((var) |= (bit))
#define REMOVE_BIT(var, bit) ((var) &= ~(bit))

Macro overloading#

A macro is defined to return another macro, depending on the number of argument:

#define GET_FUNC(_1, _2, _3, NAME,...) NAME
#define FUNC(...)                                                   \
    GET_FUNC(__VA_ARGS__,                                           \
        FUNC3,                                                      \
        FUNC2                                                       \

When we use FUNC(x, y), it will expand to
GET_FUNC(x, y, FUNC3, FUNC2, ...)(x, y), and finally is replaced to
FUNC2(x, y).

When we use FUNC(x, y, z), it will expand to
GET_FUNC(x, y, z, FUNC3, FUNC2, ...)(x, y, z), and finally is replaced to
FUNC3(x, y, z).

If you want to expand to support 4 arguments or so on, use:

#define GET_FUNC(_1, _2, _3, _4, NAME,...) NAME
#define FUNC(...)                                                   \
    GET_FUNC(__VA_ARGS__,                                           \
        FUNC4,                                                      \
        FUNC3,                                                      \
        FUNC2                                                       \

Use goto if it can reduce the complexity#

Using goto is considered as bad and harmful, but if it is used with care, it can be helpful to reduce the complexity.

For example:

void func(...) {
    byte* buf1=malloc(...);
    byte* buf2=malloc(...);
    FILE* f=fopen(...);
    if (f==NULL)
        goto func_cleanup_and_exit;
    if (something_goes_wrong_1)
        goto func_close_file_cleanup_and_exit;




Compiler warnings#

Is it worth to turn on -Wall to see all possible problems which usually are small error and hard to get noticed. In GCC, it is also possible to turn all warnings to errors with -Werror. If enabled, any problem marked as error will halt the compiler, and it must be fixed.

Example 1:

int f1(int a, int b, int c) {
    int ret = a + b + c;
    printf("%d", ret);

int main() {
    printf("%d", f1(1,2,3));

The main() function still runs but prints out wrong value due to non-returned function. A random value will be seen because compiler still use the return register which is not set the desired value.

Example 2:

bool f1() {
    return condition ? true : false;

bool is 1-byte datatype

Compiler will generate a code to set the low byte of the AL register to 0x01 or 0x00, because the return size of 1 byte. Other higher bytes of the AL register won’t change.

However, in a different file, f1() is assumed to return int, not bool, therefore compiler generates code to compare 4 bytes of an int which is only updated its lowest byte after f1() returns. There maybe some random value in higher bytes of that int memory block, and it will cause a wrong comparison:

void main() {

Cached data#

Modern CPUs have L1, L2, and L3 caches. When fetching data from memory, it usually read a whole line of L1 cache. For example, L1 cache has 64-bytes lines, it will fetch 64 bytes from memory at once when it accesses to a memory.

So if a data structure is larger than 64 bytes, it is very important to divide it by 2 parts: the most demanded fields and the less demanded ones. It is desirable to place the most demanded fields in the first 64 bytes. C++ classes are also concerned.