- C Programming Language Review
- C Compilation Process
- C Macros
- Specific Sized Numbers
- Undefined Behaviours
- What are Undefined Behaviours
- Frequent Undefined Behaviours
- Signed Overflow
- Division by Zero
- NULL Pointer Dereference
- Value of a Pointer to Object with Ended Lifetime
- Use of Indeterminate Value
- String Literal Modification
- Access Out Of Bounds
- Pointer Used After Freed
- Memory Model in C
C Programming Language Review
This document is not intended to be a primer on the C programming language. It only gives you the essential information you need to complete your programming assignments.
We will focus on common errors, caveats and important concepts that might ease your cognitive overhead when you program in C. We will also recommend some tools that help you speed up your development and debugging.
C Compilation Process
Unlike other interpreted programming languages, we use compilers to compile C written programs.
A full compilation in C is depicted in the following figure:
A detailed explanation can be found here.
C Macros
You often see C preprocessor macros defined to create "small functions"
But they aren't actual functions, it just changes the text of the program.
#include
just copies that file into the current file and replace the arguments.
Example:
#define twox(x) (x + x)
// twox(3); => (3 + 3);
// this could lead to unexpected behaviours
// int y = 2;
// int z = twox(y++); => z = (y++ + y++); z will atucally be 5
Specific Sized Numbers
C only guarantees minimum and relative size of "int", "short" etc... The range that each type can represent depends on the implementation.
The integer data types range in size from at least 8 bits to at least 32 bits. The C99 standard extends this range to include integer sizes of at least 64 bits.
The types are ordered by the width, guaranteeing that wider types are at least as large as narrower types. E.g. long long int
can represents all values that a long int
can represent.
If you need to have an exact width of something, you can use the {u|}int{#}_t
type to specify:
- signed or unsigned
- number of bits
For example:
-
uint8_t
is an unsigned 8-bit integer -
int64_t
is an signed 64-bit integer
All theses types are defined in the header file stdint.h
instead of in the language itself.
Undefined Behaviours
The C language standard precisely specifies the observable behavior of C language programs, except for:
- Undefined behaviours
- Unspecific behaviours
- Implementation-defined behaviours
- Locale-specific behaviours
More information about these can be found here.
We are going to focus on undefined behaviours in this section.
What are Undefined Behaviours
- The language definition says: "We don't know what will happen, nor care of that matter".
- This often means unpredictable behaviour.
- Often contributes to bugs that seem random and hard to reproduce.
What we have to do is to pay attention to these possible behaviours and avoid them in the source code.
We will use UB and undefined behaviours interchangeably in the later sections.
Frequent Undefined Behaviours
Here are some common undefined behaviours you may or may not have already encountered before:
Signed Overflow
#include < limits .h >
int foo(int a)
{
int b = INT_MAX + a; // UB, b can be anything
return b;
}
Division by Zero
#include <stdio.h>
int func() {
int gv;
printf("Enter a integer number: ");
scanf("%d", &gv);
return (23 / func()); // UB
}
NULL Pointer Dereference
int foo(int* p)
{
int x = *p;
if (!p)
return x; // Either UB above or this branch is never taken
else
return 0;
}
int bar()
{
int* p = NULL;
return *p; // Unconditional UB
}
Value of a Pointer to Object with Ended Lifetime
int* fun(int x) {
int y = 2;
y = x + y;
return *y; // UB
}
Use of Indeterminate Value
#include <stdio.h>
int main() {
int a;
int b = a; // UB
printf("a = %d\n", a);
printf("b = %d\n", b);
return 0;
}
String Literal Modification
#include <stdio.h>
int main() {
char *p = "some text here";
p[2] = 'O'; // UB
}
Access Out Of Bounds
#include <stdio.h>
int main() {
int arr[5] = { 1, 2, 3, 4, 5 };
int b = arr[7]; // UB
printf("b = %d\n", b);
}
Pointer Used After Freed
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
char str[9] = "tutorial";
char ftr[9] = "aftertut";
int bufsize = strlen(str) + 1;
char *buf = (char *)malloc(bufsize);
if (!buf) {
return EXIT_FAILURE;
}
free(buf);
strcpy(buf, ftr); // UB
printf("buf = %s\n", buf);
return EXIT_SUCCESS;
}
This list goes on, you can find more information about undefined behaviours in this thesis and in the C99 Standard