Sunday, 9 February 2014

What are buffer overflows?

To the uninitiated, the jargon "buffer overflow" may invoke images of hackers and bugs and exploits, but this article shows that you can easliy have your own buffer overflow too.

A buffer overflow occurs when a value is assigned to a variable, but the value exceeds the memory allocated for the variable. A typical example is the C-string:

#include <iostream>
using namespace std;

int main()
{
    char a[8], b[8];
    strcpy(a,"12345678");
    strcpy(b,"abcdefgh");
    cout << "Before:" << endl;
    cout << a << endl;
    cout << b << endl;

    strcpy(b,"abcdefghijklmnopqrstuvwxyz");
    cout << "After:" << endl;
    cout << a << endl;
    cout << b << endl;
    system("pause");
    return 0;
}


On my system it produces the following output:

Before:
12345678
abcdefgh
After:
qrstuvwxyz
abcdefghijklmnopqrstuvwxyz
Press any key to continue...


As observed, although no change was made to the variable a, its value has changed. It has been overwritten by text from variable b. The memory looks something like this, where \0 indicates the end of the C-string:

  starting of variable b                            starting of variable a
             v                                                v
Before: ...  a  b  c  d  e  f  g  h  \0  ?  ?  ?  ?  ?  ?  ?  1  2  3  4  5  6  7  8  \0 ...
After : ...  a  b  c  d  e  f  g  h   i  j  k  l  m  n  o  p  q  r  s  t  u  v  w  y   z ...


The memory address increases from left to right. It is useful to imagine memory as a "stack", that is, the FIRST to go into memory will come out LAST. The bottom of the stack has a higher address than the top of the stack. When a new variable is declared, it is placed on a lower memory address (i.e. higher on the stack). This is why the variable b is on top (to the left) of variable a.

Note that char a[8] and b[8] declares that the variable should take only 8 bytes, and the compiler arranges the stack accordingly. (Homework: What are those question marks?) When variable b is written with the string a-z, the remaining string after character h simply "overflows" into the next variable, thus overwriting the variable a!

In summary, buffer overflows occur when the data size exceeds the memory allocated for it in the first place. Trivial programmer mistakes can lead to hard-to-detect bugs, and even worse, security exploits. Fortunately, most modern debuggers do check for buffer overflows, for example those runtime checks (RTCs) by Microsoft.

No comments:

Post a Comment