jack

I wrote a long essay for Liv trying to give a whirlwind tour through the different way programs have allocated memory between 1980s BASICs and today, but realised I didn't necessarily have wide enough knowledge to really do it justice.

It's obviously quite specific as it's aimed at people who've done some programming, but not recently.

If you've any suggestions for how it could be helpful, please shout. I was originally going to put it on toothywiki so someone could make sweeping changes if they happened to feel like it, but didn't feel like trying to format it.

1. STATIC MEMORY ALLOCATION

Suppose you know in advance how many memory locations your program will need to store things. Then you can just designate them in advance, when the program is written or compiled, and not need to do anything special while the program runs.

In C and other languages, this is represented by global variables. Global variables are declared somewhere near the start of the program, and can be accessed at any time in the duration of the program, so the compiler knows exactly how many global variables there are, so when it compiles the program, it can say "OK, there's 20 bytes of code, and a further 10 bytes to store variable 1, variable 2, etc."

A simple part of a program might be:

  unsigned int N; // global variable
  int sum_1_to_20() {
    N=20;
    return N * (N+1) / 2;
  }

And this is compiled into machine code:

  mov  dword ptr [N (405120h)],14h  // Store 20 (0x14) in memory location 0x405120.
  mov  eax,dword ptr [N (405120h)]  // Load memory location 0x405120 into register eax
  lea  ecx,[eax+1]                  // Put that value plus one into register ecx
  imul eax,ecx                      // Multiply register eax by register ecx
  shr  eax,1                        // Shift all the bits in eax right (ie. divides by two).
  ret                                // Return from function (answer in register eax)

FOOTNOTE

In actual fact, if you tried this example, you wouldn't get _exactly_ that unless you knew what to fiddle with, because the compiler should normally optimise the code. In this case, the compiler would probably do something like:

mov eax, 0xD2 // store 0xD2 (210) in register eax
ret // Return from function (answer in register eax)

Which gives the same result but is much shorter and doesn't need any memory locations at all. But it doesn't really illustrate the principle :)

2. DYNAMIC MEMORY ALLOCATION

Sometimes you don't know in advance. If a function was recursive (or just non-trivial), you may need an undetermined number of local variables. In that case, these are allocated by the program as it goes along. One way is to start froom the last byte in memory and work backwards. There is generally a special CPU register ("esp" on many intel CPUs) used to keep track of how far you've got. Local variables are stored at a fixed offset relative to the bottom of the stack, so you might have:

                   (   variables for  )     (   variables for  )(  variables for   )
                   (innermost function)     (2nd outermost func)(outermost function)
 [ BLANK ][ MEMORY ][  var1  ][  var2  ] ... [  var1  ][  var2  ][  var1  ][  var2  ]
           ^
           |
          esp

If your function was:

  unsigned int factorial(volatile unsigned char N) {
    // Calculate factorial using recursion.
    // (Probably not the best way in a real program)
    if (N<=0) {
      return 1;
    } else 
      return N * factorial(N-1);
    }
  }

N is a parameter of the function, but in fact, this acts almost identically to a local variable. This would become something like:

  // When the program starts, the stack looks something like
  //      A         B          C           D          E
  // [BLANKMEM][BLANKMEM][0x00000006][0x00000007][0x00000008]
  //            ^-esp     ^-param                 ^-parameter of outermost function

  mov         eax,dword ptr [esp+4]     // Parent func put 0x00000006 value in stack
                                        // int is four bytes, so at esp+4 (C)
                                        // We read this into register eax.
                                        
  test        eax,eax                   // Test if eax is zero or not.
  
  ja          factorial+0Eh (40106Eh)   // Jump to memory location 40106Eh if not
  
                                        // [N was zero]
  mov         eax,1                     //   store our answer (ie. 1) in eax
  ret                                   //   return
  
                                        // [N was not zero]
  mov         ecx,dword ptr [esp+4]     //   Read 0x0000006 (C) into register ecx
  sub         ecx,1                     //   Subtract one
  push        ecx                       //   Push ecx. ie:
                                        //    Store 0x0000005 at memory loc esp (B)
                                        //    Decremement esp by 4. (A)
  
  call        factorial (40100Ah)       //   Call next innermost function
                                        //    next-innermost answer stored in eax
                                        //    ie. 5! aka 60 
  
  mov         edx,dword ptr [esp+8]     //   Read 0x0000006 (C) into register edx
  add         esp,4                     //   Increment esp by 4 (undoing "push") (B)
                                        //    Now esp at (B) same as when we started
                                        //    Which is important
  imul        eax,edx                   //   Multiply eax=60 by edx=0x00000006
                                        //    Store our answer in eax as well
  
  ret                                   //   return

In fact, my diagram omitted several things, such as the fact that call inserts the current memory address the CPU is supposed to return to onto the stack as well, and ret takes it off again.

During execution, esp should generally point to the word of memory just below the last value allocated, so local variables are always at esp+a_bit and paramters at esp+a_bit_more. Notice that when you go into a function, you should restore the state of the stack to what it was before when you come out again.

3A. POINTERS

However, our plan so far relied on each variable being local to a function. This is not always the case. You may want to say something like:

  void func() {
    FILE* f = fopen("c:/foo.txt", "r");
    char firstLine[1000];
    /* errorcheck */  if (f == 0) return;
    fgets(firstLine, 1000, f);
    printf( "%s", firstLine);
    fclose(f);
  }

Those are all real C library functions. What does fopen actually do? One way or another, it's going to store some information about file somewhere in memory. This is a FILE structure (an object) consisting of a few small variables storing information about the file, and a pointer to a buffer of a few thousand bytes. fopen says "Hey, I need some memory to store a FILE structure, and some more for a buffer".

fgets looks in this memory to see the first line of the file, reading more data into the buffer if necessary. fclose says "Hey, I'm done with this memory".

How does the address of the memory get from fopen to fgets to fclose? FILE* is the address of a FILE structure in memory, ie. a pointer. When fopen allocates the memory it returns the address, and then fclose knows which one to get rid of.

In this case, your program doesn't need to know anything about the contents of FILE or where it is in memory, and it probably shouldn't. In C you can mess with the address directly if you like, but that's generally meaningless, the only valid things to do with the address are done inside fopen and fclose.

Nowadays the library functions might often use a simple class instead of a pointer, to stop your program being able to do this. High level languages might often do this a lot of the time, and not make it obvious which "=" operations are saying "copy this data" and which are saying "copy the address of this data". But its useful to know the difference in your head.

(An object which contains nothing more than this is called a "handle". Alternatively, if you only want to call some other function, there's a syntax which means "access this paramter as if it were a local variable, but internally, don't allocate any memory for it, just make it point to ")

What you do know is that FILE* only takes up approx four bytes in memory, so you can safely pass it to other functions without worrying that the entire 3GB file is being copied. However, you have to be careful that if you use fopen to create a FILE* in one function, and pass it to another function, which calls fclose that the first function doesn't try to read from it without checking whether it's been closed or not.

The reason why pointers are tricky is that they directly expose "writing to arbitrary parts of memory". If you used an object instead, all the fgets and fclose functions might be able to include some code that checks that the memory hasn't gone away. However, if you use a pointer, it's quicker because you don't have any of that checking code. But it's difficult, because if you get rid of the memory, and then call fclose again, it will attempt to look in that memory, and see random garbage[1] and do something unpredictable[1]. (This may not actually happen with FILE*s, but it's the sort of thing that does happen every day, sometimes millions of times a second :))

[1] This is an actual technical term, honest :)

3B. THE HEAP

fopen says "I need some memory to store..." Where is that memory? In example 2 we just stored all the memory we needed consecutively, but that doesn't work, because we might open lots of files, and then close them in some random order, so we'd end up with lots of gaps.

In fact, when compiled every program contains a simple memory manager that asks the operating system for a bunch of memory and keeps track of which bits are being used. This does sensible things like putting requests for small amounts of memory in small gaps and leaving large gaps for potentially large requests, etc.

But when you're coding, you don't have to worry about this for years and years -- it's low level even for C programmers. All you need to know is that the memory is allocated _somewhere_.

In C, there are honest-to-god functions which do this job. Eg:

  char * str = malloc(100); // Malloc grabs memory from operating system, notes down
                            // internally that the first hundred bytes are being used
                            // and returns the address of the memory you can use.
  
  str[0]="H";               // Put things in that memory
  str[1]="e";
  str[2]="l";
  str[3]="l";
  ...
  
  int * pi1 = malloc(sizeof(int)); // Malloc allocates another four bytes elsewhere
  
  free(str);                // malloc keeps a list of what memory is used, so
                            // free only needs to know the address to know which

  int * i2 = malloc(sizeof(int)); // Malloc allocates another four bytes

  printf(str);              // DON'T DO THIS! The second call to malloc may or may not
                            // have reused the memory that was allocated to "str".
                            // As soon as you call free, the memory location in "str"
                            // becomes useless. Ideally you should set it to "0" at                                
                            // once because 0 is ALWAYS wrong[1], and then if you use
                            // it, you'll know immediately.
                            
                            // On the other hand you MUST remember to call free at some
                            // point, otherwise malloc will keep holding onto that
                            // memory until your program ends.

In C++, there's a special language construct to allocate memory. You can say:

  int * p = new int;
  *p = 3;
  ...
  delete p;

"new" is more fancy than malloc because it allocates memory for a specific type of variable, however when it's compiled it does the same thing. When you say "new" you get new memory (potentially a LOT of memory), and each call to "new" has to be matched with a call to "delete".

Higher level languages don't require you to explicitely request memory at all, they let you just make a variable and then rush in and make sure the right memory is available for it, in a way described below.

[1] If your computer can actually give you memory address zero, it's written into the C specification that the compiler has to use some other way of specifying that a memory address is definitely invalid, and it has to turn "if(str==0)" into an appropriate test.

3B-AND-A-HALF. A DIGRESSION

C has two syntaxes for using variables. One is:

  {
    int i;   // <----- program chooses memory location for i automatically
    i = 2*3;
    ...
  }          // <----- i goes away

The other is:

  {
    int * i = &j;  // <----- explicitely choose which memory address variable i
                   //        is stored in. (The same one as j! & means "address of".)
    *i = 2*3;
    ...
  }                // <----- i goes away. However, j still points to that memory address

These two snippets of code do exactly the same thing, except that in the second example the variable i uses a specific memory location (one that was allocated earlier for variable j) and in the first example another memory address is chosen for i.

The two different syntaxes can do much the same thing, except that the first allows you to allocate a variable automatically on the stack, and the second allows you to make a variable refer to a specific memory location explicitely. If there was EITHER a syntax for "make int i point to a different memory location instead" OR a syntax for "allocate a new memory location on the stack for a local variable, and make int * i point to it", you could use just one.

I think some people would find one of those syntaxes less confusing, because you'd always access a variable the same way, rather than sometimes saying i (for a local vairable) and sometimes saying *i (for a pointer).

4. WHERE WE ARE SO FAR

In C, we've so far discovered two ways of allocating memory. If we declare a local variable in a function, a memory location (or more if the variable is bigger) is set aside for it on the stack. This is automatically set aside when the function begins, and goes away again when the function ends.

If we want to share memory between functions, we need to explicitely reserve memory somewhere other than the stack (by calling malloc or new) and then give another function a pointer to it, and then eventually call free or delete.

This works perfectly well. However, there are at least two obvious mistakes you can make. One is that if there's a piece of memory you allocate in 17 places, and free in 23 places, you may forget one. If this happens in a loop, every time, you will reserve sligtly more memory than you free again, and eventually your program will have millions of copies of this memory allocated but never used.

The other is that if you use the same variable in two different parts of the program, one of them has to deallocate it, but if you don't know which part of the program will run first, it's easy for one to delete it when the other is still using it. Eg. you can allocate a local variable, return a pointer to a memory location of (or somewhere in the middle of) the memory allocated for that variable. In that case, the memory is deallocated at the end of the function, and then the program may tried to read it later on.

Even if you don't make that mistake, imagine if your function returned a big object, but the parent function only needed one element of it. You'd ideally like the memory for the object to go away as soon as the parent function has used what it needs, but if you allocated the memory (so you can return the address of the memory rather than copying the memory itself from one func to another), you need to call free or delete afterwards.

In PERL you can blithely allocate variables wherever you like, and the memory is always supposed to just work without you having to worry about it. How does this work?

5. MANAGED MEMORY AND SMART POINTERS

When is memory officially leaked? When we allocate it and don't deallocate it. However, since to deallocate it we need to know the address.

  void func() {
    int * i = new int;
    *i = 2;
  }

The pointer is a local variable so at the "}" it goes away, at which point the memory we allocated is definitely never going to be deallocated because the program doesn't store the memory location for it anywhere. Ideally, the memory would be delete'd at that point.

A smart pointer is an object that works like a pointer, but because the object can include some code that runs when the local variable gets to a "}" and goes away, which can helpfully deallocate the memory for you. Compare:

  void func1() {
    // calculateNewMatrix() allocates lots of memory on the heap with new
    ReallyBigMatrix * m = calculateNewMatrix(); 
    *m = *m + *m;
    delete m; // Mustn't forget!
  }

  void func2() {
    // calculateNewMatrix() allocates lots of memory on the heap with new
    std::auto_ptr m = calculateNewMatrix(); 
    // cleverly m works just like a normal pointer.
    *m = *m + *m;
    // No need to delete anything, auto_ptr calls delete automatically at "}"
  }

In this case, we didn't really need auto_ptr. But if you've been refactoring, you'll have known that whenever you see the same code duplicated in mutiple functions, it would be better if it was all in one place.

Before, every function that called new (or called another function which allocated some memory), had to call delete at the end. If we use auto_ptr, they don't, and even more, they can't FORGET to do so. This is a massive saving.

In PERL, every variable is a smart pointer, so there's no special syntax for it as that's just the way the language works. C jumps through hoops to be backwards-compatible and keep old code working on new compilers without any changes at all. Thus in C, the special syntax is for the new way of doing things, and the simpler syntax is for the original way.

There is _some_ overhead to using auto_ptr as it does some checks to make sure the pointer is valid, etc when its used. However, if you're only doing something fairly simnple, this should be low or non-existant compared to what you would have written anyway, and it's normally 100% worth it.

Now we can explicitely tell variables to be allocated and go away if we want to, but we don't have to.

5A. REFERENCE COUNTING

If you may want multiple pointers to the same bit of memory, you can have an even smarter pointer called a reference counting pointer. Eg. in a forthcoming C++ standard shared_ptr. This works like auto_ptr, except that you can copy a shared_ptr, and have two different pointers to the same object, and have them count the number of pointers left and have the memory be automatically reclaimed when the _last_ one goes away. Eg:

  void func(int flag) {
    int * ans;
    {
      int * i = new int; // Allocate first block of memory
      int * j = new int; // Allocate second block of memory
      *i = 10;
      *j = 20;
      if ( flag ) ans = i; else ans = j;
      // << do something else to *i >>
      // << do something else to *j >>
      if (flag) delete j; else delete i; // <-- delete whichever one ans isn't 
    }
    // << Do something with *ans >>
    delete ans; // <--- delete whichever block is left
  }

This becomes:

  void func(int flag) {
    std::shared_ptr ans;
    {
      std::shared_ptr i = new int; // Allocate first block of memory (1 ptr)
      std::shared_ptr j = new int; // Allocate second block of memory (1 ptr)
      *i = 10;
      *j = 20;
      if ( flag ) ans = i; else ans = j; // One block of memory now has 2 ptrs to it
      // << do something else to i >>
      // << do something else to j >>

    } // <--- i and j go away. One block loses its last shared_ptr which deletes it

    // << Do something with ans >>
    
  } // <--- ans goes away. The other block loses its last shared_ptr which deletes it.

Again, all variables in PERL work like this. If it's done right, it's a lot harder to leak memory accidentally.

Traditional C programmers deplore the overhead of counting the number of uses. Traditional PERL programmers say it ought to be more efficient if it's built into the language, rather than provided as a library.

6. GARBAGE COLLECTION

Reference counting is a simple sort of garbage collection. But real garbage collection runs extra code to shuffle memory around as necessary, etc.

If you don't like reference counting (eg. because you have to run code every time a variable leaves scope, or because Large Block of Memory A contains a smart pointer to Large Block of Memory B, and Large Block of Memory B contains a smart pointer to Large Block of Memory A, and so neither of those pointers ever go out of scope), the alternative is a complete garbage collector.

This is code that's included in your program (or interpreter) that runs every so often, checks which memory still has smart pointers pointing to it, and deletes all rest.

With reference collection the programmer still has to avoid some mistakes (that may be harder to see than if she was doing it all by hand from the beginning). Whereas with garbage collection, every so often some simple request for 4 new bytes of memory may trigger a big chuntering as the program suddenly has to rearrange a lot of things, and it's impossible for your program to control when that happens.

PERL's garbage collector was always based on reference counters. Others are much more advanced, and the question of how to write garbage collectors such that they do their job well is really interesting (and beyond the scope of my knowledge). They are probably how things will often go from now on, but serious programmers still like control over exactly when objects are allocated and deallocated.

Flat | Top-Level Comments Only

From:

simont

(Minor formatting nitpick: you've missed a line break before "// once because 0 is ALWAYS wrong[1]", which widens the entire post and impairs readability.)

jack

Thank you! I generally still read LJ more than DW, so I didn't see (and only saw your comment now). I appreciate how helpful you managed to make the description :)

Memory management examples

Memory management examples

no subject

no subject

Active Recent Entries