Process and Process Memory

Huican Ping Notes

 

Basic Stuff.

    When an executable program is read into system memory by the kernel and executed, it becomes a process. We can consider system memory to be divided into two distinct regions. One is user space, and the other is kernel Space. Every process has is own user space (about 1GB virtual space) and are prevented from interfering with one another. The mode change which is from user mode to kernel mode is called a context switch.

 

 

Process Memory.

Above we said that each process run in its private address space, but how is the space is organized? Generally, the user process is divided into three segments or regions, they are: text, data and stack.

 

I think a diagram can speak it out very clearly.

 

Fig. 1.

 

Text Segment.

 

The Text segment (a.k.a the Instruction segment) contains the executable program code and constant data. The text segment is marked by the operating system as read-only and can not be modified by the process. Multiple processes can share the same text segment. Processes share the text segment if a second copy of the program is to be executed concurrently. In this setting, the system references the previously loaded text segment with the pointer rather than reloading a duplicated. If needed, shared text, which is the default when using the C/C++ compiler, can be turned off by using the -N option on the compile time.

 

Data Segment.

The data segment, which is contiguous (in a virtual sense) with the text segment, can be subdivided into initialized data (e.g. in C/C++, variables that are declared as static or are static by virtual of their placement) and uninitialized (or 0-initizliazed) data. The uninitialized data area is also called BSS (Block Started By Symbol). For example, Initialized Data section is for initialized global variables or static variables, and BSS is for uninitialized. During its execution lifetime, a process may request additional data segment space. Library memory allocation routines (e.g., new, malloc, calloc, etc.) in turn make use of the system calls brk and sbrk to extend the size of the data segment. The newly allocated space is added to the end of the current uninitialized data area. This area of available memory is also called "heap". Generally speaking, you can call the whole data area as heap, but restrictly, people only refers the umapped area in the fig.

 

Stack Segment. 

The stack segment is used by the process for the storage of automatic identifier, register variables, and function call information. In the above figure, the stack grows towards the uninitialized data segment.  

 

The u area.

In addition to the text, data, and stack segment, the OS also maintains for each process a region called the u area (User Area). The u area contains information specific to the process (e.g. open files, current directory, signal action, accounting information) and a system stack segment for process use. If the process makes a system call (e.g., the system call to write in the function in main ), the stack frame information for the system is stored in the system stack segment. Again, this information is kept by the OS in an area that the process doesn't normally have access to. Thus, if this information is needed, the process must use special system call to access it. Like the process itself, the contents of the u area for the process are paged in and out bye the OS.

 

Process Memory Addresses.

The system keep s track of the virtual addresses associated with each user process segment. This addrsses information is available to the process and can be obtained by referencing the external variables etext, edata, and end. The addresses  (not the contents) of these three variable correspond respectively to the first valid address above the text, initialized data, and uninitialized data segments. The below program will show you how this information can be obtained and displayed.


#include <iostream>
extern int etext, edata, end;
using namespace std;
int main( ){
cout << "Adr etext: " << hex << int(&etext) << "\t ";
cout << "Adr edata: " << hex << int(&edata) << "\t ";
cout << "Adr end: " << hex << int(&end ) << "\n";

char *s1 = "hello"; //in initialized data segmenta
static int a=1; //in initialized data segment
static int b; //in uninitialized data segment
char s2[] = "hello"; //in the stack area.
int * c = new int;

cout <<hex << int(s1) << endl;
cout <<hex << int(&a) << endl;
cout <<hex << int(&b) << endl;
cout <<hex << int(s2) << endl;
cout <<hex << int(c) << endl;
delete c;
return 0;
}


$>g++ add.cxx 
$>a.out
Adr etext: 8048a9f Adr edata: 8049d74 Adr end: 8049e10
8048acf //it is in initialized data segment
8049c40 //it is in initialized data segment
8049e0c //uninitialized segment
83e5008 //Library memory allocation routines extend the size of the data segment
bfffbca0 //stack