(Originally recorded 2020-05-11)
You are happily writing your high-performance computing code and then suddenly something unexpected happens. The program crashes and reports:
This error occurs when your program has tried to use a memory address outside of the range of memory addresses that it has been granted to use by the operating system.
In the simplified model of the CPU that we have been using in class, we glossed over an important component – the memory management unit (MMU).
In that model, the CPU repeatedly fetches instructions from memory and interprets those instructions, some of which may be instructions to load or store data from memory. We illustrated memory as a large array, with data stored in different entries of that array.
The actual memory in your computer does operate as a big array in some sense. Memory accepts addresses (in the form of electrical signals on its address lines) and returns the corresponding data associated with that address (again in the form of electrical signals on its data lines). Similarly, the CPU sends requests to memory in the form of addresses and receives the data back.
But. Actual memory accesses can’t be as simple as sending the addresses generated by the CPU directly to memory and then receiving the data back. Memory is a shared resource in a computer system. Your laptop has hundreds of processes (programs) running on it at any one time – they all expect to be able to read and write to arbitrary memory addresses. However, those arbitrary memory accesses can’t just go directly to physical memory – otherwise processes could read and write the memory of other processes (a bad thing). To enable processes to execute as if they were the only processes using memory, computer systems add a layer of indirection. Programs generate virtual addresses that are translated by the MMU into physical addresses. These translations are carefully managed by the operating system so that the same physical address is not mapped to more than one process (except in well-defined and agreed-upon circumstances). As far as each process is concerned, it is the only process running.
Memory is also a finite resource. Each process can’t have a translation for every possible address – else we would need 64 bits of separate address space for every process. Rather, the operating system must request memory that it wants to use from the operating system. If the operating system grants that request, it will also arrange for the appropriate translations to physical memory for that range of addresses that it is granting.
Translations from virtual to physical addresses are made at the granularity of pages – usually 4k bytes in size (or 12 bits of address). That is, the part of a virtual address that is translated is the amount of the address that distinguishes between pages. The remaining part of the address is used to access the specific piece of memory within the page (and is not translated).
Translations are maintained in a page table (each process has its own page table), which are in turn stored in memory. The translation process involves looking up a the physical frame number (the upper part of the physical address) from the page table and then combining that with the offset from the virtual address to generate a full physical address.
The page table is also used to keep track of what memory has been allocated to a process. The page table is an array that is indexed by the page number and any address can be presented to it to return a translation. However, some of the bits in the contents of the page table are used to indicate whether or not the translation is valid. Only valid translations – to addresses that have been allocated by the OS – are processed and sent to physical memory.
But if a CPU can generate an arbitrary address to present to the MMU, what happens when there is not a valid translation corresponding to that address? You guessed it: a segmentation fault.