Swapping Out and Discarding Pages
When physical memory becomes scarce the Linux memory management subsystem
must attempt to free physical pages.
This task falls to the kernel swap daemon (kswapd).
The kernel swap daemon is a special type of process, a kernel thread.
Kernel threads are processes that have no virtual memory, instead they run in kernel mode in the physical address space.
The kernel swap daemon is slightly misnamed in that it does more than merely swap pages out to the system's
Its role is make sure that there are enough free pages in the system to keep the
memory management system operating efficiently.
The Kernel swap daemon (kswapd) is started by the
kernel init process at startup time and sits waiting for the kernel swap timer to periodically expire.
Every time the timer expires, the swap daemon looks to see if the number of
free pages in the system is getting too low.
It uses two variables, free_pages_high and free_pages_low to decide if it should free some pages.
So long as the number of free pages in the system remains above free_pages_high, the kernel
swap daemon does nothing; it sleeps again until its timer next expires.
For the purposes of this check the kernel swap daemon takes into account the number of pages currently
being written out to the swap file.
It keeps a count of these in nr_async_pages, which is incremented each time a page is queued waiting to
be written out to the swap file and decremented when the write to the swap device has completed.
free_pages_low and free_pages_high are set at
system startup time and are related to the number of physical pages in the
If the number of free pages in the system has fallen below free_pages_high or worse still
free_pages_low, the kernel swap daemon will try three ways to reduce the number of physical pages being
used by the system:
- Reducing the size of the buffer and page caches,
- Swapping out System V shared memory pages,
- Swapping out and discarding pages.
If the number of free pages in the system has fallen below free_pages_low, the kernel swap daemon
will try to free 6 pages before it next runs.
Otherwise it will try to free 3 pages.
Each of the above methods are tried in turn until enough pages have been
The kernel swap daemon remembers which method it used the last time that it attempted to free
Each time it runs it will start trying to free pages using this last successful method.
After it has freed sufficient pages, the swap daemon sleeps again until its timer expires.
If the reason that the kernel swap daemon freed pages was that the number of free pages in the
system had fallen below free_pages_low, it only sleeps for half its usual time.
Once the number of free pages is more than free_pages_low the kernel swap daemon goes
back to sleeping longer between checks.
Swapping Out and Discarding Pages
The swap daemon looks at each process in the system
in turn to see if it is a good candidate for swapping.
Good candidates are processes that can be swapped (some cannot) and that have
one or more pages which can be swapped or discarded from memory.
Pages are swapped out of physical memory into the system's swap files only if
the data in them cannot be retrieved another way.
A lot of the contents of an executable image come from the image's file and
can easily be re-read from that file.
For example, the executable instructions of an image will never be modified
by the image and so will never be written to the swap file.
These pages can simply be discarded; when they are again referenced
by the process, they will be brought back into memory from the
Once the process to swap has been located, the swap daemon looks through
all of its virtual memory regions looking for areas which
are not shared or locked.
Linux does not swap out all of the swappable pages of the process that
it has selected. Instead it removes only a small number of pages.
Pages cannot be swapped or discarded if they are locked in memory.
The Linux swap algorithm uses page aging.
Each page has a counter (held in the mem_map_t data structure) that
gives the Kernel swap daemon some idea whether or not a page is worth swapping.
Pages age when they are unused and rejuvinate on access;
the swap daemon only swaps out old pages.
The default action when a page is first allocated, is to give it an initial age of 3.
Each time it is touched, it's age is increased by 3 to a maximum of 20.
Every time the Kernel swap daemon runs it ages pages, decrementing their
age by 1.
These default actions can be changed and for this reason they (and other
swap related information) are stored in the swap_control data structure.
If the page is old (age = 0), the swap daemon will process it further.
Dirty pages are pages which can be swapped out.
Linux uses an architecture specific bit in the PTE to describe pages this way.
However, not all dirty pages are necessarily written to the swap file.
Every virtual memory region of a process may have its own swap operation
(pointed at by the vm_ops pointer in the vm_area_struct) and that method is used.
Otherwise, the swap daemon will allocate a page in the swap file and write the
page out to that device.
The page's page table entry is replaced by one which
is marked as invalid but which contains information about where the page is in
the swap file.
This is an offset into the swap file
where the page is held and an indication of which swap file is being used.
Whatever the swap method used, the original physical page
is made free by putting it back into the free_area.
Clean (or rather not dirty) pages can be discarded and put back into the free_area
If enough of the swappable process' pages have been swapped out or discarded, the swap daemon
will again sleep.
The next time it wakes it will consider the next process in the system.
In this way, the swap daemon nibbles away at each process' physical pages until the system
is again in balance. This is much fairer than swapping out whole processes.