% % Copyright 2014, General Dynamics C4 Systems % % This software may be distributed and modified according to the terms of % the GNU General Public License version 2. Note that NO WARRANTY is provided. % See "LICENSE_GPLv2.txt" for details. % % @TAG(GD_GPL) % \chapter{\label{ch:vspace}Address Spaces and Virtual Memory} A virtual address space in seL4 is called a VSpace. In a similar way to a CSpace (see \autoref{ch:cspace}), a VSpace is composed of objects provided by the microkernel. Unlike CSpaces, these objects for managing virtual memory largely correspond to those of the hardware. Consequently, each architecture defines its own objects for the top-level VSpace and further intermediate paging structures. Common to every architecture is the \obj{Page}, representing a frame of physical memory. The kernel also includes \obj{ASID Pool} and \obj{ASID Control} objects for tracking the status of address spaces. These VSpace-related objects are sufficient to implement the hardware data structures required to create, manipulate, and destroy virtual memory address spaces. It should be noted that, as usual, the manipulator of a virtual memory space needs the appropriate capabilities to the required objects. \section{Objects} \subsection{Hardware Virtual Memory Objects} Each architecture has a top-level paging structure (level 0) and a number of intermediate levels. The top-level paging structure corresponds directly to the higher-level concept of a VSpace in seL4. For each architecture, the VSpace is realised as a different object, as determined by the architectural details. In general, each paging structure at each level contains slots where the next level paging structure, or a specifically sized frame of memory, can be mapped. If the previous level is not mapped, a mapping operation will fail. Developers need to manually create and map all paging structures. The size and type of structure at each level, and the number of bits in the virtual address resolved for that level, is hardware defined. seL4 provides methods for operating on these hardware paging structures including mapping and cache operations. Mapping operations are invoked on the capability being mapped, e.g. to map a level 1 paging structure at a specific virtual address, the capability to the corresponding object is invoked with a map operation, where the top-level structure is passed as an argument. In general, the top-level structure has no invocations for mapping, but is used as an argument to several other virtual-memory related object invocations. For some architectures, the top-level page table can be invoked for cache operations. By making these cache related operations invocations on page directory capabilities in addition to the page capabilities themselves, the API allows users more flexible policy options. For example, a process that has delegated a page directory can conduct cache operations on all frames mapped from that capability without access to those capabilities directly. The rest of this section details the paging structures for each architecture. \subsubsection{IA-32} On IA-32, the VSpace is realised as a \texttt{PageDirectory}, which covers the entire 4\,GiB range in the 32-bit address space, and forms the top-level paging structure. Second level page-tables (\texttt{PageTable} objects) each cover a 4\,MiB range. Structures at both levels are indexed by 10 bits in the virtual address. \begin{tabularx}{\textwidth}{Xlll} \toprule \emph{Object} & \emph{Address Bits} & \emph{Level} & \emph{Methods} \\ \midrule \texttt{PageDirectory} & 22---31 & 0 & \autoref{group__x86__seL4__X86__PageDirectory} \\ \texttt{PageTable} & 12---21 & 1 & \autoref{group__x86__seL4__X86__PageTable} \\ \bottomrule \end{tabularx} \subsubsection{x64} On x86-64, the VSpace is realised as a \texttt{PML4}. Three further levels of paging structure are defined, as shown in the table below. All structures are indexed with 9 bits of the virtual address. \begin{tabularx}{\textwidth}{Xlll} \toprule \emph{Object} & \emph{Address Bits} & \emph{Level} & \emph{Methods} \\ \midrule \texttt{PML4} & 39---47 & 0 & None \\ \texttt{PDPT} & 30---38 & 1 & \autoref{group__x86__64__seL4__X86__PDPT} \\ \texttt{PageDirectory} & 21---29 & 2 & \autoref{group__x86__seL4__X86__PageDirectory} \\ \texttt{PageTable} & 12---20 & 3 & \autoref{group__x86__seL4__X86__PageTable} \\ \bottomrule \end{tabularx} \subsubsection{AArch32} Like IA-32, ARM AArch32 realise the VSpace as a \texttt{PageDirectory}, which covers the entire 4\,GiB address range, and a second-level \texttt{PageTable}. The second-level structures on AArch32 cover 1\,MiB address ranges. ARM AArch32 processors have a two-level page-table structure. The top-level page directory covers a range of 4\,GiB and each page table covers a 1\,MiB range. \begin{tabularx}{\textwidth}{Xlll} \toprule \emph{Object} & \emph{Address Bits} & \emph{Level} & \emph{Methods} \\ \midrule \texttt{PageDirectory} & 20---31 & 0 & \autoref{group__aarch32__seL4__ARM__PageDirectory} \\ \texttt{PageTable} & 12---19 & 1 & \autoref{group__arm__seL4__ARM__PageTable} \\ \bottomrule \end{tabularx} \subsubsection{AArch64} ARM AArch64 processors have a four-level page-table structure, where the VSpace is realised as a \texttt{PageGlobalDirectory}. All paging structures are index by 9 bits of the virtual address. \begin{tabularx}{\textwidth}{Xlll} \toprule \emph{Object} & \emph{Address Bits} & \emph{Level} & \emph{Methods} \\ \midrule \texttt{PageGlobalDirectory} & 39---47 & 0 & \autoref{group__aarch64__seL4__ARM__PageGlobalDirectory} \\ \texttt{PageUpperDirectory} & 30---38 & 1 & \autoref{group__aarch64__seL4__ARM__PageUpperDirectory} \\ \texttt{PageDirectory} & 21---29 & 2 & \autoref{group__aarch64__seL4__ARM__PageDirectory} \\ \texttt{PageTable} & 12---20 & 3 & \autoref{group__arm__seL4__ARM__PageTable} \\ \bottomrule \end{tabularx} \subsection{RISC-V} RISC-V provides the same paging structure for all levels, \texttt{PageTable}. The VSpace is then realised as a \texttt{PageTable}. \subsubsection{RISC-V 32-bit} 32-bit RISC-V \texttt{PageTables} are indexed by 10 bits of virtual address. \begin{tabularx}{\textwidth}{Xlll} \toprule \emph{Object} & \emph{Address Bits} & \emph{Level} & \emph{Methods} \\ \midrule \texttt{PageTable} & 22---31 & 0 & \autoref{group__riscv__seL4__RISCV__PageTable} \\ \texttt{PageTable} & 12---21 & 1 & \autoref{group__riscv__seL4__RISCV__PageTable} \\ \bottomrule \end{tabularx} \subsubsection{RISC-V 64-bit} 64-bit RISC-V follows the SV39 model, where \texttt{PageTables} are indexed by 9 bits of virtual address. Although RISC-V allows for multiple different numbers of paging levels, currently seL4 only supports exactly three levels of paging structures. \begin{tabularx}{\textwidth}{Xlll} \toprule \emph{Object} & \emph{Address Bits} & \emph{Level} & \emph{Methods} \\ \midrule \texttt{PageTable} & 30---38 & 0 & \autoref{group__riscv__seL4__RISCV__PageTable} \\ \texttt{PageTable} & 21---29 & 1 & \autoref{group__riscv__seL4__RISCV__PageTable} \\ \texttt{PageTable} & 12---20 & 2 & \autoref{group__riscv__seL4__RISCV__PageTable} \\ \bottomrule \end{tabularx} \subsection{Page} A \obj{Page} object corresponds to a frame of physical memory that is used to implement virtual memory pages in a virtual address space. The virtual address for a \obj{Page} mapping must be aligned to the size of the \obj{Page} and must be mapped to a suitable VSpace, and every intermediate paging structure required. To map a page readable, the capability to the page that is being invoked must have read permissions. To map the page writeable, the capability must have write permissions. The requested mapping permissions are specified with an argument of type \texttt{seL4\_CapRights} given to the mapping function. If the capability does not have sufficient permissions to authorise the given mapping, then the mapping permissions are silently downgraded. Specific mapping permissions are dependant on the architecture and are documented in the \autoref{sec:api_reference} for each function. At minimum, each architecture defines \texttt{Map}, \texttt{Unmap},\texttt{Remap} and \texttt{GetAddress} methods for pages. Methods for page objects for each architecture can be found in the \autoref{sec:api_reference}, and are indexed per architecture in the table below. \begin{tabularx}{\textwidth}{Xl} \toprule \emph{Architectures} & \emph{Methods} \\ \midrule IA32, X64 & \autoref{group__x86__seL4__X86__Page} \\ AArch32, AArch64 & \autoref{group__arm__seL4__ARM__Page} \\ RISC-V & \autoref{group__riscv__seL4__RISCV__Page} \\ \bottomrule \end{tabularx} Each architecture also defines a range of page sizes. In the next section we show the available page sizes, as well as the \emph{mapping level}, which refers to the level of the paging structure at which this page must be mapped. \subsubsection{AArch32 page sizes} \begin{tabularx}{\textwidth}{Xll} \toprule \emph{Constant} & \emph{Size} & \emph{Mapping level} \\ \midrule \texttt{seL4\_PageBits} & 4\,KiB & 1 \\ \texttt{seL4\_LargePageBits} & 64\,KiB & 1 \\ \texttt{seL4\_SectionBits} & 1\,MiB & 0 \\ \texttt{seL4\_SuperSectionBits} & 16\,MiB & 0 \\ \bottomrule \end{tabularx} Mappings for sections and super sections consume 16 slots in the page table and page directory respectively. \subsubsection{AArch64 page sizes} \begin{tabularx}{\textwidth}{Xll} \toprule \emph{Constant} & \emph{Size} & \emph{Mapping level} \\ \midrule \texttt{seL4\_PageBits} & 4\,KiB & 3 \\ \texttt{seL4\_LargePageBits} & 2\,MiB & 2 \\ \texttt{seL4\_HugePageBits} & 1\,GiB & 1 \\ \bottomrule \end{tabularx} \subsubsection{IA-32 page sizes} \begin{tabularx}{\textwidth}{Xll} \toprule \emph{Constant} & \emph{Size} & \emph{Mapping level} \\ \midrule \texttt{seL4\_PageBits} & 4\,KiB & 1 \\ \texttt{seL4\_LargePageBits} & 4\,MiB & 0 \\ \bottomrule \end{tabularx} \subsubsection{X64 page sizes} \begin{tabularx}{\textwidth}{Xll} \toprule \emph{Constant} & \emph{Size} & \emph{Mapping level} \\ \midrule \texttt{seL4\_PageBits} & 4\,KiB & 3 \\ \texttt{seL4\_LargePageBits} & 2\,MiB & 2 \\ \texttt{seL4\_HugePageBits} & 1\,GiB & 1 \\ \bottomrule \end{tabularx} \subsubsection{RISC-V 32-bit page sizes} \begin{tabularx}{\textwidth}{Xll} \toprule \emph{Constant} & \emph{Size} & \emph{Mapping level} \\ \midrule \texttt{seL4\_PageBits} & 4\,KiB & 1 \\ \texttt{seL4\_LargePageBits} & 4\,MiB & 0 \\ \texttt{seL4\_HugePageBits} & 512\,MiB & 0 \\ \bottomrule \end{tabularx} \subsubsection{RISC-V 64-bit page sizes} \begin{tabularx}{\textwidth}{Xll} \toprule \emph{Constant} & \emph{Size} & \emph{Mapping level} \\ \midrule \texttt{seL4\_PageBits} & 4\,KiB & 2 \\ \texttt{seL4\_LargePageBits} & 2\,MiB & 1 \\ \texttt{seL4\_HugePageBits} & 1\,GiB & 0 \\ \bottomrule \end{tabularx} \subsection{ASID Control} For internal kernel book-keeping purposes, there is a fixed maximum number of applications the system can support. In order to manage this limited resource, the microkernel provides an \obj{ASID Control} capability. The \obj{ASID Control} capability is used to generate a capability that authorises the use of a subset of available address-space identifiers. This newly created capability is called an \obj{ASID Pool}. \obj{ASID Control} only has a single \texttt{MakePool} method for each architecture, listed in the table below. \begin{tabularx}{\textwidth}{Xl} \toprule \emph{Architectures} & \emph{Methods} \\ \midrule IA32, X64 & \autoref{group__x86__seL4__X86__ASIDControl} \\ AArch32, AArch64 & \autoref{group__arm__seL4__ARM__ASIDControl} \\ RISC-V & \autoref{group__riscv__seL4__RISCV__ASIDControl} \\ \bottomrule \end{tabularx} \subsection{ASID Pool} An \obj{ASID Pool} confers the right to create a subset of the available maximum applications. For a VSpace to be usable by an application, it must be assigned to an ASID. This is done using a capability to an \obj{ASID Pool}. The \obj{ASID Pool} object has a single method, \texttt{Assign}, for each architecture: \begin{tabularx}{\textwidth}{Xl} \toprule \emph{Architectures} & \emph{Methods} \\ \midrule IA32, X64 & \autoref{group__x86__seL4__X86__ASIDPool} \\ AArch32, AArch64 & \autoref{group__arm__seL4__ARM__ASIDPool} \\ RISC-V & \autoref{group__riscv__seL4__RISCV__ASIDPool} \\ \bottomrule \end{tabularx} \section{Mapping Attributes} A parameter of type \texttt{seL4\_ARM\_VMAttributes} or \texttt{seL4\_x86\_VMAttributes} is used to specify the cache behaviour of the page being mapped; possible values for ARM that can be bitwise OR'd together are shown in \autoref{tbl:vmattr_arm} \ifxeightsix and an enumeration of valid values for IA-32 are shown in \autoref{tbl:vmattr_ia32}\fi. \begin{table}[htb] \begin{center} \begin{tabularx}{\textwidth}{p{0.33\textwidth}X} \toprule Attribute & Meaning \\ \midrule \texttt{seL4\_ARM\_PageCacheable} & Enable data in this mapping to be cached \\ \texttt{seL4\_ARM\_ParityEnabled} & Enable parity checking for this mapping\\ \texttt{seL4\_ARM\_ExecuteNever} & Map this memory as non-executable \\ \bottomrule \end{tabularx} \caption{\label{tbl:vmattr_arm} Virtual memory attributes for ARM page table entries.} \end{center} \end{table} \begin{table}[htb] \begin{center} \begin{tabularx}{\textwidth}{p{0.33\textwidth}X} \toprule Attribute & Meaning \\ \midrule \texttt{seL4\_x86\_WriteBack} & Read and writes are cached \\ \texttt{seL4\_x86\_CacheDisabled} & Prevent data in this mapping from being cached \\ \texttt{seL4\_x86\_WriteThrough} & Enable write through cacheing for this mapping \\ \texttt{seL4\_x86\_WriteCombining} & Enable write combining for this mapping \\ \bottomrule \end{tabularx} \caption{\label{tbl:vmattr_ia32} Virtual memory attributes for x86 page table entries.} \end{center} \end{table} \section{Sharing Memory} seL4 does not allow \obj{Page Table}s to be shared, but does allow pages to be shared between address spaces. To share a page, the capability to the \obj{Page} must first be duplicated using the \apifunc{seL4\_CNode\_Copy}{cnode_copy} method and the new copy must be used in the \apifunc{seL4\_ARM\_Page\_Map}{arm_page_map} \ifxeightsix or \apifunc{seL4\_x86\_Page\_Map}{x86_page_map} \fi method that maps the page into the second address space. Attempting to map the same capability twice will result in an error. \section{Page Faults} Page faults are reported to the exception handler of the executed thread. See \autoref{sec:vm-fault}.