NameDateSize

..23-Mar-202045

acpi.cH A D23-Mar-202037.6 KiB

acpi_allocators.cH A D25-Jul-20197.2 KiB

acpi_allocators.hH A D25-Jul-2019559

acpi_debug.hH A D23-Mar-2020735

acpi_ec.cH A D23-Mar-202011 KiB

acpi_main.cH A D23-Mar-20205.7 KiB

acpi_parse_dmar.cH A D23-Mar-202018.5 KiB

acpi_parse_madt.cH A D23-Mar-202016.4 KiB

acpi_service.cH A D23-Mar-20209.9 KiB

acpi_shared.hH A D23-Mar-20201.6 KiB

acpica_osglue.cH A D23-Mar-202044.1 KiB

arch/H25-Jul-20194

buttons.cH A D23-Mar-20202.2 KiB

HakefileH A D23-Mar-20202.7 KiB

intel_vtd.cH A D23-Mar-202028 KiB

intel_vtd.hH A D23-Mar-20208.3 KiB

lpc_ioapic_ioapic_impl.hH A D23-Mar-20201.7 KiB

lpc_ioapic_spaces.hH A D23-Mar-2020986

pcilnk_controller_client.cH A D23-Mar-20207.6 KiB

pcilnk_controller_client.hH A D23-Mar-2020464

README_VTDH A D25-Jul-201910.4 KiB

vtd_debug.hH A D23-Mar-2020726

vtd_domains.hH A D23-Mar-20204.8 KiB

vtd_sl_paging.hH A D23-Mar-20206 KiB

README_VTD

1=====================================================================================
2Intel VT-d (Virtualization Technology for Directed I/O) driver Overview
3=====================================================================================
4
5An overview of Intel Virtualization Technology for Directed I/O can be found on pages
614-19 in the architecture specification (see DOCUMENTATION below).
7
8In the specification and this README, a domain is defined as an abstract isolated
9environment that consists of a subset of the host physical memory. In the case of
10Arrakis, each domain generally corresponds to an application.
11
12=====================================================================================
13
14RELEVANT FILES
15
16Contains errors that may be returned when creating/destroying domains and 
17adding/removing devices
18errors/errno.fugu
19
20Added an x86-64 VNode identity syscall and VNode identity structure
21kernel/arch/x86_64/syscall.c
22include/barrelfish_kpi/capabilities.h
23include/arch/x86_64/barrelfish/invocations_arch.h
24
25Added reference to the root PML4 VNode capability
26lib/barrelfish/capabilities.c
27include/barrelfish/caddr.h
28
29Modified the x86-64 page table entries by adding a vtd_snoop field. Also, changed 
30paging_x86_64_map_large() and paging_x86_64_map() to set the vtd_snoop field.
31include/target/x86_64/barrelfish_kpi/paging_target.h
32kernel/include/target/x86_64/paging_kernel_target.h
33
34Defined the constant VREGION_FLAGS_VTD_SNOOP, which allows the vtd_snoop field to be 
35set from user-space. This constant is bitwise OR'd with the primary flags used in
36user-mapping, such as VREGION_FLAGS_READ_WRITE.
37include/barrelfish/vregion.h
38
39Added RPC calls allowing applications to use the VT-d
40if/acpi.if
41include/acpi_client/acpi_client.h
42lib/acpi_client/acpi_client.c
43usr/acpi/acpi_service.c
44
45Changes needed to add a call to the VT-d initialization function.
46usr/acpi/acpi.c
47usr/acpi/Hakefile
48
49Added a commandline option to not allow translation through the VT-d
50usr/acpi/acpi_main.c
51usr/acpi/acpi_shared.h
52
53Debug printer and its power switch.
54usr/acpi/vtd_debug.h
55
56This header contains the functions and structures used for second-level translation.
57Only used to establish the identity pagetable, but they may also be used to establish
58arbitrary mappings
59usr/acpi/vtd_sl_paging.h
60
61Where the VT-d driver implementation is contained
62usr/acpi/intel_vtd.h
63usr/acpi/intel_vtd.c
64usr/acpi/vtd_domains.h
65
66Added an ACPI-enumerated device declaration structure type
67usr/acpi/acpica/include/actbl2.h
68
69Mackerel specifications for the remapping hardware register set for each hardware
70unit. The offset to the IOTLB registers is found in one of the remapping registers,
71so two separate specifications were created. Another one will need to created for
72the set of fault registers.
73devices/vtd.dev
74devices/vtd_iotlb.dev
75devices/Hakefile
76
77Where we add the remaining devices to the identity domain
78usr/pci/pcimain.c
79
80Contains queries for finding devices to add to the identity domain
81usr/skb/programs/pci_queries.pl
82
83Changes made to support DMA remapping through the VT-d for applications using the
84network stack and the e10k driver:
85if/e10k.if
86usr/drivers/e10k/e10k_qdriver.c
87usr/drivers/e10k/e10k_cdriver.c
88usr/drivers/e10k/e10k_vf.c
89usr/drivers/e10k/Hakefile
90lib/arranet/arranet.c
91lib/arranet/Hakefile
92
93=====================================================================================
94
95IMPLEMENTATION
96
97The VT-d driver is currently coupled with the ACPI daemon.
98
99Domains and domain-ids are manages by a sorted and bounded doubly-linked list.
100
101Remapping hardware units are managed by a simple linked list and use application
102pagetables for translation (except for the identity domain). As a result, applications
103may be required to flush the processor caches (more on this below). 
104
105At the very end of the intialization function in acpi.c, we make a call to vtd_init
106(implemented in intel_vtd.c), where we parse the DMAR table, which is comprised of
107remapping structures that contain devices under their scope. While parsing, we create
108remapping unit structures and report all devices we find to the SKB. After we have 
109completed the task of parsing the DMAR table, we then construct the identity domain.
110
111Since each application corresponds to a domain, we want each domain to have access to
112all devices. Hence, we establish minimum and maximum domain-id bounds across all
113units. We finally execute a query to retrieve and add all applicable devices
114explicitly found in the remapping structures to the identity domain. Devices that are
115applicable to be inserted into the identity domain are all PCIe devices/bridges and
116PCI devices (no bridges) that reside on the root bus. The reason for the later is
117because PCI devices behind the same bus share the same source-id on transactions.
118This also implies that all devices that reside behind the same PCI bridge must be
119contained in the same domain.
120
121Devices that are found during PCI bus enumeration are reported to the SKB in the PCI
122daemon. After PCI device discovery is complete, a RPC call is made to the ACPI daemon
123to extract these devices from the SKB and add them to the identity domain. However,
124since drivers are initiated by Kaluga during device discovery, it is possible for a
125device to be added before this call occurs.
126
127For each hardware unit, after all of this occurs, we set the root table, enable DMA 
128remapping, and report to the SKB which segment translation is enabled on and if 
129flushing is required for that segment.
130
131If remapping hardware units are present on the platform, translation is enabled by
132default. However, the commandline option "vtd_force_off" can be supplied to the ACPI 
133daemon in order to not enable translation on any hardware unit.
134
135An application that wishes to use the VT-d then does the following (excluding any
136other changes such as using virtual addresses in place of physical ones):
137
138(1) Executes a query to determine if translation is enabled for specific segments and
139    if flushing the CPU caches is required for those segments.
140
141(2) Constructs a domain using its root PML4 as an argument. A reference to it can be 
142    found in the list of well-known capabilities in include/barrelfish/caddr.h. It is
143    identified as cap_vroot.
144
145(3) The application then simply adds devices to the constructed domain using that
146    PMl4 as one of the arguments to vtd_domain_add_device(). Applications that want
147    to use the same device can do so by making SR-IOV copies.
148
149(4) Add syscall(s) to flush the cache/TLB entries if flushing is required.
150
151(5) Remove the devices from the domain when done.
152
153=====================================================================================
154
155ASSUMPTIONS AND CAVEATS
156
157The VT-d operations, such as constructing a domain, require passing the application's
158root PML4 VNode capability. As inter-core transfers of VNode capabilities have not
159yet been implemented, any application wishing to use the VT-d must be on the same
160core as the ACPI daemon.
161
162The calls for adding/removing devices require the user to specify the segment the
163device belongs to. Arrakis currently doesn't (and probably won't) support PCI
164segments, which is simply a logical collection of PCI buses. This is used for rather
165complex topologies or hierarchies, which may require more than 256 buses. Since this
166is the case, a value of 0 should be used for the segment number.
167
168Currently, the implementation assumes that there is a single remapping unit for each
169segment. Changes will need to be made to account for the possibility that more than
170one hardware unit resides on a single segment.
171
172Attemping to map the entire address space with the current implementation, which
173constructs the pagetable structure by creating and mapping frames, is infeasible. As
174a result, the identity page table only covers the first 1024 GB of physical memory,
175but the amount mapped can easily be changed.
176
177Since device drivers are started by Kaluga during PCI bus enumeration, translation 
178has to be enabled before this. But the paths of devices under the scope of the 
179remapping structures contained in the DMAR table require knowing secondary bus 
180numbers for path lengths greater than 2. All of this information is reported to the 
181SKB during bus enumeration. Currently, we avoid this problem by assuming that the 
182path length of each device is 2.
183
184Coherency with the processes caches for pages and paging entries is determined by
185the Snoop Control and Page-walk Coherency fields in the Extended Capability register,
186respectively. If the Snoop Control bit is set to 1, then second-level page table 
187entries with the SNP bit set will result in the remapping unit snooping the processor
188caches (for pages). Earlier, it was noted that x86-64 page table entries have been 
189modified to contain a vtd_snoop field. The field that was changed for this to occur 
190was an available field. This should not cause any problems with application mappings.
191Also, the vtd_snoop field is treated as reserved for hardware implementations having
192the Snoop Control bit set to 0. This should allow the VT-d to be used by hardware not 
193supporting the snooping of CPU caches for pages. 
194
195Hardware that don't have both of these bits set for relevant hardware units may 
196require applications to flush the processor caches. Flushing may also be required if 
197the addresses used by the devices in the domain of the application are not obtained 
198from the provided user-mapping functions (e.g. vspace_map_one_frame_attr()). For 
199performance benefits, syscalls to flush the cache should be avoided in the I/O 
200path of the application.
201
202Note that there may be other problems/issues that we are yet unaware of.
203
204=====================================================================================
205
206TESTING
207
208The implementation has been only tested with applications using the e10k drivers,
209with appropriate changes being made to e10k_vf.c, e10k_cdriver.c, e10k_qdriver.c, and
210arranet.c (where we construct the domain and add the appropriate device(s)). 
211
212These applications are:
213Memcached (using the UDP protocol) 
214UDP echo server
215
216In arranet.c, the processor caches have to be flushed, when required, after 
217initalizing the TX packet descriptors in lwip_arrakis_start().
218
219=====================================================================================
220
221TODO
222
223Fault logging
224Interrupt remapping
225...
226
227=====================================================================================
228
229DOCUMENTATION
230
231Based on the latest VT-d architecture specification (September 2013):
232http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf)
233