1204431Sraj\documentclass[a4paper,twocolumn]{article}
2204431Sraj
3204431Sraj\usepackage{abstract}
4204431Sraj\usepackage{xspace}
5204431Sraj\usepackage{amssymb}
6204431Sraj\usepackage{latexsym}
7204431Sraj\usepackage{tabularx}
8204431Sraj\usepackage[T1]{fontenc}
9204431Sraj\usepackage{calc}
10204431Sraj\usepackage{listings}
11204431Sraj\usepackage{color}
12204431Sraj\usepackage{url}
13204431Sraj
14204431Sraj\title{Device trees everywhere}
15204431Sraj
16204431Sraj\author{David Gibson \texttt{<{dwg}{@}{au1.ibm.com}>}\\
17204431Sraj  Benjamin Herrenschmidt \texttt{<{benh}{@}{kernel.crashing.org}>}\\
18204431Sraj  \emph{OzLabs, IBM Linux Technology Center}}
19204431Sraj
20204431Sraj\newcommand{\R}{\textsuperscript{\textregistered}\xspace}
21204431Sraj\newcommand{\tm}{\textsuperscript{\texttrademark}\xspace}
22204431Sraj\newcommand{\tge}{$\geqslant$}
23204431Sraj%\newcommand{\ditto}{\textquotedbl\xspace}
24204431Sraj
25204431Sraj\newcommand{\fixme}[1]{$\bigstar$\emph{\textbf{\large #1}}$\bigstar$\xspace}
26204431Sraj
27204431Sraj\newcommand{\ppc}{\mbox{PowerPC}\xspace}
28204431Sraj\newcommand{\of}{Open Firmware\xspace}
29204431Sraj\newcommand{\benh}{Ben Herrenschmidt\xspace}
30204431Sraj\newcommand{\kexec}{\texttt{kexec()}\xspace}
31204431Sraj\newcommand{\dtbeginnode}{\texttt{OF\_DT\_BEGIN\_NODE\xspace}}
32204431Sraj\newcommand{\dtendnode}{\texttt{OF\_DT\_END\_NODE\xspace}}
33204431Sraj\newcommand{\dtprop}{\texttt{OF\_DT\_PROP\xspace}}
34204431Sraj\newcommand{\dtend}{\texttt{OF\_DT\_END\xspace}}
35204431Sraj\newcommand{\dtc}{\texttt{dtc}\xspace}
36204431Sraj\newcommand{\phandle}{\texttt{linux,phandle}\xspace}
37204431Sraj\begin{document}
38204431Sraj
39204431Sraj\maketitle
40204431Sraj
41204431Sraj\begin{abstract}
42204431Sraj  We present a method for booting a \ppc{}\R Linux\R kernel on an
43204431Sraj  embedded machine.  To do this, we supply the kernel with a compact
44204431Sraj  flattened-tree representation of the system's hardware based on the
45204431Sraj  device tree supplied by Open Firmware on IBM\R servers and Apple\R
46204431Sraj  Power Macintosh\R machines.
47204431Sraj
48204431Sraj  The ``blob'' representing the device tree can be created using \dtc
49204431Sraj  --- the Device Tree Compiler --- that turns a simple text
50204431Sraj  representation of the tree into the compact representation used by
51204431Sraj  the kernel.  The compiler can produce either a binary ``blob'' or an
52204431Sraj  assembler file ready to be built into a firmware or bootwrapper
53204431Sraj  image.
54204431Sraj
55204431Sraj  This flattened-tree approach is now the only supported method of
56204431Sraj  booting a \texttt{ppc64} kernel without Open Firmware, and we plan
57204431Sraj  to make it the only supported method for all \texttt{powerpc}
58204431Sraj  kernels in the future.
59204431Sraj\end{abstract}
60204431Sraj
61204431Sraj\section{Introduction}
62204431Sraj
63204431Sraj\subsection{OF and the device tree}
64204431Sraj
65204431SrajHistorically, ``everyday'' \ppc machines have booted with the help of
66204431Sraj\of (OF), a firmware environment defined by IEEE1275 \cite{IEEE1275}.
67204431SrajAmong other boot-time services, OF maintains a device tree that
68204431Srajdescribes all of the system's hardware devices and how they're
69204431Srajconnected.  During boot, before taking control of memory management,
70204431Srajthe Linux kernel uses OF calls to scan the device tree and transfer it
71204431Srajto an internal representation that is used at run time to look up
72204431Srajvarious device information.
73204431Sraj
74204431SrajThe device tree consists of nodes representing devices or
75204431Srajbuses\footnote{Well, mostly.  There are a few special exceptions.}.
76204431SrajEach node contains \emph{properties}, name--value pairs that give
77204431Srajinformation about the device.  The values are arbitrary byte strings,
78204431Srajand for some properties, they contain tables or other structured
79204431Srajinformation.
80204431Sraj
81204431Sraj\subsection{The bad old days}
82204431Sraj
83204431SrajEmbedded systems, by contrast, usually have a minimal firmware that
84204431Srajmight supply a few vital system parameters (size of RAM and the like),
85204431Srajbut nothing as detailed or complete as the OF device tree.  This has
86204431Srajmeant that the various 32-bit \ppc embedded ports have required a
87204431Srajvariety of hacks spread across the kernel to deal with the lack of
88204431Srajdevice tree.  These vary from specialised boot wrappers to parse
89204431Srajparameters (which are at least reasonably localised) to
90204431SrajCONFIG-dependent hacks in drivers to override normal probe logic with
91204431Srajhardcoded addresses for a particular board.  As well as being ugly of
92204431Srajitself, such CONFIG-dependent hacks make it hard to build a single
93204431Srajkernel image that supports multiple embedded machines.
94204431Sraj
95204431SrajUntil relatively recently, the only 64-bit \ppc machines without OF
96204431Srajwere legacy (pre-POWER5\R) iSeries\R machines.  iSeries machines often
97204431Srajonly have virtual IO devices, which makes it quite simple to work
98204431Srajaround the lack of a device tree.  Even so, the lack means the iSeries
99204431Srajboot sequence must be quite different from the pSeries or Macintosh,
100204431Srajwhich is not ideal.
101204431Sraj
102204431SrajThe device tree also presents a problem for implementing \kexec.  When
103204431Srajthe kernel boots, it takes over full control of the system from OF,
104204431Srajeven re-using OF's memory.  So, when \kexec comes to boot another
105204431Srajkernel, OF is no longer around for the second kernel to query.
106204431Sraj
107204431Sraj\section{The Flattened Tree}
108204431Sraj
109204431SrajIn May 2005 \benh implemented a new approach to handling the device
110204431Srajtree that addresses all these problems.  When booting on OF systems,
111204431Srajthe first thing the kernel runs is a small piece of code in
112204431Sraj\texttt{prom\_init.c}, which executes in the context of OF.  This code
113204431Srajwalks the device tree using OF calls, and transcribes it into a
114204431Srajcompact, flattened format.  The resulting device tree ``blob'' is then
115204431Srajpassed to the kernel proper, which eventually unflattens the tree into
116204431Srajits runtime form.  This blob is the only data communicated between the
117204431Sraj\texttt{prom\_init.c} bootstrap and the rest of the kernel.
118204431Sraj
119204431SrajWhen OF isn't available, either because the machine doesn't have it at
120204431Srajall or because \kexec has been used, the kernel instead starts
121204431Srajdirectly from the entry point taking a flattened device tree.  The
122204431Srajdevice tree blob must be passed in from outside, rather than generated
123204431Srajby part of the kernel from OF.  For \kexec, the userland
124204431Sraj\texttt{kexec} tools build the blob from the runtime device tree
125204431Srajbefore invoking the new kernel.  For embedded systems the blob can
126204431Srajcome either from the embedded bootloader, or from a specialised
127204431Srajversion of the \texttt{zImage} wrapper for the system in question.
128204431Sraj
129204431Sraj\subsection{Properties of the flattened tree}
130204431Sraj
131204431SrajThe flattened tree format should be easy to handle, both for the
132204431Srajkernel that parses it and the bootloader that generates it.  In
133204431Srajparticular, the following properties are desirable:
134204431Sraj
135204431Sraj\begin{itemize}
136204431Sraj\item \emph{relocatable}: the bootloader or kernel should be able to
137204431Sraj  move the blob around as a whole, without needing to parse or adjust
138204431Sraj  its internals.  In practice that means we must not use pointers
139204431Sraj  within the blob.
140204431Sraj\item \emph{insert and delete}: sometimes the bootloader might want to
141204431Sraj  make tweaks to the flattened tree, such as deleting or inserting a
142204431Sraj  node (or whole subtree).  It should be possible to do this without
143204431Sraj  having to effectively regenerate the whole flattened tree.  In
144204431Sraj  practice this means limiting the use of internal offsets in the blob
145204431Sraj  that need recalculation if a section is inserted or removed with
146204431Sraj  \texttt{memmove()}.
147204431Sraj\item \emph{compact}: embedded systems are frequently short of
148204431Sraj  resources, particularly RAM and flash memory space.  Thus, the tree
149204431Sraj  representation should be kept as small as conveniently possible.
150204431Sraj\end{itemize}
151204431Sraj
152204431Sraj\subsection{Format of the device tree blob}
153204431Sraj\label{sec:format}
154204431Sraj
155204431Sraj\begin{figure}[htb!]
156204431Sraj  \centering
157204431Sraj  \footnotesize
158204431Sraj  \begin{tabular}{r|c|l}
159204431Sraj    \multicolumn{1}{r}{\textbf{Offset}}& \multicolumn{1}{c}{\textbf{Contents}} \\\cline{2-2}
160204431Sraj    \texttt{0x00} & \texttt{0xd00dfeed} & magic number \\\cline{2-2}
161204431Sraj    \texttt{0x04} & \emph{totalsize} \\\cline{2-2}
162204431Sraj    \texttt{0x08} & \emph{off\_struct} & \\\cline{2-2}
163204431Sraj    \texttt{0x0C} & \emph{off\_strs} & \\\cline{2-2}
164204431Sraj    \texttt{0x10} & \emph{off\_rsvmap} & \\\cline{2-2}
165204431Sraj    \texttt{0x14} & \emph{version} \\\cline{2-2}
166204431Sraj    \texttt{0x18} & \emph{last\_comp\_ver} & \\\cline{2-2}
167204431Sraj    \texttt{0x1C} & \emph{boot\_cpu\_id} & \tge v2 only\\\cline{2-2}
168204431Sraj    \texttt{0x20} & \emph{size\_strs} & \tge v3 only\\\cline{2-2}
169204431Sraj    \multicolumn{1}{r}{\vdots} & \multicolumn{1}{c}{\vdots} & \\\cline{2-2}
170204431Sraj    \emph{off\_rsvmap} & \emph{address0} & memory reserve \\
171204431Sraj    + \texttt{0x04} & ...& table \\\cline{2-2}
172204431Sraj    + \texttt{0x08} & \emph{len0} & \\
173204431Sraj    + \texttt{0x0C} & ...& \\\cline{2-2}
174204431Sraj    \vdots & \multicolumn{1}{c|}{\vdots} & \\\cline{2-2}
175204431Sraj    & \texttt{0x00000000}- & end marker\\
176204431Sraj    & \texttt{00000000} & \\\cline{2-2}
177204431Sraj    & \texttt{0x00000000}- & \\
178204431Sraj    & \texttt{00000000} & \\\cline{2-2}
179204431Sraj    \multicolumn{1}{r}{\vdots} & \multicolumn{1}{c}{\vdots} & \\\cline{2-2}
180204431Sraj    \emph{off\_strs} & \texttt{'n' 'a' 'm' 'e'} & strings block \\
181204431Sraj    + \texttt{0x04} & \texttt{~0~ 'm' 'o' 'd'} & \\
182204431Sraj    + \texttt{0x08} & \texttt{'e' 'l' ~0~ \makebox[\widthof{~~~}]{\textrm{...}}} & \\
183204431Sraj    \vdots & \multicolumn{1}{c|}{\vdots} & \\\cline{2-2}
184204431Sraj    \multicolumn{1}{r}{+ \emph{size\_strs}} \\
185204431Sraj    \multicolumn{1}{r}{\vdots} & \multicolumn{1}{c}{\vdots} & \\\cline{2-2}
186204431Sraj    \emph{off\_struct} & \dtbeginnode & structure block \\\cline{2-2}
187204431Sraj    + \texttt{0x04} & \texttt{'/' ~0~ ~0~ ~0~}  & root node\\\cline{2-2}
188204431Sraj    + \texttt{0x08} & \dtprop & \\\cline{2-2}
189204431Sraj    + \texttt{0x0C} & \texttt{0x00000005} & ``\texttt{model}''\\\cline{2-2}
190204431Sraj    + \texttt{0x10} & \texttt{0x00000008} & \\\cline{2-2}
191204431Sraj    + \texttt{0x14} & \texttt{'M' 'y' 'B' 'o'} & \\
192204431Sraj    + \texttt{0x18} & \texttt{'a' 'r' 'd' ~0~} & \\\cline{2-2}
193204431Sraj    \vdots & \multicolumn{1}{c|}{\vdots} & \\\cline{2-2}
194204431Sraj    & \texttt{\dtendnode} \\\cline{2-2}
195204431Sraj    & \texttt{\dtend} \\\cline{2-2}
196204431Sraj    \multicolumn{1}{r}{\vdots} & \multicolumn{1}{c}{\vdots} & \\\cline{2-2}
197204431Sraj    \multicolumn{1}{r}{\emph{totalsize}} \\
198204431Sraj  \end{tabular}
199204431Sraj  \caption{Device tree blob layout}
200204431Sraj  \label{fig:blob-layout}
201204431Sraj\end{figure}
202204431Sraj
203204431SrajThe format for the blob we devised, was first described on the
204204431Sraj\texttt{linuxppc64-dev} mailing list in \cite{noof1}.  The format has
205204431Srajsince evolved through various revisions, and the current version is
206204431Srajincluded as part of the \dtc (see \S\ref{sec:dtc}) git tree,
207204431Sraj\cite{dtcgit}.
208204431Sraj
209204431SrajFigure \ref{fig:blob-layout} shows the layout of the blob of data
210204431Srajcontaining the device tree.  It has three sections of variable size:
211204431Srajthe \emph{memory reserve table}, the \emph{structure block} and the
212204431Sraj\emph{strings block}.  A small header gives the blob's size and
213204431Srajversion and the locations of the three sections, plus a handful of
214204431Srajvital parameters used during early boot.
215204431Sraj
216204431SrajThe memory reserve map section gives a list of regions of memory that
217204431Srajthe kernel must not use\footnote{Usually such ranges contain some data
218204431Srajstructure initialised by the firmware that must be preserved by the
219204431Srajkernel.}.  The list is represented as a simple array of (address,
220204431Srajsize) pairs of 64 bit values, terminated by a zero size entry.  The
221204431Srajstrings block is similarly simple, consisting of a number of
222204431Srajnull-terminated strings appended together, which are referenced from
223204431Srajthe structure block as described below.
224204431Sraj
225204431SrajThe structure block contains the device tree proper.  Each node is
226204431Srajintroduced with a 32-bit \dtbeginnode tag, followed by the node's name
227204431Srajas a null-terminated string, padded to a 32-bit boundary.  Then
228204431Srajfollows all of the properties of the node, each introduced with a
229204431Sraj\dtprop tag, then all of the node's subnodes, each introduced with
230204431Srajtheir own \dtbeginnode tag.  The node ends with an \dtendnode tag, and
231204431Srajafter the \dtendnode for the root node is an \dtend tag, indicating
232204431Srajthe end of the whole tree\footnote{This is redundant, but included for
233204431Srajease of parsing.}.  The structure block starts with the \dtbeginnode
234204431Srajintroducing the description of the root node (named \texttt{/}).
235204431Sraj
236204431SrajEach property, after the \dtprop, has a 32-bit value giving an offset
237204431Srajfrom the beginning of the strings block at which the property name is
238204431Srajstored.  Because it's common for many nodes to have properties with
239204431Srajthe same name, this approach can substantially reduce the total size
240204431Srajof the blob.  The name offset is followed by the length of the
241204431Srajproperty value (as a 32-bit value) and then the data itself padded to
242204431Sraja 32-bit boundary.
243204431Sraj
244204431Sraj\subsection{Contents of the tree}
245204431Sraj\label{sec:treecontents}
246204431Sraj
247204431SrajHaving seen how to represent the device tree structure as a flattened
248204431Srajblob, what actually goes into the tree?  The short answer is ``the
249204431Srajsame as an OF tree''.  On OF systems, the flattened tree is
250204431Srajtranscribed directly from the OF device tree, so for simplicity we
251204431Srajalso use OF conventions for the tree on other systems.
252204431Sraj
253204431SrajIn many cases a flat tree can be simpler than a typical OF provided
254204431Srajdevice tree.  The flattened tree need only provide those nodes and
255204431Srajproperties that the kernel actually requires; the flattened tree
256204431Srajgenerally need not include devices that the kernel can probe itself.
257204431SrajFor example, an OF device tree would normally include nodes for each
258204431SrajPCI device on the system.  A flattened tree need only include nodes
259204431Srajfor the PCI host bridges; the kernel will scan the buses thus
260204431Srajdescribed to find the subsidiary devices.  The device tree can include
261204431Srajnodes for devices where the kernel needs extra information, though:
262204431Srajfor example, for ISA devices on a subsidiary PCI/ISA bridge, or for
263204431Srajdevices with unusual interrupt routing.
264204431Sraj
265204431SrajWhere they exist, we follow the IEEE1275 bindings that specify how to
266204431Srajdescribe various buses in the device tree (for example,
267204431Sraj\cite{IEEE1275-pci} describe how to represent PCI devices).  The
268204431Srajstandard has not been updated for a long time, however, and lacks
269204431Srajbindings for many modern buses and devices.  In particular, embedded
270204431Srajspecific devices such as the various System-on-Chip buses are not
271204431Srajcovered.  We intend to create new bindings for such buses, in keeping
272204431Srajwith the general conventions of IEEE1275 (a simple such binding for a
273204431SrajSystem-on-Chip bus was included in \cite{noof5} a revision of
274204431Sraj\cite{noof1}).
275204431Sraj
276204431SrajOne complication arises for representing ``phandles'' in the flattened
277204431Srajtree.  In OF, each node in the tree has an associated phandle, a
278204431Sraj32-bit integer that uniquely identifies the node\footnote{In practice
279204431Srajusually implemented as a pointer or offset within OF memory.}.  This
280204431Srajhandle is used by the various OF calls to query and traverse the tree.
281204431SrajSometimes phandles are also used within the tree to refer to other
282204431Srajnodes in the tree.  For example, devices that produce interrupts
283204431Srajgenerally have an \texttt{interrupt-parent} property giving the
284204431Srajphandle of the interrupt controller that handles interrupts from this
285204431Srajdevice.  Parsing these and other interrupt related properties allows
286204431Srajthe kernel to build a complete representation of the system's
287204431Srajinterrupt tree, which can be quite different from the tree of bus
288204431Srajconnections.
289204431Sraj
290204431SrajIn the flattened tree, a node's phandle is represented by a special
291204431Sraj\phandle property.  When the kernel generates a flattened tree from
292204431SrajOF, it adds a \phandle property to each node, containing the phandle
293204431Srajretrieved from OF.  When the tree is generated without OF, however,
294204431Srajonly nodes that are actually referred to by phandle need to have this
295204431Srajproperty.
296204431Sraj
297204431SrajAnother complication arises because nodes in an OF tree have two
298204431Srajnames.  First they have the ``unit name'', which is how the node is
299204431Srajreferred to in an OF path.  The unit name generally consists of a
300204431Srajdevice type followed by an \texttt{@} followed by a \emph{unit
301204431Srajaddress}.  For example \texttt{/memory@0} is the full path of a memory
302204431Srajnode at address 0, \texttt{/ht@0,f2000000/pci@1} is the path of a PCI
303204431Srajbus node, which is under a HyperTransport\tm bus node.  The form of
304204431Srajthe unit address is bus dependent, but is generally derived from the
305204431Srajnode's \texttt{reg} property.  In addition, nodes have a property,
306204431Sraj\texttt{name}, whose value is usually equal to the first path of the
307204431Srajunit name. For example, the nodes in the previous example would have
308204431Sraj\texttt{name} properties equal to \texttt{memory} and \texttt{pci},
309204431Srajrespectively.  To save space in the blob, the current version of the
310204431Srajflattened tree format only requires the unit names to be present.
311204431SrajWhen the kernel unflattens the tree, it automatically generates a
312204431Sraj\texttt{name} property from the node's path name.
313204431Sraj
314204431Sraj\section{The Device Tree Compiler}
315204431Sraj\label{sec:dtc}
316204431Sraj
317204431Sraj\begin{figure}[htb!]
318204431Sraj  \centering
319204431Sraj  \begin{lstlisting}[frame=single,basicstyle=\footnotesize\ttfamily,
320204431Sraj    tabsize=3,numbers=left,xleftmargin=2em]
321204431Sraj/memreserve/ 0x20000000-0x21FFFFFF;
322204431Sraj
323204431Sraj/ {
324204431Sraj	model = "MyBoard";
325204431Sraj	compatible = "MyBoardFamily";
326204431Sraj	#address-cells = <2>;
327204431Sraj	#size-cells = <2>;
328204431Sraj
329204431Sraj	cpus {
330204431Sraj		#address-cells = <1>;
331204431Sraj		#size-cells = <0>;
332204431Sraj		PowerPC,970@0 {
333204431Sraj			device_type = "cpu";
334204431Sraj			reg = <0>;
335204431Sraj			clock-frequency = <5f5e1000>;
336204431Sraj			timebase-frequency = <1FCA055>;
337204431Sraj			linux,boot-cpu;
338204431Sraj			i-cache-size = <10000>;
339204431Sraj			d-cache-size = <8000>;
340204431Sraj		};
341204431Sraj	};
342204431Sraj
343204431Sraj	memory@0 {
344204431Sraj		device_type = "memory";
345204431Sraj		memreg: reg = <00000000 00000000
346204431Sraj		               00000000 20000000>;
347204431Sraj	};
348204431Sraj
349204431Sraj	mpic@0x3fffdd08400 {
350204431Sraj		/* Interrupt controller */
351204431Sraj		/* ... */
352204431Sraj	};
353204431Sraj
354204431Sraj	pci@40000000000000 {
355204431Sraj		/* PCI host bridge */
356204431Sraj		/* ... */
357204431Sraj	};
358204431Sraj
359204431Sraj	chosen {
360204431Sraj		bootargs = "root=/dev/sda2";
361204431Sraj		linux,platform = <00000600>;
362204431Sraj		interrupt-controller =
363204431Sraj			< &/mpic@0x3fffdd08400 >;
364204431Sraj	};
365204431Sraj};
366204431Sraj\end{lstlisting}
367204431Sraj  \caption{Example \dtc source}
368204431Sraj  \label{fig:dts}
369204431Sraj\end{figure}
370204431Sraj
371204431SrajAs we've seen, the flattened device tree format provides a convenient
372204431Srajway of communicating device tree information to the kernel.  It's
373204431Srajsimple for the kernel to parse, and simple for bootloaders to
374204431Srajmanipulate.  On OF systems, it's easy to generate the flattened tree
375204431Srajby walking the OF maintained tree.  However, for embedded systems, the
376204431Srajflattened tree must be generated from scratch.
377204431Sraj
378204431SrajEmbedded bootloaders are generally built for a particular board.  So,
379204431Srajit's usually possible to build the device tree blob at compile time
380204431Srajand include it in the bootloader image.  For minor revisions of the
381204431Srajboard, the bootloader can contain code to make the necessary tweaks to
382204431Srajthe tree before passing it to the booted kernel.
383204431Sraj
384204431SrajThe device trees for embedded boards are usually quite simple, and
385204431Srajit's possible to hand construct the necessary blob by hand, but doing
386204431Srajso is tedious.  The ``device tree compiler'', \dtc{}\footnote{\dtc can
387204431Srajbe obtained from \cite{dtcgit}.}, is designed to make creating device
388204431Srajtree blobs easier by converting a text representation of the tree
389204431Srajinto the necessary blob.
390204431Sraj
391204431Sraj\subsection{Input and output formats}
392204431Sraj
393204431SrajAs well as the normal mode of compiling a device tree blob from text
394204431Srajsource, \dtc can convert a device tree between a number of
395204431Srajrepresentations.  It can take its input in one of three different
396204431Srajformats:
397204431Sraj\begin{itemize}
398204431Sraj\item source, the normal case.  The device tree is described in a text
399204431Sraj  form, described in \S\ref{sec:dts}.
400204431Sraj\item blob (\texttt{dtb}), the flattened tree format described in
401204431Sraj  \S\ref{sec:format}.  This mode is useful for checking a pre-existing
402204431Sraj  device tree blob.
403204431Sraj\item filesystem (\texttt{fs}), input is a directory tree in the
404204431Sraj  layout of \texttt{/proc/device-tree} (roughly, a directory for each
405204431Sraj  node in the device tree, a file for each property).  This is useful
406204431Sraj  for building a blob for the device tree in use by the currently
407204431Sraj  running kernel.
408204431Sraj\end{itemize}
409204431Sraj
410204431SrajIn addition, \dtc can output the tree in one of three different
411204431Srajformats:
412204431Sraj\begin{itemize}
413204431Sraj\item blob (\texttt{dtb}), as in \S\ref{sec:format}.  The most
414204431Sraj  straightforward use of \dtc is to compile from ``source'' to
415204431Sraj  ``blob'' format.
416204431Sraj\item source (\texttt{dts}), as in \S\ref{sec:dts}.  If used with blob
417204431Sraj  input, this allows \dtc to act as a ``decompiler''.
418204431Sraj\item assembler source (\texttt{asm}).  \dtc can produce an assembler
419204431Sraj  file, which will assemble into a \texttt{.o} file containing the
420204431Sraj  device tree blob, with symbols giving the beginning of the blob and
421204431Sraj  its various subsections.  This can then be linked directly into a
422204431Sraj  bootloader or firmware image.
423204431Sraj\end{itemize}
424204431Sraj
425204431SrajFor maximum applicability, \dtc can both read and write any of the
426204431Srajexisting revisions of the blob format.  When reading, \dtc takes the
427204431Srajversion from the blob header, and when writing it takes a command line
428204431Srajoption specifying the desired version.  It automatically makes any
429204431Srajnecessary adjustments to the tree that are necessary for the specified
430204431Srajversion.  For example, formats before 0x10 require each node to have
431204431Srajan explicit \texttt{name} property.  When \dtc creates such a blob, it
432204431Srajwill automatically generate \texttt{name} properties from the unit
433204431Srajnames.
434204431Sraj
435204431Sraj\subsection{Source format}
436204431Sraj\label{sec:dts}
437204431Sraj
438204431SrajThe ``source'' format for \dtc is a text description of the device
439204431Srajtree in a vaguely C-like form.  Figure \ref{fig:dts} shows an
440204431Srajexample.  The file starts with \texttt{/memreserve/} directives, which
441204431Srajgives address ranges to add to the output blob's memory reserve table,
442204431Srajthen the device tree proper is described.
443204431Sraj
444204431SrajNodes of the tree are introduced with the node name, followed by a
445204431Sraj\texttt{\{} ... \texttt{\};} block containing the node's properties
446204431Srajand subnodes.  Properties are given as just {\emph{name} \texttt{=}
447204431Sraj  \emph{value}\texttt{;}}.  The property values can be given in any
448204431Srajof three forms:
449204431Sraj\begin{itemize}
450204431Sraj\item \emph{string} (for example, \texttt{"MyBoard"}).  The property
451204431Sraj  value is the given string, including terminating NULL.  C-style
452204431Sraj  escapes (\verb+\t+, \verb+\n+, \verb+\0+ and so forth) are allowed.
453204431Sraj\item \emph{cells} (for example, \texttt{<0 8000 f0000000>}).  The
454204431Sraj  property value is made up of a list of 32-bit ``cells'', each given
455204431Sraj  as a hex value.
456204431Sraj\item \emph{bytestring} (for example, \texttt{[1234abcdef]}).  The
457204431Sraj  property value is given as a hex bytestring.
458204431Sraj\end{itemize}
459204431Sraj
460204431SrajCell properties can also contain \emph{references}.  Instead of a hex
461204431Srajnumber, the source can give an ampersand (\texttt{\&}) followed by the
462204431Srajfull path to some node in the tree.  For example, in Figure
463204431Sraj\ref{fig:dts}, the \texttt{/chosen} node has an
464204431Sraj\texttt{interrupt-controller} property referring to the interrupt
465204431Srajcontroller described by the node \texttt{/mpic@0x3fffdd08400}.  In the
466204431Srajoutput tree, the value of the referenced node's phandle is included in
467204431Srajthe property.  If that node doesn't have an explicit phandle property,
468204431Sraj\dtc will automatically create a unique phandle for it.  This approach
469204431Srajmakes it easy to create interrupt trees without having to explicitly
470204431Srajassign and remember phandles for the various interrupt controller
471204431Srajnodes.
472204431Sraj
473204431SrajThe \dtc source can also include ``labels'', which are placed on a
474204431Srajparticular node or property.  For example, Figure \ref{fig:dts} has a
475204431Srajlabel ``\texttt{memreg}'' on the \texttt{reg} property of the node
476204431Sraj\texttt{/memory@0}.  When using assembler output, corresponding labels
477204431Srajin the output are generated, which will assemble into symbols
478204431Srajaddressing the part of the blob with the node or property in question.
479204431SrajThis is useful for the common case where an embedded board has an
480204431Srajessentially fixed device tree with a few variable properties, such as
481204431Srajthe size of memory.  The bootloader for such a board can have a device
482204431Srajtree linked in, including a symbol referring to the right place in the
483204431Srajblob to update the parameter with the correct value determined at
484204431Srajruntime.
485204431Sraj
486204431Sraj\subsection{Tree checking}
487204431Sraj
488204431SrajBetween reading in the device tree and writing it out in the new
489204431Srajformat, \dtc performs a number of checks on the tree:
490204431Sraj\begin{itemize}
491204431Sraj\item \emph{syntactic structure}:  \dtc checks that node and property
492204431Sraj  names contain only allowed characters and meet length restrictions.
493204431Sraj  It checks that a node does not have multiple properties or subnodes
494204431Sraj  with the same name.
495204431Sraj\item \emph{semantic structure}: In some cases, \dtc checks that
496204431Sraj  properties whose contents are defined by convention have appropriate
497204431Sraj  values.  For example, it checks that \texttt{reg} properties have a
498204431Sraj  length that makes sense given the address forms specified by the
499204431Sraj  \texttt{\#address-cells} and \texttt{\#size-cells} properties.  It
500204431Sraj  checks that properties such as \texttt{interrupt-parent} contain a
501204431Sraj  valid phandle.
502204431Sraj\item \emph{Linux requirements}:  \dtc checks that the device tree
503204431Sraj  contains those nodes and properties that are required by the Linux
504204431Sraj  kernel to boot correctly.
505204431Sraj\end{itemize}
506204431Sraj
507204431SrajThese checks are useful to catch simple problems with the device tree,
508204431Srajrather than having to debug the results on an embedded kernel.  With
509204431Srajthe blob input mode, it can also be used for diagnosing problems with
510204431Srajan existing blob.
511204431Sraj
512204431Sraj\section{Future Work}
513204431Sraj
514204431Sraj\subsection{Board ports}
515204431Sraj
516204431SrajThe flattened device tree has always been the only supported way to
517204431Srajboot a \texttt{ppc64} kernel on an embedded system.  With the merge of
518204431Sraj\texttt{ppc32} and \texttt{ppc64} code it has also become the only
519204431Srajsupported way to boot any merged \texttt{powerpc} kernel, 32-bit or
520204431Sraj64-bit.  In fact, the old \texttt{ppc} architecture exists mainly just
521204431Srajto support the old ppc32 embedded ports that have not been migrated
522204431Srajto the flattened device tree approach.  We plan to remove the
523204431Sraj\texttt{ppc} architecture eventually, which will mean porting all the
524204431Srajvarious embedded boards to use the flattened device tree.
525204431Sraj
526204431Sraj\subsection{\dtc features}
527204431Sraj
528204431SrajWhile it is already quite usable, there are a number of extra features
529204431Srajthat \dtc could include to make creating device trees more convenient:
530204431Sraj\begin{itemize}
531204431Sraj\item \emph{better tree checking}: Although \dtc already performs a
532204431Sraj  number of checks on the device tree, they are rather haphazard.  In
533204431Sraj  many cases \dtc will give up after detecting a minor error early and
534204431Sraj  won't pick up more interesting errors later on.  There is a
535204431Sraj  \texttt{-f} parameter that forces \dtc to generate an output tree
536204431Sraj  even if there are errors.  At present, this needs to be used more
537204431Sraj  often than one might hope, because \dtc is bad at deciding which
538204431Sraj  errors should really be fatal, and which rate mere warnings.
539204431Sraj\item \emph{binary include}: Occasionally, it is useful for the device
540204431Sraj  tree to incorporate as a property a block of binary data for some
541204431Sraj  board-specific purpose.  For example, many of Apple's device trees
542204431Sraj  incorporate bytecode drivers for certain platform devices.  \dtc's
543204431Sraj  source format ought to allow this by letting a property's value be
544204431Sraj  read directly from a binary file.
545204431Sraj\item \emph{macros}: it might be useful for \dtc to implement some
546204431Sraj  sort of macros so that a tree containing a number of similar devices
547204431Sraj  (for example, multiple identical ethernet controllers or PCI buses)
548204431Sraj  can be written more quickly.  At present, this can be accomplished
549204431Sraj  in part by running the source file through CPP before compiling with
550204431Sraj  \dtc.  It's not clear whether ``native'' support for macros would be
551204431Sraj  more useful.
552204431Sraj\end{itemize}
553204431Sraj
554204431Sraj\bibliographystyle{amsplain}
555204431Sraj\bibliography{dtc-paper}
556204431Sraj
557204431Sraj\section*{About the authors}
558204431Sraj
559204431SrajDavid Gibson has been a member of the IBM Linux Technology Center,
560204431Srajworking from Canberra, Australia, since 2001.  Recently he has worked
561204431Srajon Linux hugepage support and performance counter support for ppc64,
562204431Srajas well as the device tree compiler.  In the past, he has worked on
563204431Srajbringup for various ppc and ppc64 embedded systems, the orinoco
564204431Srajwireless driver, ramfs, and a userspace checkpointing system
565204431Sraj(\texttt{esky}).
566204431Sraj
567204431SrajBenjamin Herrenschmidt was a MacOS developer for about 10 years, but
568204431Srajultimately saw the light and installed Linux on his Apple PowerPC
569204431Srajmachine.  After writing a bootloader, BootX, for it in 1998, he
570204431Srajstarted contributing to the PowerPC Linux port in various areas,
571204431Srajmostly around the support for Apple machines. He became official
572204431SrajPowerMac maintainer in 2001. In 2003, he joined the IBM Linux
573204431SrajTechnology Center in Canberra, Australia, where he ported the 64 bit
574204431SrajPowerPC kernel to Apple G5 machines and the Maple embedded board,
575204431Srajamong others things.  He's a member of the ppc64 development ``team''
576204431Srajand one of his current goals is to make the integration of embedded
577204431Srajplatforms smoother and more maintainable than in the 32-bit PowerPC
578204431Srajkernel.
579204431Sraj
580204431Sraj\section*{Legal Statement}
581204431Sraj
582204431SrajThis work represents the view of the author and does not necessarily
583204431Srajrepresent the view of IBM.
584204431Sraj
585204431SrajIBM, \ppc, \ppc Architecture, POWER5, pSeries and iSeries are
586204431Srajtrademarks or registered trademarks of International Business Machines
587204431SrajCorporation in the United States and/or other countries.
588204431Sraj
589204431SrajApple and Power Macintosh are a registered trademarks of Apple
590204431SrajComputer Inc. in the United States, other countries, or both.
591204431Sraj
592204431SrajLinux is a registered trademark of Linus Torvalds.
593204431Sraj
594204431SrajOther company, product, and service names may be trademarks or service
595204431Srajmarks of others.
596204431Sraj
597204431Sraj\end{document}
598