1%if false  
2  Copyright (c) 2009, ETH Zurich.
3  All rights reserved.
4  
5  This file is distributed under the terms in the attached LICENSE file.
6  If you do not find this file, copies can be found by writing to:
7  ETH Zurich D-INFK, Haldeneggsteig 4, CH-8092 Zurich. Attn: Systems Group.
8%endif
9
10%include polycode.fmt
11
12%if false
13
14> module PureExpressions where
15
16> import Data.Char
17
18> import {-# SOURCE #-} Constructs
19
20%endif
21
22\section{Filet-o-Fish pure expressions}
23\label{sec:fof_syntax_core}
24
25The core of Filet-o-Fish is organized around the purely functional
26core of C. It consists of C types as well as C expressions. 
27
28
29%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
30\subsection{Types}
31%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
32
33%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
34\subsubsection{Data-type Definitions}
35%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
36
37The |TypeExpr| data-type encompasses the following types:
38\begin{itemize}
39        \item Void,
40        \item Integers, of various signedness and size,
41        \item Float,
42        \item Named structures and unions,
43        \item Named pointers, ie. a pointer recurring in a structure
44             or union,
45        \item Arrays, and
46        \item Pointers
47\end{itemize}
48
49Note that a value of type |TInt| or |TFloat| is a \emph{constant},
50like $2$, $3 / 7$, or \verb!sizeof(struct foo)!. In FoF meta-language,
51a C variable is \emph{not} a value -- but a construct. So, the type of
52the variable
53$x$ defined by \verb!int32_t x = 4! is \emph{not} |TInt Signed TInt32|. 
54
55> data TypeExpr = TVoid
56>               | TInt Signedness Size
57>               | TFloat
58>               | TChar
59>               | TStruct AllocStruct String TFieldList
60>               | TUnion AllocUnion String TFieldList
61>               | TCompPointer String 
62>               | TEnum String [(String, Int)]
63>               | TArray AllocArray TypeExpr
64>               | TPointer TypeExpr Mode
65>               | TTypedef TypeExpr String
66>               | TFun String Function TypeExpr [(TypeExpr, Maybe String)]
67>                 deriving (Eq, Show)
68
69%%%%%%%%%%%%%%%%
70\paragraph{Functions}
71%%%%%%%%%%%%%%%%
72
73A function is represented by an Haskell function, taking a list of
74arguments and computing the body of the function. In the jargon, this
75is called an \emph{higher-order abstract syntax}. So, the function
76definition is represented by the following type:
77
78> data Function = Fun ([PureExpr] -> FoFCode PureExpr)
79
80Because |TypeExpr| is showable, |Function| has to be showable
81too. While we could define a more complete |Show| instance for
82|Function|, we will not do so here and simply return an opaque name.
83
84> instance Show Function where
85>     show _ = "<fun>"
86
87Concerning equality, this becomes more tricky. We would have to define
88what "equality" means and if that definition is decidable. Here, we
89consider syntactic equality and although we could decide whether two
90functions are syntactically equal or not, we will not do so for the
91moment. We simply consider functions as always distinct.
92
93> instance Eq Function where
94>     _ == _ = False
95
96
97%%%%%%%%%%%%%%%%
98\paragraph{Composed data-types}
99%%%%%%%%%%%%%%%%
100
101
102Composed data-types have several allocation policies: they might be
103declared dynamically, using malloc, or statically, on the stack. This
104is reflected by the following definitions. We chose to use differents
105definitions for each kind of data-type because they are likely to
106evolve in future versions and diverge from this common scheme.
107
108> data AllocStruct = StaticStruct
109>                  | DynamicStruct
110>                    deriving (Eq, Show)
111> data AllocUnion = StaticUnion
112>                 | DynamicUnion
113>                    deriving (Eq, Show)
114> data AllocArray = StaticArray Int
115>                 | DynamicArray
116>                    deriving (Eq, Show)
117
118Both Structures and Unions rely on the |TFieldList|
119synonym. Basically, the type of a Structure corresponds to its name
120as well as the list of its field names and respective types.
121
122> type TFieldList = [(String, TypeExpr)]
123
124
125%%%%%%%%%%%%%%%%
126\paragraph{Integers}
127%%%%%%%%%%%%%%%%
128
129Signedness and size of integers is defined as usual. An integer is
130either signed or unsigned and its size may vary from 8 to 64
131bits. Interestingly, we derive |Ord| on these data-types: |Ord|
132provides us with a comparison function on the signedness and size. In
133practice, we can check that a cast is a correct \emph{downcasting} by
134enforcing that the sign and size we cast to is \emph{bigger} than the
135original sign and size.
136
137> data Signedness = Unsigned 
138>                 | Signed
139>                   deriving (Eq, Ord, Show)
140>
141> data Size = TInt8
142>           | TInt16
143>           | TInt32
144>           | TInt64
145>             deriving (Eq, Ord, Show)    
146
147
148%%%%%%%%%%%%%%%%
149\paragraph{Pointers}
150%%%%%%%%%%%%%%%%
151
152As we understand that the suspense is unbearable, we are going to
153reveal you the type of $x$ defined above. Actually, the type of $x$ is
154|TPointer (TInt Signed TInt32) Avail|. A pointer? Indeed, a variable
155does actually \emph{points} to a location in memory. This choice
156allows us to capture the notion of variables and pointers in a
157single abstraction, called a \emph{reference cell}.
158
159A reference cell can be in one of the following states: either
160|Avail|able or |Read|. This
161distinction makes sense during the compilation process, it can ignored
162otherwise.
163
164> data Mode = Avail
165>           | Read
166>             deriving (Eq, Show)
167
168%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
169\subsubsection{Smart Constructors}
170%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
171
172In some circumstances, it is necessary to explicitly write the type of
173an expression. However, explicitly combining the previously defined
174types can be quite cumbersome. For example, we can naturally define
175the base types as follow:
176
177> voidT :: TypeExpr
178> voidT = TVoid
179>
180> uint8T, uint16T, uint32T, uint64T :: TypeExpr 
181> uint8T = TInt Unsigned TInt8
182> uint16T = TInt Unsigned TInt16
183> uint32T = TInt Unsigned TInt32
184> uint64T = TInt Unsigned TInt64
185>
186> int8T, int16T, int32T, int64T :: TypeExpr
187> int8T = TInt Signed TInt8
188> int16T = TInt Signed TInt16
189> int32T = TInt Signed TInt32
190> int64T = TInt Signed TInt64
191>
192> floatT :: TypeExpr
193> floatT = TFloat
194>
195> charT :: TypeExpr
196> charT = TChar
197
198> uintptrT :: TypeExpr
199> uintptrT = TCompPointer "void"
200
201
202And, similarly, we can build up composed types by applying them on
203smaller types:
204
205> arrayDT :: TypeExpr -> TypeExpr
206> arrayDT typ = TArray DynamicArray typ 
207>
208> arrayST :: Int -> TypeExpr -> TypeExpr
209> arrayST size typ = TArray (StaticArray size) typ 
210>
211> ptrT :: TypeExpr -> TypeExpr
212> ptrT typ = TPointer typ Avail
213>
214> structDT, unionDT,
215>  structST, unionST :: String -> TFieldList -> TypeExpr
216> structDT name fields = TStruct DynamicStruct name fields
217> unionDT name fields = TUnion DynamicUnion name fields
218> structST name fields = TStruct StaticStruct name fields
219> unionST name fields = TUnion StaticUnion name fields
220>
221> enumT :: String -> [(String, Int)] -> TypeExpr
222> enumT name fields = TEnum name fields
223>
224> typedef :: TypeExpr -> String -> TypeExpr
225> typedef typ name = TTypedef typ name
226
227Finally, the named pointer -- which is actually a \emph{fix-point} --
228takes as input the name of the structure or union it refers to.
229
230> cptrT :: String -> TypeExpr
231> cptrT id = TCompPointer id
232
233
234%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
235\subsection{Pure Expressions}
236%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
237
238In a first step, we are going to define the expressions composing FoF
239meta-language. As for types, this consists in a data-type, |PureExpr|,
240capturing the syntax of expressions. Then, we also define some smart
241constructors.
242
243%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
244\subsubsection{Data-type Definitions}
245%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
246
247An expression is one of the following object:
248\begin{itemize}
249\item |void|, the only object populating the type |Void|,
250\item an integer, of specific signedness and size,
251\item a float,
252\item a reference to an object in memory,
253\item a unary operation, applied to an object,
254\item a binary operation, applied on two objects,
255\item the |sizeof| operator, applied to a type,
256\item a conditional expression, testing an object against $0$,
257      returning one of two objects, and
258\item a |cast| operator, casting an object to a given type
259\end{itemize}
260
261
262> data PureExpr = Void
263>               | CLInteger Signedness Size Integer
264>               | CLFloat Float
265>               | CLChar Char
266>               | CLRef Origin TypeExpr VarName
267>               | Unary UnaryOp PureExpr
268>               | Binary BinaryOp PureExpr PureExpr
269>               | Sizeof TypeExpr
270>               | Test PureExpr PureExpr PureExpr
271>               | Cast TypeExpr PureExpr
272>               | Quote String
273>                 deriving (Eq, Show)
274
275%%%%%%%%%%%%%%%%
276\paragraph{Variable names}
277%%%%%%%%%%%%%%%%
278
279A reference is identified by a name. A |Generated| name has been
280forged by FoF. A |Provided| name has been defined by the compiler
281designer. An |Inherited| name results from an operation performed on
282another variable. We carefully track the origin of names for compilation purpose: for
283example, if a variable name has been |Generated|, we should try to
284eliminate it, to make the compiled code more readable.
285
286> data VarName = Generated String
287>              | Provided String
288>              | Inherited Int VarName
289>                deriving (Show, Eq)
290
291A reference is also decorated by its \emph{origin}. This field is used by
292the compiler to identify the scope of variables. Therefore, the
293compiler can enforce some safety checks, such as verifying that the
294address of a local variable is not assigned to a global one, for
295example. Sadly, this information is not always precisely maintained
296nor correctly used in the current implementation. More care and more
297checks should be added in the future, to ensure the correctness of the
298generated code.
299
300> data Origin = Local
301>             | Global
302>             | Param
303>             | Dynamic
304>               deriving (Eq, Show)
305
306%%%%%%%%%%%%%%%%
307\paragraph{Unary operations}
308%%%%%%%%%%%%%%%%
309
310The unary operations are either the arithmetic \emph{minus} operation,
311or the logic \emph{complement} operation, or the logic \emph{negation}
312operation.
313
314> data UnaryOp = Minus | Complement | Negation
315>              deriving (Eq, Show)
316
317%%%%%%%%%%%%%%%%
318\paragraph{Binary operations}
319%%%%%%%%%%%%%%%%
320
321The binary operations are either arithmetic operators ($+$, $-$,
322$\times$, $/$, and $\%$), Boolean operators ($<<$, $>>$, $\&$,
323bitwise-or, and \^{}), or comparison operators ($<$, $<=$, $>$, $>=$,
324$==$, and $!=$).
325
326> data BinaryOp = Plus | Sub | Mul    | Div   | Mod
327>               | Shl  | Shr | AndBit | OrBit | XorBit 
328>               | Le   | Leq | Ge     | Geq   | Eq  | Neq
329>                 deriving (Eq, Show)
330
331
332%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
333\subsubsection{Smart Constructors}
334%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
335
336As usual, we define some constructors for the C programmer to feel at
337home with FoF. Let us start with the constants first:
338
339> void :: PureExpr
340> void = Void
341>
342> int8, int16, int32, int64 :: Integer -> PureExpr
343> int8 x = CLInteger Signed TInt8 x
344> int16 x = CLInteger Signed TInt16 x
345> int32 x = CLInteger Signed TInt32 x
346> int64 x = CLInteger Signed TInt64 x
347>
348> uint8, uint16, uint32, uint64 :: Integer -> PureExpr
349> uint8 x = CLInteger Unsigned TInt8 x
350> uint16 x = CLInteger Unsigned TInt16 x
351> uint32 x = CLInteger Unsigned TInt32 x
352> uint64 x = CLInteger Unsigned TInt64 x
353>
354> charc :: Char -> PureExpr
355> charc x = CLInteger Unsigned TInt8 (toInteger $ ord x)
356>
357> float :: Float -> PureExpr
358> float x = CLFloat x
359>
360> cchar :: Char -> PureExpr
361> cchar x = CLChar x
362
363> opaque :: TypeExpr -> String -> PureExpr
364> opaque t s = CLRef Local t (Provided s)
365
366Then come the unary operators:
367
368> minus, comp, neg :: PureExpr -> PureExpr
369> minus = Unary Minus 
370> comp = Unary Complement
371> neg = Unary Negation 
372
373And the binary operators. Note that they are defined
374\emph{infix}. Therefore, it becomes possible to write the following
375code:
376
377> exampleInfix :: PureExpr
378> exampleInfix = (uint8 1) .<. ((uint8 2) .+. (uint8 4))
379
380Although not specified yet, we could have set up the left/right
381associativity and precedence rules of these operators. This would
382reduce the parenthesizing overhead. It is just a matter of doing it.
383
384> (.+.), (.-.), (.*.), (./.), (.%.),
385>          (.<<.), (.>>.), (.&.), (.|.), (.^.),
386>          (.<.), (.<=.), (.>.), 
387>          (.>=.), (.==.), (.!=.) :: PureExpr -> PureExpr -> PureExpr
388> (.+.) = Binary Plus
389> (.-.) = Binary Sub
390> (.*.) = Binary Mul
391> (./.) = Binary Div
392> (.%.) = Binary Mod
393> (.<<.) = Binary Shl
394> (.>>.) = Binary Shr
395> (.&.) = Binary AndBit
396> (.|.) = Binary OrBit
397> (.^.) = Binary XorBit
398> (.<.) = Binary Le
399> (.<=.) = Binary Leq
400> (.>.) = Binary Ge
401> (.>=.) = Binary Geq
402> (.==.) = Binary Eq
403> (.!=.) = Binary Neq
404
405Finally, |sizeof|, conditionals, and |cast| have their straightforward
406alter-ego in FoF:
407
408> sizeof :: TypeExpr -> PureExpr
409> sizeof t = Sizeof t
410>
411> test :: PureExpr -> PureExpr -> PureExpr -> PureExpr
412> test c ift iff = Test c ift iff
413>
414> cast :: TypeExpr -> PureExpr -> PureExpr
415> cast t e = Cast t e
416
417
418When compiling foreign function calls, one might need to turn a
419(Haskell) string into a FoF \emph{quote} object. This is achieved by
420the following combinator. One must avoid using this operation as much
421as possible: this quotation has no semantic meaning, therefore one
422should use it only when we are really sure we are not interested in
423the quoted semantic anymore.
424
425> quote :: String -> PureExpr
426> quote s = Quote s