README.md revision 279549
1# LIBUCL
2
3[![Build Status](https://travis-ci.org/vstakhov/libucl.svg?branch=master)](https://travis-ci.org/vstakhov/libucl)[![Coverity](https://scan.coverity.com/projects/4138/badge.svg)](https://scan.coverity.com/projects/4138)
4
5**Table of Contents**  *generated with [DocToc](http://doctoc.herokuapp.com/)*
6
7- [Introduction](#introduction)
8- [Basic structure](#basic-structure)
9- [Improvements to the json notation](#improvements-to-the-json-notation)
10	- [General syntax sugar](#general-syntax-sugar)
11	- [Automatic arrays creation](#automatic-arrays-creation)
12	- [Named keys hierarchy](#named-keys-hierarchy)
13	- [Convenient numbers and booleans](#convenient-numbers-and-booleans)
14- [General improvements](#general-improvements)
15	- [Commments](#commments)
16	- [Macros support](#macros-support)
17	- [Variables support](#variables-support)
18	- [Multiline strings](#multiline-strings)
19- [Emitter](#emitter)
20- [Validation](#validation)
21- [Performance](#performance)
22- [Conclusion](#conclusion)
23
24## Introduction
25
26This document describes the main features and principles of the configuration
27language called `UCL` - universal configuration language.
28
29If you are looking for the libucl API documentation you can find it at [this page](doc/api.md).
30
31## Basic structure
32
33UCL is heavily infused by `nginx` configuration as the example of a convenient configuration
34system. However, UCL is fully compatible with `JSON` format and is able to parse json files.
35For example, you can write the same configuration in the following ways:
36
37* in nginx like:
38
39```nginx
40param = value;
41section {
42    param = value;
43    param1 = value1;
44    flag = true;
45    number = 10k;
46    time = 0.2s;
47    string = "something";
48    subsection {
49        host = {
50            host = "hostname"; 
51            port = 900;
52        }
53        host = {
54            host = "hostname";
55            port = 901;
56        }
57    }
58}
59```
60
61* or in JSON:
62
63```json
64{
65    "param": "value",
66    "param1": "value1",
67    "flag": true,
68    "subsection": {
69        "host": [
70        {
71            "host": "hostname",
72            "port": 900
73        },
74        {
75            "host": "hostname",
76            "port": 901
77        }
78        ]
79    }
80}
81```
82
83## Improvements to the json notation.
84
85There are various things that make ucl configuration more convenient for editing than strict json:
86
87### General syntax sugar
88
89* Braces are not necessary to enclose a top object: it is automatically treated as an object:
90
91```json
92"key": "value"
93```
94is equal to:
95```json
96{"key": "value"}
97```
98
99* There is no requirement of quotes for strings and keys, moreover, `:` may be replaced `=` or even be skipped for objects:
100
101```nginx
102key = value;
103section {
104    key = value;
105}
106```
107is equal to:
108```json
109{
110    "key": "value",
111    "section": {
112        "key": "value"
113    }
114}
115```
116
117* No commas mess: you can safely place a comma or semicolon for the last element in an array or an object:
118
119```json
120{
121    "key1": "value",
122    "key2": "value",
123}
124```
125### Automatic arrays creation
126
127* Non-unique keys in an object are allowed and are automatically converted to the arrays internally:
128
129```json
130{
131    "key": "value1",
132    "key": "value2"
133}
134```
135is converted to:
136```json
137{
138    "key": ["value1", "value2"]
139}
140```
141
142### Named keys hierarchy
143
144UCL accepts named keys and organize them into objects hierarchy internally. Here is an example of this process:
145```nginx
146section "blah" {
147	key = value;
148}
149section foo {
150	key = value;
151}
152```
153
154is converted to the following object:
155
156```nginx
157section {
158	blah {
159		key = value;
160	}
161	foo {
162		key = value;
163	}
164}
165```
166    
167Plain definitions may be more complex and contain more than a single level of nested objects:
168   
169```nginx
170section "blah" "foo" {
171	key = value;
172}
173```
174
175is presented as:
176
177```nginx    
178section {
179	blah {
180		foo {
181			key = value;
182		}
183	}
184}
185```
186
187### Convenient numbers and booleans
188
189* Numbers can have suffixes to specify standard multipliers:
190    + `[kKmMgG]` - standard 10 base multipliers (so `1k` is translated to 1000)
191    + `[kKmMgG]b` - 2 power multipliers (so `1kb` is translated to 1024)
192    + `[s|min|d|w|y]` - time multipliers, all time values are translated to float number of seconds, for example `10min` is translated to 600.0 and `10ms` is translated to 0.01
193* Hexadecimal integers can be used by `0x` prefix, for example `key = 0xff`. However, floating point values can use decimal base only.
194* Booleans can be specified as `true` or `yes` or `on` and `false` or `no` or `off`.
195* It is still possible to treat numbers and booleans as strings by enclosing them in double quotes.
196
197## General improvements
198
199### Commments
200
201UCL supports different style of comments:
202
203* single line: `#` 
204* multiline: `/* ... */`
205
206Multiline comments may be nested:
207```c
208# Sample single line comment
209/* 
210 some comment
211 /* nested comment */
212 end of comment
213*/
214```
215
216### Macros support
217
218UCL supports external macros both multiline and single line ones:
219```nginx
220.macro "sometext";
221.macro {
222    Some long text
223    ....
224};
225```
226
227Moreover, each macro can accept an optional list of arguments in braces. These
228arguments themselves are the UCL object that is parsed and passed to a macro as
229options:
230
231```nginx
232.macro(param=value) "something";
233.macro(param={key=value}) "something";
234.macro(.include "params.conf") "something";
235.macro(#this is multiline macro
236param = [value1, value2]) "something";
237.macro(key="()") "something";
238```
239
240UCL also provide a convenient `include` macro to load content from another files
241to the current UCL object. This macro accepts either path to file:
242
243```nginx
244.include "/full/path.conf"
245.include "./relative/path.conf"
246.include "${CURDIR}/path.conf"
247```
248
249or URL (if ucl is built with url support provided by either `libcurl` or `libfetch`):
250
251	.include "http://example.com/file.conf"
252
253`.include` macro supports a set of options:
254
255* `try` (default: **false**) - if this option is `true` than UCL treats errors on loading of
256this file as non-fatal. For example, such a file can be absent but it won't stop the parsing
257of the top-level document.
258* `sign` (default: **false**) - if this option is `true` UCL loads and checks the signature for
259a file from path named `<FILEPATH>.sig`. Trusted public keys should be provided for UCL API after
260parser is created but before any configurations are parsed.
261* `glob` (default: **false**) - if this option is `true` UCL treats the filename as GLOB pattern and load
262all files that matches the specified pattern (normally the format of patterns is defined in `glob` manual page
263for your operating system). This option is meaningless for URL includes.
264* `url` (default: **true**) - allow URL includes.
265* `priority` (default: 0) - specify priority for the include (see below).
266
267Priorities are used by UCL parser to manage the policy of objects rewriting during including other files
268as following:
269
270* If we have two objects with the same priority then we form an implicit array
271* If a new object has bigger priority then we overwrite an old one
272* If a new object has lower priority then we ignore it
273
274By default, the priority of top-level object is set to zero (lowest priority). Currently,
275you can define up to 16 priorities (from 0 to 15). Includes with bigger priorities will
276rewrite keys from the objects with lower priorities as specified by the policy.
277
278### Variables support
279
280UCL supports variables in input. Variables are registered by a user of the UCL parser and can be presented in the following forms:
281
282* `${VARIABLE}`
283* `$VARIABLE`
284
285UCL currently does not support nested variables. To escape variables one could use double dollar signs:
286
287* `$${VARIABLE}` is converted to `${VARIABLE}`
288* `$$VARIABLE` is converted to `$VARIABLE`
289
290However, if no valid variables are found in a string, no expansion will be performed (and `$$` thus remains unchanged). This may be a subject
291to change in future libucl releases.
292
293### Multiline strings
294
295UCL can handle multiline strings as well as single line ones. It uses shell/perl like notation for such objects:
296```
297key = <<EOD
298some text
299splitted to
300lines
301EOD
302```
303
304In this example `key` will be interpreted as the following string: `some text\nsplitted to\nlines`.
305Here are some rules for this syntax:
306
307* Multiline terminator must start just after `<<` symbols and it must consist of capital letters only (e.g. `<<eof` or `<< EOF` won't work);
308* Terminator must end with a single newline character (and no spaces are allowed between terminator and newline character);
309* To finish multiline string you need to include a terminator string just after newline and followed by a newline (no spaces or other characters are allowed as well);
310* The initial and the final newlines are not inserted to the resulting string, but you can still specify newlines at the begin and at the end of a value, for example:
311
312```
313key <<EOD
314
315some
316text
317
318EOD
319```
320
321## Emitter
322
323Each UCL object can be serialized to one of the three supported formats:
324
325* `JSON` - canonic json notation (with spaces indented structure);
326* `Compacted JSON` - compact json notation (without spaces or newlines);
327* `Configuration` - nginx like notation;
328* `YAML` - yaml inlined notation.
329
330## Validation
331
332UCL allows validation of objects. It uses the same schema that is used for json: [json schema v4](http://json-schema.org). UCL supports the full set of json schema with the exception of remote references. This feature is unlikely useful for configuration objects. Of course, a schema definition can be in UCL format instead of JSON that simplifies schemas writing. Moreover, since UCL supports multiple values for keys in an object it is possible to specify generic integer constraints `maxValues` and `minValues` to define the limits of values count in a single key. UCL currently is not absolutely strict about validation schemas themselves, therefore UCL users should supply valid schemas (as it is defined in json-schema draft v4) to ensure that the input objects are validated properly.
333
334## Performance
335
336Are UCL parser and emitter fast enough? Well, there are some numbers.
337I got a 19Mb file that consist of ~700 thousands lines of json (obtained via
338http://www.json-generator.com/). Then I checked jansson library that performs json
339parsing and emitting and compared it with UCL. Here are results:
340
341```
342jansson: parsed json in 1.3899 seconds
343jansson: emitted object in 0.2609 seconds
344
345ucl: parsed input in 0.6649 seconds
346ucl: emitted config in 0.2423 seconds
347ucl: emitted json in 0.2329 seconds
348ucl: emitted compact json in 0.1811 seconds
349ucl: emitted yaml in 0.2489 seconds
350```
351
352So far, UCL seems to be significantly faster than jansson on parsing and slightly faster on emitting. Moreover,
353UCL compiled with optimizations (-O3) performs significantly faster:
354```
355ucl: parsed input in 0.3002 seconds
356ucl: emitted config in 0.1174 seconds
357ucl: emitted json in 0.1174 seconds
358ucl: emitted compact json in 0.0991 seconds
359ucl: emitted yaml in 0.1354 seconds
360```
361
362You can do your own benchmarks by running `make check` in libucl top directory.
363
364## Conclusion
365
366UCL has clear design that should be very convenient for reading and writing. At the same time it is compatible with
367JSON language and therefore can be used as a simple JSON parser. Macroes logic provides an ability to extend configuration
368language (for example by including some lua code) and comments allows to disable or enable the parts of a configuration
369quickly.
370