Deleted Added
full compact
README.md (262398) README.md (263648)
1# LIBUCL
2
3[![Build Status](https://travis-ci.org/vstakhov/libucl.svg?branch=master)](https://travis-ci.org/vstakhov/libucl)
4
5**Table of Contents** *generated with [DocToc](http://doctoc.herokuapp.com/)*
6
7- [Introduction](#introduction)
8- [Basic structure](#basic-structure)
9- [Improvements to the json notation](#improvements-to-the-json-notation)
10 - [General syntax sugar](#general-syntax-sugar)
11 - [Automatic arrays creation](#automatic-arrays-creation)
12 - [Named keys hierarchy](#named-keys-hierarchy)
13 - [Convenient numbers and booleans](#convenient-numbers-and-booleans)
14- [General improvements](#general-improvements)
15 - [Commments](#commments)
16 - [Macros support](#macros-support)
17 - [Variables support](#variables-support)
18 - [Multiline strings](#multiline-strings)
19- [Emitter](#emitter)
20- [Validation](#validation)
21- [Performance](#performance)
22- [Conclusion](#conclusion)
23
1## Introduction
2
3This document describes the main features and principles of the configuration
4language called `UCL` - universal configuration language.
5
6If you are looking for the libucl API documentation you can find it at [this page](doc/api.md).
7
8## Basic structure
9
10UCL is heavily infused by `nginx` configuration as the example of a convenient configuration
11system. However, UCL is fully compatible with `JSON` format and is able to parse json files.
12For example, you can write the same configuration in the following ways:
13
14* in nginx like:
15
16```nginx
17param = value;
18section {
19 param = value;
20 param1 = value1;
21 flag = true;
22 number = 10k;
23 time = 0.2s;
24 string = "something";
25 subsection {
26 host = {
27 host = "hostname";
28 port = 900;
29 }
30 host = {
31 host = "hostname";
32 port = 901;
33 }
34 }
35}
36```
37
38* or in JSON:
39
40```json
41{
42 "param": "value",
43 "param1": "value1",
44 "flag": true,
45 "subsection": {
46 "host": [
47 {
48 "host": "hostname",
49 "port": 900
50 },
51 {
52 "host": "hostname",
53 "port": 901
54 }
55 ]
56 }
57}
58```
59
60## Improvements to the json notation.
61
62There are various things that make ucl configuration more convenient for editing than strict json:
63
64### General syntax sugar
65
66* Braces are not necessary to enclose a top object: it is automatically treated as an object:
67
68```json
69"key": "value"
70```
71is equal to:
72```json
73{"key": "value"}
74```
75
76* There is no requirement of quotes for strings and keys, moreover, `:` may be replaced `=` or even be skipped for objects:
77
78```nginx
79key = value;
80section {
81 key = value;
82}
83```
84is equal to:
85```json
86{
87 "key": "value",
88 "section": {
89 "key": "value"
90 }
91}
92```
93
94* No commas mess: you can safely place a comma or semicolon for the last element in an array or an object:
95
96```json
97{
98 "key1": "value",
99 "key2": "value",
100}
101```
102### Automatic arrays creation
103
104* Non-unique keys in an object are allowed and are automatically converted to the arrays internally:
105
106```json
107{
108 "key": "value1",
109 "key": "value2"
110}
111```
112is converted to:
113```json
114{
115 "key": ["value1", "value2"]
116}
117```
118
119### Named keys hierarchy
120
121UCL accepts named keys and organize them into objects hierarchy internally. Here is an example of this process:
122```nginx
123section "blah" {
124 key = value;
125}
126section foo {
127 key = value;
128}
129```
130
131is converted to the following object:
132
133```nginx
134section {
135 blah {
136 key = value;
137 }
138 foo {
139 key = value;
140 }
141}
142```
143
144Plain definitions may be more complex and contain more than a single level of nested objects:
145
146```nginx
147section "blah" "foo" {
148 key = value;
149}
150```
151
152is presented as:
153
154```nginx
155section {
156 blah {
157 foo {
158 key = value;
159 }
160 }
161}
162```
163
164### Convenient numbers and booleans
165
166* Numbers can have suffixes to specify standard multipliers:
167 + `[kKmMgG]` - standard 10 base multipliers (so `1k` is translated to 1000)
168 + `[kKmMgG]b` - 2 power multipliers (so `1kb` is translated to 1024)
169 + `[s|min|d|w|y]` - time multipliers, all time values are translated to float number of seconds, for example `10min` is translated to 600.0 and `10ms` is translated to 0.01
170* Hexadecimal integers can be used by `0x` prefix, for example `key = 0xff`. However, floating point values can use decimal base only.
171* Booleans can be specified as `true` or `yes` or `on` and `false` or `no` or `off`.
172* It is still possible to treat numbers and booleans as strings by enclosing them in double quotes.
173
174## General improvements
175
176### Commments
177
178UCL supports different style of comments:
179
180* single line: `#`
181* multiline: `/* ... */`
182
183Multiline comments may be nested:
184```c
185# Sample single line comment
186/*
187 some comment
188 /* nested comment */
189 end of comment
190*/
191```
192
193### Macros support
194
195UCL supports external macros both multiline and single line ones:
196```nginx
197.macro "sometext";
198.macro {
199 Some long text
200 ....
201};
202```
203There are two internal macros provided by UCL:
204
205* `include` - read a file `/path/to/file` or an url `http://example.com/file` and include it to the current place of
206UCL configuration;
207* `try\_include` - try to read a file or url and include it but do not create a fatal error if a file or url is not accessible;
208* `includes` - read a file or an url like the previous macro, but fetch and check the signature file (which is obtained
209by `.sig` suffix appending).
210
211Public keys which are used for the last command are specified by the concrete UCL user.
212
213### Variables support
214
215UCL supports variables in input. Variables are registered by a user of the UCL parser and can be presented in the following forms:
216
217* `${VARIABLE}`
218* `$VARIABLE`
219
220UCL currently does not support nested variables. To escape variables one could use double dollar signs:
221
222* `$${VARIABLE}` is converted to `${VARIABLE}`
223* `$$VARIABLE` is converted to `$VARIABLE`
224
225However, if no valid variables are found in a string, no expansion will be performed (and `$$` thus remains unchanged). This may be a subject
226to change in future libucl releases.
227
228### Multiline strings
229
230UCL can handle multiline strings as well as single line ones. It uses shell/perl like notation for such objects:
231```
232key = <<EOD
233some text
234splitted to
235lines
236EOD
237```
238
239In this example `key` will be interpreted as the following string: `some text\nsplitted to\nlines`.
240Here are some rules for this syntax:
241
242* Multiline terminator must start just after `<<` symbols and it must consist of capital letters only (e.g. `<<eof` or `<< EOF` won't work);
243* Terminator must end with a single newline character (and no spaces are allowed between terminator and newline character);
244* To finish multiline string you need to include a terminator string just after newline and followed by a newline (no spaces or other characters are allowed as well);
245* The initial and the final newlines are not inserted to the resulting string, but you can still specify newlines at the begin and at the end of a value, for example:
246
247```
248key <<EOD
249
250some
251text
252
253EOD
254```
255
256## Emitter
257
258Each UCL object can be serialized to one of the three supported formats:
259
260* `JSON` - canonic json notation (with spaces indented structure);
261* `Compacted JSON` - compact json notation (without spaces or newlines);
262* `Configuration` - nginx like notation;
263* `YAML` - yaml inlined notation.
264
24## Introduction
25
26This document describes the main features and principles of the configuration
27language called `UCL` - universal configuration language.
28
29If you are looking for the libucl API documentation you can find it at [this page](doc/api.md).
30
31## Basic structure
32
33UCL is heavily infused by `nginx` configuration as the example of a convenient configuration
34system. However, UCL is fully compatible with `JSON` format and is able to parse json files.
35For example, you can write the same configuration in the following ways:
36
37* in nginx like:
38
39```nginx
40param = value;
41section {
42 param = value;
43 param1 = value1;
44 flag = true;
45 number = 10k;
46 time = 0.2s;
47 string = "something";
48 subsection {
49 host = {
50 host = "hostname";
51 port = 900;
52 }
53 host = {
54 host = "hostname";
55 port = 901;
56 }
57 }
58}
59```
60
61* or in JSON:
62
63```json
64{
65 "param": "value",
66 "param1": "value1",
67 "flag": true,
68 "subsection": {
69 "host": [
70 {
71 "host": "hostname",
72 "port": 900
73 },
74 {
75 "host": "hostname",
76 "port": 901
77 }
78 ]
79 }
80}
81```
82
83## Improvements to the json notation.
84
85There are various things that make ucl configuration more convenient for editing than strict json:
86
87### General syntax sugar
88
89* Braces are not necessary to enclose a top object: it is automatically treated as an object:
90
91```json
92"key": "value"
93```
94is equal to:
95```json
96{"key": "value"}
97```
98
99* There is no requirement of quotes for strings and keys, moreover, `:` may be replaced `=` or even be skipped for objects:
100
101```nginx
102key = value;
103section {
104 key = value;
105}
106```
107is equal to:
108```json
109{
110 "key": "value",
111 "section": {
112 "key": "value"
113 }
114}
115```
116
117* No commas mess: you can safely place a comma or semicolon for the last element in an array or an object:
118
119```json
120{
121 "key1": "value",
122 "key2": "value",
123}
124```
125### Automatic arrays creation
126
127* Non-unique keys in an object are allowed and are automatically converted to the arrays internally:
128
129```json
130{
131 "key": "value1",
132 "key": "value2"
133}
134```
135is converted to:
136```json
137{
138 "key": ["value1", "value2"]
139}
140```
141
142### Named keys hierarchy
143
144UCL accepts named keys and organize them into objects hierarchy internally. Here is an example of this process:
145```nginx
146section "blah" {
147 key = value;
148}
149section foo {
150 key = value;
151}
152```
153
154is converted to the following object:
155
156```nginx
157section {
158 blah {
159 key = value;
160 }
161 foo {
162 key = value;
163 }
164}
165```
166
167Plain definitions may be more complex and contain more than a single level of nested objects:
168
169```nginx
170section "blah" "foo" {
171 key = value;
172}
173```
174
175is presented as:
176
177```nginx
178section {
179 blah {
180 foo {
181 key = value;
182 }
183 }
184}
185```
186
187### Convenient numbers and booleans
188
189* Numbers can have suffixes to specify standard multipliers:
190 + `[kKmMgG]` - standard 10 base multipliers (so `1k` is translated to 1000)
191 + `[kKmMgG]b` - 2 power multipliers (so `1kb` is translated to 1024)
192 + `[s|min|d|w|y]` - time multipliers, all time values are translated to float number of seconds, for example `10min` is translated to 600.0 and `10ms` is translated to 0.01
193* Hexadecimal integers can be used by `0x` prefix, for example `key = 0xff`. However, floating point values can use decimal base only.
194* Booleans can be specified as `true` or `yes` or `on` and `false` or `no` or `off`.
195* It is still possible to treat numbers and booleans as strings by enclosing them in double quotes.
196
197## General improvements
198
199### Commments
200
201UCL supports different style of comments:
202
203* single line: `#`
204* multiline: `/* ... */`
205
206Multiline comments may be nested:
207```c
208# Sample single line comment
209/*
210 some comment
211 /* nested comment */
212 end of comment
213*/
214```
215
216### Macros support
217
218UCL supports external macros both multiline and single line ones:
219```nginx
220.macro "sometext";
221.macro {
222 Some long text
223 ....
224};
225```
226There are two internal macros provided by UCL:
227
228* `include` - read a file `/path/to/file` or an url `http://example.com/file` and include it to the current place of
229UCL configuration;
230* `try\_include` - try to read a file or url and include it but do not create a fatal error if a file or url is not accessible;
231* `includes` - read a file or an url like the previous macro, but fetch and check the signature file (which is obtained
232by `.sig` suffix appending).
233
234Public keys which are used for the last command are specified by the concrete UCL user.
235
236### Variables support
237
238UCL supports variables in input. Variables are registered by a user of the UCL parser and can be presented in the following forms:
239
240* `${VARIABLE}`
241* `$VARIABLE`
242
243UCL currently does not support nested variables. To escape variables one could use double dollar signs:
244
245* `$${VARIABLE}` is converted to `${VARIABLE}`
246* `$$VARIABLE` is converted to `$VARIABLE`
247
248However, if no valid variables are found in a string, no expansion will be performed (and `$$` thus remains unchanged). This may be a subject
249to change in future libucl releases.
250
251### Multiline strings
252
253UCL can handle multiline strings as well as single line ones. It uses shell/perl like notation for such objects:
254```
255key = <<EOD
256some text
257splitted to
258lines
259EOD
260```
261
262In this example `key` will be interpreted as the following string: `some text\nsplitted to\nlines`.
263Here are some rules for this syntax:
264
265* Multiline terminator must start just after `<<` symbols and it must consist of capital letters only (e.g. `<<eof` or `<< EOF` won't work);
266* Terminator must end with a single newline character (and no spaces are allowed between terminator and newline character);
267* To finish multiline string you need to include a terminator string just after newline and followed by a newline (no spaces or other characters are allowed as well);
268* The initial and the final newlines are not inserted to the resulting string, but you can still specify newlines at the begin and at the end of a value, for example:
269
270```
271key <<EOD
272
273some
274text
275
276EOD
277```
278
279## Emitter
280
281Each UCL object can be serialized to one of the three supported formats:
282
283* `JSON` - canonic json notation (with spaces indented structure);
284* `Compacted JSON` - compact json notation (without spaces or newlines);
285* `Configuration` - nginx like notation;
286* `YAML` - yaml inlined notation.
287
288## Validation
289
290UCL allows validation of objects. It uses the same schema that is used for json: [json schema v4](http://json-schema.org). UCL supports the full set of json schema with the exception of remote references. This feature is unlikely useful for configuration objects. Of course, schema definition can be in UCL format instead of JSON that sinplifies schemas writing. Moreover, since UCL supports multiple values for keys in an object it is possible to specify generic integer constraints `maxValues` and `minValues` to define the limits of values in a single key. UCL currently is not absolutely strict about validation schemas themselves, therefore UCL users should supply valid schemas (as it is defined in json-schema draft v4) to ensure that input is validated properly.
291
265## Performance
266
267Are UCL parser and emitter fast enough? Well, there are some numbers.
268I got a 19Mb file that consist of ~700 thousands lines of json (obtained via
269http://www.json-generator.com/). Then I checked jansson library that performs json
270parsing and emitting and compared it with UCL. Here are results:
271
272```
273jansson: parsed json in 1.3899 seconds
274jansson: emitted object in 0.2609 seconds
275
276ucl: parsed input in 0.6649 seconds
277ucl: emitted config in 0.2423 seconds
278ucl: emitted json in 0.2329 seconds
279ucl: emitted compact json in 0.1811 seconds
280ucl: emitted yaml in 0.2489 seconds
281```
282
283So far, UCL seems to be significantly faster than jansson on parsing and slightly faster on emitting. Moreover,
284UCL compiled with optimizations (-O3) performs significantly faster:
285```
286ucl: parsed input in 0.3002 seconds
287ucl: emitted config in 0.1174 seconds
288ucl: emitted json in 0.1174 seconds
289ucl: emitted compact json in 0.0991 seconds
290ucl: emitted yaml in 0.1354 seconds
291```
292
293You can do your own benchmarks by running `make test` in libucl top directory.
294
295## Conclusion
296
297UCL has clear design that should be very convenient for reading and writing. At the same time it is compatible with
298JSON language and therefore can be used as a simple JSON parser. Macroes logic provides an ability to extend configuration
299language (for example by including some lua code) and comments allows to disable or enable the parts of a configuration
300quickly.
292## Performance
293
294Are UCL parser and emitter fast enough? Well, there are some numbers.
295I got a 19Mb file that consist of ~700 thousands lines of json (obtained via
296http://www.json-generator.com/). Then I checked jansson library that performs json
297parsing and emitting and compared it with UCL. Here are results:
298
299```
300jansson: parsed json in 1.3899 seconds
301jansson: emitted object in 0.2609 seconds
302
303ucl: parsed input in 0.6649 seconds
304ucl: emitted config in 0.2423 seconds
305ucl: emitted json in 0.2329 seconds
306ucl: emitted compact json in 0.1811 seconds
307ucl: emitted yaml in 0.2489 seconds
308```
309
310So far, UCL seems to be significantly faster than jansson on parsing and slightly faster on emitting. Moreover,
311UCL compiled with optimizations (-O3) performs significantly faster:
312```
313ucl: parsed input in 0.3002 seconds
314ucl: emitted config in 0.1174 seconds
315ucl: emitted json in 0.1174 seconds
316ucl: emitted compact json in 0.0991 seconds
317ucl: emitted yaml in 0.1354 seconds
318```
319
320You can do your own benchmarks by running `make test` in libucl top directory.
321
322## Conclusion
323
324UCL has clear design that should be very convenient for reading and writing. At the same time it is compatible with
325JSON language and therefore can be used as a simple JSON parser. Macroes logic provides an ability to extend configuration
326language (for example by including some lua code) and comments allows to disable or enable the parts of a configuration
327quickly.