1[comment {-*- tcl -*- doctools manpage}]
2[comment ===========================================]
3[manpage_begin treeql n 1.3.1]
4[copyright {2004 Colin McCormack <coldstore@users.sourceforge.net>}]
5[copyright {2004 Andreas Kupries <andreas_kupries@users.sourceforge.net>}]
6[moddesc   {Tree Query Language}]
7[titledesc {Query tree objects}]
8[category  {Data structures}]
9[require Tcl 8.2]
10[require snit]
11[require struct::list]
12[require struct::set]
13[require treeql [opt 1.3.1]]
14[description]
15[para]
16
17This package provides objects which can be used to query and transform
18tree objects following the API of tree objects created by the package
19[package struct::tree].
20
21[para]
22
23The tree query and manipulation language used here, TreeQL, is
24inspired by Cost (See section [sectref References] for more
25information).
26
27[para]
28
29[package treeql], the package, is a fairly thin query facility over
30tree-structured data types.  It implements an ordered set of nodes
31(really a list) which are generated and filtered through the
32application of TreeQL operators to each node in turn.
33
34
35[comment ===========================================]
36[section API]
37
38[subsection {TreeQL CLASS API}]
39
40The command [cmd treeql] is a [package snit]::type which implements
41the Treeql Query Language. This means that it follows the API for
42class commands as specified by the package [package snit].
43
44Its general syntax is
45
46[list_begin definitions]
47
48[call [cmd treeql] [arg objectname] [option -tree] [arg tree] \
49	[opt "[option -query] [arg query]"] \
50	[opt "[option -nodes] [arg nodes]"] \
51	[opt [arg args]...]]
52
53The command creates a new tree query object and returns the fully
54qualified name of the object command as its result.
55
56The API the returned command is following is described in the section
57[sectref {TreeQL OBJECT API}]
58
59[para]
60
61Each query object is associated with a single [arg tree] object. This
62is the object all queries will be run against.
63
64[para]
65
66If the option [option -nodes] was specified then its argument is
67treated as a list of nodes. This list is used to initialize the node
68set. It defaults to the empty list.
69
70[para]
71
72If the option [option -query] was specified then its argument will be
73interpreted as an object, the [term {parent query}] of this query. It
74defaults to the object itself. All queries will be interpreted in the
75environment of this object.
76
77[para]
78
79Any arguments coming after the options are treated as a query and run
80immediately, after the [term {node set}] has been initialized. This
81uses the same syntax for the query as the method [method query].
82
83[para]
84
85The operations of the TreeQL available for this are explained in the
86section about [sectref {The Tree Query Language}]. This section also
87explains the term [term {node set}] used above.
88
89[list_end]
90
91
92[subsection {TreeQL OBJECT API}]
93
94As [package treeql] has been implemented in [package snit] all the
95standard methods of [package snit]-based classes are available to the
96user and therefore not listed here. Please read the documentation for
97[package snit] for what they are and what functionality they provide
98
99[para]
100
101The methods provided by the package [package treeql] itself are listed
102and explained below.
103
104[list_begin definitions]
105
106[call [arg qo] [method query] [arg args]...]
107
108This method interprets its arguments as a series of TreeQL operators
109and interpretes them from the left to right (i.e. first to last).
110
111Note that the first operator uses the [term {node set}] currently
112known to the object to perform its actions.
113
114In other words, the [term {node set}] is [emph not] cleared, or
115modified in other ways, before the query is run. This allows the user
116to run several queries one after the other and have each use the
117results of the last. Any initialization has to be done by any query
118itself, using TreeQL operators.
119
120The result of the method is the [term {node set}] after the last
121operator of the query has been executed.
122
123[para]
124
125[emph Note] that uncaught errors will leave the [term {node set}] of
126the object in an intermediate state, per the TreeQL operators which
127were executed successfully before the error occurred.
128
129[para]
130
131The above means in detail that:
132
133[list_begin enumerated]
134[enum]
135
136The first argument is interpreted as the name of a query operator, the
137number of arguments required by that operator is then determined, and
138taken from the immediately following arguments.
139
140[para]
141
142Because of this operators cannot have optional arguments, all
143arguments have to be present as defined.  Failure to do this will, at
144least, confuse the query interpreter, but more likely cause errors.
145
146[enum]
147
148The operator is applied to the current node set, yielding a new node
149set, and/or manipulating the tree object the query object is connected
150to.
151
152[enum]
153
154The arguments used (i.e. operator name and arguments) are removed from
155the list of method arguments, and then the whole process is repeated
156from step [lb]1[rb], until the list of arguments is empty or an error
157occurred.
158
159[list_end]
160
161[para]
162[example {
163    # q is the query object.
164
165    q query root children get data
166
167    # The above query
168    # - Resets the node set to the root node - root
169    # - Adds the children of root to the set - children
170    # - Replaces the node set with the       - get data
171    #   values for the attribute 'data',
172    #   for all nodes in the set which
173    #   have such an attribute.
174    # - And returns this information.
175
176    # Below we can see the same query, but rewritten
177    # to show the structure as it is seen by the query
178    # interpreter.
179
180    q query \\
181	    root \\
182	    children \\
183	    get data
184}]
185[para]
186
187The operators of the TreeQL language available for this are explained
188in the section about [sectref {The Tree Query Language}]. This section
189also explains the term [term {node set}] used above.
190
191
192[call [arg qo] [method result]]
193
194This method returns a list containing the current node set.
195
196
197[call [arg qo] [method discard]]
198
199This method returns the current node set (like method
200[method result]), but also destroys the query object ([arg qo]).
201
202This is useful when constructing and using sub-queries (%AUTO% objects
203immediately destroyed after use).
204
205[list_end]
206
207
208[comment ===========================================]
209[section {The Tree Query Language}]
210
211This and the following sections specify the Tree Query Language used
212by the query objects of this package in detail.
213
214[para]
215
216First we explain the general concepts underneath the language which
217are required to comprehend it. This is followed by the specifications
218for all the available query operators. They fall into eight
219categories, and each category has its own section.
220
221[para]
222[comment {
223    Local table of contents just for this section.
224}]
225[list_begin enumerated]
226[enum]
227[sectref {TreeQL Concepts}]
228[enum]
229[sectref {Structural generators}]
230[enum]
231[sectref {Attribute Filters}]
232[enum]
233[sectref {Attribute Mutators}]
234[enum]
235[sectref {Attribute String Accessors}]
236[enum]
237[sectref Sub-queries]
238[enum]
239[sectref {Node Set Operators}]
240[enum]
241[sectref {Node Set Iterators}]
242[enum]
243[sectref {Typed node support}]
244[list_end]
245[para]
246
247
248[comment ===========================================]
249[subsection {TreeQL Concepts}]
250
251The main concept which has to be understood is that of the
252[term {node set}].
253
254Each query object maintains exactly one such [term {node set}], and
255essentially all operators use it and input argument and for their
256result.
257
258This structure simply contains the handles of all nodes which are
259currently of interest to the query object.
260
261To name it a [term set] is a bit of a misnomer, because
262
263[list_begin enumerated]
264[enum]
265A node (handle) can occur in the structure more than once, and
266
267[enum]
268the order of nodes in the structure is important as well.
269
270Whenever an operator processes all nodes in the node set it will do so
271in the order they occur in the structure.
272
273[list_end]
274[para]
275
276Regarding the possible multiple occurrence of a node, consider a node
277set containing two nodes A and B, both having node P as their
278immediate parent. Application of the TreeQL operator "parent" will
279then add P to the new node set twice, once per node it was parent
280of. I.e. the new node set will then be {P P}.
281
282
283[comment ===========================================]
284[subsection {Structural generators}]
285
286All tree-structural operators locate nodes in the tree based on a
287structural relation ship to the nodes currently in the set and then
288replace the current node set with the set of nodes found
289
290Nodes which fulfill such a relationship multiple times are added to
291the result as often as they fulfill the relationship.
292
293[para]
294
295It is important to note that the found nodes are collected in a
296separate storage area while processing the node set, and are added to
297(or replacing) the current node set only after the current node set
298has been processed completely.
299
300In other words, the new nodes are [emph not] processed by the operator
301as well and do not affect the iteration.
302
303[para]
304
305When describing an operator the variable [var N] will be used to refer
306to any node in the node set.
307
308[list_begin definitions]
309[def [method ancestors]]
310
311Replaces the current node set with the ancestors for all nodes [var N]
312in the node set, should [var N] have a parent. In other words, nodes
313without a parent do not contribute to the new node set. In other
314words, uses all nodes on the path from node [var N] to root, in this
315order (root last), for all nodes [var N] in the node set. This
316includes the root, but not the node itself.
317
318
319[def [method rootpath]]
320
321Replaces the current node set with the ancestors for all nodes [var N]
322in the node set, should [var N] have a parent. In other words, nodes
323without a parent do not contribute to the new node set.
324
325In contrast to the operator [method ancestors] the nodes are added in
326reverse order however, i.e. the root node first.
327
328
329[def [method parent]]
330
331Replaces the current node set with the parent of node [var N], for all
332nodes [var N] in the node set, should [var N] have a parent. In other
333words, nodes without a parent do not contribute to the new node set.
334
335
336[def [method children]]
337
338Replaces the current node set with the immediate children of node
339[var N], for all nodes [var N] in the node set, should [var N] have
340children. In other words, nodes without children do not contribute to
341the new node set.
342
343
344[def [method left]]
345
346Replaces the current node set with the previous/left sibling for all
347nodes [var N] in the node set, should [var N] have siblings to the
348left. In other words, nodes without left siblings do not contribute to
349the new node set.
350
351
352[def [method right]]
353
354Replaces the current node set with the next/right sibling for all
355nodes [var N] in the node set, should [var N] have siblings to the
356right. In other words, nodes without right siblings do not contribute
357to the new node set.
358
359
360[def [method prev]]
361
362Replaces the current node set with all previous/left siblings of node
363[var N], for all nodes [var N] in the node set, should [var N] have
364siblings to the left. In other words, nodes without left siblings are
365ignored. The left sibling adjacent to the node is added first, and the
366leftmost sibling last (reverse tree order).
367
368
369[def [method esib]]
370
371Replaces the current node set with all previous/left siblings of node
372[var N], for all nodes [var N] in the node set, should [var N] have
373siblings to the left. In other words, nodes without left siblings are
374ignored. The leftmost sibling is added first, and the left sibling
375adjacent to the node last (tree order).
376
377[para]
378
379The method name is a shorthand for [term {Earlier SIBling}].
380
381
382[def [method next]]
383
384Replaces the current node set with all next/right siblings of node
385[var N], for all nodes [var N] in the node set, should [var N] have
386siblings to the right. In other words, nodes without right siblings do
387not contribute to the new node set. The right sibling adjacent to the
388node is added first, and the rightmost sibling last (tree order).
389
390
391[def [method root]]
392
393Replaces the current node set with a node set containing a single
394node, the root of the tree.
395
396
397[def [method tree]]
398
399Replaces the current node set with a node set containing all nodes
400found in the tree. The nodes are added in pre-order (parent first,
401then children, the latter from left to right, first to last).
402
403
404[def [method descendants]]
405
406Replaces the current node set with the nodes in all subtrees rooted at
407node [var N], for all nodes [var N] in the node set, should [var N]
408have children. In other words, nodes without children do not
409contribute to the new node set.
410
411[para]
412
413This is like the operator [method children], but covers the children
414of children as well, i.e. all the [term {proper descendants}]. "Rooted
415at [var N]" means that [var N] itself is not added to the new set,
416which is also implied by [term {proper descendants}].
417
418
419[def [method subtree]]
420
421Like operator [method descendants], but includes the node [var N]. In
422other words:
423
424[para]
425
426Replaces the current node set with the nodes of the subtree of node
427[var N], for all nodes [var N] in the node set, should [var N] have
428children. In other words, nodes without children do not contribute to
429the new node set. I.e this is like the operator [method children], but
430covers the children of children, etc. as well. "Of [var N]" means that
431[var N] itself is added to the new set.
432
433
434[def [method forward]]
435
436Replaces the current node set with the nodes in the subtrees rooted at
437the right siblings of node [var N], for all nodes [var N] in the node
438set, should [var N] have right siblings, and they children. In other
439words, nodes without right siblings, and them without children are
440ignored.
441
442[para]
443This is equivalent to the operator sequence
444
445[example {next descendants}]
446
447
448[def [method later]]
449
450This is an alias for the operator [method forward].
451
452
453[def [method backward]]
454
455Replaces the current node set with the nodes in the flattened previous
456subtrees, in reverse tree order.
457
458[para]
459This is nearly equivalent to the operator sequence
460
461[example {prev descendants}]
462
463The only difference is that this uses the nodes in reverse order.
464
465
466[def [method earlier]]
467
468Replaces the current node set with the nodes in the flattened previous
469subtrees, in tree order.
470
471[para]
472This is equivalent to the operator sequence
473
474[example {prev subtree}]
475
476[list_end]
477
478
479[comment ===========================================]
480[subsection {Attribute Filters}]
481
482These operators filter the node set by reference to attributes of
483nodes and their properties. Filter means that all nodes not fulfilling
484the criteria are removed from the node set. In other words, the node
485set is replaced by the set of nodes fulfilling the filter criteria.
486
487
488[list_begin definitions]
489
490[def "[method hasatt] [arg attr]"]
491
492Reduces the node set to nodes which have an attribute named
493[arg attr].
494
495
496[def "[method withatt] [arg attr] [arg value]"]
497
498Reduces the node set to nodes which have an attribute named
499[arg attr], and where the value of that attribute is equal to
500[arg value] (The "==" operator is [cmd {string equal -nocase}]).
501
502
503[def "[method withatt!] [arg attr] [arg val]"]
504
505This is the same as [method withatt], but all nodes in the node set
506have to have the attribute, and the "==" operator is
507[cmd {string equal}], i.e. no [option -nocase].
508
509The operator will fail with an error if they don't have the attribute.
510
511
512[def "[method attof] [arg attr] [arg vals]"]
513
514Reduces the node set to nodes which which have an attribute named
515[arg attr] and where the value of that attribute is contained in the
516list [arg vals] of legal values. The contained-in operator used here
517does glob matching (using the attribute value as pattern) and ignores
518the case of the attribute value, [emph {but not}] for the elements of
519[arg vals].
520
521
522[def "[method attmatch] [arg attr] [arg match]"]
523
524Same as [method withatt], but [cmd {string match}] is used as the "=="
525operator, and [arg match] is the pattern checked for.
526
527[para]
528
529[emph Note] that [arg match] is a interpreted as a partial argument
530[emph list] for [cmd {string match}]. This means that it is
531interpreted as a list containing the pattern, and the pattern element
532can be preceded by options understand by [cmd {string match}], like
533[option -nocase].
534
535This is especially important should the pattern contain spaces. It has
536to be wrapped into a list for correct interpretation by this operator
537
538[list_end]
539
540
541[comment ===========================================]
542[subsection {Attribute Mutators}]
543
544These operators change node attributes within the underlying tree. In
545other words, all these operators have [term {side effects}].
546
547
548[list_begin definitions]
549
550[def "[method set] [arg attr] [arg val]"]
551
552Sets the attribute [arg attr] to the value [arg val], for all nodes
553[var N] in the node set.
554
555The operator will fail if a node does not have an attribute named
556[arg attr]. The tree will be left in a partially modified state.
557
558
559[def "[method unset] [arg attr]"]
560
561Unsets the attribute [arg attr], for all nodes [var N] in the node
562set.
563
564The operator will fail if a node does not have an attribute named
565[arg attr]. The tree will be left in a partially modified state.
566
567[list_end]
568
569
570
571[comment ===========================================]
572[subsection {Attribute String Accessors}]
573
574These operators retrieve the values of node attributes from the
575underlying tree. The collected results are stored in the node set, but
576are not actually nodes.
577
578[para]
579
580In other words, they redefine the semantics of the node set stored by
581the query object to contain non-node data after their completion.
582
583[para]
584
585The query interpreter will terminate after it has finished processing
586one of these operators, silently discarding any later query elements.
587It also means that our talk about maintenance of a node set is not
588quite true. It is a node set while the interpreter is processing
589commands, but can be left as an attribute value set at the end of
590query processing.
591
592
593[list_begin definitions]
594
595[def "[method string] [arg op] [arg attr]"]
596
597Applies the string operator [arg op] to the attribute named
598[arg attr], for all nodes [var N] in the node set, collects the
599results of that application and places them into the node set.
600
601[para]
602
603The operator will fail if a node does not have an attribute named
604[arg attr].
605
606[para]
607
608The argument [arg op] is interpreted as partial argument list for the
609builtin command [cmd string].  Its first word has to be any of the
610sub-commands understood by [cmd string].  This has to be followed by
611all arguments required for the subcommand, except the last.  that last
612argument is supplied by the attribute value.
613
614
615[def "[method get] [arg pattern]"]
616
617For all nodes [var N] in the node set it determines all their
618attributes with names matching the glob [arg pattern], then the values
619of these attributes, at last it replaces the node set with the list of
620these attribute values.
621
622
623[def [method attlist]]
624
625This is a convenience definition for the operator
626[method {getvals *}]. In other words, it replaces the node set with a
627list of the attribute values for all attributes for all nodes [var N]
628in the node set.
629
630
631[def "[method attrs] [arg glob]"]
632
633Replaces the current node set with a list of attribute lists, one
634attribute list per for all nodes [var N] in the node set.
635
636
637[def "[method attval] [arg attname]"]
638
639Reduces the current node set with the operator [method hasatt], and
640then replaces it with a list containing the values of the attribute
641named [arg attname] for all nodes [var N] in the node set.
642
643[list_end]
644
645
646[comment ===========================================]
647[subsection Sub-queries]
648
649Sub-queries yield node sets which are then used to augment, reduce or
650replace the current node set.
651
652[list_begin definitions]
653
654[def "[method andq] [arg query]"]
655
656Replaces the node set with the set-intersection of the node set
657generated by the sub-query [arg query] and itself.
658
659[para]
660
661The execution of the sub-query uses the current node set as its own
662initial node set.
663
664
665[def "[method orq] [arg query]"]
666
667Replaces the node set with the set-union of the node set generated by
668the sub-query [arg query] and itself. Duplicate nodes are removed.
669
670[para]
671
672The execution of the sub-query uses the current node set as its own
673initial node set.
674
675
676[def "[method notq] [arg query]"]
677
678Replaces the node set with the set of nodes generated by the sub-query
679[arg query] which are also not in the current node set. In other word
680the set difference of itself and the node set generated by the
681sub-query.
682
683[para]
684
685The execution of the sub-query uses the current node set as its own
686initial node set.
687
688[list_end]
689
690
691[comment ===========================================]
692[subsection {Node Set Operators}]
693
694These operators change the node set directly, without referring to the
695tree.
696
697[comment {
698    Should have a 'reverse' as well.
699}]
700
701[list_begin definitions]
702
703[def [method unique]]
704
705Removes duplicate nodes from the node set, preserving order. In other
706words, the earliest occurrence of a node handle is preserved, every
707other occurrence is removed.
708
709
710[def [method select]]
711
712Replaces the current node set with a node set containing only the
713first node from the current node set
714
715
716[def "[method transform] [arg query] [arg var] [arg body]"]
717
718First it interprets the sub-query [arg query], using the current node
719set as its initial node set.
720
721Then it iterates over the result of that query, binding the handle of
722each node to the variable named in [arg var], and executing the script
723[arg body].
724
725The collected results of these executions is made the new node set,
726replacing the current one.
727
728[para]
729
730The script [arg body] is executed in the context of the caller.
731
732
733[def "[method map] [arg var] [arg body]"]
734
735Iterates over the current node set, binding the handle of each node to
736the variable named in [arg var], and executing the script [arg body].
737
738The collected results of these executions is made the new node set,
739replacing the current one.
740
741[para]
742
743The script [arg body] is executed in the context of the caller.
744
745
746[def "[method quote] [arg val]"]
747
748Appends the literal value [arg val] to the current node set.
749
750
751[def "[method replace] [arg val]"]
752
753Replaces the current node set with the literal list value [arg val].
754
755[list_end]
756
757
758[comment ===========================================]
759[subsection {Node Set Iterators}]
760
761[list_begin definitions]
762
763[def "[method foreach] [arg query] [arg var] [arg body]"]
764
765Interprets the sub-query [arg query], then performs the equivalent of
766operator [method over] on the nodes in the node set created by that
767query. The current node set is not changed, except through side
768effects from the script [arg body].
769
770[para]
771
772The script [arg body] is executed in the context of the caller.
773
774
775[def "[method with] [arg query] [arg body]"]
776
777Interprets the [arg query], then runs the script [arg body] on the
778node set generated by the query. At last it restores the current node
779set as it was before the execution of the query.
780
781[para]
782
783The script [arg body] is executed in the context of the caller.
784
785
786[def "[method over] [arg var] [arg body]"]
787
788Executes the script [arg body] for each node in the node set, with the
789variable named by [arg var] bound to the name of the current node.
790
791The script [arg body] is executed in the context of the caller.
792
793[para]
794
795This is like the builtin [cmd foreach], with the node set as the
796source of the list to iterate over.
797
798[para]
799
800The results of executing the [arg body] are ignored.
801
802
803[def [method delete]]
804
805Deletes all the nodes contained in the current node set from the tree.
806
807[list_end]
808
809
810[comment ===========================================]
811[subsection {Typed node support}]
812
813These filters and accessors assume the existence of an attribute
814called [const @type], and are short-hand forms useful for cost-like
815tree query, html tree editing, and so on.
816
817
818[list_begin definitions]
819
820[def [method nodetype]]
821
822Returns the node type of nodes.
823
824Attribute string accessor. This is equivalent to
825
826[example {get @type}]
827
828
829[def "[method oftype] [arg t]"]
830
831Reduces the node set to nodes whose type is equal to [arg t], with
832letter case ignored.
833
834
835[def "[method nottype] [arg t]"]
836
837Reduces the node set to nodes whose type is not equal to [arg t], with
838letter case ignored.
839
840
841[def "[method oftypes] [arg attrs]"]
842
843Reduces set to nodes whose @type is an element in the list [arg attrs]
844of types. The value of @type is used as a glob pattern, and letter
845case is relevant.
846
847[list_end]
848
849
850[section Examples]
851
852... TODO ...
853
854[section References]
855
856[list_begin enumerated]
857[enum]
858[uri http://wiki.tcl.tk/COST COST] on the Tcler's Wiki.
859
860[enum]
861
862[uri http://wiki.tcl.tk/treeql TreeQL] on the Tcler's Wiki. Discuss
863this package there.
864
865[list_end]
866
867[section {BUGS, IDEAS, FEEDBACK}]
868
869This document, and the package it describes, will undoubtedly contain
870bugs and other problems.
871
872Please report such in the category [emph treeql] of the
873[uri {http://sourceforge.net/tracker/?group_id=12883} {Tcllib SF Trackers}].
874
875Please also report any ideas for enhancements you may have for either
876package and/or documentation.
877
878
879[keywords {tree query language} {structured queries}]
880[keywords tree TreeQL Cost XPath DOM XSLT]
881[manpage_end]
882