1[comment {-*- tcl -*- doctools manpage}] 2[comment ===========================================] 3[manpage_begin treeql n 1.3.1] 4[copyright {2004 Colin McCormack <coldstore@users.sourceforge.net>}] 5[copyright {2004 Andreas Kupries <andreas_kupries@users.sourceforge.net>}] 6[moddesc {Tree Query Language}] 7[titledesc {Query tree objects}] 8[category {Data structures}] 9[require Tcl 8.2] 10[require snit] 11[require struct::list] 12[require struct::set] 13[require treeql [opt 1.3.1]] 14[description] 15[para] 16 17This package provides objects which can be used to query and transform 18tree objects following the API of tree objects created by the package 19[package struct::tree]. 20 21[para] 22 23The tree query and manipulation language used here, TreeQL, is 24inspired by Cost (See section [sectref References] for more 25information). 26 27[para] 28 29[package treeql], the package, is a fairly thin query facility over 30tree-structured data types. It implements an ordered set of nodes 31(really a list) which are generated and filtered through the 32application of TreeQL operators to each node in turn. 33 34 35[comment ===========================================] 36[section API] 37 38[subsection {TreeQL CLASS API}] 39 40The command [cmd treeql] is a [package snit]::type which implements 41the Treeql Query Language. This means that it follows the API for 42class commands as specified by the package [package snit]. 43 44Its general syntax is 45 46[list_begin definitions] 47 48[call [cmd treeql] [arg objectname] [option -tree] [arg tree] \ 49 [opt "[option -query] [arg query]"] \ 50 [opt "[option -nodes] [arg nodes]"] \ 51 [opt [arg args]...]] 52 53The command creates a new tree query object and returns the fully 54qualified name of the object command as its result. 55 56The API the returned command is following is described in the section 57[sectref {TreeQL OBJECT API}] 58 59[para] 60 61Each query object is associated with a single [arg tree] object. This 62is the object all queries will be run against. 63 64[para] 65 66If the option [option -nodes] was specified then its argument is 67treated as a list of nodes. This list is used to initialize the node 68set. It defaults to the empty list. 69 70[para] 71 72If the option [option -query] was specified then its argument will be 73interpreted as an object, the [term {parent query}] of this query. It 74defaults to the object itself. All queries will be interpreted in the 75environment of this object. 76 77[para] 78 79Any arguments coming after the options are treated as a query and run 80immediately, after the [term {node set}] has been initialized. This 81uses the same syntax for the query as the method [method query]. 82 83[para] 84 85The operations of the TreeQL available for this are explained in the 86section about [sectref {The Tree Query Language}]. This section also 87explains the term [term {node set}] used above. 88 89[list_end] 90 91 92[subsection {TreeQL OBJECT API}] 93 94As [package treeql] has been implemented in [package snit] all the 95standard methods of [package snit]-based classes are available to the 96user and therefore not listed here. Please read the documentation for 97[package snit] for what they are and what functionality they provide 98 99[para] 100 101The methods provided by the package [package treeql] itself are listed 102and explained below. 103 104[list_begin definitions] 105 106[call [arg qo] [method query] [arg args]...] 107 108This method interprets its arguments as a series of TreeQL operators 109and interpretes them from the left to right (i.e. first to last). 110 111Note that the first operator uses the [term {node set}] currently 112known to the object to perform its actions. 113 114In other words, the [term {node set}] is [emph not] cleared, or 115modified in other ways, before the query is run. This allows the user 116to run several queries one after the other and have each use the 117results of the last. Any initialization has to be done by any query 118itself, using TreeQL operators. 119 120The result of the method is the [term {node set}] after the last 121operator of the query has been executed. 122 123[para] 124 125[emph Note] that uncaught errors will leave the [term {node set}] of 126the object in an intermediate state, per the TreeQL operators which 127were executed successfully before the error occurred. 128 129[para] 130 131The above means in detail that: 132 133[list_begin enumerated] 134[enum] 135 136The first argument is interpreted as the name of a query operator, the 137number of arguments required by that operator is then determined, and 138taken from the immediately following arguments. 139 140[para] 141 142Because of this operators cannot have optional arguments, all 143arguments have to be present as defined. Failure to do this will, at 144least, confuse the query interpreter, but more likely cause errors. 145 146[enum] 147 148The operator is applied to the current node set, yielding a new node 149set, and/or manipulating the tree object the query object is connected 150to. 151 152[enum] 153 154The arguments used (i.e. operator name and arguments) are removed from 155the list of method arguments, and then the whole process is repeated 156from step [lb]1[rb], until the list of arguments is empty or an error 157occurred. 158 159[list_end] 160 161[para] 162[example { 163 # q is the query object. 164 165 q query root children get data 166 167 # The above query 168 # - Resets the node set to the root node - root 169 # - Adds the children of root to the set - children 170 # - Replaces the node set with the - get data 171 # values for the attribute 'data', 172 # for all nodes in the set which 173 # have such an attribute. 174 # - And returns this information. 175 176 # Below we can see the same query, but rewritten 177 # to show the structure as it is seen by the query 178 # interpreter. 179 180 q query \\ 181 root \\ 182 children \\ 183 get data 184}] 185[para] 186 187The operators of the TreeQL language available for this are explained 188in the section about [sectref {The Tree Query Language}]. This section 189also explains the term [term {node set}] used above. 190 191 192[call [arg qo] [method result]] 193 194This method returns a list containing the current node set. 195 196 197[call [arg qo] [method discard]] 198 199This method returns the current node set (like method 200[method result]), but also destroys the query object ([arg qo]). 201 202This is useful when constructing and using sub-queries (%AUTO% objects 203immediately destroyed after use). 204 205[list_end] 206 207 208[comment ===========================================] 209[section {The Tree Query Language}] 210 211This and the following sections specify the Tree Query Language used 212by the query objects of this package in detail. 213 214[para] 215 216First we explain the general concepts underneath the language which 217are required to comprehend it. This is followed by the specifications 218for all the available query operators. They fall into eight 219categories, and each category has its own section. 220 221[para] 222[comment { 223 Local table of contents just for this section. 224}] 225[list_begin enumerated] 226[enum] 227[sectref {TreeQL Concepts}] 228[enum] 229[sectref {Structural generators}] 230[enum] 231[sectref {Attribute Filters}] 232[enum] 233[sectref {Attribute Mutators}] 234[enum] 235[sectref {Attribute String Accessors}] 236[enum] 237[sectref Sub-queries] 238[enum] 239[sectref {Node Set Operators}] 240[enum] 241[sectref {Node Set Iterators}] 242[enum] 243[sectref {Typed node support}] 244[list_end] 245[para] 246 247 248[comment ===========================================] 249[subsection {TreeQL Concepts}] 250 251The main concept which has to be understood is that of the 252[term {node set}]. 253 254Each query object maintains exactly one such [term {node set}], and 255essentially all operators use it and input argument and for their 256result. 257 258This structure simply contains the handles of all nodes which are 259currently of interest to the query object. 260 261To name it a [term set] is a bit of a misnomer, because 262 263[list_begin enumerated] 264[enum] 265A node (handle) can occur in the structure more than once, and 266 267[enum] 268the order of nodes in the structure is important as well. 269 270Whenever an operator processes all nodes in the node set it will do so 271in the order they occur in the structure. 272 273[list_end] 274[para] 275 276Regarding the possible multiple occurrence of a node, consider a node 277set containing two nodes A and B, both having node P as their 278immediate parent. Application of the TreeQL operator "parent" will 279then add P to the new node set twice, once per node it was parent 280of. I.e. the new node set will then be {P P}. 281 282 283[comment ===========================================] 284[subsection {Structural generators}] 285 286All tree-structural operators locate nodes in the tree based on a 287structural relation ship to the nodes currently in the set and then 288replace the current node set with the set of nodes found 289 290Nodes which fulfill such a relationship multiple times are added to 291the result as often as they fulfill the relationship. 292 293[para] 294 295It is important to note that the found nodes are collected in a 296separate storage area while processing the node set, and are added to 297(or replacing) the current node set only after the current node set 298has been processed completely. 299 300In other words, the new nodes are [emph not] processed by the operator 301as well and do not affect the iteration. 302 303[para] 304 305When describing an operator the variable [var N] will be used to refer 306to any node in the node set. 307 308[list_begin definitions] 309[def [method ancestors]] 310 311Replaces the current node set with the ancestors for all nodes [var N] 312in the node set, should [var N] have a parent. In other words, nodes 313without a parent do not contribute to the new node set. In other 314words, uses all nodes on the path from node [var N] to root, in this 315order (root last), for all nodes [var N] in the node set. This 316includes the root, but not the node itself. 317 318 319[def [method rootpath]] 320 321Replaces the current node set with the ancestors for all nodes [var N] 322in the node set, should [var N] have a parent. In other words, nodes 323without a parent do not contribute to the new node set. 324 325In contrast to the operator [method ancestors] the nodes are added in 326reverse order however, i.e. the root node first. 327 328 329[def [method parent]] 330 331Replaces the current node set with the parent of node [var N], for all 332nodes [var N] in the node set, should [var N] have a parent. In other 333words, nodes without a parent do not contribute to the new node set. 334 335 336[def [method children]] 337 338Replaces the current node set with the immediate children of node 339[var N], for all nodes [var N] in the node set, should [var N] have 340children. In other words, nodes without children do not contribute to 341the new node set. 342 343 344[def [method left]] 345 346Replaces the current node set with the previous/left sibling for all 347nodes [var N] in the node set, should [var N] have siblings to the 348left. In other words, nodes without left siblings do not contribute to 349the new node set. 350 351 352[def [method right]] 353 354Replaces the current node set with the next/right sibling for all 355nodes [var N] in the node set, should [var N] have siblings to the 356right. In other words, nodes without right siblings do not contribute 357to the new node set. 358 359 360[def [method prev]] 361 362Replaces the current node set with all previous/left siblings of node 363[var N], for all nodes [var N] in the node set, should [var N] have 364siblings to the left. In other words, nodes without left siblings are 365ignored. The left sibling adjacent to the node is added first, and the 366leftmost sibling last (reverse tree order). 367 368 369[def [method esib]] 370 371Replaces the current node set with all previous/left siblings of node 372[var N], for all nodes [var N] in the node set, should [var N] have 373siblings to the left. In other words, nodes without left siblings are 374ignored. The leftmost sibling is added first, and the left sibling 375adjacent to the node last (tree order). 376 377[para] 378 379The method name is a shorthand for [term {Earlier SIBling}]. 380 381 382[def [method next]] 383 384Replaces the current node set with all next/right siblings of node 385[var N], for all nodes [var N] in the node set, should [var N] have 386siblings to the right. In other words, nodes without right siblings do 387not contribute to the new node set. The right sibling adjacent to the 388node is added first, and the rightmost sibling last (tree order). 389 390 391[def [method root]] 392 393Replaces the current node set with a node set containing a single 394node, the root of the tree. 395 396 397[def [method tree]] 398 399Replaces the current node set with a node set containing all nodes 400found in the tree. The nodes are added in pre-order (parent first, 401then children, the latter from left to right, first to last). 402 403 404[def [method descendants]] 405 406Replaces the current node set with the nodes in all subtrees rooted at 407node [var N], for all nodes [var N] in the node set, should [var N] 408have children. In other words, nodes without children do not 409contribute to the new node set. 410 411[para] 412 413This is like the operator [method children], but covers the children 414of children as well, i.e. all the [term {proper descendants}]. "Rooted 415at [var N]" means that [var N] itself is not added to the new set, 416which is also implied by [term {proper descendants}]. 417 418 419[def [method subtree]] 420 421Like operator [method descendants], but includes the node [var N]. In 422other words: 423 424[para] 425 426Replaces the current node set with the nodes of the subtree of node 427[var N], for all nodes [var N] in the node set, should [var N] have 428children. In other words, nodes without children do not contribute to 429the new node set. I.e this is like the operator [method children], but 430covers the children of children, etc. as well. "Of [var N]" means that 431[var N] itself is added to the new set. 432 433 434[def [method forward]] 435 436Replaces the current node set with the nodes in the subtrees rooted at 437the right siblings of node [var N], for all nodes [var N] in the node 438set, should [var N] have right siblings, and they children. In other 439words, nodes without right siblings, and them without children are 440ignored. 441 442[para] 443This is equivalent to the operator sequence 444 445[example {next descendants}] 446 447 448[def [method later]] 449 450This is an alias for the operator [method forward]. 451 452 453[def [method backward]] 454 455Replaces the current node set with the nodes in the flattened previous 456subtrees, in reverse tree order. 457 458[para] 459This is nearly equivalent to the operator sequence 460 461[example {prev descendants}] 462 463The only difference is that this uses the nodes in reverse order. 464 465 466[def [method earlier]] 467 468Replaces the current node set with the nodes in the flattened previous 469subtrees, in tree order. 470 471[para] 472This is equivalent to the operator sequence 473 474[example {prev subtree}] 475 476[list_end] 477 478 479[comment ===========================================] 480[subsection {Attribute Filters}] 481 482These operators filter the node set by reference to attributes of 483nodes and their properties. Filter means that all nodes not fulfilling 484the criteria are removed from the node set. In other words, the node 485set is replaced by the set of nodes fulfilling the filter criteria. 486 487 488[list_begin definitions] 489 490[def "[method hasatt] [arg attr]"] 491 492Reduces the node set to nodes which have an attribute named 493[arg attr]. 494 495 496[def "[method withatt] [arg attr] [arg value]"] 497 498Reduces the node set to nodes which have an attribute named 499[arg attr], and where the value of that attribute is equal to 500[arg value] (The "==" operator is [cmd {string equal -nocase}]). 501 502 503[def "[method withatt!] [arg attr] [arg val]"] 504 505This is the same as [method withatt], but all nodes in the node set 506have to have the attribute, and the "==" operator is 507[cmd {string equal}], i.e. no [option -nocase]. 508 509The operator will fail with an error if they don't have the attribute. 510 511 512[def "[method attof] [arg attr] [arg vals]"] 513 514Reduces the node set to nodes which which have an attribute named 515[arg attr] and where the value of that attribute is contained in the 516list [arg vals] of legal values. The contained-in operator used here 517does glob matching (using the attribute value as pattern) and ignores 518the case of the attribute value, [emph {but not}] for the elements of 519[arg vals]. 520 521 522[def "[method attmatch] [arg attr] [arg match]"] 523 524Same as [method withatt], but [cmd {string match}] is used as the "==" 525operator, and [arg match] is the pattern checked for. 526 527[para] 528 529[emph Note] that [arg match] is a interpreted as a partial argument 530[emph list] for [cmd {string match}]. This means that it is 531interpreted as a list containing the pattern, and the pattern element 532can be preceded by options understand by [cmd {string match}], like 533[option -nocase]. 534 535This is especially important should the pattern contain spaces. It has 536to be wrapped into a list for correct interpretation by this operator 537 538[list_end] 539 540 541[comment ===========================================] 542[subsection {Attribute Mutators}] 543 544These operators change node attributes within the underlying tree. In 545other words, all these operators have [term {side effects}]. 546 547 548[list_begin definitions] 549 550[def "[method set] [arg attr] [arg val]"] 551 552Sets the attribute [arg attr] to the value [arg val], for all nodes 553[var N] in the node set. 554 555The operator will fail if a node does not have an attribute named 556[arg attr]. The tree will be left in a partially modified state. 557 558 559[def "[method unset] [arg attr]"] 560 561Unsets the attribute [arg attr], for all nodes [var N] in the node 562set. 563 564The operator will fail if a node does not have an attribute named 565[arg attr]. The tree will be left in a partially modified state. 566 567[list_end] 568 569 570 571[comment ===========================================] 572[subsection {Attribute String Accessors}] 573 574These operators retrieve the values of node attributes from the 575underlying tree. The collected results are stored in the node set, but 576are not actually nodes. 577 578[para] 579 580In other words, they redefine the semantics of the node set stored by 581the query object to contain non-node data after their completion. 582 583[para] 584 585The query interpreter will terminate after it has finished processing 586one of these operators, silently discarding any later query elements. 587It also means that our talk about maintenance of a node set is not 588quite true. It is a node set while the interpreter is processing 589commands, but can be left as an attribute value set at the end of 590query processing. 591 592 593[list_begin definitions] 594 595[def "[method string] [arg op] [arg attr]"] 596 597Applies the string operator [arg op] to the attribute named 598[arg attr], for all nodes [var N] in the node set, collects the 599results of that application and places them into the node set. 600 601[para] 602 603The operator will fail if a node does not have an attribute named 604[arg attr]. 605 606[para] 607 608The argument [arg op] is interpreted as partial argument list for the 609builtin command [cmd string]. Its first word has to be any of the 610sub-commands understood by [cmd string]. This has to be followed by 611all arguments required for the subcommand, except the last. that last 612argument is supplied by the attribute value. 613 614 615[def "[method get] [arg pattern]"] 616 617For all nodes [var N] in the node set it determines all their 618attributes with names matching the glob [arg pattern], then the values 619of these attributes, at last it replaces the node set with the list of 620these attribute values. 621 622 623[def [method attlist]] 624 625This is a convenience definition for the operator 626[method {getvals *}]. In other words, it replaces the node set with a 627list of the attribute values for all attributes for all nodes [var N] 628in the node set. 629 630 631[def "[method attrs] [arg glob]"] 632 633Replaces the current node set with a list of attribute lists, one 634attribute list per for all nodes [var N] in the node set. 635 636 637[def "[method attval] [arg attname]"] 638 639Reduces the current node set with the operator [method hasatt], and 640then replaces it with a list containing the values of the attribute 641named [arg attname] for all nodes [var N] in the node set. 642 643[list_end] 644 645 646[comment ===========================================] 647[subsection Sub-queries] 648 649Sub-queries yield node sets which are then used to augment, reduce or 650replace the current node set. 651 652[list_begin definitions] 653 654[def "[method andq] [arg query]"] 655 656Replaces the node set with the set-intersection of the node set 657generated by the sub-query [arg query] and itself. 658 659[para] 660 661The execution of the sub-query uses the current node set as its own 662initial node set. 663 664 665[def "[method orq] [arg query]"] 666 667Replaces the node set with the set-union of the node set generated by 668the sub-query [arg query] and itself. Duplicate nodes are removed. 669 670[para] 671 672The execution of the sub-query uses the current node set as its own 673initial node set. 674 675 676[def "[method notq] [arg query]"] 677 678Replaces the node set with the set of nodes generated by the sub-query 679[arg query] which are also not in the current node set. In other word 680the set difference of itself and the node set generated by the 681sub-query. 682 683[para] 684 685The execution of the sub-query uses the current node set as its own 686initial node set. 687 688[list_end] 689 690 691[comment ===========================================] 692[subsection {Node Set Operators}] 693 694These operators change the node set directly, without referring to the 695tree. 696 697[comment { 698 Should have a 'reverse' as well. 699}] 700 701[list_begin definitions] 702 703[def [method unique]] 704 705Removes duplicate nodes from the node set, preserving order. In other 706words, the earliest occurrence of a node handle is preserved, every 707other occurrence is removed. 708 709 710[def [method select]] 711 712Replaces the current node set with a node set containing only the 713first node from the current node set 714 715 716[def "[method transform] [arg query] [arg var] [arg body]"] 717 718First it interprets the sub-query [arg query], using the current node 719set as its initial node set. 720 721Then it iterates over the result of that query, binding the handle of 722each node to the variable named in [arg var], and executing the script 723[arg body]. 724 725The collected results of these executions is made the new node set, 726replacing the current one. 727 728[para] 729 730The script [arg body] is executed in the context of the caller. 731 732 733[def "[method map] [arg var] [arg body]"] 734 735Iterates over the current node set, binding the handle of each node to 736the variable named in [arg var], and executing the script [arg body]. 737 738The collected results of these executions is made the new node set, 739replacing the current one. 740 741[para] 742 743The script [arg body] is executed in the context of the caller. 744 745 746[def "[method quote] [arg val]"] 747 748Appends the literal value [arg val] to the current node set. 749 750 751[def "[method replace] [arg val]"] 752 753Replaces the current node set with the literal list value [arg val]. 754 755[list_end] 756 757 758[comment ===========================================] 759[subsection {Node Set Iterators}] 760 761[list_begin definitions] 762 763[def "[method foreach] [arg query] [arg var] [arg body]"] 764 765Interprets the sub-query [arg query], then performs the equivalent of 766operator [method over] on the nodes in the node set created by that 767query. The current node set is not changed, except through side 768effects from the script [arg body]. 769 770[para] 771 772The script [arg body] is executed in the context of the caller. 773 774 775[def "[method with] [arg query] [arg body]"] 776 777Interprets the [arg query], then runs the script [arg body] on the 778node set generated by the query. At last it restores the current node 779set as it was before the execution of the query. 780 781[para] 782 783The script [arg body] is executed in the context of the caller. 784 785 786[def "[method over] [arg var] [arg body]"] 787 788Executes the script [arg body] for each node in the node set, with the 789variable named by [arg var] bound to the name of the current node. 790 791The script [arg body] is executed in the context of the caller. 792 793[para] 794 795This is like the builtin [cmd foreach], with the node set as the 796source of the list to iterate over. 797 798[para] 799 800The results of executing the [arg body] are ignored. 801 802 803[def [method delete]] 804 805Deletes all the nodes contained in the current node set from the tree. 806 807[list_end] 808 809 810[comment ===========================================] 811[subsection {Typed node support}] 812 813These filters and accessors assume the existence of an attribute 814called [const @type], and are short-hand forms useful for cost-like 815tree query, html tree editing, and so on. 816 817 818[list_begin definitions] 819 820[def [method nodetype]] 821 822Returns the node type of nodes. 823 824Attribute string accessor. This is equivalent to 825 826[example {get @type}] 827 828 829[def "[method oftype] [arg t]"] 830 831Reduces the node set to nodes whose type is equal to [arg t], with 832letter case ignored. 833 834 835[def "[method nottype] [arg t]"] 836 837Reduces the node set to nodes whose type is not equal to [arg t], with 838letter case ignored. 839 840 841[def "[method oftypes] [arg attrs]"] 842 843Reduces set to nodes whose @type is an element in the list [arg attrs] 844of types. The value of @type is used as a glob pattern, and letter 845case is relevant. 846 847[list_end] 848 849 850[section Examples] 851 852... TODO ... 853 854[section References] 855 856[list_begin enumerated] 857[enum] 858[uri http://wiki.tcl.tk/COST COST] on the Tcler's Wiki. 859 860[enum] 861 862[uri http://wiki.tcl.tk/treeql TreeQL] on the Tcler's Wiki. Discuss 863this package there. 864 865[list_end] 866 867[section {BUGS, IDEAS, FEEDBACK}] 868 869This document, and the package it describes, will undoubtedly contain 870bugs and other problems. 871 872Please report such in the category [emph treeql] of the 873[uri {http://sourceforge.net/tracker/?group_id=12883} {Tcllib SF Trackers}]. 874 875Please also report any ideas for enhancements you may have for either 876package and/or documentation. 877 878 879[keywords {tree query language} {structured queries}] 880[keywords tree TreeQL Cost XPath DOM XSLT] 881[manpage_end] 882