README revision 1.1.1.1
1Copyright 2001, 2004 Free Software Foundation, Inc. 2 3This file is part of the GNU MP Library. 4 5The GNU MP Library is free software; you can redistribute it and/or modify 6it under the terms of the GNU Lesser General Public License as published by 7the Free Software Foundation; either version 3 of the License, or (at your 8option) any later version. 9 10The GNU MP Library is distributed in the hope that it will be useful, but 11WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY 12or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public 13License for more details. 14 15You should have received a copy of the GNU Lesser General Public License 16along with the GNU MP Library. If not, see http://www.gnu.org/licenses/. 17 18 19 20 21 22 23 GMP EXPRESSION EVALUATION 24 ------------------------- 25 26 27 28THIS CODE IS PRELIMINARY AND MAY BE SUBJECT TO INCOMPATIBLE CHANGES IN 29FUTURE VERSIONS OF GMP. 30 31 32 33The files in this directory implement a simple scheme of string based 34expression parsing and evaluation, supporting mpz, mpq and mpf. 35 36This will be slower than direct GMP library calls, but may be convenient in 37various circumstances, such as while prototyping, or for letting a user 38enter values in symbolic form. "2**5723-7" for example is a lot easier to 39enter or maintain than the equivalent written out in decimal. 40 41 42 43BUILDING 44 45Nothing in this directory is a normal part of libgmp, and nothing is built 46or installed, but various Makefile rules are available to compile 47everything. 48 49All the functions are available through a little library (there's no shared 50library since upward binary compatibility is not guaranteed). 51 52 make libexpr.a 53 54In a program, prototypes are available using 55 56 #include "expr.h" 57 58run-expr.c is a sample program doing evaluations from the command line. 59 60 make run-expr 61 ./run-expr '1+2*3' 62 63t-expr.c is self-test program, it prints nothing if successful. 64 65 make t-expr 66 ./t-expr 67 68The expr*.c sources don't depend on gmp-impl.h and can be compiled with just 69a standard installed GMP. This isn't true of t-expr though, since it uses 70some of the internal tests/libtests.la. 71 72 73 74SIMPLE USAGE 75 76int mpz_expr (mpz_t res, int base, const char *e, ...); 77int mpq_expr (mpq_t res, int base, const char *e, ...); 78int mpf_expr (mpf_t res, int base, const char *e, ...); 79 80These functions evaluate simple arithmetic expressions. For example, 81 82 mpz_expr (result, 0, "123+456", NULL); 83 84Numbers are parsed by mpz_expr and mpq_expr the same as mpz_set_str with the 85given base. mpf_expr follows mpf_set_str, but supporting an "0x" prefix for 86hex when base==0. 87 88 mpz_expr (result, 0, "0xAAAA * 0x5555", NULL); 89 90White space, as indicated by <ctype.h> isspace(), is ignored except for the 91purpose of separating tokens. 92 93Variables can be included in expressions by putting them in the varargs list 94after the string. "a", "b", "c" etc in the expression string designate 95those values. For example, 96 97 mpq_t foo, bar; 98 ... 99 mpq_expr (q, 10, "2/3 + 1/a + b/2", foo, bar, NULL); 100 101Here "a" will be the value from foo and "b" from bar. Up to 26 variables 102can be included this way. The NULL must be present to indicate the end of 103the list. 104 105Variables can also be written "$a", "$b" etc. This is necessary when using 106bases greater than 10 since plain "a", "b" etc will otherwise be interpreted 107as numbers. For example, 108 109 mpf_t quux; 110 mpf_expr (f, 16, "F00F@-6 * $a", quux, NULL); 111 112All the standard C operators are available, with the usual precedences, plus 113"**" for exponentiation at the highest precedence (and right associative). 114 115 Operators Precedence 116 ** 220 117 ~ ! - (unary) 210 118 * / % 200 119 + - 190 120 << >> 180 121 <= < >= > 170 122 == != 160 123 & 150 124 ^ 140 125 | 130 126 && 120 127 || 110 128 ? : 100/101 129 130Currently only mpz_expr has the bitwise ~ % & ^ and | operators. The 131precedence numbers are of interest in the advanced usage described below. 132 133Various functions are available too. For example, 134 135 mpz_expr (res, 10, "gcd(123,456,789) * abs(a)", var, NULL); 136 137The following is the full set of functions, 138 139 mpz_expr 140 abs bin clrbit cmp cmpabs congruent_p divisible_p even_p fib fac 141 gcd hamdist invert jacobi kronecker lcm lucnum max min nextprime 142 odd_p perfect_power_p perfect_square_p popcount powm 143 probab_prime_p root scan0 scan1 setbit sgn sqrt 144 145 mpq_expr 146 abs, cmp, den, max, min, num, sgn 147 148 mpf_expr 149 abs, ceil, cmp, eq, floor, integer_p, max, min, reldiff, sgn, 150 sqrt, trunc 151 152All these are the same as the GMP library functions, except that min and max 153don't exist in the library. Note also that min, max, gcd and lcm take any 154number of arguments, not just two. 155 156mpf_expr does all calculations to the precision of the destination variable. 157 158 159Expression parsing can succeed or fail. The return value indicates this, 160and will be one of the following 161 162 MPEXPR_RESULT_OK 163 MPEXPR_RESULT_BAD_VARIABLE 164 MPEXPR_RESULT_BAD_TABLE 165 MPEXPR_RESULT_PARSE_ERROR 166 MPEXPR_RESULT_NOT_UI 167 168BAD_VARIABLE is when a variable is referenced that hasn't been provided. 169For example if "c" is used when only two parameters have been passed. 170BAD_TABLE is applicable to the advanced usage described below. 171 172PARSE_ERROR is a general syntax error, returned for any mal-formed input 173string. 174 175NOT_UI is returned when an attempt is made to use an operand that's bigger 176than an "unsigned long" with a function that's restricted to that range. 177For example "fib" is mpz_fib_ui and only accepts an "unsigned long". 178 179 180 181 182ADVANCED USAGE 183 184int mpz_expr_a (const struct mpexpr_operator_t *table, 185 mpz_ptr res, int base, const char *e, size_t elen, 186 mpz_srcptr var[26]) 187int mpq_expr_a (const struct mpexpr_operator_t *table, 188 mpq_ptr res, int base, const char *e, size_t elen, 189 mpq_srcptr var[26]) 190int mpf_expr_a (const struct mpexpr_operator_t *table, 191 mpf_ptr res, int base, unsigned long prec, 192 const char *e, size_t elen, 193 mpf_srcptr var[26]) 194 195These functions are an advanced interface to expression parsing. 196 197The string is taken as pointer and length. This makes it possible to parse 198an expression in the middle of somewhere without copying and null 199terminating it. 200 201Variables are an array of 26 pointers to the appropriate operands, or NULL 202for variables that are not available. Any combination of variables can be 203given, for example just "x" and "y" (var[23] and var[24]) could be set. 204 205Operators and functions are specified with a table. This makes it possible 206to provide additional operators or functions, or to completely change the 207syntax. The standard tables used by the simple functions above are 208available as 209 210 const struct mpexpr_operator_t * const mpz_expr_standard_table; 211 const struct mpexpr_operator_t * const mpq_expr_standard_table; 212 const struct mpexpr_operator_t * const mpf_expr_standard_table; 213 214struct mpexpr_operator_t is the following 215 216 struct mpexpr_operator_t { 217 const char *name; 218 mpexpr_fun_t fun; 219 int type; 220 int precedence; 221 }; 222 223 typedef void (*mpexpr_fun_t) (void); 224 225As an example, the standard mpz_expr table entry for multiplication is as 226follows. See the source code for the full set of standard entries. 227 228 { "*", (mpexpr_fun_t) mpz_mul, MPEXPR_TYPE_BINARY, 200 }, 229 230"name" is the string to parse, "fun" is the function to call for it, "type" 231indicates what parameters the function takes (among other things), and 232"precedence" sets its operator precedence. 233 234A NULL for "name" indicates the end of the table, so for example an mpf 235table with nothing but addition could be 236 237 struct mpexpr_operator_t table[] = { 238 { "+", (mpexpr_fun_t) mpf_add, MPEXPR_TYPE_BINARY, 190 }, 239 { NULL } 240 }; 241 242A special type MPEXPR_TYPE_NEW_TABLE makes it possible to chain from one 243table to another. For example the following would add a "mod" operator to 244the standard mpz table, 245 246 struct mpexpr_operator_t table[] = { 247 { "mod", (mpexpr_fun_t) mpz_fdiv_r, MPEXPR_TYPE_BINARY, 125 }, 248 { (const char *) mpz_expr_standard_table, NULL, MPEXPR_TYPE_NEW_TABLE } 249 }; 250 251Notice the low precedence on "mod", so that for instance "45+26 mod 7" 252parses as "(45+26)mod7". 253 254 255Functions are designated by a precedence of 0. They always occur as 256"foo(expr)" and so have no need for a precedence level. mpq_abs in the 257standard mpq table is 258 259 { "abs", (mpexpr_fun_t) mpq_abs, MPEXPR_TYPE_UNARY }, 260 261Functions expecting no arguments as in "foo()" can be given with 262MPEXPR_TYPE_0ARY, or actual constants to be parsed as just "foo" are 263MPEXPR_TYPE_CONSTANT. For example if a "void mpf_const_pi(mpf_t f)" 264function existed (which it doesn't) it could be, 265 266 { "pi", (mpexpr_fun_t) mpf_const_pi, MPEXPR_TYPE_CONSTANT }, 267 268 269Parsing of operator names is done by seeking the table entry with the 270longest matching name. So for instance operators "<" and "<=" exist, and 271when presented with "x <= y" the parser matches "<=" because it's longer. 272 273Parsing of function names, on the other hand, is done by requiring a whole 274alphanumeric word to match. For example presented with "fib2zz(5)" the 275parser will attempt to find a function called "fib2zz". A function "fib" 276wouldn't be used because it doesn't match the whole word. 277 278The flag MPEXPR_TYPE_WHOLEWORD can be ORed into an operator type to override 279the default parsing style. Similarly MPEXPR_TYPE_OPERATOR into a function. 280 281 282Binary operators are left associative by default, meaning they're evaluated 283from left to right, so for example "1+2+3" is treated as "(1+2)+3". 284MPEXPR_TYPE_RIGHTASSOC can be ORed into the operator type to work from right 285to left as in "1+(2+3)". This is generally what's wanted for 286exponentiation, and for example the standard mpz table has 287 288 { "**", (mpexpr_fun_t) mpz_pow_ui, 289 MPEXPR_TYPE_BINARY_UI | MPEXPR_TYPE_RIGHTASSOC, 220 } 290 291Unary operators are postfix by default. For example a factorial to be used 292as "123!" might be 293 294 { "!", (mpexpr_fun_t) mpz_fac_ui, MPEXPR_TYPE_UNARY_UI, 215 } 295 296MPEXPR_TYPE_PREFIX can be ORed into the type to get a prefix operator. For 297instance negation (unary minus) in the standard mpf table is 298 299 { "-", (mpexpr_fun_t) mpf_neg, 300 MPEXPR_TYPE_UNARY | MPEXPR_TYPE_PREFIX, 210 }, 301 302 303The same operator can exist as a prefix unary and a binary, or as a prefix 304and postfix unary, simply by putting two entries in the table. While 305parsing the context determines which style is sought. But note that the 306same operator can't be both a postfix unary and a binary, since the parser 307doesn't try to look ahead to decide which ought to be used. 308 309When there's two entries for an operator, both prefix or both postfix (or 310binary), then the first in the table will be used. This makes it possible 311to override an entry in a standard table, for example to change the function 312it calls, or perhaps its precedence level. The following would change mpz 313division from tdiv to cdiv, 314 315 struct mpexpr_operator_t table[] = { 316 { "/", (mpexpr_fun_t) mpz_cdiv_q, MPEXPR_TYPE_BINARY, 200 }, 317 { "%", (mpexpr_fun_t) mpz_cdiv_r, MPEXPR_TYPE_BINARY, 200 }, 318 { (char *) mpz_expr_standard_table, NULL, MPEXPR_TYPE_NEW_TABLE } 319 }; 320 321 322The type field indicates what parameters the given function expects. The 323following styles of functions are supported. mpz_t is shown, but of course 324this is mpq_t for mpq_expr_a, mpf_t for mpf_expr_a, etc. 325 326 MPEXPR_TYPE_CONSTANT void func (mpz_t result); 327 328 MPEXPR_TYPE_0ARY void func (mpz_t result); 329 MPEXPR_TYPE_I_0ARY int func (void); 330 331 MPEXPR_TYPE_UNARY void func (mpz_t result, mpz_t op); 332 MPEXPR_TYPE_UNARY_UI void func (mpz_t result, unsigned long op); 333 MPEXPR_TYPE_I_UNARY int func (mpz_t op); 334 MPEXPR_TYPE_I_UNARY_UI int func (unsigned long op); 335 336 MPEXPR_TYPE_BINARY void func (mpz_t result, mpz_t op1, mpz_t op2); 337 MPEXPR_TYPE_BINARY_UI void func (mpz_t result, 338 mpz_t op1, unsigned long op2); 339 MPEXPR_TYPE_I_BINARY int func (mpz_t op1, mpz_t op2); 340 MPEXPR_TYPE_I_BINARY_UI int func (mpz_t op1, unsigned long op2); 341 342 MPEXPR_TYPE_TERNARY void func (mpz_t result, 343 mpz_t op1, mpz_t op2, mpz_t op3); 344 MPEXPR_TYPE_TERNARY_UI void func (mpz_t result, mpz_t op1, mpz_t op2, 345 unsigned long op3); 346 MPEXPR_TYPE_I_TERNARY int func (mpz_t op1, mpz_t op2, mpz_t op3); 347 MPEXPR_TYPE_I_TERNARY_UI int func (mpz_t op1, mpz_t op2, 348 unsigned long op3); 349 350Notice the pattern of "UI" for the last parameter as an unsigned long, or 351"I" for the result as an "int" return value. 352 353It's important that the declared type for an operator or function matches 354the function pointer given. Any mismatch will have unpredictable results. 355 356For binary functions, a further type attribute is MPEXPR_TYPE_PAIRWISE which 357indicates that any number of arguments should be accepted, and evaluated by 358applying the given binary function to them pairwise. This is used by gcd, 359lcm, min and max. For example the standard mpz gcd is 360 361 { "gcd", (mpexpr_fun_t) mpz_gcd, 362 MPEXPR_TYPE_BINARY | MPEXPR_TYPE_PAIRWISE }, 363 364Some special types exist for comparison operators (or functions). 365MPEXPR_TYPE_CMP_LT through MPEXPR_TYPE_CMP_GE expect an MPEXPR_TYPE_I_BINARY 366function, returning positive, negative or zero like mpz_cmp and similar. 367For example the standard mpf "!=" operator is 368 369 { "!=", (mpexpr_fun_t) mpf_cmp, MPEXPR_TYPE_CMP_NE, 160 }, 370 371But there's no obligation to use these types, for instance the standard mpq 372table just uses a plain MPEXPR_TYPE_I_BINARY and mpq_equal for "==". 373 374Further special types MPEXPR_TYPE_MIN and MPEXPR_TYPE_MAX exist to implement 375the min and max functions, and they take a function like mpf_cmp similarly. 376The standard mpf max function is 377 378 { "max", (mpexpr_fun_t) mpf_cmp, 379 MPEXPR_TYPE_MAX | MPEXPR_TYPE_PAIRWISE }, 380 381These can be used as operators too, for instance the following would be the 382>? operator which is a feature of GNU C++, 383 384 { ">?", (mpexpr_fun_t) mpf_cmp, MPEXPR_TYPE_MAX, 175 }, 385 386Other special types are used to define "(" ")" parentheses, "," function 387argument separator, "!" through "||" logical booleans, ternary "?" ":", and 388the "$" which introduces variables. See the sources for how they should be 389used. 390 391 392User definable operator tables will have various uses. For example, 393 394 - a subset of the C operators, to be rid of infrequently used things 395 - a more mathematical syntax like "." for multiply, "^" for powering, 396 and "!" for factorial 397 - a boolean evaluator with "^" for AND, "v" for OR 398 - variables introduced with "%" instead of "$" 399 - brackets as "[" and "]" instead of "(" and ")" 400 401The only fixed parts of the parsing are the treatment of numbers, whitespace 402and the two styles of operator/function name recognition. 403 404As a final example, the following would be a complete mpz table implementing 405some operators with a more mathematical syntax. Notice there's no need to 406preserve the standard precedence values, anything can be used so long as 407they're in the desired relation to each other. There's also no need to have 408entries in precedence order, but it's convenient to do so to show what comes 409where. 410 411 static const struct mpexpr_operator_t table[] = { 412 { "^", (mpexpr_fun_t) mpz_pow_ui, 413 MPEXPR_TYPE_BINARY_UI | MPEXPR_TYPE_RIGHTASSOC, 9 }, 414 415 { "!", (mpexpr_fun_t) mpz_fac_ui, MPEXPR_TYPE_UNARY_UI, 8 }, 416 { "-", (mpexpr_fun_t) mpz_neg, 417 MPEXPR_TYPE_UNARY | MPEXPR_TYPE_PREFIX, 7 }, 418 419 { "*", (mpexpr_fun_t) mpz_mul, MPEXPR_TYPE_BINARY, 6 }, 420 { "/", (mpexpr_fun_t) mpz_fdiv_q, MPEXPR_TYPE_BINARY, 6 }, 421 422 { "+", (mpexpr_fun_t) mpz_add, MPEXPR_TYPE_BINARY, 5 }, 423 { "-", (mpexpr_fun_t) mpz_sub, MPEXPR_TYPE_BINARY, 5 }, 424 425 { "mod", (mpexpr_fun_t) mpz_mod, MPEXPR_TYPE_BINARY, 6 }, 426 427 { ")", NULL, MPEXPR_TYPE_CLOSEPAREN, 4 }, 428 { "(", NULL, MPEXPR_TYPE_OPENPAREN, 3 }, 429 { ",", NULL, MPEXPR_TYPE_ARGSEP, 2 }, 430 431 { "$", NULL, MPEXPR_TYPE_VARIABLE, 1 }, 432 { NULL } 433 }; 434 435 436 437 438INTERNALS 439 440Operator precedence is implemented using a control and data stack, there's 441no C recursion. When an expression like 1+2*3 is read the "+" is held on 442the control stack and 1 on the data stack until "*" has been parsed and 443applied to 2 and 3. This happens any time a higher precedence operator 444follows a lower one, or when a right-associative operator like "**" is 445repeated. 446 447Parentheses are handled by making "(" a special prefix unary with a low 448precedence so a whole following expression is read. The special operator 449")" knows to discard the pending "(". Function arguments are handled 450similarly, with the function pretending to be a low precedence prefix unary 451operator, and with "," allowed within functions. The same special ")" 452operator recognises a pending function and will invoke it appropriately. 453 454The ternary "? :" operator is also handled using precedences. ":" is one 455level higher than "?", so when a valid a?b:c is parsed the ":" finds a "?" 456on the control stack. It's a parse error for ":" to find anything else. 457 458 459 460FUTURE 461 462The ternary "?:" operator evaluates the "false" side of its pair, which is 463wasteful, though it ought to be harmless. It'd be better if it could 464evaluate only the "true" side. Similarly for the logical booleans "&&" and 465"||" if they know their result already. 466 467Functions like MPEXPR_TYPE_BINARY could return a status indicating operand 468out of range or whatever, to get an error back through mpz_expr etc. That 469would want to be just an option, since plain mpz_add etc have no such 470return. 471 472Could have assignments like "a = b*c" modifying the input variables. 473Assignment could be an operator attribute, making it expect an lvalue. 474There would want to be a standard table without assignments available 475though, so user input could be safely parsed. 476 477The closing parenthesis table entry could specify the type of open paren it 478expects, so that "(" and ")" could match and "[" and "]" match but not a 479mixture of the two. Currently "[" and "]" can be added, but there's no 480error on writing a mixed expression like "2*(3+4]". Maybe also there could 481be a way to say that functions can only be written with one or the other 482style of parens. 483 484 485 486---------------- 487Local variables: 488mode: text 489fill-column: 76 490End: 491