History log of /seL4-l4v-master/HOL4/src/parse/selftest.sml
Revision Date Author Comments
# d77d0c68 24-Aug-2020 Michael Norrish <michael.norrish@data61.csiro.au>

Provide a regression test entry-point for pretty-printing output

Occasionally bare strings can be annoyingly ugly


# 792751aa 09-Jul-2019 Michael Norrish <Michael.Norrish@nicta.com.au>

Fix base_lexer's regexp for UTF8 characters less quotation marks


# 5ef6cbbd 08-Jul-2019 Michael Norrish <Michael.Norrish@nicta.com.au>

Add single guillemets ‹...› as delimiters to string-literal o/loads


# 66e2a3a3 05-Jul-2019 Michael Norrish <Michael.Norrish@nicta.com.au>

Rework handling of string literals

Can now do the parsing side of github issue #510, allowing string
literals to be invisibly injected into other types, just as numbers
are. Add to this feature with the ability to use the «...» delimiters
as a separate flavour of string (with the underlying term being the
application of an injection function to a "normal" string literal).

Pretty-printing still needs to be done.


# 17578090 23-Jun-2019 Michael Norrish <Michael.Norrish@nicta.com.au>

Change testutils API to allow for counting of failures

Don't want to necessarily just always instantly abort. Changes die's
type from

string -> 'a

to

string -> unit

This causes a number of knock-on effects.


# 6958db7b 08-Jan-2019 Michael Norrish <Michael.Norrish@nicta.com.au>

Make thyname$++ lex successfully once more


# 30aea1a3 07-Jan-2019 Michael Norrish <Michael.Norrish@nicta.com.au>

More tests for lexing with $s.


# a1cca210 02-Jan-2019 Michael Norrish <Michael.Norrish@nicta.com.au>

Further robustification of lexing around the magic $ character

More tests to narrow the space of expected behaviour


# ffcc4e09 01-Jan-2019 Michael Norrish <Michael.Norrish@nicta.com.au>

Make "quoted variables" ($var$(..)) syntax more robust

In particular, they may contain quotation symbols and
backslash-escaped characters that are not necessarily UTF8 safe.

Thus: “$var$(\172)” parses to a variable whose name (as revealed by
dest_var), is the string containing character 172. The pretty-printer
correctly reverses this.

There are yet two infelicities in all this:
- quoted variables with challenging contents (spaces, quotations) need
to be preceded by whitespace to avoid completely confusing the lexer
- the pretty-printer doesn't safely print back quoted variables in
such contexts, so you get something that won't parse

The latter is easy to fix. The first is not.


# 0e8cc2e4 31-Dec-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Add better tests for non-UTF8 chars in string literals; fix bugs

This closes #632 (again)

Thanks to Brian Campbell for keeping me honest


# dc6976e6 31-Dec-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Allow for string/char literals that aren't valid UTF8

UTF8 machinery was being applied to the contents of HOL's own string
literals, which led to problems if one wrote “"\172"” for example
because the SML level UTF8 machinery objected to \172. The fix is to
make sure that the SML/UTF8 stuff doesn't get called on string literal
tokens at all.

Closes #632


# dc2b4626 05-Dec-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Tests and fixes for proper handling of suffix list-forms


# 56a8aae3 04-Dec-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Failing test case for parsing listspec as suffix


# e00c7e37 05-Dec-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Modify testutils.require_msg and derivatives to return values

It can be useful to get back the result of a passing test for use in
further tests that build on that.

Knock-on effects where it is now not possible to simply List.app such
functions; they must now have their results explicitly ignored.


# 8649983c 29-Nov-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Fix bug in lexing of strings like x'0 followed by symbolic chars


# ae6c8a5b 27-Nov-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Allow \z inside $var$(...) syntax to denote empty string.

This then lets p/printer print $var$(...*\z) and not have to print
something that looks like an end-comment.


# 6037e247 27-Nov-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Treat Unicode lambda specially: it's neither symbolic nor alphabetic

It's not symbolic and so mustn't merge with preceding symbolic
characters (as in the test-case, :=λ is two tokens), and it's not
alphabetic because it mustn't merge with following variables names, as
in λx.

(OTOH, backslash is clearly symbolic: it appears in /\ and \/ both.)


# 2f084c8f 27-Nov-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Fix error in handling of 0xF when it follows a left paren


# a6f74cd4 26-Nov-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Rework lexing & printing to support new variable syntaxes

First, support $var$(...) syntax, allowing otherwise unparseable
variables to get into the kernel.

Also support a shorter syntax to encode variables with many trailing
prime characters. Now all of x'4', x'⁴' and x'''' parse to the kernel
variable that really has 4 prime characters after the x. The $var$
syntax is used to print kernel variables that have the first two forms
as their "real" names.

Lexing (on the example in the comment at the end of term_tokens.sml at
least) may be roughly a third faster.

Closes #614
Closes #613


# 311d2865 22-Nov-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Implement a toString function for term_tokens type

Use this in src/parse's selftest to get better output when term_token
functions fail.


# 17990a1d 08-Nov-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Implement "input only type abbreviations"

Closes #599


# 654be2c5 24-Oct-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Tweak testutils

- output on failures now prettier
- more utility functions
- make 'a testresult same as 'a Exn.result


# 793c19de 17-Sep-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Get chained Unicode negations to print properly

Selftests for both the pretty-printing behaviour and the underlying
token-splitting function that this depends on.


# b440d91d 22-Aug-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Fix ParseDatatype bug when parsing Datatype syntax

One test case would be

foo = Con1 bool; bar = Con2 foo (bool -> bool);

which didn't cope with the first semicolon, which is next to the bool
argument without an intervening space.

The previous handling of the "feature" whereby atomic types don't need
to be separated from other arguments involved a recursive call to the
type parser on a specially constructed sub-buffer. This was
disgusting and broken. Now the code just calls a parse_atom function
made visible from inside the type parser.

Add some illustrative test-cases.

Thanks to Oskar Abrahamsson for the bug report.


# 6c7720fd 03-Jun-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Fix type-grammar pretty-printing; include regression test


# 10f0150e 03-Jun-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Implement Haskell-style $ as infix function application operator

With documentation and some simple test cases.


# f179922a 14-May-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Lex tokens consisting of only $-signs as identifiers

This adds behaviour to the system: previously it was impossible to get
anything out of quotations containing bare $-signs. Now these are
lexed as term identifiers. (The type parser was already capable of
lexing such things sensibly.)

I am strongly inclined to now use single dollar-signs as in Haskell,
as an alternative form of adding parentheses. For example,

f $ g $ x + 1

is a lighter way of writing f (g (x + 1)). The effect is achieved by
making $ a right-associative infix with precedence looser than normal
function application.

Closes #512


# 798ac6f3 10-May-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Complete implementation of new API for declaring+using monads

Monadsyntax in general (the do..od notation) can be turned on and off.
In addition, specific monad instances can be declared and subsequently
enabled by just calling something like

monadsyntax.enable_monad "option"

Having state_transformerTheory as an ancestor turns on monadsyntax, and
enables the state monad. (Enabling a monad causes a bunch of calls to
overload to occur; disabling a monad undoes those calls.)


# f52ec0a0 18-Mar-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Fix p/printers and selftests to agree with one another again


# 09c5dc00 14-Mar-2018 Michael Norrish <Michael.Norrish@nicta.com.au>

Up to and boolTheory and slightly beyond


# 1e968be0 04-Oct-2017 Michael Norrish <Michael.Norrish@nicta.com.au>

Implement find-replace for strings

Substitutions are applied to longest left-most matches in preference.


# 99630098 22-Aug-2017 Michael Norrish <Michael.Norrish@nicta.com.au>

Improve datatype parsing, including ; as terminator inside records

More tests included.


# a3898a0d 13-Aug-2017 Michael Norrish <Michael.Norrish@nicta.com.au>

Fix remaining listsyntax bugs (those shown up in existing test-cases)

Next and final step is to tweak let pretty-printing.


# 7c8db103 10-Aug-2017 Michael Norrish <Michael.Norrish@nicta.com.au>

Reproduce pred_set selftest failure in src/parse selftest


# 990ce29e 10-Aug-2017 Michael Norrish <Michael.Norrish@nicta.com.au>

Handle infix cons-operators (like pred_set$INSERT) better

Still can't get past the self-test in src/pred_set/src

Term-grammars now maintain their own max-timestamp so that successive
rules' timestamps are guaranteed to increase. (Previously, the new
timestamp just got made bigger than the timestamps associated with
"related" grammar rules. This required a scan of all the grammar's
rules, and a careful definition of what it was to be related.)


# 08ea2f7d 10-Aug-2017 Michael Norrish <Michael.Norrish@nicta.com.au>

Fix some problems in printing of list-forms

Add a number of test-cases, some of which still fail


# ba84673c 10-Aug-2017 Michael Norrish <Michael.Norrish@nicta.com.au>

Parse normal list-forms, including GSPEC, but fail to print them


# 4ed22e30 10-Aug-2017 Michael Norrish <Michael.Norrish@nicta.com.au>

Parsing of lets (even with ;'s) works; absyn post-processing doesn't


# ede4d6e9 09-Aug-2017 Michael Norrish <Michael.Norrish@nicta.com.au>

src/parse compiles but parser broken.

Completed change to remove LISTRULE form from grammar. Instead, all
list syntax is handled with embedded ListForms within normal rules.


# 7944f9bc 09-Aug-2017 Michael Norrish <Michael.Norrish@nicta.com.au>

Write a function to find possible "list" reductions within a phrase

Will need lots of test-cases as it's painfully complicated


# e8895bc2 08-Aug-2017 Michael Norrish <Michael.Norrish@nicta.com.au>

WIP for change to list syntax handling.

Very broken at the moment


# 744f9a3b 21-Apr-2016 Michael Norrish <Michael.Norrish@nicta.com.au>

Fix many selftests to have prettier output


# d2d3cc44 02-Nov-2014 Michael Norrish <michael.norrish@nicta.com.au>

Pretty-printer now annotates consts with their real names and types.

In "complicated" overload situations, such as the overloads for NOTIN
and <>, the annotation can't point to a "real" underlying constant,
but does still provide the type for the emacs tool-tips.

In passing, also create a new annotation form, the SymConst as well as
the Const. This is done for the TeX backend's benefit, so that it can
avoid treating things like + the same as normal identifiers. This
works well for math-mode munging where you want "+" to map to $+$, but
want "INSERT" to map to \textsf{INSERT}.

Closes #39


# 7d77e3c7 03-Aug-2011 Michael Norrish <Michael.Norrish@nicta.com.au>

Minor new regression test for lexing decimal fractions.


# c5b67214 28-Jul-2011 Michael Norrish <Michael.Norrish@nicta.com.au>

Fix some fiddly lexing problems to do with the new decimals syntax.


# 19456733 28-Jul-2011 Michael Norrish <Michael.Norrish@nicta.com.au>

Fix tokenizing of decimals when they're concatenated with various symbols.

Also allow decimals to include eye-helping underscores.


# 752c22cd 28-Jul-2011 Michael Norrish <Michael.Norrish@nicta.com.au>

Add some term token regression tests.


# d3e399bb 28-Jul-2011 Michael Norrish <Michael.Norrish@nicta.com.au>

Parse decimal fractions, turning them into divisions.

Client theories need to overload the magic string

GrammarSpecials.decimal_fraction_special

to their division constant in order to get this to work.


# 808c0442 09-Jun-2011 Michael Norrish <Michael.Norrish@nicta.com.au>

More lexing tests.


# 23031709 09-Jun-2011 Michael Norrish <Michael.Norrish@nicta.com.au>

Add tests for the base_lexer and its bogus handling of comments.


# 407b5983 09-Jan-2011 Ramana Kumar <ramana.kumar@gmail.com>

Replace add_ann_string with a more general add_xstring

Allows a custom size and/or annotation to be given to a string for
pretty-printing. For code using add_ann_string, you can get add_xstring
instead and define
fun add_ann_string s = add_xstring {s=s,ann=SOME ann,sz=NONE}
for an easy upgrade.


# 71f8aa27 20-Nov-2009 Thomas Tuerk <Thomas.Tuerk@cl.cam.ac.uk>

cleanup and code to produce wrong nestings of begin_style / end_style added


# ddfd4fe1 20-Nov-2009 Thomas Tuerk <Thomas.Tuerk@cl.cam.ac.uk>

Fix of some trouble with the vt100_terminal. I used the 88 color extension. This extension is for example not present in the terminals on Macs. Without this extension, the terminals displayed bound and free variables as blinking(!). I removed the usage of the extension and hope it is working now.

In order to see, how the terminal actually displays annotations and styles, I added a selftest. If you are using a vt100 terminal, it might be interesting to execute this selftest manually several times and play around with extensions like the 88-color one. However, it should hopefully work on old terminals without any extensions now.