History log of /seL4-l4v-10.1.1/HOL4/src/parse/type_tokens.sml
Revision Date Author Comments
# d3e399bb 28-Jul-2011 Michael Norrish <Michael.Norrish@nicta.com.au>

Parse decimal fractions, turning them into divisions.

Client theories need to overload the magic string

GrammarSpecials.decimal_fraction_special

to their division constant in order to get this to work.


# 8564ac5b 29-Sep-2009 Michael Norrish <Michael.Norrish@nicta.com.au>

More rampant Unicodification.
* the Unicode trace is now *on* by default. You can turn this off by
doing
set_trace "Unicode" 0
In the emacs mode, you can also type M-h C-u. No script file should
need changing as a result of this.

* Type variables can now be printed as Greek lower-case letters. This
is controlled by a trace "Greek tyvars", that is a slave to the
Unicode trace. (This means that if you turn Unicode on, Greek
tyvars is turned on too; and if you turn Unicode off, Greek tyvars
is turned off as well. However, you can adjust the trace directly using
set_trace as well, if you wish.)

The effect of "Greek tyvars" is too allow lower-case Greek letters
alpha to omega to act as aliases for 'a, 'b ... forms. This
aliasing is done with both parsing and printing. The lambda is
*not* adjusted in this way. It'd be surprising if anyone ever
wanted more than about alpha, beta, gamma and delta of course.

This is purely a presentational matter. You may see a Greek alpha
on your screen, but to the kernel, it's still mk_vartype "'a".

* There is one possible backwards incompatibility that I can think
of. The quotation filter had to be adjusted to push it into type
parsing when it saw

``:

followed by a lowercase greek letter. Previously, it saw such a
thing as the start of a term parse.


# 5c771b3d 15-Jan-2009 Michael Norrish <Michael.Norrish@nicta.com.au>

Modify lexing so that "mixed" tokens are possible if a user puts them
into the grammar. For example, one might do

set_fixity "-->a" (Infix(NONASSOC, 450))

This then causes ``p -->a q`` to be parsed as an infix application of
-->a to p and q. If you wrote ``p --> a q``, the -->a token would not
be seen. Without -->a being in the grammar, it will be lexed as
before, a symbolic token (-->) followed by an alphanumeric one (a).
(Note also that things like --a--> where the mixing up happens in the
middle of the token, are just as possible.)

This change also allows traditional alphanumeric identifiers to
combine with Unicode subscripts. Using LaTeX notation, a_1 is now
possible.

The system builds with selftest level 1, so I believe I have preserved
backwards compatibility. (I have also added a test to
src/bool/selftest.sml to check for _1 working as a case expression
pattern properly.)


# 7d307422 22-Aug-2007 Michael Norrish <Michael.Norrish@nicta.com.au>

Update to support ty1[ty2] syntax for array types. This means that a
32-bit word type now prints and parses as :bool[32]. I haven't quite
managed to rebuild all of the system (because I'm about to go to bed),
but I'm pretty confident the change won't break anything serious.
Where it might break things is in input such as

f:num list -> bool [3;4]

The above used to work, but is pretty unlikely. It will now break
because the type parser will grab the [ token that used to "belong"
only to the term level.


# 99110e15 21-Jan-2007 Michael Norrish <Michael.Norrish@nicta.com.au>

Types can now be numerals. Numeral types have as many elements in
them as the number provided. This is thanks to the fact that they
encode into type-operators provided by fcpTheory (bit0 and bit1), and
the unit type. The plan is to have these types used for the indexes
of finite cartesian products, and to further support :elem[num] as the
notation for these "array" types. 32 bit words will thus be
:bool[32]. If people want to save two characters, they can still
abbreviate :bool[32] to :word32 of course. This is not dependent
typing, so writing things like :bool[n] will be an error.

Specific type abbreviations are no longer required because it is all
special-cased in the type parser/printer combination.

Note that :0 is not allowed!


# 1e693f02 14-Apr-2005 Michael Norrish <Michael.Norrish@nicta.com.au>

Changed the lexing implementation in src/parse to use our own copy of mllex.
Speed is comparable, code is conforming SML, and mllex is also something
of a standard for the other implementations too. I will eventually remove
all uses of mosmllex and mosmlyac from the distribution, perhaps starting
with tools/Holmake.
Functionality is slightly altered: it is now not possible to have
antiquotations inside comments. If there is hue and cry about this, I can
restore this functionality though in rather a hackish way. None of the
core distribution relies on being able to do this.


# 52b44cd4 24-Sep-2002 Keith Wansbrough <keith@lochan.org>

Further improve error reporting:

* make_atom failures (e.g., update of nonexistent field) now give
located errors; ditto all friends in Parse_support. Modified
absyn_to_preterm and Parse_suport.* to pass locations downward to
accommodate this.

* Fix a bug in parse_term that was causing ranges to be confused for
certain sequences of tokens (was reporting left edge of last token
to right edge of first token, oops!).

* Add text " (possibly missing semicolon)" to error "May not use
record update functions at top level", as this is a common error.

* Fix a small bug in location-handling in type_tokens.

Other changes:

* Remove another printing of the tail of the input (no longer
necessary).

* Various code tidy-ups.

* Reduce Loc_Near (Loc_Near locn) to Loc_Near locn (otherwise they
pile up dramatically!).


# 275a997c 03-Sep-2002 Keith Wansbrough <keith@lochan.org>

This commit contains the initial version of the error-location
reporting code I have been working on over the last couple of weeks.

Basically, error messages will now report the source location they
relate to, enabling users to locate bugs more easily within large
definitions.

So far, errors are reported by the lexer, the term parser, and the
datatype parser. Type parser errors will follow, and (hopefully)
typecheck errors.

src/0/locn is a small location-handling library. locn.locn is the
type of locations, and 'a locn.located is a synonym for 'a *
locn.locn. There is a location, Loc_None, for compiler-generated
text, and another, Loc_Unknown, for text of unknown location.

src/0/Feedback now includes some functions for generating located errors.

The lexer now returns located tokens, term_token and type_token
maintain the locations, and parse_term maintains them *internally* on
each terminal and nonterminal. When the system is complete, locations
will persist into absyn and preterm (if I understand the code
correctly!) to enable the typechecker to yield located errors.

I would appreciate any feedback:

* Is the performance impact significant?
* Are any error messages incorrect? (there's a lot of
location-manipulation code in parse_term, some of which I don't
fully understand, and surely some bugs)
* Have I broken anything for anyone, or am I likely to?

The actual output format of the location information is provisional
only, and I intend to generate absolute file positions (rather than
fragment offsets) shortly.

--KW 8-)

NB: tags pre-locn-change and post-locn-change cover this commit.


# a1dfcb9b 26-Jul-2002 Michael Norrish <Michael.Norrish@nicta.com.au>

A complete re-implementation of lexing. Speeds things up considerably
through the use of mosmllex. (The example at the bottom of term_tokens.sml
goes through four to five times faster than it used to.) The basic
design is now more stateful than it used to be. There is a new type,
called the qbuf, which provides buffered access to a quotation. The
implementation of parse_type changes completely because it can no longer
do the monadic-parsing trick of backing up over failed parses. The
implementation of the term parser doesn't really change at all.


# 67005981 09-Apr-2001 Michael Norrish <Michael.Norrish@nicta.com.au>

Changes to make the type parser handle thy$tyname syntax. This syntax
can only be used as suffixes. Thus: (bool,bool) min$fun, for example.


# 67cbea0e 10-Feb-2001 Konrad Slind <konrad.slind@gmail.com>

Support for new representation of strings.Minor changes.


# d28425f8 06-Sep-1999 Michael Norrish <Michael.Norrish@nicta.com.au>

Quite radical changes which allow for "." to be used as the record
field selection operator. Lexer has changed quite dramatically:
it is no longer responsible for determining whether or not a token
is a "keyword" or not. Instead this is done at the parsing level.

The lexer is also now unable to cope with identifiers that are a mix
of alphabetically acceptable and symbolically acceptable characters.


# 6e2257c8 11-Aug-1999 Michael Norrish <Michael.Norrish@nicta.com.au>

A number of changes, doing away with the old Pretype and Strm modules.
These changes support the new implementation of the ParseDatatype
module, which now refers to the type parser provided in this directory,
meaning that ParseDatatype can see new user-defined type infixes as and
when they become available.


# 0058df84 28-Jun-1999 Michael Norrish <Michael.Norrish@nicta.com.au>

Fixed to bring it up to date with Taupo release 0.
(Changes have come across from parse_branch development.)