Module System Cleanup
Author: Joachim Schimpf
Last update: 02/2003
This information describes changes to the module system that were implemented
in release 5.0.
Topics
-
Simplify the implementation and semantics of tools
-
sort out some problem related to "lazy import" and clashes
-
implement reexport
-
implement better semantics for :/2, i.e. specifying lookup module rather
than definition module
-
related change is to get rid of the null-module descriptors that are used
currently for :/2
-
allow [m1,m2]:Goal as a shorthand for m1:Goal, m2:Goal
-
Possibly split procedure descriptors into
-
definition descriptor (at most 1 per module)
-
visibility descriptor (at most 1 per module)
-
qualified reference descriptor (possibly many per module)
-
Get rid of protected-property, declared-flag
-
Simplify the module system by removing the interface/body separation
-
Change default visibility for containers (record,setval) to local
-
Towards a multi-language system
-
Get rid of "global" visibility
Differences for the user
-
:/2 specifies the visibility module, and accepts a list of modules on the
left hand side
-
visibility changes are not allowed (except local->export and import->reexport)
-
export in module-interface and body are no longer treated differently -
in fact the interface-body separation is made obsolete
-
lazy import does not cause visibility clashes, these occur only when an
ambiguously imported name is actually used. Ambiguity is resolved via explicit
import.
-
globals are deprecated and replaced by exports that are imported automatically
when used
-
setting a handler now automatically exports the handler
-
Definition before use: When an exported predicate is being used, it (or
the whole module that defines it) has to be imported first (via import,
use_module, reexport or import/from).
The implementation of :/2
Originally, :/2 was called call_explicit and defined as referring to the
definition module. This does not fit well into the overall scheme since
otherwise everything is based on the notion of visibility.
The original implementation used non-standard procedure descriptors
that did not belong to a module ("null" module) and were never deallocated.
Several bugs were due to these "null"-module descriptors.
The main problem with letting :/2 specify the lookup module instead
of the definition module is that this could potentially create lookup chains,
which are costly to implement. However, when only exported predicates can
be accessed this way, the visible predicate is the same as the defined
one, therefore the problem of chains does not occur. It is also possible
and potentially useful to make the exported/global restriction only for
compiled calls, while allowing unlimited access for metacalls: this creates
no problems with chains (since visibility is resolved at call time) and
simplifies precise metaprogramming using :/2 and @/2.
New implementation: The "null" descriptors are replaced by qualified-access-descriptors.
They refer from a use in one module to a definition in another module.
This is similar to an import-descriptor, but while there can be only one
import descriptor for a particular name (specifying the visible one), there
may be many qualified-access-descriptors.
Simplifying consistency/redefinition
The original implementation allowed certain dynamic redefinitions, e.g.
tool->nontool. The requirement for a redefinition to be allowed is that
a call compiled under the original assumption is still valid in the redefined
case. As a fortunate side effect, this policy also solved the following
problem that occurs when recompiling a module that exports a predicate
to which a call has already been compiled elsewhere:
:- module_interface(m).
:- export p/1.
% at this point p/1 is exported but not yet known as a tool
:- tool(p/1, p/2).
:- begin_module(m).
p(X,Y) :- ...
Assume m was compiled, then a call to p/1 was compiled with the tool-calling
-convention. When module m is now recompiled, the export-directive exports
a non-tool which fortunately is compatible with the tool-call, but this
is just a lucky special case.
Inter-module consistency checks
To solve the above problem, the actual export (i.e. the updating of the
corresponding import descriptors and the consistency check) can be delayed.
Actual export is done:
-
at export time only if already code-defined
-
at code-definition time otherwise
Note that we introduce a new descriptor state here (descriptor exported
but corresponding imports not checked or updated) that didn't exist before.
An additional flag TO_EXPORT is introduced to indicate this state, which
is halfway between LOCAL and EXPORT.
Intra-module consistency checks
Once the local descriptor has been referenced, every single declaration
must make a consistent change. The following table indicates what changes
are allowed.
Predicate property |
Change when already referenced (call compiled) |
Change when already code defined |
code |
yes |
yes |
modes & uniftype |
no |
no |
inline trans |
no |
yes |
adding tool property |
no |
no |
debugged |
yes |
no |
spy,trace,skip,start |
yes |
yes |
parallel |
yes |
no |
demon |
yes |
no |
waking prio |
yes |
yes |
calling convention |
no |
no |
dynamic |
yes |
no |
Protected procedures
In the previous implementation, the protect-mechanism was used to enforce
that redefinitions of predicates that were treated specially by the compiler
were made beforehand, i.e. before any calls had been compiled. This should
now be taken care of by the general mechanism, i.e. the restrictions on
changes of the calling convention when calls have already been compiled.
The protected-property has therefore been removed.
For a small number of control constructs i.e.
,/2 ;/2 ->/2 -?-> :/2 true/0
it would be good to forbid redefinition altogether otherwise all code that
analyses goals would have to check whether those have been redefined. We
are talking here about the compiler, tr_goals/3 and the like. It simply
means that these goals can be relied upon even without explicit sepia_kernel-qualification.
Import/lazy import
It probably makes sense to have both, lazy and immediate import, with the
following meaning:
-
When a module is (lazily) imported (use_module(m) or import(m)) no checks
are done, the fact is just memorized. In particular, it is not an error
(or warning) when two imported modules export the same names.
-
Only when the name is referenced for the first time does the import link
get established: it is checked whether there is a unique import, and ambiguity
is an error
-
Explicit import (e.g. import p/3 from m) immediately creates
the import link.
-
Import links could previously be removed by abolish. This is now no longer
be necessary to resolve ambiguous imports. This allows to restrict the
functionality of abolish and get rid of unwanted dynamicity in the interfaces.
-
When a predicate is referenced before having been imported in any way:
-
we assume a default calling convention and default properties for the predicate.
If a later import proved not to be compatible with these assumptions, this
is an error.
-
consequently, imports should always textually precede any call of a predicate.
-
the same applies to a use->local sequence (to allow one-pass processing)
Necessary changes:
-
when module interface is imported, exports must no longer be mapped into
import-froms. I.e. get rid of the distinction between exporting in the
interface or in the body.
-
ambiguity must be reported when it happens during lazy import (make/will_lazy_import),
but not already at the import-module directive,
-
former notion of global can be treated as lazy import from sepia_kernel
(actually now eclipse_language)!
Getting rid of globals
In ECLiPSe prior to 5.0, predicates could be declared global.
This visibility class has been removed in order to simplify things.
The main use of global predicates was for the ECLiPSe built-in predicates,
which were automatically visible everywhere (unless hidden by a local or
imported definition - this was used in compatibility packages). The new
scheme is as follows:
-
By default, the module eclipse_language is implicitly (lazy)
imported into every new module. This is the way the built-ins are provided.
-
The set of predicates reexported from eclipse_language defines the set
of builtins (previously this was the set of global declarations).
-
Local redefinitions still hide this (lazy) import.
-
Compatibility packages that provide alternative implementations of builtins:
the import from eclipse_language and the import from a compatibility module
are now equivalent. Possible solutions:
-
The importer has to resolve the conflict
-
The compatibility library has to be used instead of the kernel (via
module/3, create_module/3)
Alternative ways to deal with this problem (not implemented):
-
The kernel-exports are somehow declared "weak", the library exports "strong"
- but this approach doesn't nest
-
Partial ordering: if something is visible via one import and hidden via
another, it should be hidden (see algorithm below)
What is implemented in 5.0 is: no automatic resolution, except arithmetic
comparisons, which are explicitly resolved in favour of eclipse_language
by the ambiguity-handler. Compatibility packages should be used instead
of eclipse_language rather than in addition.
Resolution of some import-ambiguities (not implemented):
vis := next_imported()
while (clash := next_imported())
{
if (vis visible in definition module of clash)
% hidden
vis := clash
else if (clash visible in definition module
of vis)
;
else
error(ambiguous_import)
}
Tools
Tools are predicates which get a caller module argument added when called.
The current implementation allows all kinds of redefinitions which
is probably exaggerated.
Suggested changes, not all done in release 5.0 yet:
-
disallow tool/1, always require to specify the tool body when declaring
a tool. This will allow to do the tool->body mapping at compile time, making
it possible to inline the calls properly.
% tool definition in module tm:
:- tool(t/3, tb/4).
% Called in module cm:
t(a,b,c) -----> tm:tb(a,b,c,cm)
t(a,b,c)@xm -----> tm:tb(a,b,c,xm)
The qualification with :/2 is necessary because the tool body might not
be visible in module cm.
Note that this transformation can, for compiled calls, be done by inlining:
:- inline(t/3, t_t/3).
t_t(t(A,B,C), tm:tb(A,B,C,M), M).
For metacalls, the same transformation must be done, probably in the emulator.
For delay/waking, the mapping should happen at delay time (i.e. the
tool body is delayed instead of the original call), so the waking code
does not have to deal with this complication and can be simpler and more
efficient.
Also, the compiler does not have to deal with tool calls, they are
all removed (replaced by body calls) by inlining.
All this means that tool interfaces can never get called, and therefore
do not need any code generated. The code field in the procedure descriptor
can be used to hold a pointer to the descriptor of the tool body.
To do: define restrictions on visibility of tool body.
-
Does it have to be defined in the same module as the tool interface,
or just visible? (just visible from an implementation standpoint, but the
restriction may make sense for programming discipline?)
-
Is it automatically exported/imported with the interface? (exported probably
yes, imported probably no).
-
Most (all?) flag settings on the interface should be propagated to the
body.
-
The tool-to-body link cannot be changed (after the tool has been referenced),
ie. the tool interface cannot be redefined, however the body can.
-
For debugging etc, it would be nice if there was a one-to-one mapping between
tool interface and body, so that body calls could be mapped back to interface
calls, and the body predicates would almost never be visible.
Restrictions on redefinition
In the previous implementation lots of dynamic redefinitions were allowed.
This is problematic when calls are compiled and properties of the callee
are used in the process, like
-
modes
-
tool property
-
inlining
-
external calling convention
It is safer not to allow redefinitions and require forward declarations
for everything:
-
Within a module, forward declarations can be avoided by going to multipass
compilation, at least within a single file.
-
For interactive, incremental compilation it might be enough to have a simple
default assumption (prolog convention, local, general mode, no tool, no
inlining, etc).
-
For references across module boundaries, the exporting module interface
must provide all the information and must be known at compile time.
Abolish
That leaves the problem of the abolish-primitive: There is no way to inform
the possibly inlined calls of the abolishment. Maybe the semantics could
be restricted to a removal of the clauses while keeping all other properties.
This is tantamount to allowing redefinition with all the other properties
being kept the same. This is implemented in release 5.0.
Erase_module
Recompiling a module: Currently erase+compile. This leaves all the referencing
descriptors around and they are updated when the module is recompiled and
the predicate reappears.
Time of visibility resolution
What about metacall-access and non-call access (property lookup etc)? In
other words, should the first metacall to a (unambiguously, lazily) imported
predicate fix the import link, or should it be resolved afresh on every
metacall? Since metacalls contribute to the semantics, they should freeze
the visibility, but in order to keep the ease of the interactive toplevel
that is not necessarily desirable. In 5.0, the first metacall freezes the
visibility, i.e. they behave like compiled calls.
Procedure Descriptors
Current pri descriptor fields and their usage:
-
code
-
wam code address, or emulator builtin index
-
next_proc
-
finding visible pred in a module
-
dict-gc
-
to copy the fields to all with a certain mod_ref
-
unlinking in erase_module
-
lazy import
-
next_in_mod
-
to import all exported ones to free in erase_module
-
current_predicate
-
mod_def
-
the module where the descriptor belongs
-
mod_ref
-
the module that the descriptor refers to, ie. where the definition is
-
did
-
functor of the predicate (name/arity)
-
flags
-
mode
-
3 bits per argument in a 32 bit word, modes for higher arguments ignored
-
trans_function
-
did of the transformation predicate (inlining)
Descriptor types
Defined in that module:
LOCAL
EXPORT
Defined elsewhere (exported or reexported there):
IMPORT
IMPEXP
Unknown
DEFAULT
Qualified access (exported or reexported elsewhere)
QUALI
Descriptor states:
scope |
module_ref |
TO_EXPORT |
NOREFERENCE |
CODE_DEFINED |
other properties |
DEFAULT |
D_UNKNOWN |
0 |
0/1 |
0 |
any |
LOCAL |
== module_def |
0/1 |
0/1 |
0/1 |
any |
EXPORT |
== module_def |
0 |
0/1 |
0/1 |
any |
IMPEXP |
home module |
0 |
0/1 |
0/1 |
any |
IMPORT |
home module |
0 |
0/1 |
0/1 |
any |
QUALI |
home module |
0 |
0 |
0/1 |
any |
State changes - previous situation
from\to |
LOCAL |
EXPORT |
GLOBAL |
IMPORT |
DEFAULT |
DEFAULT |
ok |
ok |
ok |
ok |
- |
LOCAL |
- |
ok |
ok |
error |
|
EXPORT |
ok |
- |
ok |
error |
|
GLOBAL |
ok |
ok |
- |
ok |
|
IMPORT |
error |
error |
error |
|
|
State changes - new behaviour
We accept repeated (or weaker) declarations silently
from\to |
LOCAL |
EXPORT |
IMPEXP |
IMPORT |
DEFAULT |
DEFAULT |
ok |
ok |
ok |
ok |
|
LOCAL |
nop |
ok |
error |
error |
|
EXPORT |
nop |
nop |
error |
error |
|
IMPEXP |
error |
error |
nop(s) |
nop(s) |
|
IMPORT |
error |
error |
ok(s) |
nop(s) |
|
(s) - if imported from same module as before
Note on reexport
reexport could be handled by inlined indirection:
:- reexport p/3 from m1.
is functionally equivalent to
:- export p/3.
p(A,B,C) :- m1:p(A,B,C).
which can be made efficient by adding inlining
:- inline(p/3,t_p/2).
t_p(P3, m1:P3).
This is also related to having use_module in a module interface, which
is similar to re-exporting. The difference between that and reexport is
the definition module of the indirectly imported predicate.
Removal of the module_interface section
Pre-5.0, modules could be partiotioned into module_interface and module
body (begin_module). This static sectioning has been dropped. Without the
module_interface section, the following queries effectively comprise a
module's interface:
These directives record themselves as the interface of the module that
contains them. They do not have to appear in any particular section.
For backward compatibility, we interpret certain directives in an old-style
module_interface by transformation into an equivalent export/reexport/global
directive. Unfortunately, this is not exactly possible for occurrences
of use_module,lib,import in module_interfaces: they almost map to the new
reexport directive, but the semantics is subtly different. We therefore
support having use_module and import in recorded interfaces, although it
is only possible to create them by using obsolete features.
Directive in old :- module_interface |
Occurs in recorded interface as |
op(A,B,C) |
export op(A,B,C) |
set_chtab(A,B) |
export chtab(A,B) |
define_macro(A,B,C) |
export macro(A,B,C) |
set_flag(syntax_option, X) |
export syntax_option(X) |
meta_attribute(A,B) |
global meta_attribute(A,B) |
use_module(M) |
use_module(M) - almost reexport(M) |
lib(M) |
use_module(library(M)) - almost reexport(library(M)) |
import(Preds from M) |
import(Preds from M) - almost reexport(Preds from M) |
import(M) |
import(M) - almost reexport(M) |
For any other directives in a module_interface, we issue a warning.
The source-processor problem
How do we solve the source-processor problem? We execute certain export/local
directives, ie. the ones that affect the syntax:
-
op/3
-
struct/1
-
macro/3
-
chtab/2
-
syntax_option/1
-
meta_attribute/2
Apart from that, we also have to do all imports (since they may define
necessary syntax). For more details, see library(source_processor).
Autoload
The existing autoload feature is messy because
-
It requires the autoloaded predicates to be global
-
It creates the module where the autoloaded predicates are defined
Only one of these two things should be done, and globality should not be
used at all. I think there are two conceptually different features that
could be called "autoloading":
-
A development environment tool that comes into action when an undefined
predicate is called. It could find (possibly multiple) libraries that define
the missing predicate and offer the programmer to load and import one of
these libraries. This might not work when a call is already compiled (calling
convention).
-
A runtime mechanism that lazily loads bulky libraries (or individual
predicates) only when called. Here, the programmer has clearly specified
what definition is wanted.