Module System Cleanup

Author: Joachim Schimpf
Last update: 02/2003

This information describes changes to the module system that were implemented in release 5.0.

Topics

Simplify the implementation and semantics of tools
sort out some problem related to "lazy import" and clashes
implement reexport
implement better semantics for :/2, i.e. specifying lookup module rather than definition module
related change is to get rid of the null-module descriptors that are used currently for :/2
allow [m1,m2]:Goal as a shorthand for m1:Goal, m2:Goal
Possibly split procedure descriptors into

definition descriptor (at most 1 per module)
visibility descriptor (at most 1 per module)
qualified reference descriptor (possibly many per module)

Get rid of protected-property, declared-flag
Simplify the module system by removing the interface/body separation
Change default visibility for containers (record,setval) to local
Towards a multi-language system
Get rid of "global" visibility

Differences for the user

:/2 specifies the visibility module, and accepts a list of modules on the left hand side
visibility changes are not allowed (except local->export and import->reexport)
export in module-interface and body are no longer treated differently - in fact the interface-body separation is made obsolete
lazy import does not cause visibility clashes, these occur only when an ambiguously imported name is actually used. Ambiguity is resolved via explicit import.
globals are deprecated and replaced by exports that are imported automatically when used
setting a handler now automatically exports the handler
Definition before use: When an exported predicate is being used, it (or the whole module that defines it) has to be imported first (via import, use_module, reexport or import/from).

The implementation of :/2

Originally, :/2 was called call_explicit and defined as referring to the definition module. This does not fit well into the overall scheme since otherwise everything is based on the notion of visibility.
The original implementation used non-standard procedure descriptors that did not belong to a module ("null" module) and were never deallocated. Several bugs were due to these "null"-module descriptors.

The main problem with letting :/2 specify the lookup module instead of the definition module is that this could potentially create lookup chains, which are costly to implement. However, when only exported predicates can be accessed this way, the visible predicate is the same as the defined one, therefore the problem of chains does not occur. It is also possible and potentially useful to make the exported/global restriction only for compiled calls, while allowing unlimited access for metacalls: this creates no problems with chains (since visibility is resolved at call time) and simplifies precise metaprogramming using :/2 and @/2.

New implementation: The "null" descriptors are replaced by qualified-access-descriptors. They refer from a use in one module to a definition in another module. This is similar to an import-descriptor, but while there can be only one import descriptor for a particular name (specifying the visible one), there may be many qualified-access-descriptors.

Simplifying consistency/redefinition

The original implementation allowed certain dynamic redefinitions, e.g. tool->nontool. The requirement for a redefinition to be allowed is that a call compiled under the original assumption is still valid in the redefined case. As a fortunate side effect, this policy also solved the following problem that occurs when recompiling a module that exports a predicate to which a call has already been compiled elsewhere:

:- module_interface(m).

:- export p/1.

% at this point p/1 is exported but not yet known as a tool

:- tool(p/1, p/2).

:- begin_module(m).

p(X,Y) :- ...

Assume m was compiled, then a call to p/1 was compiled with the tool-calling -convention. When module m is now recompiled, the export-directive exports a non-tool which fortunately is compatible with the tool-call, but this is just a lucky special case.

Inter-module consistency checks

To solve the above problem, the actual export (i.e. the updating of the corresponding import descriptors and the consistency check) can be delayed. Actual export is done:

at export time only if already code-defined
at code-definition time otherwise

Note that we introduce a new descriptor state here (descriptor exported but corresponding imports not checked or updated) that didn't exist before. An additional flag TO_EXPORT is introduced to indicate this state, which is halfway between LOCAL and EXPORT.

Intra-module consistency checks

Once the local descriptor has been referenced, every single declaration must make a consistent change. The following table indicates what changes are allowed.

Predicate property	Change when already referenced (call compiled)	Change when already code defined
code	yes	yes
modes & uniftype	no	no
inline trans	no	yes
adding tool property	no	no
debugged	yes	no
spy,trace,skip,start	yes	yes
parallel	yes	no
demon	yes	no
waking prio	yes	yes
calling convention	no	no
dynamic	yes	no

Protected procedures

In the previous implementation, the protect-mechanism was used to enforce that redefinitions of predicates that were treated specially by the compiler were made beforehand, i.e. before any calls had been compiled. This should now be taken care of by the general mechanism, i.e. the restrictions on changes of the calling convention when calls have already been compiled. The protected-property has therefore been removed.
For a small number of control constructs i.e.

,/2 ;/2 ->/2 -?-> :/2 true/0

it would be good to forbid redefinition altogether otherwise all code that analyses goals would have to check whether those have been redefined. We are talking here about the compiler, tr_goals/3 and the like. It simply means that these goals can be relied upon even without explicit sepia_kernel-qualification.

Import/lazy import

It probably makes sense to have both, lazy and immediate import, with the following meaning:

When a module is (lazily) imported (use_module(m) or import(m)) no checks are done, the fact is just memorized. In particular, it is not an error (or warning) when two imported modules export the same names.
Only when the name is referenced for the first time does the import link get established: it is checked whether there is a unique import, and ambiguity is an error
Explicit import (e.g. import p/3 from m) immediately creates the import link.
Import links could previously be removed by abolish. This is now no longer be necessary to resolve ambiguous imports. This allows to restrict the functionality of abolish and get rid of unwanted dynamicity in the interfaces.
When a predicate is referenced before having been imported in any way:

we assume a default calling convention and default properties for the predicate. If a later import proved not to be compatible with these assumptions, this is an error.
consequently, imports should always textually precede any call of a predicate.
the same applies to a use->local sequence (to allow one-pass processing)

Necessary changes:

when module interface is imported, exports must no longer be mapped into import-froms. I.e. get rid of the distinction between exporting in the interface or in the body.
ambiguity must be reported when it happens during lazy import (make/will_lazy_import), but not already at the import-module directive,
former notion of global can be treated as lazy import from sepia_kernel (actually now eclipse_language)!

Getting rid of globals

In ECLiPSe prior to 5.0, predicates could be declared global. This visibility class has been removed in order to simplify things. The main use of global predicates was for the ECLiPSe built-in predicates, which were automatically visible everywhere (unless hidden by a local or imported definition - this was used in compatibility packages). The new scheme is as follows:

By default, the module eclipse_language is implicitly (lazy) imported into every new module. This is the way the built-ins are provided.
The set of predicates reexported from eclipse_language defines the set of builtins (previously this was the set of global declarations).
Local redefinitions still hide this (lazy) import.
Compatibility packages that provide alternative implementations of builtins: the import from eclipse_language and the import from a compatibility module are now equivalent. Possible solutions:

The importer has to resolve the conflict
The compatibility library has to be used instead of the kernel (via module/3, create_module/3)

The kernel-exports are somehow declared "weak", the library exports "strong" - but this approach doesn't nest
Partial ordering: if something is visible via one import and hidden via another, it should be hidden (see algorithm below)

Resolution of some import-ambiguities (not implemented):

vis := next_imported()
while (clash := next_imported())
{
    if (vis visible in definition module of clash)    % hidden
        vis := clash
    else if (clash visible in definition module of vis)
        ;
    else
        error(ambiguous_import)
}

Tools

Tools are predicates which get a caller module argument added when called.
The current implementation allows all kinds of redefinitions which is probably exaggerated.

Suggested changes, not all done in release 5.0 yet:

disallow tool/1, always require to specify the tool body when declaring a tool. This will allow to do the tool->body mapping at compile time, making it possible to inline the calls properly.

% tool definition in module tm:

:- tool(t/3, tb/4).

% Called in module cm:

t(a,b,c)      ----->     tm:tb(a,b,c,cm)

t(a,b,c)@xm   ----->     tm:tb(a,b,c,xm)

The qualification with :/2 is necessary because the tool body might not be visible in module cm.
Note that this transformation can, for compiled calls, be done by inlining:

:- inline(t/3, t_t/3).

t_t(t(A,B,C), tm:tb(A,B,C,M), M).

For metacalls, the same transformation must be done, probably in the emulator.
For delay/waking, the mapping should happen at delay time (i.e. the tool body is delayed instead of the original call), so the waking code does not have to deal with this complication and can be simpler and more efficient.
Also, the compiler does not have to deal with tool calls, they are all removed (replaced by body calls) by inlining.
All this means that tool interfaces can never get called, and therefore do not need any code generated. The code field in the procedure descriptor can be used to hold a pointer to the descriptor of the tool body.

To do: define restrictions on visibility of tool body.

Does it have to be defined in the same module as the tool interface, or just visible? (just visible from an implementation standpoint, but the restriction may make sense for programming discipline?)
Is it automatically exported/imported with the interface? (exported probably yes, imported probably no).
Most (all?) flag settings on the interface should be propagated to the body.
The tool-to-body link cannot be changed (after the tool has been referenced), ie. the tool interface cannot be redefined, however the body can.
For debugging etc, it would be nice if there was a one-to-one mapping between tool interface and body, so that body calls could be mapped back to interface calls, and the body predicates would almost never be visible.

Restrictions on redefinition

In the previous implementation lots of dynamic redefinitions were allowed. This is problematic when calls are compiled and properties of the callee are used in the process, like

modes
tool property
inlining
external calling convention

It is safer not to allow redefinitions and require forward declarations for everything:

Within a module, forward declarations can be avoided by going to multipass compilation, at least within a single file.
For interactive, incremental compilation it might be enough to have a simple default assumption (prolog convention, local, general mode, no tool, no inlining, etc).
For references across module boundaries, the exporting module interface must provide all the information and must be known at compile time.

Abolish

That leaves the problem of the abolish-primitive: There is no way to inform the possibly inlined calls of the abolishment. Maybe the semantics could be restricted to a removal of the clauses while keeping all other properties. This is tantamount to allowing redefinition with all the other properties being kept the same. This is implemented in release 5.0.

Erase_module

Recompiling a module: Currently erase+compile. This leaves all the referencing descriptors around and they are updated when the module is recompiled and the predicate reappears.

Time of visibility resolution

What about metacall-access and non-call access (property lookup etc)? In other words, should the first metacall to a (unambiguously, lazily) imported predicate fix the import link, or should it be resolved afresh on every metacall? Since metacalls contribute to the semantics, they should freeze the visibility, but in order to keep the ease of the interactive toplevel that is not necessarily desirable. In 5.0, the first metacall freezes the visibility, i.e. they behave like compiled calls.

Procedure Descriptors

Current pri descriptor fields and their usage:

code

wam code address, or emulator builtin index

next_proc

finding visible pred in a module
dict-gc
to copy the fields to all with a certain mod_ref
unlinking in erase_module
lazy import

next_in_mod

to import all exported ones to free in erase_module
current_predicate

mod_def

the module where the descriptor belongs

mod_ref

the module that the descriptor refers to, ie. where the definition is

functor of the predicate (name/arity)

flags

various property flags

mode

3 bits per argument in a 32 bit word, modes for higher arguments ignored

trans_function

did of the transformation predicate (inlining)

Descriptor types

Defined in that module:

LOCAL
EXPORT

Defined elsewhere (exported or reexported there):

IMPORT
IMPEXP

Unknown

DEFAULT

Qualified access (exported or reexported elsewhere)

QUALI

Descriptor states:

scope	module_ref	TO_EXPORT	NOREFERENCE	CODE_DEFINED	other properties
DEFAULT	D_UNKNOWN	0	0/1	0	any
LOCAL	== module_def	0/1	0/1	0/1	any
EXPORT	== module_def	0	0/1	0/1	any
IMPEXP	home module	0	0/1	0/1	any
IMPORT	home module	0	0/1	0/1	any
QUALI	home module	0	0	0/1	any

State changes - previous situation

from\to	LOCAL	EXPORT	GLOBAL	IMPORT	DEFAULT
DEFAULT	ok	ok	ok	ok	-
LOCAL	-	ok	ok	error
EXPORT	ok	-	ok	error
GLOBAL	ok	ok	-	ok
IMPORT	error	error	error

State changes - new behaviour

We accept repeated (or weaker) declarations silently

from\to	LOCAL	EXPORT	IMPEXP	IMPORT	DEFAULT
DEFAULT	ok	ok	ok	ok
LOCAL	nop	ok	error	error
EXPORT	nop	nop	error	error
IMPEXP	error	error	nop(s)	nop(s)
IMPORT	error	error	ok(s)	nop(s)

(s) - if imported from same module as before

Note on reexport

reexport could be handled by inlined indirection:

:- reexport p/3 from m1.

is functionally equivalent to

:- export p/3.

p(A,B,C) :- m1:p(A,B,C).

which can be made efficient by adding inlining

:- inline(p/3,t_p/2).

t_p(P3, m1:P3).

This is also related to having use_module in a module interface, which is similar to re-exporting. The difference between that and reexport is the definition module of the indirectly imported predicate.

Removal of the module_interface section

Pre-5.0, modules could be partiotioned into module_interface and module body (begin_module). This static sectioning has been dropped. Without the module_interface section, the following queries effectively comprise a module's interface:

:- export
:- reexport

These directives record themselves as the interface of the module that contains them. They do not have to appear in any particular section.

For backward compatibility, we interpret certain directives in an old-style module_interface by transformation into an equivalent export/reexport/global directive. Unfortunately, this is not exactly possible for occurrences of use_module,lib,import in module_interfaces: they almost map to the new reexport directive, but the semantics is subtly different. We therefore support having use_module and import in recorded interfaces, although it is only possible to create them by using obsolete features.

Directive in old :- module_interface Occurs in recorded interface as

op(A,B,C) export op(A,B,C)

set_chtab(A,B) export chtab(A,B)

define_macro(A,B,C) export macro(A,B,C)

set_flag(syntax_option, X) export syntax_option(X)

meta_attribute(A,B) global meta_attribute(A,B)

use_module(M) use_module(M) - almost reexport(M)

lib(M) use_module(library(M)) - almost reexport(library(M))

import(Preds from M) import(Preds from M) - almost reexport(Preds from M)

import(M) import(M) - almost reexport(M)

For any other directives in a module_interface, we issue a warning.

The source-processor problem

How do we solve the source-processor problem? We execute certain export/local directives, ie. the ones that affect the syntax:

op/3
struct/1
macro/3
chtab/2
syntax_option/1
meta_attribute/2

Apart from that, we also have to do all imports (since they may define necessary syntax). For more details, see library(source_processor).

Autoload

The existing autoload feature is messy because

It requires the autoloaded predicates to be global
It creates the module where the autoloaded predicates are defined

Only one of these two things should be done, and globality should not be used at all. I think there are two conceptually different features that could be called "autoloading":

A development environment tool that comes into action when an undefined predicate is called. It could find (possibly multiple) libraries that define the missing predicate and offer the programmer to load and import one of these libraries. This might not work when a call is already compiled (calling convention).
A runtime mechanism that lazily loads bulky libraries (or individual predicates) only when called. Here, the programmer has clearly specified what definition is wanted.

Directive in old :- module_interface	Occurs in recorded interface as
op(A,B,C)	export op(A,B,C)
set_chtab(A,B)	export chtab(A,B)
define_macro(A,B,C)	export macro(A,B,C)
set_flag(syntax_option, X)	export syntax_option(X)
meta_attribute(A,B)	global meta_attribute(A,B)
use_module(M)	use_module(M) - almost reexport(M)
lib(M)	use_module(library(M)) - almost reexport(library(M))
import(Preds from M)	import(Preds from M) - almost reexport(Preds from M)
import(M)	import(M) - almost reexport(M)