In this text we are going to explain how the specification in the
report
was translated into our prototypical ML implementation.
Implementation of Specification Modules
Roughly speaking, each module of the specification is implemented as
an ML structure.
If a module of the specification has a parameter that was not implemented
ad hoc,
then this specification module is implemented as an
ML functor
instead of an ML structure.
E.g., the specification modules
ANCHOR
and
HD
are
implemented as functors,
whereas the following are implemented as structures:
LINK,
MO,
PAGE,
HMD.
Additionally to
this text there is an
alphabetical list of all implementation modules
and you can use the following dependency graph
to navigate into this list.
Note that doubled-line boxes indicate ML functors, while
simple box indicate ML structures.
In the
postscript format, the
parameter signatures have dotted boxes.
The modules ATT_ANCHOR, ATT_LINK, ATT_PAGE, ATT_HD
ATT_HMD, ATT_FSD, ATT_FSD and
ATT_SITE
are all implemented as a single module.
As an ML functor cannot have
more than one
parameter,
the parameter ENTRY had to be duplicated into,
on the one hand,
the ML signature
extended_equal_p
for single occurrence and, on the other hand,
the ML signature domainandrangep
for double occurrence;
moreover, we had to add functions for
equality testing and string conversion to these parameters.
The equality tests that were implicit in the specification
are explicitly implemented for all types.
The standard name for the equality test is "equal" and you
usually have to prefix it with the structure name and a dot because
overloading is not possible in ML, which required to suffix
some equality test names with an underscore and type name.
The name of the string conversion which we added for
most of the types is usually "tostring".
In some modules the tostring function gets some extra integer first
first argument describing the style of the string.
If this argument is zero, the result should read exactly as in
the specification. Otherwise, a more readable string is produced.
The string conversion is used to implement functions
named "displaystrict" and "display" that print out their argument.
If both display functions occur in a module, "displaystrict" corresponds
to "tostring 0" while "display" produces a more readable result.
The following modules were not implemented until now:
PAGE_SYMBOLS, HMD_ADDR,
CHAPTER_SYMBOLS, CHAPTER, FSD_ADDR,
BOOK_SYMBOLS, BOOK, SITE_ADDR.
Data Types and Abstract Types
The constructors of visible sorts were implemented as ML data types, whereas
the constructors of hidden sorts were implemented as ML abstract types.
Moreover, we have added a constructor for
variables to all sorts
that are in the focus of our specification in order to be able
to have some undefined parts of a data type for lazy definition
in a top-down construction of data objects.
The details will be explained somewhere else.
For the logical intention of the variables cf. the papers
"On Notions of Inductive Validity for First-Order Equational Clauses"
and
"Full First-Order Sequent and Tableau Calculi With Preservation of Solutions and the Liberalized delta-Rule but Without Skolemization".
Implementation Style
Note that we have curried all functions
(except data type constructors) for simplicity and
compatibility with other functional programming languages like
Haskell.
Thus the declaration for the
equality test in our current top module HMD
has the declaration
equal: hmd -> hmd -> bool
and not
equal: hmd * hmd -> bool
Problems with ML
One of our reasons to choose ML as implementation language was our
hope to realize each specification module as a single ML module
in a very straightforward fashion. The following problems arose
during the implementation. If you think that you can give us some
advice on this, please send e-mail to
Claus-Peter Wirth.
-
One problem was that we were not able to import the functor
ANCHOR into the functor HD
without binding the formal parameter with signature
DOCUMENT_P
of ANCHOR.
We solved this by binding this formal parameter
to the formal parameter with signature DOCUMENT_P
of HD. Slightly simplified, this looks like:
functor Hd (Document_p : DOCUMENT_P): HD
=
struct
structure Anchor_in_Hd = Anchor (Document_p);
open Anchor_in_Hd;
...
end(*struct functor Hd*);
The price
we had to pay for this was that we had to open
ANCHOR in HD
because this was the only way to get access to the proper
instance of ANCHOR without a very ugly
double prefixing.
The name conflicts that arouse had to be removed by hand,
so that some names of the implementation of ANCHOR
now differ
from those of the specification by the suffix "_anchor".
E.g., while the equality test for HMD has the
intended name "Hmd.equal", the name of the equality test for anchors
is "Hmd.equal_anchor" instead of "Anchor.equal".
Similarly, "Hmd.get_att_anchor" is used instead of "Anchor.get_att".
As you can see from the dependency graph,
the module Anchortable
is actually between the above collision of name spaces.
Therefore, a function that normally would have the name say
"Anchortable.apply" actually has the name "Hmd.apply_anchortable".
-
The following problem was more serious:
In a first step we implemented
sets as a polymorphic ML type.
But we have sets
(like "link set")
whose elements are not ML equality types,
and in ML there it is neither possible to
overload user defined functions
nor to extend the overloading of built-in overloaded functions
nor to
write a function call like
'a.equal
in the body of a function
insert : 'a -> 'a set -> 'a set.
Note that in
Haskell
we would have both possibilities of overloading.
In ML, however, we had to provide the equality test as an additional
argument to most functions on sets.
This procedure is ugly and error-prone.
When we then came to the implementation of string conversion for sets,
we had to accept that in ML the only proper way to implement sets
is to implement sets as an ML functor.
In the current implementation, only the functor implementation
of sets is used.
We are not happy with this solution for the following reasons:
-
We did not want to have set as a functor
but as a polymorphic type.
In our intention, the difference between
polymorphic types and functors
is the same as the difference between structural and non-structural modules
discussed in the draft
"Improving ASF+",
namely that each instantiation of a functor
(even with the same actual parameter)
provides us with a new copy, while a structural module (polymorphic type)
shares all its occurrences referring to the same actual parameter.
-
We had to split several specification modules into a multitude
of ML structures or functors because (when Set is a functor) we cannot
define the type of sets of elements in the same structure where the type of
these elements themselves is defined.
Moreover, we cannot define the type of sets of elements in the
module using elements (as was the case in our specification)
because the use of this kind of sets in different modules
would then become incompatible.
E.g., we had to split the specification module LINK
into separate ML structures
for links,
for specifiers,
and
for sets of specifiers.
Similarly, we had to split the parameterized specification module HD
into three modules, namely
one for the functor for
hyperdocuments,
one for the functor for
functions from anchor identifiers to anchors,
and
one for the structure for
sets of links.
-
The worst consequence of all is that the functor for
function objects is unable to
return its domain or range or similar data in form of a set
but must return it in form of a list instead.
This is because a set of integers returned from a function object
from integers to integers would be of different type than any
set of integers outside the functor and thus useless.