Some Private Remarks on the Boyer-Moore(BM for short)-Prover

Author: Claus-Peter Wirth

Date: Dec 19, 1991.


The unrefered references always refer to 
Boyer, Robert S.; Moore, J Strother:
A Computational Logic, Academic Press 1979, Menlo Park, California.

The page references refered to the `new' book refer to
Boyer, Robert S.; Moore, J Strother:
A Computational Logic Handbook, Academic Press 1988.

Further references:
[Stevens]  Andrew Stevens. A Rational Reconstruction of Boyer and Moore's
                           Technique for Constructing Induction Formulas.
                           ECAI 88, pp. 565-570.

I. REMARKS TO CHAPTER XIV


1. INTERNAL CLAUSAL FORM
   
   The process of deductive proving is majorly influenced by exhaustive
   contextual rewriting on the basis of the internal clause form
   and the rippling-out of IFs using the meta-rewrite rule
   (f ... (IF test pos neg) ...) = (IF test (f ... pos ...) (f ... neg ...),
   which becomes necessary because of the form of the BM-definitions
   (taking conditional rewrite rules this rippling is not needed anymore).

   It would be more beautiful therefore to show these clauses to the user
   and allow him to specify his theorems and definitions in a similar form.
   While the clauses are presented to the user in the form

     (IMPLIES (AND ...) concl).

   BM say (page 220 of the new book) that the user must not think
   in this expression but has to think in clausal form. Their example
   for this is that a rewrite-rule with left-hand side (AND ...) would
   never apply to the clause above (Note, however, that such a rule may
   apply during simplification of the original theorem-term or in case of
   a real AND function symbol that has survived due to the special treatment
   explained in `Answers of Ulrich, C.b' below.).

   We have had three open questions:   
   A. How does the BM-prover translate the internal clausal form into 
      the external implicational form for presentation to the user?
   B. Does this translation function have any influence on the behaviour
      of the BM-prover; i. e. is the distinguished `concl' a literal of
      major interest to the BM-prover?
   C. Does the BM-prover make any use of how the user chooses and
      structures his logical connectives for expressing a theorem?

   Answers of Ulrich:
   A. Clauses are represented internally as lists. 
      The last element of such a list is the conclusion `concl'.
   B. The list representing a clause is simplified in the order of the
      list; hence `concl' is simplified last (See (A)).
      This becomes important for cross-fertilization, where the literals
      of the form (NOT (EQUAL s t)) are tried for replacement in the order
      of the internal list until one applies (See page 149).
   C. The structure of the logical connectives has two influences:
      a. The ordering of the internal lists that represent the clauses
         may be influenced.
      b. (page 90, footnote 14)
         If more than one of the clauses obtained by clausifying 
         cannot be shown by simplification and it comes to an induction,
         then the BM-prover prefers to induct upon 
         the single input clause rather than several ones. 
         This is done by not spliting up normalized AND-connectives.
         The reason behind is: 
           If the user specifies an inductive
           theorem that yields different clauses requiring inductive proof,
           he will probably have done this because of the impossibility of 
           showing these clauses separately.


2. THE MACHINE OF A DEFINED FUNCTION
   
   The so-called `machine' (see page 165) of a function of the BM-prover is 
   conceptually nothing but the set of the recursive ones of the conditional
   rewrite-rules that correspond to the definition scheme of the function.
   (Actually, they differ by the flaw described in (3.) below and by the
    omission of the non-logical superterms of the recursive calls.
    Note: The twofold use of `machines' in BM (for termination on one hand
          and for suggestion of appropriate induction schemes on the other)
          mirrors the similar uses of conditional equations.
          For the termination purpose the BM-approach allows to throw away
          the non-logical superterms because there is never more than
          one function in the process of definition. 
          (henceforth called the `one per time fashion' of definition).
          For the suggestion of appropriate induction schemes the 
          non-logical superterms may be of no use --- so one could
          argue that the machines are more appropriate here than conditional
          equations. 
          However, the whole induction heuristics of BM have little reason
          (except: If there is an easy induction
                   (and all functions are defined in the one per time fashion),
                   then it is likely go the BM-way),
          and a more complicated induction scheme 
          (which need not be precomputed)
          might also make use of the non-logical superterms.)
   Therefore we think it would be nicer to have the user specify 
   the function by means of conditional equations instead.


3. MACHINES FOR TERMINATION PROOFS, PARTIALLY DEFINED FUNCTIONS
   
   The BM-prover has no concepts for partially defined functions. 

   This property becomes relevant even for totally defined functions during
   the definition process. Take a rule of the form

      f t = g f s  <-- goodnaturedp t = true, f u = 0.

   Now suppose the call of `f s' to be smaller than `f t' if `f u = 0', but 
   not necessarily without. Additionally suppose `f u' to be smaller than
   `f t' (if `goodnatured p t = true').

   While the definition is fine now (because it is left-right-compatible), 
   the BM-prover is unable to recognize this 
   because he is forced to show `f s' to be smaller than `f t' if
   `goodnaturedp t = true' without the necessary additional assumption
   `f u = 0'. (See page 165.)


4. REVISED MACHINES (i.e. machines with weakened cases)

   BM say that they do the induction not according to the `machine' 
   but to a modified version of it: 
   They weaken the cases of the machine (i. e. the conditions of the equations)
   by taking the implied conditions of the so-called
   `induction-lemmas' (which are used for proving termination) instead. 
   This weakening seems to be a rather stupid thing to do.
   Of course, it might save some proof computation that already has been
   done for the termination proof before. But weaker is weaker.
   BM give two reasons for their procedure:

   A. It eliminates weak base subgoals

   B. Because of usage of the same induction-lemmas (which tend to be very few)
      it arranges for different terms to suggest identical inductions.

   Now, (A) seems to be nonsense because it's not the base subgoals that make
   the real trouble. However, 
   (A) is said to be important in [Stevens](p. 566). 
   Most interesting is our following discovery:
   While the book of 1979 says the weak base subgoal to be bad,
   the current version of the BM-prover exactly proves the very same example
   of the book of 1979 by the weak base subgoal. 
   However the base subgoal is not proved by induction
   (as claimed in the book of 1979), but by a simple case distinction.
   For a commented version of this proof see file `arithconj.boot' for 
   the definitions and `arithconj.session' for the proof.
   Note: The functions ZEROP, LESSP, PLUS were redefined to MY-ZEROP, MY-LESSP,
         PLUS-1 for avoiding build-in arithmetics. The second redefinition
         of PLUS-1 from a function definition to two defining rewrite-rules
         became necessary for the BM-prover to allow tracing on the rewriting
         of (PLUS I J). The only difference this last redefinition 
         makes for the proof of lemma P177 seems to be in the non-detection
         of (NUMBERP (PLUS I J)) which would avoid the last two steps in the 
         proof.
         For the complete tracing session see the rather large file
         `arithconj.session.total'.

   (B) does not seem to make any sense either
   because the heuristic comparison between different induction schemes
   depends only on the variables to induct upon and not on the case
   distinctions to be rather simple.

   In the new book there are no induction-lemmas anymore, but it is not
   described, whether they still have their machines revised. We guess
   BM came to the same conclusion as we and have removed induction-lemmas
   and revised machines therefore, but we are not quite sure about this.


5. UNMEASURED POSITIONS

   BM speak of `a measured subset of variables' instead of measured positions,
   but we do not do this because the term `variable' is misleading.

   What is the meaning of unmeasured positions?

   A. The unmeasured positions of a function f are irrelevant for 
      the definition of f to be compatible with the reduction ordering.

   B. The unmeasured positions of a function f are those that
      are not put to pieces by this function f in its recursive cases.
      By `putting to pieces' we mean the application of a destructor that
      has a somehow inverse constructor. Such destructors are sometimes
      called selectors. (E. g. : CAR has constructor CONS, SUB1 has constructor
      ADD1; but DELETE has no inverse constructor.)

   (Note that this misshaped double use of the same concept is typical for
    BM's AI-approach to the problem, according to the motto:
    ``Better a misshaped concept than one one does not know how to work with.''
    A rather nice example on which the two concepts (A) and (B) 
    differ is given by BM on page 176:
      There the function LESSP allows either its first or its
      second position to be a measured subset. 
      This is according to (A), because
      (A) is BM's means to get the concept working.
      However, for the induction proof it is more important to know
      which positions are taken to pieces in the sense of (B). 
      Hence, they are actually mislead to do an induction
      proof on the second position because it forms a measured subset for its
      own. Luckily, this misleading does not turn out to be harmful because
      the suggestion is right although the argumentation is wrong.
      Their lemma is (NOT (LESSP (PLUS I J) I)). So induction on I is right
      because this is suggested by PLUS, which recurses on the first argument.
      However, consider (NOT (LESSP (PLUS J I) I)).
      This needs induction on J, and the BM-prover really discovers this (see
      file `exampleforfive'), because the induction on I is `flawed'.
      (Note: This is a very nice example for the BM prover finding a necessary
             lemma by generalization.)
      So it is not so easy to fool him; even after some effort
      (file `secondexampleforfive'). Of course one can fool him
      (file `modifiedexampleforfive'), but only by mean tricks.
    Conclusion: The conceptual difference between (A) and (B) does not
                seem to be very important for simple examples in practice,
                whereas it should not be ignored.
                An example where the attempt to realize (B) by (A)
                fails is very likely to be corrected by `flaw'-detection. 
                
   (A) is of importance for the guidance of inductive proofs: Instead of
   just trying to ripple-out the wave-fronts one is suggested to 
   store the wave-fronts into the unmeasured positions.
   Compared to the more general approach of an enlargable reduction ordering,
   this is a rather primitive way of arguing; --- however, it has the advantage
   of giving a hint that can lead to an operationalization of
   the argumentation, which is not provided by the more general approach.

   (B) is most clearly exhibited when one uses the constructor-fashion of 
   definition instead of the destructor-fashion. 
   So if we use the constructor-fashion, we get the information most easily
   without the concept of measured positions.


6. MEASURED POSITIONS: CHANGING OR UNCHANGING

   The measured positions, i. e. the members of a set of measured positions,
   are separated into `changing' and `unchanging ones'.
   Position i of an n-ary function f 
   with $fx_{0}...x_{n-1}$ as left-hand side of definition 
   is `unchanging' :iff 
     for all subterms of the right-hand side of the definition of f
     that have the form $ft_{0}...t_{n-1}$ we have $t_{i}=x_{i)$ syntactically.
   
   What is the use of this separation?
   
   The technical aim is to allow additional induction schemes by weakening
   the restrictions for the unchanging variables.

   In terms of the two separated cancepts (A) and (B) of (5.) the
   reason can be described the following way:

     The (measured) unchanging positions are relevant for the induction
     ordering. Hence they do not belong to (A).
     However, they are not put to pieces in the function's recursive cases
     and therefore do belong to (B).
     Therefore this distinction can be thought of as a step from the
     operationalizable complement of (A) to the intended complement of (B). 

     However unchanging measured positions are not the only kind of 
     positions that are needed for the induction ordering 
     (i. e. do not belong to (A)) but are not put to pieces in the
     recursive cases of the function (i. e. do belong to (B)).

     Therefore, speaking in terms of my DA, 
     one really should use (B) for choosing covering sets of substitutions,
     and separately use (A) for guiding the proof towards induction
     hypotheses that are smaller than the conclusion.


II. REMARKS TO CHAPTER XV (FROM INDUCTION TEMPLATES TO INDUCTION SCHEMES)
 
What was computed for each defined function symbol f in the previous chapter
BM and we will now call a set of `induction templates' (page 183).
By an induction template we mean a pair consisting of
  i.  something that does a trichotomy of the positions of f into 
      unmeasured, unchanging measured, and changing measured ones.
  ii. the revised machine for the measure leading to (i.).

Now the general idea of generating induction schemes from these sets of 
induction templates is a three step process explained in this chapter:

1. Generate for each subterm of the conjecture and for each
   induction template of this subterm's top-symbol an induction scheme
   if possible.

2. Remove each induction scheme `subsumed' by another, 
   and substitute each pair of induction schemes of (1.) by their `merge' if
   possible.

3. Heuristically choose one of the induction schemes of (2.) for the
   proof attempt for the conjecture.

The conceptual approach of BM is again rather ugly
(but there does not seem to be a more beautiful way):
Instead of dealing with induction over variables, which is the most natural
and theoretically necessary way to tackle the problem, what BM actually do
is thinking in terms of positions instead of variables and then afterwards
correcting all the discrepancies of the two concepts. 
(In this correction process they forget one discrepancy 
 exhibited by the funny example of the file `bmmishit'.)


HOW BM TACKLE (1.)(GENERATION OF INDUCTION SCHEMES) 

An induction scheme for a fixed subterm TERM of a conjecture CONJ
and a fixed induction template of this subterm's top-symbol f 
is obtained the following way:
Instantiate the formals of the function f according to the actuals 
(i. e. the direct subterms of the fixed subterm TERM of the conjecture CONJ)
by a substitution $\gamma$.
For each case CASE of the revised machine of the fixed induction template 
generate a proof subgoal of the form

( ( CASE$\gamma$     $\wedge$ 
    CONJ$\sigma_{0}$ $\wedge$ .... $\wedge$ CONJ$\sigma_{m}$ 
  )
  --> 
  CONJ 
).

CONJ is the conjecture one wants to prove.
$\gamma$ is the substitution that maps the (naturally distinct)
formals of the function f to its actuals in the current subterm.
Let the case CASE guard (m+1) calls to the function f, namely:
$ft_{0,0}...t{n-1,0}$;
.
.
.
$ft_{0,m}...t{n-1,m}$. 
For each j < (m+1):
  $\sigma_{j}$ is a substitution
  whose domain are those actuals which are variables.
  
How to construct $\sigma_{j}$ form $\gamma$ and $ft_{0,j}...t{n-1,j}$
is the central problem of the whole process of (1.).
In the sequel we describe BM's procedure for a fixed j of a fixed CASE (of 
a fixed induction template for a fixed subterm TERM of the conjecture CONJ).
Note, however, that the induction scheme consists of all the induction
subgoals of the template plus the base subgoals, which say that the
conjunction of the negated cases [CASE] of the inductive subgoals
implies the conjecture.

Let $\mu$ be the substitution that maps each of the formals 
$x_{0},...,x_{n-1}$ of $fx_{0}...x_{n-1}$ to the 
term that occupies its position in $ft_{0,j}...t{n-1,j}$;
i. e. $x_{i}\mu := t_{i,j}$.
Now $\sigma_{j}$ is given by mapping 
(for each formal $x_{i}$ for which $x_{i}\gamma$ is a variable):
  
        $x_{i}\gamma$        to        $x_{i}\mu\gamma$.


If all actuals are distinct variables, everything is fine.
But two problems can occur:

A. The actual $x_{i}\gamma$ of a formal $x_{i}$ is not a variable.
  
B. The actual is a variable which is an actual of another formal, too.


TO A: 
  
Three cases:

  a. i is a changing measured position:

     BM say that the induction template does not apply to the subterm TERM.
     This is reasonable because we do not know how to put $x_{i}\gamma$
     to pieces in this case, and that is exactly what we should do when
     the position i is in the complement of (B) of (5.) of (I.).
     So instead of pushing down a selector on something it is usually
     better to have this something sending up the constructor.
     For non-selector destructors (e.g.: delete), however, it might be 
     necessary to send the destructor down, which the BM-prover cannot do.

  Hence assume in the sequel that for each changing measured i:
     $x_{i}\gamma$ to be a variable.

  b. i is an unchanging measured position:

  Two cases:

     alpha. $x_{i}\gamma$ contains a variable of the form $x_{k}\gamma$
            where k is a changing measured position:

        BM say that the induction template does not apply to the subterm TERM.
        This is forced by their way of induction foundation because the
        changing measured j must change in some case CASE' of the revised 
        machine of the induction template:
        In this case CASE' the foundation of the induction will be destroyed.

        Even when using a more general means of foundation of induction,
        it is not a bad heuristic not to try an induction here:
        One should generalize on the occurrence of $x_{k}\gamma$
        in $x_{i}\gamma$ before an induction attempt or 
        try to prove a lemma that $x_{i}\gamma$ is insensitive on
        the change of $x_{k}\gamma$ to $x_{k}\mu\gamma$.

     beta. $x_{i}\gamma$ contains no variable of the form $x_{k}\gamma$
           where k is a changing measured position:

        This is allowed by BM ($\sigma_{j}$ is undefined on $x_{i}\gamma$ 
        then), but it possibly changes $\sigma_{j}$ in the following way:
        If k is an unmeasured position and $x_{k}\gamma$ is a variable
        that occurs in $x_{i}\gamma$, then $x_{k}\gamma\sigma_{j}$ is
        $x_{k}\gamma$ instead of $x_{k}\mu\gamma$.
        This is heuristically good, but one is tempted not to
        precompute an induction scheme here (and in other cases).

  c. i is an unmeasured position:

     This is allowed by BM.
     (But $\sigma_{j}$ is undefined on $x_{i}\gamma$ of course.)


TO B: 

  Let i and k be positions of f with $x_{i}\gamma = x_{k}\gamma$ a variable.
  Six cases:

  a. i and k are changing measured positions:

     BM say the induction template not to be appropriate. This is stupid
     as exhibited by the example of the file `bmmishit'.
     The conceptually fitting way is to say the induction template to
     be appropriate here iff $x_{i}\mu\gamma = x_{k}\mu\gamma$.

  b. i and k are unchanging measured positions:

     Because of $x_{i}\mu\gamma = x_{i}\gamma = x_{j}\gamma = x_{j}\mu\gamma$
     this no problem at all.

  c. i and k are unmeasured positions:

     BM arbitrarily choose one of $x_{i}\mu\gamma$, $x_{k}\mu\gamma$ 
     for the value of $x_{i}\gamma$ under $\sigma_{j}$.

  d. unmeasured / changing-measured:

     The value of $x_{i}\gamma$ under $\sigma_{j}$ is determined by the 
     changing measured position only.

  e. unchanging-measured / changing-measured:

     In this case BM says that the template does not apply.
     The problem is exactly the one of A.b.alpha.

  f. unchanging-measured / unmeasured:

     Treated analogously to  A.b.beta.


HOW BM TACKLE (2.)(SUBSUMPTION AND MERGING OF SCHEMES)

BM start with the introduction of further notations:

For an induction scheme $s$ generated as described above the following sets
are defined:

  ${\rm CHANGING}(s)$ is the union over all inductive subgoals of 
  the union over all j of the Domains of the $\sigma_{j}$.
  ${\rm UNCHANGING}(s) := \var{TERM}\setminus{\rm CHANGING}(s)$


1. SUBSUMPTION OF SCHEMES

If an induction scheme is `subsumed' by another it is thrown away
and this other scheme is given additional credit. BM try to 
explain what they mean by $s_{1}$ being `scheme-subsumed' by $s_{2}$ but 
we are not sure to understand it; it is something like a conjunction of this:

(0) $s_{1}\not=s_{2}$.
(1) {\rm CHANGING}(s_{1})  \ \subseteq\ {\rm CHANGING}(s_{2})$
(2) {\rm UNCHANGING}(s_{2})\ \subseteq\ {\rm UNCHANGING}(s_{2})$
(3) Each inductive subgoal of $s_{1}$ is subgoal-subsumed by 
    an inductive subgoal of $s_{2}$.
(4) No inductive subgoal of $s_{2}$ subgoal-subsumes more than one 
    inductive subgoal of $s_{1}$.

(Aside: We cannot see any heuristic reason for (4)! Do You?)

An inductive subgoal $g_{1}$ is subgoal-subsumed by 
an inductive subgoal $g_{2}$ :iff

(1) The terms in the conjunction $CASE\gamma$ of $g_{1}$ are a subset of 
    those in $g_{2}$.
(2) Each substitution $\sigma_{j}$ of $g_{1}$ is substitution-subsumed by 
    some such substitution of $g_{2}.
(3) No substitution $\sigma_{j}$ of $g_{2}$ substitution-subsumes more than one
    such substitution of $\sigma_{j}$.

(Aside: We cannot see any heuristic reason for (3)! Do You?)

The following definition is wrong because it thinks in terms of constructors
substituted into the conclusion instead of destructors substituted into the
hypotheses, but it is easier to state it this way:
A substitution $\sigma$ is substitution-subsumed by a substitution $\sigma'$
:(bugged!)iff there is a substitution $\kappa$ with $\sigma\kappa = \sigma'$.
Note that this last definition is just the reverse of what one usually 
expects from subsumption.

A proper definition is:
A substitution $\sigma$ is substitution-subsumed by a substitution $\sigma'$
:iff for each term $t'$ in the range of $\sigma'$ there is a set of positions P
such that
  (a) For each variable-position v in $t'$ 
      there are p,q with v=pq and $p\in P$.
  (b) For each $p\in P$ there is a term $t$ in the range of $\sigma$ 
      such that $t'/p=t$ syntactically.

[Stevens] understand BM in a different way:
They require $\sigma=\sigma'$ here but allow `repeated' schemes of 
the subsumee's: 
A repeated scheme SR for a function f is either an induction scheme
for the function f or the result of expanding a machine M for the function
f on one of the subgoals of a repeated scheme R for the function f.
Expanding a machine M on the subgoal $g$ is done by instantiating the 
formals of M according to some ${\rm TERM}\sigma_{k}$ of $g$ and
then by replacing $g$ by a new subgoal for each case CASE of M 
which adds this CASE to that one of $g$ 
and takes over all $\sigma_{j}$ from $g$
except for $\sigma_{k}$, which is changed according to the recursive
calls of CASE.

 
2.MERGING OF SCHEMES

Because BM precompute the induction scheme they must be very sophisticated in
guessing the right scheme. One step towards intelligent guessing is the
`merging' of schemes, which replaces two schemes (one mergable into the other)
from the set of schemes under choice by their merge.

Roughly speaking, two induction schemes merge, if their CHANGINGs overlap
in a way that is compatible in the sense that 
the suggested changes in the intersection of the CHANGINGs are identical and 
the remaining CHANGING-variables are not among the other scheme's UNCHANGINGs.

Scheme $s_{1}$ is mergable into  scheme $s_{2}$ :iff
(1) ${\rm CHANGING}  (s_{1})\cap{\rm CHANGING}  (s_{2})\not=\emptyset$
(2) ${\rm UNCHANGING}(s_{1})\cap{\rm CHANGING}  (s_{2})    =\emptyset$
(3) ${\rm CHANGING}  (s_{1})\cap{\rm UNCHANGING}(s_{2})    =\emptyset$
(4) There is an injective mapping INJG from the inductive subgoals of $s_{1}$
    into the inductive subgoals of $s_{2}$ such that for each inductive
    subgoal $g$ of $s_{1}$: $g$ is subgoal-mergeable into ${\rm INJG}(g)$.
(Aside: We do not see reason for the injectivity of (4)! Do You?)

Inductive subgoal $g$ is subgoal-mergable into inductive subgoal $g'$ :iff
there is some injective mapping INJS from the substitutions $\sigma_{j}$
of $g$ into these substitutions of $g'$ such that for each substitution
$\sigma_{j}$ of $g$: 
   $\sigma_{j}$ and ${\rm INJS}(\sigma_{j})$ are substitution-mergable.
(Aside: We see no reason for the injectivity! Do You? Note however:
        Even if injectivity is not needed at all, BM cannot come to a 
        symmetric kind of merging, where the unmerged subgoals of both
        induction schemes are taken over unchanged into the merged
        induction scheme (and so are the unmerged substituitons.)
        This because one of the old schemes must lay the foundation
        for the induction by its measure,
        and it is not enough to decrease w.r.t. one measure or another.)

Two substitutions are substitution-mergable :iff
(1) Their restrictions to the intersection of 
    the sets of variables of their terms TERM are identical.
(2) The intersection of their Domains is not empty.

The result $s''$ of the merging is similar to $s_{2}$ 
but with the following exceptions:
(1) ${\rm CHANGING}  (s'') :=
     {\rm CHANGING}  (s_{1}) \cup {\rm CHANGING}  (s_{2})$
(2) ${\rm UNCHANGING}(s'') :=
     {\rm UNCHANGING}(s_{1}) \cup {\rm UNCHANGING}(s_{2})$
Note that CHANGING and UNCHANGING still stay disjoint.
(3) For each (i. e. not only for the ones suggested by INJG)
    inductive subgoal $g$ of $s_{1}$ that is subgoal-mergeable
    into a subgoal $g'$ of $s_{2}$ we merge this subgoal $g$ into $g'$,
    yielding a new subgoal $g''$ which differs from $g'$ by:
    (a) ${\rm CASE}(g'') := {\rm CASE}(g)\ \wedge\ {\rm CASE}(g')$
    (b) Each substitution $\sigma$ from $g$ is merged into each substitution
        $\sigma'$ of $g'$ into which it is substitution-mergeable, yielding
        (for each merge) a substitution $\sigma''$ of $g''$ 
        which is the union of $\sigma$ and $\sigma'$.
(4) (Surprisingly found on page 196.) If now the CASE of two subgoals
    consists of the same conjuncted terms, the two subgoals are merged
    into one that conjuncts all conjuncted terms of the premises of the two
    subgoals in its premise.
(5) The base subgoals are reconstructed from the new inductive subgoals.

Now $s''$ is sound for the same reason as $s_{2}$:
- Completeness of case distinction is given because of reconstruction of
  base subgoals.
- Making CASE stronger cannot do any harm to the foundation of the induction.
- The possibly additional substitutions occur only outside of TERM of $s_{2}$
  because of (1) of the definition of substitution-mergeable
  and (2),(3) of the definition of mergable.

[Stevens] recognized BM's approach to be too simple 
in not allowing repeated schemes in place of the scheme $s_{1}$.
They give a rather nice example for this.
Furthermore their approach allows to treat subsumption and merging
in a uniform way (for which they have to throw away the injectivities
(as suggested by us)).

Note: Having read this one might assume that the merging of schemes is of 
      no use when one does not want to precompute the induction scheme. However
      the next section about the FLAWED-heuristic makes sense even when
      not precomputing induction schemes and is majorly influenced
      by the assumption of all subsumption removed and all merging been done.


HOW BM TACKLE (3.)(HEURISTIC SELECTION OF AN INDUCTION SCHEME)

1. FLAWED INDUCTION SCHEMES

Some schemes are said to be `flawed'.
Unflawed scemes are prefered to flawed ones.

Roughly speaking, we guess a flawed scheme should be one that overlaps
with its CHANGING-variables into another schemes CHANGING- or UNCHANGING-
variables in a mode that does not allow merging because of 
a clash on the CHANGING-variables or an overlaping of the CHANGING-
variables into the other's UNCHANGING-variables;
whereas BM's definition is a little different and a lot obscure.

A scheme $s$ is flawed :iff there is another scheme $s'$ with
$({\rm CHANGING}(s') \cup {\rm UNCHANGING}(s'))
 \cap {\rm IND-VAR}(s)
 \not= \emptyset$;
where the set of `induction-variables' of $s$ is defined by
  ${\rm IND-VAR}(s) := ???$.
??? seems to be just the set of actuals of the changing-measured positions
that have lead to the generation of $s$. However, we do not know whether
we understand this correctly and how to find the induction-variables of
merged schemes. 

For making sense it is necessary that the merging has been
completed before, i. e. $s$ and $s'$ are not mergeable.

[Stevens] try to explain what is good with BM's flawing concept and what is
bad. Their treatment, however, is not enlightening. We keep thinking
our initial (roughly speaking) formulation of the problem to be the adequate
one: On the one hand the more sophisticated treatments are not convincing.
On the other there are shortcomings (i.e. certain possibilities for induction
are not considered at all) in the concept 
which should be removed before introducing further sophistication.


2. SCORES

If telling apart the flawed from the unflawed schemes does not settle the 
score, then the remaining induction schemes are chosen by an attribute called 
SCORE that is constructed along with the scheme construction (in a manner we do
not understand) and manipulated when merging (new score is sum of old ones)
and subsuming (subsumer's score is increased by subsumee's one).
Highest score is winner. If several winners: Do the induction on
the scheme derived from the template whose term TERM
contains the biggest number of `nasty' functions in the 
conjecture. This tends to remove many nasty functions after simplification
of the inductive subgoal: Because something like TERM tends to appear
in premise and conclusion it is apt to be generalize or cross-fertilized away.
A function is said to be `nasty' 
if it does not do anything to its positions in its recursive cases 
but nothing or selector application.

If all this does not give a single winner, BM claim the conjecture to
be most likely to be symmetric (whatever this means) and choose arbitrarily.