Some Private Remarks on the Boyer-Moore(BM for short)-Prover and on induction heuristics in general Author: Claus-Peter Wirth Date: Jan 10, 1992. The unrefered references always refer to Boyer, Robert S.; Moore, J Strother: A Computational Logic, Academic Press 1979, Menlo Park, California. The page references refered to the `new' book refer to Boyer, Robert S.; Moore, J Strother: A Computational Logic Handbook, Academic Press 1988. Further references: [Stevens] Andrew Stevens. A Rational Reconstruction of Boyer and Moore's Technique for Constructing Induction Formulas. ECAI 88, pp. 565-570. [Walther1] Christoph Walther. Computing Induction Axioms. CADE-11 (?). [Walther2] Christoph Walther. Computing Induction Axioms by Machine. LICS-92(?). I. REMARKS TO CHAPTER XIV 1. INTERNAL CLAUSAL FORM The process of deductive proving is majorly influenced by exhaustive contextual rewriting on the basis of the internal clause form and the rippling-out of IFs using the meta-rewrite rule (f ... (IF test pos neg) ...) = (IF test (f ... pos ...) (f ... neg ...)), which becomes necessary because of the form of the BM-definitions (taking conditional rewrite rules this rippling is not needed anymore). It would be more beautiful therefore to show these clauses to the user and allow him to specify his theorems and definitions in a similar form. While the clauses are presented to the user in the form (IMPLIES (AND ...) concl) BM say (page 220 of the new book) that the user must not think in this expression but has to think in clausal form. Their example for this is that a rewrite-rule with left-hand side (AND ...) would never apply to the clause above (Note, however, that such a rule may apply during simplification of the original theorem-term or in case of a real AND function symbol that has survived due to the special treatment explained in `Answers of Ulrich, C.b' below.). We have had three open questions: A. How does the BM-prover translate the internal clausal form into the external implicational form for presentation to the user? B. Does this translation function have any influence on the behaviour of the BM-prover; i. e. is the distinguished `concl' a literal of major interest to the BM-prover? C. Does the BM-prover make any use of how the user chooses and structures his logical connectives for expressing a theorem? Answers of Ulrich: A. Clauses are represented internally as lists. The last element of such a list is the conclusion `concl'. B. The list representing a clause is simplified in the order of the list; hence `concl' is simplified last (See (A)). This becomes important for cross-fertilization, where the literals of the form (NOT (EQUAL s t)) are tried for replacement in the order of the internal list until one applies (See page 149). C. The structure of the logical connectives has two influences: a. The ordering of the internal lists that represent the clauses may be influenced. b. (page 90, footnote 14) If more than one of the clauses obtained by clausifying cannot be shown by simplification and it comes to an induction, then the BM-prover prefers to induct upon the single input clause rather than several ones. This is done by not spliting up normalized AND-connectives. The reason behind is: If the user specifies an inductive theorem that yields different clauses requiring inductive proof, he will probably have done this because of the impossibility of showing these clauses separately. 2. THE MACHINE OF A DEFINED FUNCTION The so-called `machine' (see page 165) of a function of the BM-prover is conceptually nothing but the set of the recursive ones of the conditional rewrite-rules that correspond to the definition scheme of the function. (Actually, they differ by the flaw described in (3.) below and by the omission of the non-logical superterms of the recursive calls. Note: The twofold use of `machines' in BM (for termination on one hand and for suggestion of appropriate induction schemes on the other) mirrors the similar uses of conditional equations. For the termination purpose the BM-approach allows to throw away the non-logical superterms because there is never more than one function in the process of definition. (henceforth called the `one per time fashion' of definition). For the suggestion of appropriate induction schemes the non-logical superterms may be of no use --- so one could argue that the machines are more appropriate here than conditional equations. However, the whole induction heuristics of BM have little reason (except: If there is an easy induction (and all functions are defined in the one per time fashion), then it is likely go the BM-way), and a more complicated induction scheme (which need not be precomputed) might also make use of the non-logical superterms.) Therefore we think it would be nicer to have the user specify the function by means of conditional equations instead. 3. MACHINES FOR TERMINATION PROOFS, PARTIALLY DEFINED FUNCTIONS The BM-prover has no concepts for partially defined functions. This property becomes relevant even for totally defined functions during the definition process. Take a rule of the form f t = g f s <-- goodnaturedp t = true, f u = 0. Now suppose the call of `f s' to be smaller than `f t' if `f u = 0', but not necessarily without. Additionally suppose `f u' to be smaller than `f t' (if `goodnatured p t = true'). While the definition is fine now (because it is left-right-compatible), the BM-prover is unable to recognize this because he is forced to show `f s' to be smaller than `f t' if `goodnaturedp t = true' without the necessary additional assumption `f u = 0'. (See page 165.) 4. REVISED MACHINES (i.e. machines with weakened cases) BM say that they do the induction not according to the `machine' but to a modified version of it: They weaken the cases of the machine (i. e. the conditions of the equations) by taking the implied conditions of the so-called `induction-lemmas' (which are used for proving termination) instead. This weakening seems to be a rather stupid thing to do. Of course, it might save some proof computation that already has been done for the termination proof before. But weaker is weaker. BM give two reasons for their procedure: A. It eliminates weak base subgoals B. Because of usage of the same induction-lemmas (which tend to be very few) it arranges for different terms to suggest identical inductions by allowing subsumptions and mergings that would not be possible without. Now, (A) seems to be nonsense because it's not the base subgoals that make the real trouble. However, (A) is said to be important in [Stevens](p. 566). [Walther1], contrariwise, recognize (correctly!) that the revision of the machines makes the proofs harder because (in the absence of merging or subsumption) the thrown away assertions must be regenerated (by case distinction) for the proof to succeed. [Walther1] even show that this can cause the induction to fail (See: Separate Relation Descriptions). Most interesting is our following discovery: While the book of 1979 says the weak base subgoal to be bad, the current version of the BM-prover exactly proves the very same example of the book of 1979 by the weak base subgoal. However the base subgoal is not proved by induction (as claimed in the book of 1979), but by a simple case distinction. For a commented version of this proof see file `arithconj.boot' for the definitions and `arithconj.session' for the proof. Note: The functions ZEROP, LESSP, PLUS were redefined to MY-ZEROP, MY-LESSP, PLUS-1 for avoiding build-in arithmetics. The second redefinition of PLUS-1 from a function definition to two defining rewrite-rules became necessary for the BM-prover to allow tracing on the rewriting of (PLUS I J). The only difference this last redefinition makes for the proof of lemma P177 seems to be in the non-detection of (NUMBERP (PLUS I J)) which would avoid the last two steps in the proof. For the complete tracing session see the rather large file `arithconj.session.total'. (B) was not understood by us until we read [Walther1] who give a straightforward explanation using their concept of relational descriptions. They call the revision process `range generalization'. In the new book there are no induction-lemmas anymore, but it is not described, whether they still have their machines revised. We suppose that BM also have discovered that (A) is nonsense while (B) is fine and therefore use their revised machines only for subsumption and merging of the induction schemes, whereas they use the unrevised machine when no subsumption or merging took place on the chosen induction scheme. 5. UNMEASURED POSITIONS BM speak of `a measured subset of variables' instead of measured positions, but we do not do this because the term `variable' is misleading. What is the meaning of unmeasured positions? A. The unmeasured positions of a function f are irrelevant for the definition of f to be compatible with the reduction ordering. B. The unmeasured positions of a function f are those that are not put to pieces by this function f in its recursive cases. By `putting to pieces' we mean the application of a destructor that has a somehow inverse constructor. Such destructors are sometimes called selectors. (E. g. : CAR has constructor CONS, SUB1 has constructor ADD1; but DELETE has no inverse constructor.) (Note that this misshaped double use of the same concept is typical for BM's AI-approach to the problem, according to the motto: ``Better a misshaped concept than one one does not know how to work with.'' A rather nice example on which the two concepts (A) and (B) differ is given by BM on page 176: There the function LESSP allows either its first or its second position to be a measured subset. This is according to (A), because (A) is BM's means to get the concept working. However, for the induction proof it is more important to know which positions are taken to pieces in the sense of (B). Hence, they are actually mislead to do an induction proof on the second position because it forms a measured subset for its own. Luckily, this misleading does not turn out to be harmful because the suggestion is right although the argumentation is wrong. Their lemma is (NOT (LESSP (PLUS I J) I)). So induction on I is right because this is suggested by PLUS, which recurses on the first argument. However, consider (NOT (LESSP (PLUS J I) I)). This needs induction on J, and the BM-prover really discovers this (see file `exampleforfive'), because the induction on I is `flawed'. (Note: This is a very nice example for the BM prover finding a necessary lemma by generalization.) So it is not so easy to fool him; even after some effort (file `secondexampleforfive'). Of course one can fool him (file `modifiedexampleforfive'), but only by mean tricks. Conclusion: The conceptual difference between (A) and (B) does not seem to be very important for simple examples in practice, whereas it should not be ignored. An example where the attempt to realize (B) by (A) fails is very likely to be corrected by `flaw'-detection.) (A) is of importance for the guidance of inductive proofs: Instead of just trying to ripple-out the wave-fronts one is suggested to store the wave-fronts into the unmeasured positions. Compared to the more general approach of an enlargable reduction ordering, this is a rather primitive way of arguing; --- however, it has the advantage of giving a hint that can lead to an operationalization of the argumentation, which is not provided by the more general approach. (B) is most clearly exhibited when one uses the constructor-fashion of definition instead of the destructor-fashion. So if we use the constructor-fashion, we get the information most easily without the concept of measured positions. 6. MEASURED POSITIONS: CHANGING OR UNCHANGING The measured positions, i. e. the members of a set of measured positions, are separated into `changing' and `unchanging' ones. Position i of an n-ary function f with $fx_{0}...x_{n-1}$ as left-hand side of definition is `unchanging' :iff for all subterms of the right-hand side of the definition of f that have the form $ft_{0}...t_{n-1}$ we have $t_{i}=x_{i)$ syntactically. What is the use of this separation? The technical aim is to allow additional induction schemes by weakening the restrictions for the unchanging variables. In terms of the two separated cancepts (A) and (B) of (5.) the reason can be described the following way: The (measured) unchanging positions are relevant for the induction ordering. Hence they do not belong to (A). However, they are not put to pieces in the function's recursive cases and therefore do belong to (B). Therefore this distinction can be thought of as a step from the operationalizable complement of (A) to the intended complement of (B). However unchanging measured positions are not the only kind of positions that are needed for the induction ordering (i. e. do not belong to (A)) but are not put to pieces in the recursive cases of the function (i. e. do belong to (B)). Therefore, speaking in terms of my DA, one really should use (B) for choosing covering sets of substitutions, and separately use (A) for guiding the proof towards induction hypotheses that are smaller than the conclusion. II. REMARKS TO CHAPTER XV (FROM INDUCTION TEMPLATES TO INDUCTION SCHEMES) What was computed for each defined function symbol f in the previous chapter BM and we will now call a set of `induction templates' (page 183). By an induction template we mean a pair consisting of i. something that does a trichotomy of the positions of f into unmeasured, unchanging measured, and changing measured ones. ii. the revised machine for the measure leading to (i.). Now the general idea of generating induction schemes from these sets of induction templates is a three step process explained in this chapter: 1. Generate for each subterm of the conjecture and for each induction template of this subterm's top-symbol an induction scheme if possible. 2. Remove each induction scheme `subsumed' by another, and substitute each pair of induction schemes of (1.) by their `merge' if possible. 3. Heuristically choose one of the induction schemes of (2.) for the proof attempt for the conjecture. The conceptual approach of BM is again rather ugly (but there does not seem to be a more beautiful way): Instead of dealing with induction over variables, which is the most natural and theoretically necessary way to tackle the problem, what BM actually do is thinking in terms of positions instead of variables and then afterwards correcting all the discrepancies of the two concepts. (In this correction process they forget one discrepancy exhibited by the funny example of the file `bmmishit'.) HOW BM TACKLE (1.)(GENERATION OF INDUCTION SCHEMES) An induction scheme for a fixed subterm TERM of a conjecture CONJ and a fixed induction template of this subterm's top-symbol f is obtained the following way: Instantiate the formals of the function f according to the actuals (i. e. the direct subterms of the fixed subterm TERM of the conjecture CONJ) by a substitution $\gamma$. For each case CASE of the revised machine of the fixed induction template generate a proof subgoal of the form ( ( CASE$\gamma$ $\wedge$ CONJ$\sigma_{0}$ $\wedge$ .... $\wedge$ CONJ$\sigma_{m}$ ) --> CONJ ). CONJ is the conjecture one wants to prove. $\gamma$ is the substitution that maps the (naturally distinct) formals of the function f to its actuals in the current subterm. Let the case CASE guard (m+1) calls to the function f, namely: $ft_{0,0}...t{n-1,0}$; . . . $ft_{0,m}...t{n-1,m}$. For each j < (m+1): $\sigma_{j}$ is a substitution whose domain are those actuals which are variables. How to construct $\sigma_{j}$ form $\gamma$ and $ft_{0,j}...t{n-1,j}$ is the central problem of the whole process of (1.). In the sequel we describe BM's procedure for a fixed j of a fixed CASE (of a fixed induction template for a fixed subterm TERM of the conjecture CONJ). Note, however, that the induction scheme consists of all the induction subgoals of the template plus the base subgoals, which say that the conjunction of the negated cases [CASE] of the inductive subgoals implies the conjecture. Let $\mu$ be the substitution that maps each of the formals $x_{0},...,x_{n-1}$ of $fx_{0}...x_{n-1}$ to the term that occupies its position in $ft_{0,j}...t{n-1,j}$; i. e. $x_{i}\mu := t_{i,j}$. Now $\sigma_{j}$ is given by mapping (for each formal $x_{i}$ for which $x_{i}\gamma$ is a variable): $x_{i}\gamma$ to $x_{i}\mu\gamma$. If all actuals are distinct variables, everything is fine. But two problems can occur: A. The actual $x_{i}\gamma$ of a formal $x_{i}$ is not a variable. B. The actual is a variable which is an actual of another formal, too. TO A: Three cases: a. i is a changing measured position: BM say that the induction template does not apply to the subterm TERM. This is reasonable because we do not know how to put $x_{i}\gamma$ to pieces in this case, and that is exactly what we should do when the position i is in the complement of (B) of (5.) of (I.). So instead of pushing down a selector on something it is usually better to have this something sending up the constructor. For non-selector destructors (e.g.: DELETE), however, it might be necessary to send the destructor down, which the BM-prover cannot do. Hence assume in the sequel that for each changing measured i: $x_{i}\gamma$ to be a variable. b. i is an unchanging measured position: Two cases: alpha. $x_{i}\gamma$ contains a variable of the form $x_{k}\gamma$ where k is a changing measured position: BM say that the induction template does not apply to the subterm TERM. This is forced by their way of induction foundation because the changing measured k must change in some case CASE' of the revised machine of the induction template: In this case CASE' the foundation of the induction will be destroyed. Even when using a more general means of foundation of induction, it is not a bad heuristic not to try an induction here: One should generalize on the occurrence of $x_{k}\gamma$ in $x_{i}\gamma$ before an induction attempt or try to prove a lemma that $x_{i}\gamma$ is insensitive on the change of $x_{k}\gamma$ to $x_{k}\mu\gamma$. beta. $x_{i}\gamma$ contains no variable of the form $x_{k}\gamma$ where k is a changing measured position: This is allowed by BM ($\sigma_{j}$ is undefined on $x_{i}\gamma$ then), but it possibly changes $\sigma_{j}$ in the following way: If k is an unmeasured position and $x_{k}\gamma$ is a variable that occurs in $x_{i}\gamma$, then $x_{k}\gamma\sigma_{j}$ is $x_{k}\gamma$ instead of $x_{k}\mu\gamma$. This is heuristically good, but one is tempted not to precompute an induction scheme here (and in other cases). c. i is an unmeasured position: This is allowed by BM. (But $\sigma_{j}$ is undefined on $x_{i}\gamma$ of course.) TO B: Let i and k be positions of f with $x_{i}\gamma = x_{k}\gamma$ a variable. Six cases: a. i and k are changing measured positions: BM say the induction template not to be appropriate. This is stupid as exhibited by the example of the file `bmmishit'. The conceptually fitting way is to say the induction template to be appropriate here iff $x_{i}\mu\gamma = x_{k}\mu\gamma$. b. i and k are unchanging measured positions: Because of $x_{i}\mu\gamma = x_{i}\gamma = x_{j}\gamma = x_{j}\mu\gamma$ this no problem at all. c. i and k are unmeasured positions: BM arbitrarily choose one of $x_{i}\mu\gamma$, $x_{k}\mu\gamma$ for the value of $x_{i}\gamma$ under $\sigma_{j}$. A better idea is to define two substitutions instead of the single $\sigma_{j}$, one for each suggestion. d. unmeasured / changing-measured: The value of $x_{i}\gamma$ under $\sigma_{j}$ is determined by the changing measured position only. e. unchanging-measured / changing-measured: In this case BM says that the template does not apply. The problem is exactly the one of A.b.alpha. f. unchanging-measured / unmeasured: Treated analogously to A.b.beta. HOW BM TACKLE (2.)(SUBSUMPTION AND MERGING OF SCHEMES) BM start with the introduction of further notations: For an induction scheme $s$ generated as described above the following sets are defined: ${\rm CHANGING}(s)$ is the union over all inductive subgoals of the union over all j of the Domains of the $\sigma_{j}$. ${\rm UNCHANGING}(s) := \var{TERM}\setminus{\rm CHANGING}(s)$ 1. SUBSUMPTION OF SCHEMES If an induction scheme is `subsumed' by another it is thrown away and this other scheme is given additional credit. BM try to explain what they mean by $s_{1}$ being `scheme-subsumed' by $s_{2}$ but we are not sure to understand it; it is something like a conjunction of this: (1) ${\rm CHANGING}(s_{1}) \ \subseteq\ {\rm CHANGING}(s_{2})$ (2) ${\rm UNCHANGING}(s_{2})\ \subseteq\ {\rm UNCHANGING}(s_{2})$ (3) Each inductive subgoal of $s_{1}$ is subgoal-subsumed by an inductive subgoal of $s_{2}$. (4) No inductive subgoal of $s_{2}$ subgoal-subsumes more than one inductive subgoal of $s_{1}$. (Aside: We cannot see any heuristic reason for (4)! Do You? Furthermore, (4) can only occur after revising the machine, which we do not think of that it should be done.) An inductive subgoal $g_{1}$ is subgoal-subsumed by an inductive subgoal $g_{2}$ :iff (1) The terms in the conjunction $CASE\gamma$ of $g_{1}$ are a subset of those in $g_{2}$. (2) Each substitution $\sigma_{j}$ of $g_{1}$ is substitution-subsumed by some such substitution of $g_{2}. (3) No substitution $\sigma_{j}$ of $g_{2}$ substitution-subsumes more than one such substitution of $\sigma_{j}$. (Asides: 1. We cannot see any heuristic reason for (3)! Do You? 2. Funny enough, [Walther1] treats (1) just the other way round, and what's more, has very good reasons for this!!!) Now what is `substitution-subsumption'? According to our opinion, it should be something like equality on the induction variables of the subsumer. BM and [Stevens] see this different and if you want to know their opinion You have to read the following three complicated paragraphs: The following definition is wrong because it thinks in terms of constructors substituted into the conclusion instead of destructors substituted into the hypotheses, but it is easier to state it this way: A substitution $\sigma$ is substitution-subsumed by a substitution $\sigma'$ :(bugged!)iff there is a substitution $\kappa$ with $\sigma\kappa = \sigma'$. Note that this last definition is just the reverse of what one usually expects from subsumption. A proper definition is: A substitution $\sigma$ is substitution-subsumed by a substitution $\sigma'$ :iff for each term $t'$ in the range of $\sigma'$ there is a set of positions P such that (a) For each variable-position v in $t'$ there are p,q with v=pq and $p\in P$. (b) For each $p\in P$ there is a term $t$ in the range of $\sigma$ such that $t'/p=t$ syntactically. [Stevens] understand BM in a different way: They require $\sigma=\sigma'$ here but allow `repeated' schemes of the subsumee's: A repeated scheme SR for a function f is either an induction scheme for the function f or the result of expanding a machine M for the function f on one of the subgoals of a repeated scheme R for the function f. Expanding a machine M on the subgoal $g$ is done by instantiating the formals of M according to some ${\rm TERM}\sigma_{k}$ of $g$ and then by replacing $g$ by a new subgoal for each case CASE of M which adds this CASE to that one of $g$ and takes over all $\sigma_{j}$ from $g$ except for $\sigma_{k}$, which is changed according to the recursive calls of CASE. [Walther1](Related Work) understand BM's explanation different again (all $t$ in the range of $\sigma$ occur as subterms in the range of $\sigam'$). In their own work on the subject [Walther1], however, require $\sigma=\sigma'$ on the induction variables of $s_{2}$. Their treatment allows the conclusion that the subsumption by `repeated machines' is rubbish anyway (See chapter III). So do not be sorry for not having understood the last paragraphs. 2.MERGING OF SCHEMES Because BM precompute the induction scheme they must be very sophisticated in guessing the right scheme. One step towards intelligent guessing is the `merging' of schemes, which replaces two schemes (one mergable into the other) from the set of schemes under choice by their merge. Roughly speaking, two induction schemes merge, if their CHANGINGs overlap in a way that is compatible in the sense that the suggested changes in the intersection of the CHANGINGs are identical and the remaining CHANGING-variables are not among the other scheme's UNCHANGINGs. Scheme $s_{1}$ is mergable into scheme $s_{2}$ :iff (1) ${\rm CHANGING} (s_{1})\cap{\rm CHANGING} (s_{2})\not=\emptyset$ (2) ${\rm UNCHANGING}(s_{1})\cap{\rm CHANGING} (s_{2}) =\emptyset$ (3) ${\rm CHANGING} (s_{1})\cap{\rm UNCHANGING}(s_{2}) =\emptyset$ (4) There is an injective mapping INJG from the inductive subgoals of $s_{1}$ into the inductive subgoals of $s_{2}$ such that for each inductive subgoal $g$ of $s_{1}$: $g$ is subgoal-mergeable into ${\rm INJG}(g)$. (Aside: We do not see reason for the injectivity of (4)! Do You?) Inductive subgoal $g$ is subgoal-mergable into inductive subgoal $g'$ :iff there is some injective mapping INJS from the substitutions $\sigma_{j}$ of $g$ into these substitutions of $g'$ such that for each substitution $\sigma_{j}$ of $g$: $\sigma_{j}$ and ${\rm INJS}(\sigma_{j})$ are substitution-mergable. (Aside: We see no reason for the injectivity! Do You? Note however: Even if injectivity is not needed at all, BM cannot come to a symmetric kind of merging, where the unmerged subgoals of both induction schemes are taken over unchanged into the merged induction scheme (and so are the unmerged substituitons.) This because one of the old schemes must lay the foundation for the induction by its measure, and it is not enough to decrease w.r.t. one measure or another.) Two substitutions are substitution-mergable :iff (1) Their restrictions to the intersection of the sets of variables of their terms TERM are identical. (2) The intersection of their Domains is not empty. The result $s''$ of the merging is similar to $s_{2}$ but with the following exceptions: (1) ${\rm CHANGING} (s'') := {\rm CHANGING} (s_{1}) \cup {\rm CHANGING} (s_{2})$ (2) ${\rm UNCHANGING}(s'') := {\rm UNCHANGING}(s_{1}) \cup {\rm UNCHANGING}(s_{2})$ Note that CHANGING and UNCHANGING still stay disjoint. (3) For each (i. e. not only for the ones suggested by INJG) inductive subgoal $g$ of $s_{1}$ that is subgoal-mergeable into a subgoal $g'$ of $s_{2}$ we merge this subgoal $g$ into $g'$, yielding a new subgoal $g''$ which differs from $g'$ by: (a) ${\rm CASE}(g'') := {\rm CASE}(g)\ \wedge\ {\rm CASE}(g')$ (b) Each substitution $\sigma$ from $g$ is merged into each substitution $\sigma'$ of $g'$ into which it is substitution-mergeable, yielding (for each merge) a substitution $\sigma''$ of $g''$ which is the union of $\sigma$ and $\sigma'$. (4) (Surprisingly found on page 196.) If now the CASE of two subgoals consists of the same conjuncted terms, the two subgoals are merged into one that conjuncts all conjuncted terms of the premises of the two subgoals in its premise. This becomes necessary indeed. For an example and a more generally applicable solution of this problem see [Walther1](Separated Relation Descriptions). (5) The base subgoals are reconstructed from the new inductive subgoals. Now $s''$ is sound for the same reason as $s_{2}$: - Completeness of case distinction is given because of reconstruction of base subgoals. - Making CASE stronger cannot do any harm to the foundation of the induction. - The possibly additional substitutions occur only outside of TERM of $s_{2}$ because of (1) of the definition of substitution-mergeable and (2),(3) of the definition of mergable. [Stevens] recognized BM's approach to be too simple in not allowing repeated schemes in place of the scheme $s_{1}$. They give a rather nice example for this. Furthermore their approach allows to treat subsumption and merging in a uniform way (for which they have to throw away the injectivities (as suggested by us)). Note: Having read this one might assume that the merging of schemes is of no use when one does not want to precompute the induction scheme. However the next section about the FLAWED-heuristic makes sense even when not precomputing induction schemes and is majorly influenced by the assumption of all subsumption removed and all merging done. HOW BM TACKLE (3.)(HEURISTIC SELECTION OF AN INDUCTION SCHEME) 1. FLAWED INDUCTION SCHEMES Some schemes are said to be `flawed'. Unflawed scemes are prefered to flawed ones. Roughly speaking, we guess a flawed scheme should be one that overlaps with its CHANGING-variables into another schemes CHANGING- or UNCHANGING- variables in a mode that does not allow merging because of a clash on the CHANGING-variables or an overlaping of the CHANGING- variables into the other's UNCHANGING-variables; whereas BM's definition is a little different and a lot obscure. A scheme $s$ is flawed :iff there is another scheme $s'$ with $({\rm CHANGING}(s') \cup {\rm UNCHANGING}(s')) \cap {\rm IND-VAR}(s) \not= \emptyset$; where the set of `induction-variables' of $s$ is defined by ${\rm IND-VAR}(s) := ???$. ??? seems to be just the set of actuals of the changing-measured positions that have lead to the generation of $s$. However, we do not know whether we understand this correctly and how to find the induction-variables of merged schemes. For making sense it is necessary that the merging has been completed before, i. e. $s$ and $s'$ are not mergeable. [Stevens] try to explain what is good with BM's flawing concept and what is bad. Their treatment, however, is not enlightening. We keep thinking our initial (roughly speaking) formulation of the problem to be the adequate one: On the one hand the more sophisticated treatments are not convincing. On the other there are shortcomings (i.e. certain possibilities for induction are not considered at all) in the concept which should be removed before introducing further sophistication. 2. SCORES If telling apart the flawed from the unflawed schemes does not settle the score, then the remaining induction schemes are chosen by an attribute called SCORE that is constructed along with the scheme construction (in a manner we do not understand) and manipulated when merging (new score is sum of old ones) and subsuming (subsumer's score is increased by subsumee's one). Highest score is winner. If several winners: Do the induction on the scheme derived from the template whose term TERM contains the biggest number of `nasty' functions in the conjecture. This tends to remove many nasty functions after simplification of the inductive subgoal: Because something like TERM tends to appear in premise and conclusion it is apt to be generalized or cross-fertilized away. A function is said to be `non-nasty' :iff it does to its positions in its recursive cases nothing or selector application. If all this does not give a single winner, BM claim the conjecture to be most likely to be symmetric (whatever this means) and choose arbitrarily. III. SOME REMARKS ON [Walther1] and [Walther2] [Walther1,Walther2] also try to tackle the problems described here. They describe the inductive subgoals of induction schemes by so-called `atomic relational descriptions' and the whole induction scheme by a `(compound) relational description' which is the set of its inductive subgoals, which they do not require to have disjoint cases CASE\gamma; however, for the following reduction to orderings described by these compound relational descriptions they have to require this and achieve this by what they call `separation' and has to deal with (4) in the description of BM's merging process. Their general idea is to reduce the problems of subsumption and merging to problems of an ordering defined by the compound relational description. This reduction is a good idea for the simplification of the problems --- even though it throws away the following kind of information: For an induction to succeed it is not only necessary to have an induction scheme that allows sufficiently strong inductive hypotheses (the strength measured by all constructor ground instantiations to the scheme) (Note: This means that their strength is something inductive instead of deductive), but also to supply syntactic (deductive) means to use this strength for the induction proof. Therefore they better had defined their strength on terms instead of the objects of their model. This, however, does not fit into their world and their approach may be sufficient for practical purposes because the means to compare the strengthes is the theorem prover again (This means: If the being stronger of one induction scheme than another does not allow to conclude that it is more appropriate for induction, then this being stronger is most unlikely to be detected.). As this last flaw seems to be difficult to understand we restate it now: Aim: Throw out of a set of induction schemes those induction schemes for which there is another induction scheme in this set which is guaranteed to work at least as well. s `working as least as well' as s' means: If s' is deductively valid then s is deductively valid, too. Maybe [Walter] think this a little bit different by `deductively valid' replaced by `provable by our theorem prover'. But this theorem prover is a first order theorem prover (at least on the first level until it decides to induct again). [Walther1] differ from BM by doing the subsumption and the merging on induction schemes that do not determine special values for the unmeasured variables but give don't-care-variables instead, which are considered to be renamed for each appearance. By this they correct an up to now undetected flaw in BM's recursion analysis. The insight to this flaw comes from their problem reduction process. So while [Walther1] has the distinction into measured and unmeasured positions, they do not need the distinction of changing/unchanging variables of the induction scheme. This approach makes their subsumption a very straightforward heuristic compared to which the approach of BM is really obscure. Ulrich, however, draw our attention to the following problem: [Walther] have their unmeasured variables unversally quantified in the premise of the induction scheme. Drawing these quantifiers out they turn out to be existential quantifiers. Now while INKA of [Walther] allows existential quantifiers, BM do not. So before BM can prove the formula chosen by the heuristic of [Walther], each of the existentially quantified variables must be replaced by some term with globally universally quantified variables. These terms must be chosen the way BM choose the terms for the unmeasured variables when generating their induction schemes from their induction templates. But even when limited to the choose of induction schemes, [Walter] means an possible improvement to BM: If one must throw away information one is really interested to keep, one should do it als late as possible: So first choose (using the more adequate version of [Walther] including existential quantifiers) and then throw away some information for fitting the schemes to the BM who does not know existential quantifers. But a prover for inductive truths should better be able to include this very restricted form of existential quantifiers --- but it is not clear how to do this. Furthermore, [Walther1] give a straightforward explanation (under the headline: Range Generalization) for the revised machines of BM. [Walther1] do not treat the concept of repeated machines of BM or [Stevens]. E.g. on (NOT (LESSP I (HALF I))) the cannot tell (NOT (ZEROP I)) {{I-->(SUB1 I)} } to be worse than ((NOT (ZEROP I)) (NOT (ZEROP (SUB1 I)))) {{I-->(SUB1 (SUB1 I))} } by subsumption as BM and [Stevens] can. Knowing that the second one works well and that the first does not, one might be disappointed by this result. Contrariwise, we are happy with this result because we always thought that the two above should be merged into ((NOT (ZEROP I)) (NOT (ZEROP (SUB1 I)))) {{I-->(SUB1 (SUB1 I))},{I-->(SUB1 I)}} which is shurely better than both above. Hence we would like it to subsume the former ones. So let us see what the other authors subsumption concepts achieve on this: BM's approach recognizes both subsumptions, which is not surprising because BM (while not knowing any reason for) tend to work well with simple examples. [Walther1] however fail on the subsumption of the first. That is because his subsumption concept (which has much more reason that BM's) is not strong enough: It lacks a complete case distinction for the subsumee's case. Instead of being sorry for the case of the first induction scheme not implying the case of the last, [Walther1] should use the case distinction: (NOT (ZEROP (SUB1 I))) (NOT (NOT (ZEROP (SUB1 I)))). For the first case they then can establish their subsumption condition at once. For the second case [Walther1] get (EQUAL I 1) which means that this is no inductive subgoal anymore and need not be tested for. Further problems not explained in [Walther1] are: Distinction changing/unchanging for measured positions (not to be confused with the omission of changing/unchanging variables of induction schemes, which is done on purpose) Merging of induction schemes Flawed induction schemes In [Walther2] their is a kind of merging given called `separated union'. For our two example schemes (NOT (ZEROP I)) { {I-->(SUB1 I)}} ((NOT (ZEROP I)) (NOT (ZEROP (SUB1 I)))) {{I-->(SUB1 (SUB1 I))} } this separated union is ((NOT (ZEROP I)) (NOT (ZEROP (SUB1 I)))) {{I-->(SUB1 (SUB1 I))},{I-->(SUB1 I)}} ((NOT (ZEROP I)) (ZEROP (SUB1 I)) ) { {I-->(SUB1 I)}} If we reconsider our subsumption problem now, we see that the case distinction still must be added to the subsumption of [Walther1], whereas the second (artifically) inductive subgoal frees us from the recognition problem that (AND (NOT (ZEROP I)) (ZEROP (SUB1 I)) ) is no inductive case anymore. For this kind of merging, however, we have to do a termination proof, which is not difficult here because the one is in the transitive closure of the other. In general this termination proof is difficult. [Walter2] still leaves open the concepts of unchanging measured positions, which we believe can be easily included to their theory, and of flawed induction schemes. The property of being flawed has two reasons in BM: Firstly, to correct the misleading hints of their bad kind of merging and subsumption. This is not needed anymore. Secondly, it establishes a kind of measure how troublesome an induction will get by instantiating variables of terms not in the focus of this recursion analysis. This heuristically important measure has yet to be established by [Walther]. We suppose they have not done it yet because it has nothing to do with the ordering aspect of inductive proof. But the fog having cleared up in front of the flaw-heuristic that prevented understanding and improvement of the BM heuristics for more than a decade, one can hope to get further on this last subject now.