Aesop
This is a work-in-progress proof search tactic for Lean 4.
Building
With elan installed, lake build
should suffice.
Adding Aesop to Your Project
To use Aesop in a Lean 4 project, first add this package as a dependency. In
your lakefile.lean, add
dependencies := #[
{ name := `aesop
src := Source.git "https://github.com/JLimperg/aesop" "" }
]Now run lake configure. Unless you use the exact same Lean 4 nightly as this
project (see our lean-toolchain), you'll get a warning. If you use a later
nightly, you'll probably be fine and Aesop will compile anyway. If not, please
open an issue and we'll update the tactic.
You should now be able to use the aesop tactic:
import Aesop
example : α → α :=
by aesopTODO Document (and, indeed, find out) how to use Aesop as a plugin.
Usage
This section is intended for prospective Aesop users. Aesop is very work in progress and this text may be out of date, so if you have questions, please ping me (Jannis Limperg) on the Lean Zulip.
Quickstart
To get you started, I'll explain Aesop's major concepts with a series of examples. A more thorough, reference-style discussion of follows in the next section.
We first define our own version of lists (so as not to clash with the standard
library) and an append function:
inductive MyList (α : Type _)
| nil
| cons (hd : α) (tl : MyList α)
namespace MyList
protected def append : (_ _ : MyList α) → MyList α
| nil, ys => ys
| cons x xs, ys => cons x (MyList.append xs ys)
instance : Append (MyList α) :=
⟨MyList.append⟩We also tell simp to unfold applications of append. aesop uses the default
simp set when it normalises a goal, so it will automatically pick up these rules
as well.
@[simp]
theorem nil_append : nil ++ xs = xs := rfl
@[simp]
theorem cons_append : cons x xs ++ ys = cons x (xs ++ ys) := rflNow we define the NonEmpty predicate on MyList:
declare_aesop_rule_sets [MyList.NonEmpty]
@[aesop safe (rule_sets [MyList.NonEmpty]) [constructors, cases]]
inductive NonEmpty : MyList α → Prop
| cons : NonEmpty (cons x xs)Here we see the first Aesop features. The declare_aesop_rule_sets command is
used to, well, declare Aesop rule sets. These are collections of rules,
which roughly correspond to tactics. When Aesop searches for a proof, it
systematically applies each available rule, then recursively searches for proofs
of the subgoals generated by the rule, and so on. A goal is proved when Aesop
applies a rule that generates no subgoals.
The @[aesop] attribute adds two rules for NonEmpty to our new rule set:
- A
constructorsrule. This rule tries to apply each constructor ofNonEmptywhenever a goal has targetNonEmpty _. - A
casesrule. This rule searches for hypothesesh : NonEmpty _and performs case analysis on them (like thecasestactic).
constructors and cases are examples of rule builders, which are
functions that turn a declaration into a rule.
Both rules above are added as safe rules. When a safe rule succeeds on a goal encountered during the proof search, it is applied and the goal is never revisited thereafter. In other words, the search does not backtrack safe rules. We will later see unsafe rules, which can backtrack.
With these rules, we can prove a theorem about NonEmpty and append:
@[aesop safe apply]
theorem nonEmpty_append {xs : MyList α} ys :
NonEmpty xs → NonEmpty (xs ++ ys) := by
aesop (rule_sets [MyList.NonEmpty])Here we call Aesop with the MyList.NonEmpty rule set, plus some default rule
sets which are implicitly enabled (but can be disabled). Aesop finds a proof by
introducing the hypothesis h : NonEmpty xs, performing case analysis on h
and applying the NonEmpty.cons constructor.
If you want to see how Aesop proves your goal (or more likely, why it doesn't prove your goal or why it takes too long to do so), you can enable tracing:
set_option trace.aesop.steps trueThis makes Aesop print out the steps it takes while searching for a proof. There
are also other tracing options which autocompletion should show you if you type
set_option trace.aesop.
The @[aesop] attribute on nonEmpty_append adds this lemma as a safe rule to
the default rule set. For this rule we use the apply rule builder, which
generates a rule that tries to apply nonEmpty_append whenever the target is of
the form NonEmpty (_ ++ _).
After adding nonEmpty_append, Aesop can now prove consequences of this lemma:
example {α : Type u} {xs : MyList α} ys zs :
NonEmpty xs → NonEmpty (xs ++ ys ++ zs) := by
aesopNext, we prove another simple theorem about NonEmpty:
theorem nil_not_NonEmpty (xs : MyList α) : xs = nil → ¬ NonEmpty xs := by
aesop (add unsafe 10% cases MyList, norm unfold Not)
(rule_sets [MyList.NonEmpty])Here, we add two rules in an add clause. These rules are not part of a
rule set but are added for this Aesop call only. They demonstrate two new
features:
-
Unsafe rules are rules which can backtrack. Here we add
cases MyListas an unsafe rule because when we performcaseson a hypothesisxs : MyList α, we gethd : αandtl : MyList α. If the rule was safe, we could easily get into an infinite loop where we callcasesontlagain, and so on. Unsafe rules, by contrast, are applied 'in parallel', so while we might still end up exploring a proof attempt that performscases20 times, other possible rules will also be considered and will hopefully lead us down a more fruitful path.To prioritise these different proof attempts, every unsafe rules is annotated with a success probability, here 10%. This should be an estimate of how likely it is that the rule will to lead to a successful proof. This estimate is used to prioritise goals: the initial goal starts with a priority of 100% and whenever we apply an unsafe rule, the priority of its subgoals is the priority of its parent goal times the success probability of the applied rule. So applying our
cases MyListrule repeatedly would give us goals with priority 10%, 1%, 0.1% etc. Of course, the success probability of a rule is only a ballpark figure; it is not supposed to be precise in any sense. -
Norm rules are yet another class of rules. They are used to normalise the goal before any other rules are applied. As part of this normalisation process, we run a variant of
simp_allwith the globalsimpset plus Aesop-specific simp lemmas. Theunfoldbuilder adds such an Aesop-specific simp lemma which unfolds theNotdefinition. (This is only necessary because the builtin rules currently do not handle negation very well.)
Here are some other examples where normalisation comes in handy:
@[aesop norm]
theorem nil_append (xs : MyList α) : nil ++ xs = xs := rfl
@[aesop norm]
theorem cons_append (x : α) xs ys : cons x xs ++ ys = cons x (xs ++ ys) := rfl
@[aesop norm]
theorem append_nil {xs : MyList α} :
xs ++ nil = xs := by
induction xs <;> aesop
theorem append_assoc {xs ys zs : MyList α} :
(xs ++ ys) ++ zs = xs ++ (ys ++ zs) := by
induction xs <;> aesopAfter we've added unfolding lemmas for append, Aesop can prove theorems about
this function more or less by itself. (For these examples, simp would already
suffice.) However, we still need to perform induction explicitly. This is a
deliberate design choice: while there are some techniques for automating
induction, they are complex and not entirely reliable, so I don't think it makes
sense to implement them in a system where users can easily perform induction
themselves.
More examples may be found in the tests folder of this repository.
Reference
This section contains a systematic and fairly comprehensive account of how Aesop operates.
Rules
A rule is a tactic plus some associated metadata. Rules come in three flavours:
-
Normalisation rules (keyword
norm) must generate zero or one subgoal. (Zero means that the rule closed the goal). Each normalisation rule is associated with an integer penalty (default 1). Normalisation rules are applied in a fixpoint loop in order of penalty, lowest first. For rules with equal penalties, the order is unspecified. See below for details on the normalisation algorithm.Normalisation rules can also be simp lemmas. These are constructed with the
unfoldorsimpbuilder. They are used by a specialsimpcall during the normalisation process. -
Safe rules (keyword
safe) are applied after normalisation but before any unsafe rules. When a safe rule is successfully applied to a goal, the goal becomes inactive, meaning no other rules are considered for it. Like normalisation rules, safe rules are associated with a penalty (default 1) which determines the order in which the rules are tried.Safe rules should be provability-preserving, meaning that if a goal is provable and we apply a safe rule to it, the generated subgoals should still be provable. This is a less precise notion than it may appear since what is provable depends on the entire Aesop rule set.
-
Unsafe rules (keyword
unsafe) are tried only if all available safe rules have failed on a goal. When an unsafe rule is applied to a goal, the goal is not marked as inactive, so other (unsafe) rules may be applied to it. These rule applications are considered independently until one of them proves the goal (or until we've exhausted all available rules and determine that the goal is not provable with the current rule set).Each unsafe rule has a success probability between 0% and 100%. These probabilities are used to determine the priority of a goal. The initial goal has priority 100% and whenever we apply an unsafe rule, the priorities of its subgoals are the priority of the rule's parent goal times the rule's success probability. Safe rules are treated as having 100% success probability.
Rules can also be multi-rules. These are rules which 'nondeterministically'
apply multiple tactics to the goal. Each of these tactics is then considered as
one possible way to solve the goal. For example, registering the constructors of
the Or type will generate a multi-rule that, given a goal with target A ∨ B,
generates one rule application with goal A and one with goal B.
Search Tree
Aesop's central data structure is a search tree. This tree alternates between two kinds of nodes:
- Goal nodes: these nodes store a goal, plus metadata relevant to the search. The parent and children of a goal node are rule application nodes. In particular, each goal node has a priority between 0% and 100%.
- Rule application ('rapp') nodes: these goals store a rule (plus metadata).
The parent and children of a rapp node are goal nodes. When the search tree
contains a rapp node with rule
r, parentpand childrenc₁, ..., cₙ, this means that the tactic of rulerwas applied to the goal ofp, generating the subgoals of thecᵢ.
When a goal node has multiple child rapp nodes, we have a choice of how to solve the goals. This makes the tree an AND-OR tree: to prove a rapp, all its child goals must be proved; to prove a goal, any of its child rapps must be proved.
Search
We start with a search tree containing a single goal node. This node's goal is the goal which Aesop is supposed to solve. Then we perform the following steps in a loop, stopping if (a) the root goal has been proved; (b) the root goal becomes unprovable; or (c) one of Aesop's rule limits has been reached. (There are configurable limits on, e.g., the total number of rules applied or the search depth.)
-
Pick the highest-priority active goal node
G. Roughly speaking, a goal node is active if it is not proved and we haven't yet applied all possible rules to it. -
If the goal of
Ghas not been normalised yet, normalise it. That means we run the following normalisation loop:- Run the normalisation rules with negative penalty (lowest penalty first). If any of these rules is successful, restart the normalisation loop with the goal produced by the rule.
- Run
simpon all hypotheses and the target, using the global simp set (i.e. lemmas tagged@[simp]) plus Aesop'ssimprules. - Run the normalisation rules with positive penalty (lowest penalty first). If any of these rules is successful, restart the normalisation loop.
The loop ends when all normalisation rules fail. It destructively updates the goal of
G(and may prove it outright). -
If we haven't tried to apply the safe rules to the goal of
Gyet, try to apply each safe rule (lowest penalty first). As soon as a rule succeeds, add the corresponding rapp and child goals to the tree and markGas inactive. The child goals receive the same priority asG. -
Otherwise there is at least one unsafe rule that hasn't been tried on
Gyet (or elseGwould have been inactive). Try the unsafe rule with the highest success probability and if it succeeds, add the corresponding rapp and child goals to the tree. The child goals receive the priority ofGtimes the success probability of the applied rule.
A goal is unprovable if we have applied all possible rules to it and all resulting child rapps are unprovable. A rapp is unprovable if any of its subgoals is unprovable.
During the search, a goal or rapp can also become irrelevant, meaning its provability has no impact on the success of the overall proof search. More formally, irrelevance is characterised by the following conditions:
- A goal is irrelevant if its parent rapp is unprovable. (This means that a sibling of the goal is already unprovable, in which case we already know that the parent rapp will never be proved.)
- A rapp is irrelevant if its parent goal is proved. (This means that a sibling of the rapp is already proved, and we only need one proof.)
- A goal or rapp is irrelevant if any of its ancestors is irrelevant.
Rule Builders
A rule builder is a metaprogram that turns a declaration or hypothesis into an Aesop rule. Currently available builders are:
-
apply: generates a rule which tries to apply the given declaration or hypothesisxto the target. The rule acts like the tacticapply x. -
forward: when applied to a declaration or hypothesis of typeA₁ → ... Aₙ → B, generates a rule which looks for hypothesesh₁ : A₁, ...,hₙ : Aₙin the goal and, if they are found, adds a new hypothesish : B. As an example, consider the lemmaeven_or_odd:even_or_odd : ∀ (n : Nat), Even n ∨ Odd n
Registering this as a forward rule will cause the goal
n : Nat m : Nat ⊢ T
to be transformed into this:
n : Nat hn : Even n ∨ Odd n m : Nat hm : Even m ∨ Odd m ⊢ T
The forward builder may also be given a list of immediate names:
(forward (immediate := [n])) even_or_oddThe immediate names, here
n, refer to the arguments ofeven_or_odd. When Aesop applies a forward rule with explicit immediate names, it only matches the corresponding arguments to hypotheses. (Here,even_or_oddhas only one argument, so there is no difference.)When no immediate names are given, Aesop considers every argument immediate, except for instance arguments and dependent arguments (i.e. those that can be inferred from the types of later arguments).
When a forward rule has been successfully applied, it will not be tried again when processing its subgoals (and their subgoals, etc.). Without this limit, many forward rules would be applied infinitely often.
-
elim: works likeforward, but after the rule has been applied, hypotheses that were used as immediate arguments are cleared. As the name suggests, this is useful when you want to eliminate a hypothesis. E.g. the rule@[aesop norm elim] theorem and_elim_right : α ∧ β → α := ...will cause the goal
h₁ : (α ∧ β) ∧ γ h₂ : δ ∧ εto be transformed into
h₁ : α h₂ : δUnlike with
forwardrules, when anelimrule is successfully applied, it may be applied again to the resulting subgoals (and their subgoals, etc.). There is less danger of infinite cycles because the original hypothesis is cleared.However, if the hypothesis or hypotheses to which the
elimrule is applied have dependencies, they are not cleared. In this case, you'll probably get an infinite cycle. (TODO fix this.) -
constructors: when applied to an inductive type or structureT, generates a rule which tries to apply each constructor ofTto the target. This is a multi-rule, so if multiple constructors apply, they are considered in parallel. If you use this constructor to build an unsafe rule, each constructor application receives the same success probability; if this is not what you want, add separateapplyrules for the constructors. -
cases: when applied to an inductive type or structureT, generates a rule that performs case analysis on every hypothesish : Tin the context. The rule recurses into subgoals, socases Orwill generate 6 goals when applied to a goal with hypothesesh₁ : A ∨ B ∨ Candh₂ : D ∨ E. However, ifTis a recursive type (e.g.List), we only perform case analysis once on each hypothesis. Otherwise we would loop infinitely.The
patternsoption can be used to apply the rule only on hypotheses of a certain shape. E.g. the rule(cases (patterns := [Fin 0])) Finwill perform case analysis only on hypotheses of typeFin 0. Patterns can contain underscores, e.g.0 ≤ _. Multiple patterns can be given (separated by commas); the rule is then applied whenever at least one of the patterns matches a hypothesis. -
simp: when applied to an equationeq : A₁ → ... Aₙ → x = y, registerseqas a simp lemma for the builtin simp pass during normalisation. As such, this builder can only build normalisation rules. -
unfold: when applied to a definition orlethypothesisf, registersfto be unfolded (i.e. replaced with its definition) by the builtin simp pass during normalisation. As such, this builder can only build normalisation rules. -
tactic: takes a tactic and directly turns it into a rule. The given declaration (the builder does not work for hypotheses) must have typeTacticM Unit,Aesop.SimpleRuleTacorAesop.RuleTac. The latter are Aesop data types which associate a tactic with additional metadata; using them may allow the rule to operate somewhat more efficiently.The builder may be given an option
uses_branch_state :=(default true). This indicates whether the given tactic uses the branch state; see below.Rule tactics should not be 'no-ops': if a rule tactic is not applicable to a goal, it should fail rather than return the goal unchanged. All no-op rules waste time and no-op
normrules will send normalisation into an infinite loop.Normalisation rules may not assign metavariables (other than the goal metavariable) or introduce new metavariables (other than the new goal metavariable). This can be a problem because some Lean tactics, e.g.
cases, do so even in cases where you probably would not expect them to. I'm afraid there is currently no good solution for this. -
default: The default builder. This is the builder used when you register a rule without specifying a builder, but you can also use it explicitly. Depending on the rule's phase, the default builder tries different builders, using the first one that works. These builders are:- For
safeandunsaferules:constructors,tactic,apply. - For
normrules:constructors,tactic,simp,apply.
- For
Rule Sets
Rule sets are declared with the command
declare_aesop_rule_sets [r₁, ..., rₙ]where the rᵢ are arbitrary names. To avoid clashes, pick names in the
namespace of your package.
Within a rule set, rules are identified by their name, builder and phase (safe/unsafe/norm). This means you can add the same declaration as multiple rules with different builders or in different phases, but not with different priorities or different builder options (if the rule's builder has any options).
Rules can appear in multiple rule sets, but in this case you should make sure that they have the same priority and use the same builder options. Otherwise, Aesop will consider these rules the same and arbitrarily pick one.
The @[aesop] Attribute
Declarations can be added to rule sets by annotating them with the @[aesop]
attribute.
In most cases, you'll want to add one rule for the declaration. The syntax for this is
@[aesop ? ? ? ?] where
-
issafe,normorunsafe. Cannot be omitted except under the conditions in the next bullet. -
is:- For
safeandnormrules, an integer penalty. Can be omitted, in which case the penalty defaults to 1. - For
unsaferules, a percentage between 0% and 100%. Cannot be omitted. You may omit theunsafephase specification when giving a percentage.
- For
-
is one of the builders given above. If you want to pass options to a builder, write it like this (with mandatory parentheses):(tactic (uses_branch_state := true))If no builder is specified, the default builder for the given phase is used.
-
is a clause of the form(rule_sets [r₁, ..., rₙ])where the
rᵢare declared rule sets. (Parentheses are mandatory.) The rule is added exactly to the specified rule sets. If this clause is omitted, it defaults to(rule_sets [default]).
It is occasionally useful to add multiple rules for a single declaration, e.g.
a cases and a constructors rule for the same inductive type. In this case,
you can write for example
@[aesop unsafe [constructors 75%, cases 90%]]
inductive T ...
@[aesop apply [safe (rule_sets [A]), 70% (rule_sets [B])]]
def foo ...
@[aesop [80% apply, safe 5 (forward (immediate := x))]]
def bar (x : T) ...In the first example, two unsafe rules for T are registered, one with success
probability 75% and one with 90%.
In the second example, two rules are registered for foo. Both use the apply
builder. The first, a safe rule with default penalty, is added to rule set
A. The second, an unsafe rule with 70% success probability, is added to
rule set B.
In the third example, two rules are registered for bar: an unsafe rule with
80% success probability using the apply builder and a safe rule with penalty
5 using the forward builder.
In general, the grammar for the @[aesop] attribute is
attr ::= @[aesop ]
| @[aesop [] ]
rule_expr ::= feature
| feature
| feature [] where feature is a phase, priority, builder or rule_sets clause. This
grammar yields one or more trees of features and each branch of these trees
specifies one rule. (A branch is a list of features.)
Erasing Rules
There are two ways to erase rules. Usually it suffices to remove the @[aesop]
attribute:
attribute [-aesop] fooThis will remove all rules associated with the declaration foo from all rule
sets.
When you want to remove only certain rules, you can use a command:
erase_aesop_rules [safe apply foo, bar (rule_sets [A])]This will remove:
- all safe rules for
foowith theapplybuilder from all rule sets (but not other, for example, unsafe rules orforwardrules); - all rules for
barfrom rule setA.
In general, the syntax is
erase_aesop_rules [] i.e. rules are specified in the same way as for the @[aesop] attribute.
However, each rule must also specify the name of the declaration whose rules
should be erased. The rule_expr grammar is therefore extended such that a
feature can also be the name of a declaration.
Note that a rule added with one of the default builders (safe_default,
norm_default, unsafe_default) will be registered under the name of the
builder that is ultimately used, e.g. apply or simp. So if you want to erase
such a rule, you may have to specify that builder instead of the default
builder.
The aesop Tactic
In its most basic form, you can call the Aesop tactic just by writing
example : α → α := by
aesopThis will use the rules in the default rule set (i.e. those added via the
attribute with no explicit rule set specified) and the rules in the builtin
rule set (i.e. those provided by Aesop itself).
The tactic's behaviour can also be customised with various options. A more involved Aesop call might look like this:
aesop
(add safe foo, 10% cases Or, safe cases Empty)
(erase A [cases, constructors], baz)
(rule_sets [A, B])
(options := { maxRuleApplicationDepth := 10 })
Here we add some rules with an add clause, erase other rules with an erase
clause, limit the used rule sets and set some options. Each of these clauses
is discussed in more detail below.
Rules can be added to an Aesop call with an add clause. This won't affect any
declared rule sets. The syntax of the add clause is
(add )
i.e. rules can be specified in the same way as for the @[aesop] attribute.
As with the erase_aesop_rules command, each rule must specify the name of
declaration from which the rule should be built; for example
(add safe [foo 1, bar 5])
will add the declaration foo as a safe rule with penalty 1 and bar as a safe
rule with penalty 5.
The rule names can also refer to hypotheses in the goal context, but not all builders support this.
Erasing RulesRules can be removed from an Aesop call with an erase clause. Again, this
affects only the current Aesop call and not the declared rule sets. The syntax
of the erase clause is
(erase )
and it works exactly like the erase_aesop_rules command.
By default, Aesop uses the default and builtin rule sets. A rule_sets
clause can be given to include additional rule sets, e.g.
(rule_sets [A, B])
This will use rule sets A, B, default and builtin. Rule sets can also
be disabled with rule_sets [-default, -builtin].
Various options can be set with an options clause, whose syntax is:
(options )
The term is an arbitrary Lean expression of type Aesop.Options; see there for
details.
Builtin Rules
The set of builtin rules (those in the builtin rule set) is currently quite
unstable, so for now I won't document them in detail. See
Aesop/BuiltinRules.lean and Aesop/BuiltinRules/*.lean
Tracing
To see how Aesop proves a goal -- or why it doesn't prove a goal, or why it's slow to prove a goal -- it is useful to see what it's doing. To that end, you can enable various tracing options. These use the usual syntax, e.g.
set_option trace.aesop.steps trueThe main options are:
trace.aesop.steps: print a step-by-step log of which goals Aesop tried to solve, which rules it tried to apply (successfully or unsuccessfully), etc. You can customise the output by setting various sub-options, e.g.trace.aesop.steps.normalizationwill show which normalisation rules were applied. When you autocompleteset_option trace.aesop.steps., you should get a full list of available sub-options.trace.aesop.ruleSet: print the rule set used for an Aesop call.trace.aesop.profile: print some information about where Aesop spent its time.trace.aesop.proof: if Aesop is successful, print the proof that was generated (as a Lean term).
Checking Internal Invariants
If you encounter behaviour that looks like an internal error in Aesop, it may
help to set the option aesop.check.all (or the more fine-grained
aesop.check.* options). This makes Aesop check various invariants while the
tactic is running. These checks are somewhat expensive, so remember to unset the
option after you've reported the bug.
Advanced Features of the Search Algorithm
Duplication of Local HypothesesWhen you use an add clause to register a local hypothesis as a rule, the
hypothesis is duplicated to ensure that it will remain available regardless of
which other rules are applied to the goal. E.g. when you have a hypothesis h : A ∧ B which you register as an apply rule, and you also have a cases rule
for And, we want to make sure that h is not destroyed by the cases rule.
To do this, we create a new hypothesis _local.h : A ∧ B and register that
hypothesis as an apply rule. We also mark _local.h as an auxDecl to make
sure that most tactics will ignore it; e.g. it will not be found by
assumption.
TODO
MetavariablesTODO