Aesop
This is a work-in-progress proof search tactic for Lean 4.
Building
With elan installed, lake build
should suffice.
Adding Aesop to Your Project
To use Aesop in a Lean 4 project, first add this package as a dependency. In
your lakefile.lean
, add
dependencies := #[
{ name := `aesop
src := Source.git "https://github.com/JLimperg/aesop" "" }
]
Now run lake configure
. Unless you use the exact same Lean 4 nightly as this
project (see our lean-toolchain
), you'll get a warning. If you use a later
nightly, you'll probably be fine and Aesop will compile anyway. If not, please
open an issue and we'll update the tactic.
You should now be able to use the aesop
tactic:
import Aesop
example : α → α :=
by aesop
TODO Document (and, indeed, find out) how to use Aesop as a plugin.
Usage
This section is intended for prospective Aesop users. Aesop is very work in progress and this text may be out of date, so if you have questions, please ping me (Jannis Limperg) on the Lean Zulip.
Quickstart
To get you started, I'll explain Aesop's major concepts with a series of examples. A more thorough, reference-style discussion of follows in the next section.
We first define our own version of lists (so as not to clash with the standard
library) and an append
function:
inductive MyList (α : Type _)
| nil
| cons (hd : α) (tl : MyList α)
namespace MyList
protected def append : (_ _ : MyList α) → MyList α
| nil, ys => ys
| cons x xs, ys => cons x (MyList.append xs ys)
instance : Append (MyList α) :=
⟨MyList.append⟩
We also tell simp
to unfold applications of append
. aesop
uses the default
simp set when it normalises a goal, so it will automatically pick up these rules
as well.
@[simp]
theorem nil_append : nil ++ xs = xs := rfl
@[simp]
theorem cons_append : cons x xs ++ ys = cons x (xs ++ ys) := rfl
Now we define the NonEmpty
predicate on MyList
:
declare_aesop_rule_sets [MyList.NonEmpty]
@[aesop safe (rule_sets [MyList.NonEmpty]) [constructors, cases]]
inductive NonEmpty : MyList α → Prop
| cons : NonEmpty (cons x xs)
Here we see the first Aesop features. The declare_aesop_rule_sets
command is
used to, well, declare Aesop rule sets. These are collections of rules,
which roughly correspond to tactics. When Aesop searches for a proof, it
systematically applies each available rule, then recursively searches for proofs
of the subgoals generated by the rule, and so on. A goal is proved when Aesop
applies a rule that generates no subgoals.
The @[aesop]
attribute adds two rules for NonEmpty
to our new rule set:
- A
constructors
rule. This rule tries to apply each constructor ofNonEmpty
whenever a goal has targetNonEmpty _
. - A
cases
rule. This rule searches for hypothesesh : NonEmpty _
and performs case analysis on them (like thecases
tactic).
constructors
and cases
are examples of rule builders, which are
functions that turn a declaration into a rule.
Both rules above are added as safe rules. When a safe rule succeeds on a goal encountered during the proof search, it is applied and the goal is never revisited thereafter. In other words, the search does not backtrack safe rules. We will later see unsafe rules, which can backtrack.
With these rules, we can prove a theorem about NonEmpty
and append
:
@[aesop safe apply]
theorem nonEmpty_append {xs : MyList α} ys :
NonEmpty xs → NonEmpty (xs ++ ys) := by
aesop (rule_sets [MyList.NonEmpty])
Here we call Aesop with the MyList.NonEmpty
rule set, plus some default rule
sets which are implicitly enabled (but can be disabled). Aesop finds a proof by
introducing the hypothesis h : NonEmpty xs
, performing case analysis on h
and applying the NonEmpty.cons
constructor.
If you want to see how Aesop proves your goal (or more likely, why it doesn't prove your goal or why it takes too long to do so), you can enable tracing:
set_option trace.aesop.steps true
This makes Aesop print out the steps it takes while searching for a proof. There
are also other tracing options which autocompletion should show you if you type
set_option trace.aesop
.
The @[aesop]
attribute on nonEmpty_append
adds this lemma as a safe rule to
the default rule set. For this rule we use the apply
rule builder, which
generates a rule that tries to apply nonEmpty_append
whenever the target is of
the form NonEmpty (_ ++ _)
.
After adding nonEmpty_append
, Aesop can now prove consequences of this lemma:
example {α : Type u} {xs : MyList α} ys zs :
NonEmpty xs → NonEmpty (xs ++ ys ++ zs) := by
aesop
Next, we prove another simple theorem about NonEmpty
:
theorem nil_not_NonEmpty (xs : MyList α) : xs = nil → ¬ NonEmpty xs := by
aesop (add unsafe 10% cases MyList, norm unfold Not)
(rule_sets [MyList.NonEmpty])
Here, we add two rules in an add
clause. These rules are not part of a
rule set but are added for this Aesop call only. They demonstrate two new
features:
-
Unsafe rules are rules which can backtrack. Here we add
cases MyList
as an unsafe rule because when we performcases
on a hypothesisxs : MyList α
, we gethd : α
andtl : MyList α
. If the rule was safe, we could easily get into an infinite loop where we callcases
ontl
again, and so on. Unsafe rules, by contrast, are applied 'in parallel', so while we might still end up exploring a proof attempt that performscases
20 times, other possible rules will also be considered and will hopefully lead us down a more fruitful path.To prioritise these different proof attempts, every unsafe rules is annotated with a success probability, here 10%. This should be an estimate of how likely it is that the rule will to lead to a successful proof. This estimate is used to prioritise goals: the initial goal starts with a priority of 100% and whenever we apply an unsafe rule, the priority of its subgoals is the priority of its parent goal times the success probability of the applied rule. So applying our
cases MyList
rule repeatedly would give us goals with priority 10%, 1%, 0.1% etc. Of course, the success probability of a rule is only a ballpark figure; it is not supposed to be precise in any sense. -
Norm rules are yet another class of rules. They are used to normalise the goal before any other rules are applied. As part of this normalisation process, we run a variant of
simp_all
with the globalsimp
set plus Aesop-specific simp lemmas. Theunfold
builder adds such an Aesop-specific simp lemma which unfolds theNot
definition. (This is only necessary because the builtin rules currently do not handle negation very well.)
Here are some other examples where normalisation comes in handy:
@[aesop norm]
theorem nil_append (xs : MyList α) : nil ++ xs = xs := rfl
@[aesop norm]
theorem cons_append (x : α) xs ys : cons x xs ++ ys = cons x (xs ++ ys) := rfl
@[aesop norm]
theorem append_nil {xs : MyList α} :
xs ++ nil = xs := by
induction xs <;> aesop
theorem append_assoc {xs ys zs : MyList α} :
(xs ++ ys) ++ zs = xs ++ (ys ++ zs) := by
induction xs <;> aesop
After we've added unfolding lemmas for append
, Aesop can prove theorems about
this function more or less by itself. (For these examples, simp
would already
suffice.) However, we still need to perform induction explicitly. This is a
deliberate design choice: while there are some techniques for automating
induction, they are complex and not entirely reliable, so I don't think it makes
sense to implement them in a system where users can easily perform induction
themselves.
More examples may be found in the tests
folder of this repository.
Reference
This section contains a systematic and fairly comprehensive account of how Aesop operates.
Rules
A rule is a tactic plus some associated metadata. Rules come in three flavours:
-
Normalisation rules (keyword
norm
) must generate zero or one subgoal. (Zero means that the rule closed the goal). Each normalisation rule is associated with an integer penalty (default 1). Normalisation rules are applied in a fixpoint loop in order of penalty, lowest first. For rules with equal penalties, the order is unspecified. See below for details on the normalisation algorithm.Normalisation rules can also be simp lemmas. These are constructed with the
unfold
orsimp
builder. They are used by a specialsimp
call during the normalisation process. -
Safe rules (keyword
safe
) are applied after normalisation but before any unsafe rules. When a safe rule is successfully applied to a goal, the goal becomes inactive, meaning no other rules are considered for it. Like normalisation rules, safe rules are associated with a penalty (default 1) which determines the order in which the rules are tried.Safe rules should be provability-preserving, meaning that if a goal is provable and we apply a safe rule to it, the generated subgoals should still be provable. This is a less precise notion than it may appear since what is provable depends on the entire Aesop rule set.
-
Unsafe rules (keyword
unsafe
) are tried only if all available safe rules have failed on a goal. When an unsafe rule is applied to a goal, the goal is not marked as inactive, so other (unsafe) rules may be applied to it. These rule applications are considered independently until one of them proves the goal (or until we've exhausted all available rules and determine that the goal is not provable with the current rule set).Each unsafe rule has a success probability between 0% and 100%. These probabilities are used to determine the priority of a goal. The initial goal has priority 100% and whenever we apply an unsafe rule, the priorities of its subgoals are the priority of the rule's parent goal times the rule's success probability. Safe rules are treated as having 100% success probability.
Rules can also be multi-rules. These are rules which 'nondeterministically'
apply multiple tactics to the goal. Each of these tactics is then considered as
one possible way to solve the goal. For example, registering the constructors of
the Or
type will generate a multi-rule that, given a goal with target A ∨ B
,
generates one rule application with goal A
and one with goal B
.
Search Tree
Aesop's central data structure is a search tree. This tree alternates between two kinds of nodes:
- Goal nodes: these nodes store a goal, plus metadata relevant to the search. The parent and children of a goal node are rule application nodes. In particular, each goal node has a priority between 0% and 100%.
- Rule application ('rapp') nodes: these goals store a rule (plus metadata).
The parent and children of a rapp node are goal nodes. When the search tree
contains a rapp node with rule
r
, parentp
and childrenc₁, ..., cₙ
, this means that the tactic of ruler
was applied to the goal ofp
, generating the subgoals of thecᵢ
.
When a goal node has multiple child rapp nodes, we have a choice of how to solve the goals. This makes the tree an AND-OR tree: to prove a rapp, all its child goals must be proved; to prove a goal, any of its child rapps must be proved.
Search
We start with a search tree containing a single goal node. This node's goal is the goal which Aesop is supposed to solve. Then we perform the following steps in a loop, stopping if (a) the root goal has been proved; (b) the root goal becomes unprovable; or (c) one of Aesop's rule limits has been reached. (There are configurable limits on, e.g., the total number of rules applied or the search depth.)
-
Pick the highest-priority active goal node
G
. Roughly speaking, a goal node is active if it is not proved and we haven't yet applied all possible rules to it. -
If the goal of
G
has not been normalised yet, normalise it. That means we run the following normalisation loop:- Run the normalisation rules with negative penalty (lowest penalty first). If any of these rules is successful, restart the normalisation loop with the goal produced by the rule.
- Run
simp
on all hypotheses and the target, using the global simp set (i.e. lemmas tagged@[simp]
) plus Aesop'ssimp
rules. - Run the normalisation rules with positive penalty (lowest penalty first). If any of these rules is successful, restart the normalisation loop.
The loop ends when all normalisation rules fail. It destructively updates the goal of
G
(and may prove it outright). -
If we haven't tried to apply the safe rules to the goal of
G
yet, try to apply each safe rule (lowest penalty first). As soon as a rule succeeds, add the corresponding rapp and child goals to the tree and markG
as inactive. The child goals receive the same priority asG
. -
Otherwise there is at least one unsafe rule that hasn't been tried on
G
yet (or elseG
would have been inactive). Try the unsafe rule with the highest success probability and if it succeeds, add the corresponding rapp and child goals to the tree. The child goals receive the priority ofG
times the success probability of the applied rule.
A goal is unprovable if we have applied all possible rules to it and all resulting child rapps are unprovable. A rapp is unprovable if any of its subgoals is unprovable.
During the search, a goal or rapp can also become irrelevant, meaning its provability has no impact on the success of the overall proof search. More formally, irrelevance is characterised by the following conditions:
- A goal is irrelevant if its parent rapp is unprovable. (This means that a sibling of the goal is already unprovable, in which case we already know that the parent rapp will never be proved.)
- A rapp is irrelevant if its parent goal is proved. (This means that a sibling of the rapp is already proved, and we only need one proof.)
- A goal or rapp is irrelevant if any of its ancestors is irrelevant.
Rule Builders
A rule builder is a metaprogram that turns a declaration or hypothesis into an Aesop rule. Currently available builders are:
-
apply
: generates a rule which tries to apply the given declaration or hypothesisx
to the target. The rule acts like the tacticapply x
. -
forward
: when applied to a declaration or hypothesis of typeA₁ → ... Aₙ → B
, generates a rule which looks for hypothesesh₁ : A₁
, ...,hₙ : Aₙ
in the goal and, if they are found, adds a new hypothesish : B
. As an example, consider the lemmaeven_or_odd
:even_or_odd : ∀ (n : Nat), Even n ∨ Odd n
Registering this as a forward rule will cause the goal
n : Nat m : Nat ⊢ T
to be transformed into this:
n : Nat hn : Even n ∨ Odd n m : Nat hm : Even m ∨ Odd m ⊢ T
The forward builder may also be given a list of immediate names:
(forward (immediate := [n])) even_or_odd
The immediate names, here
n
, refer to the arguments ofeven_or_odd
. When Aesop applies a forward rule with explicit immediate names, it only matches the corresponding arguments to hypotheses. (Here,even_or_odd
has only one argument, so there is no difference.)When no immediate names are given, Aesop considers every argument immediate, except for instance arguments and dependent arguments (i.e. those that can be inferred from the types of later arguments).
When a forward rule has been successfully applied, it will not be tried again when processing its subgoals (and their subgoals, etc.). Without this limit, many forward rules would be applied infinitely often.
-
elim
: works likeforward
, but after the rule has been applied, hypotheses that were used as immediate arguments are cleared. As the name suggests, this is useful when you want to eliminate a hypothesis. E.g. the rule@[aesop norm elim] theorem and_elim_right : α ∧ β → α := ...
will cause the goal
h₁ : (α ∧ β) ∧ γ h₂ : δ ∧ ε
to be transformed into
h₁ : α h₂ : δ
Unlike with
forward
rules, when anelim
rule is successfully applied, it may be applied again to the resulting subgoals (and their subgoals, etc.). There is less danger of infinite cycles because the original hypothesis is cleared.However, if the hypothesis or hypotheses to which the
elim
rule is applied have dependencies, they are not cleared. In this case, you'll probably get an infinite cycle. (TODO fix this.) -
constructors
: when applied to an inductive type or structureT
, generates a rule which tries to apply each constructor ofT
to the target. This is a multi-rule, so if multiple constructors apply, they are considered in parallel. If you use this constructor to build an unsafe rule, each constructor application receives the same success probability; if this is not what you want, add separateapply
rules for the constructors. -
cases
: when applied to an inductive type or structureT
, generates a rule that performs case analysis on every hypothesish : T
in the context. The rule recurses into subgoals, socases Or
will generate 6 goals when applied to a goal with hypothesesh₁ : A ∨ B ∨ C
andh₂ : D ∨ E
. However, ifT
is a recursive type (e.g.List
), we only perform case analysis once on each hypothesis. Otherwise we would loop infinitely.The
patterns
option can be used to apply the rule only on hypotheses of a certain shape. E.g. the rule(cases (patterns := [Fin 0])) Fin
will perform case analysis only on hypotheses of typeFin 0
. Patterns can contain underscores, e.g.0 ≤ _
. Multiple patterns can be given (separated by commas); the rule is then applied whenever at least one of the patterns matches a hypothesis. -
simp
: when applied to an equationeq : A₁ → ... Aₙ → x = y
, registerseq
as a simp lemma for the builtin simp pass during normalisation. As such, this builder can only build normalisation rules. -
unfold
: when applied to a definition orlet
hypothesisf
, registersf
to be unfolded (i.e. replaced with its definition) by the builtin simp pass during normalisation. As such, this builder can only build normalisation rules. -
tactic
: takes a tactic and directly turns it into a rule. The given declaration (the builder does not work for hypotheses) must have typeTacticM Unit
,Aesop.SimpleRuleTac
orAesop.RuleTac
. The latter are Aesop data types which associate a tactic with additional metadata; using them may allow the rule to operate somewhat more efficiently.The builder may be given an option
uses_branch_state :=
(default true). This indicates whether the given tactic uses the branch state; see below.Rule tactics should not be 'no-ops': if a rule tactic is not applicable to a goal, it should fail rather than return the goal unchanged. All no-op rules waste time and no-op
norm
rules will send normalisation into an infinite loop.Normalisation rules may not assign metavariables (other than the goal metavariable) or introduce new metavariables (other than the new goal metavariable). This can be a problem because some Lean tactics, e.g.
cases
, do so even in cases where you probably would not expect them to. I'm afraid there is currently no good solution for this. -
default
: The default builder. This is the builder used when you register a rule without specifying a builder, but you can also use it explicitly. Depending on the rule's phase, the default builder tries different builders, using the first one that works. These builders are:- For
safe
andunsafe
rules:constructors
,tactic
,apply
. - For
norm
rules:constructors
,tactic
,simp
,apply
.
- For
Rule Sets
Rule sets are declared with the command
declare_aesop_rule_sets [r₁, ..., rₙ]
where the rᵢ
are arbitrary names. To avoid clashes, pick names in the
namespace of your package.
Within a rule set, rules are identified by their name, builder and phase (safe/unsafe/norm). This means you can add the same declaration as multiple rules with different builders or in different phases, but not with different priorities or different builder options (if the rule's builder has any options).
Rules can appear in multiple rule sets, but in this case you should make sure that they have the same priority and use the same builder options. Otherwise, Aesop will consider these rules the same and arbitrarily pick one.
The @[aesop]
Attribute
Declarations can be added to rule sets by annotating them with the @[aesop]
attribute.
In most cases, you'll want to add one rule for the declaration. The syntax for this is
@[aesop ? ? ? ?]
where
-
issafe
,norm
orunsafe
. Cannot be omitted except under the conditions in the next bullet. -
is:- For
safe
andnorm
rules, an integer penalty. Can be omitted, in which case the penalty defaults to 1. - For
unsafe
rules, a percentage between 0% and 100%. Cannot be omitted. You may omit theunsafe
phase specification when giving a percentage.
- For
-
is one of the builders given above. If you want to pass options to a builder, write it like this (with mandatory parentheses):(tactic (uses_branch_state := true))
If no builder is specified, the default builder for the given phase is used.
-
is a clause of the form(rule_sets [r₁, ..., rₙ])
where the
rᵢ
are declared rule sets. (Parentheses are mandatory.) The rule is added exactly to the specified rule sets. If this clause is omitted, it defaults to(rule_sets [default])
.
It is occasionally useful to add multiple rules for a single declaration, e.g.
a cases
and a constructors
rule for the same inductive type. In this case,
you can write for example
@[aesop unsafe [constructors 75%, cases 90%]]
inductive T ...
@[aesop apply [safe (rule_sets [A]), 70% (rule_sets [B])]]
def foo ...
@[aesop [80% apply, safe 5 (forward (immediate := x))]]
def bar (x : T) ...
In the first example, two unsafe rules for T
are registered, one with success
probability 75% and one with 90%.
In the second example, two rules are registered for foo
. Both use the apply
builder. The first, a safe
rule with default penalty, is added to rule set
A
. The second, an unsafe
rule with 70% success probability, is added to
rule set B
.
In the third example, two rules are registered for bar
: an unsafe
rule with
80% success probability using the apply
builder and a safe
rule with penalty
5 using the forward
builder.
In general, the grammar for the @[aesop]
attribute is
attr ::= @[aesop ]
| @[aesop [] ]
rule_expr ::= feature
| feature
| feature []
where feature
is a phase, priority, builder or rule_sets
clause. This
grammar yields one or more trees of features and each branch of these trees
specifies one rule. (A branch is a list of features.)
Erasing Rules
There are two ways to erase rules. Usually it suffices to remove the @[aesop]
attribute:
attribute [-aesop] foo
This will remove all rules associated with the declaration foo
from all rule
sets.
When you want to remove only certain rules, you can use a command:
erase_aesop_rules [safe apply foo, bar (rule_sets [A])]
This will remove:
- all safe rules for
foo
with theapply
builder from all rule sets (but not other, for example, unsafe rules orforward
rules); - all rules for
bar
from rule setA
.
In general, the syntax is
erase_aesop_rules []
i.e. rules are specified in the same way as for the @[aesop]
attribute.
However, each rule must also specify the name of the declaration whose rules
should be erased. The rule_expr
grammar is therefore extended such that a
feature
can also be the name of a declaration.
Note that a rule added with one of the default builders (safe_default
,
norm_default
, unsafe_default
) will be registered under the name of the
builder that is ultimately used, e.g. apply
or simp
. So if you want to erase
such a rule, you may have to specify that builder instead of the default
builder.
The aesop
Tactic
In its most basic form, you can call the Aesop tactic just by writing
example : α → α := by
aesop
This will use the rules in the default
rule set (i.e. those added via the
attribute with no explicit rule set specified) and the rules in the builtin
rule set (i.e. those provided by Aesop itself).
The tactic's behaviour can also be customised with various options. A more involved Aesop call might look like this:
aesop
(add safe foo, 10% cases Or, safe cases Empty)
(erase A [cases, constructors], baz)
(rule_sets [A, B])
(options := { maxRuleApplicationDepth := 10 })
Here we add some rules with an add
clause, erase other rules with an erase
clause, limit the used rule sets and set some options. Each of these clauses
is discussed in more detail below.
Rules can be added to an Aesop call with an add
clause. This won't affect any
declared rule sets. The syntax of the add
clause is
(add )
i.e. rules can be specified in the same way as for the @[aesop]
attribute.
As with the erase_aesop_rules
command, each rule must specify the name of
declaration from which the rule should be built; for example
(add safe [foo 1, bar 5])
will add the declaration foo
as a safe rule with penalty 1 and bar
as a safe
rule with penalty 5.
The rule names can also refer to hypotheses in the goal context, but not all builders support this.
Erasing RulesRules can be removed from an Aesop call with an erase
clause. Again, this
affects only the current Aesop call and not the declared rule sets. The syntax
of the erase
clause is
(erase )
and it works exactly like the erase_aesop_rules
command.
By default, Aesop uses the default
and builtin
rule sets. A rule_sets
clause can be given to include additional rule sets, e.g.
(rule_sets [A, B])
This will use rule sets A
, B
, default
and builtin
. Rule sets can also
be disabled with rule_sets [-default, -builtin]
.
Various options can be set with an options
clause, whose syntax is:
(options )
The term is an arbitrary Lean expression of type Aesop.Options
; see there for
details.
Builtin Rules
The set of builtin rules (those in the builtin
rule set) is currently quite
unstable, so for now I won't document them in detail. See
Aesop/BuiltinRules.lean
and Aesop/BuiltinRules/*.lean
Tracing
To see how Aesop proves a goal -- or why it doesn't prove a goal, or why it's slow to prove a goal -- it is useful to see what it's doing. To that end, you can enable various tracing options. These use the usual syntax, e.g.
set_option trace.aesop.steps true
The main options are:
trace.aesop.steps
: print a step-by-step log of which goals Aesop tried to solve, which rules it tried to apply (successfully or unsuccessfully), etc. You can customise the output by setting various sub-options, e.g.trace.aesop.steps.normalization
will show which normalisation rules were applied. When you autocompleteset_option trace.aesop.steps.
, you should get a full list of available sub-options.trace.aesop.ruleSet
: print the rule set used for an Aesop call.trace.aesop.profile
: print some information about where Aesop spent its time.trace.aesop.proof
: if Aesop is successful, print the proof that was generated (as a Lean term).
Checking Internal Invariants
If you encounter behaviour that looks like an internal error in Aesop, it may
help to set the option aesop.check.all
(or the more fine-grained
aesop.check.*
options). This makes Aesop check various invariants while the
tactic is running. These checks are somewhat expensive, so remember to unset the
option after you've reported the bug.
Advanced Features of the Search Algorithm
Duplication of Local HypothesesWhen you use an add
clause to register a local hypothesis as a rule, the
hypothesis is duplicated to ensure that it will remain available regardless of
which other rules are applied to the goal. E.g. when you have a hypothesis h : A ∧ B
which you register as an apply
rule, and you also have a cases
rule
for And
, we want to make sure that h
is not destroyed by the cases
rule.
To do this, we create a new hypothesis _local.h : A ∧ B
and register that
hypothesis as an apply
rule. We also mark _local.h
as an auxDecl
to make
sure that most tactics will ignore it; e.g. it will not be found by
assumption
.
TODO
MetavariablesTODO