[isabelle] [Sledgehammer · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] [Sledgehammer

Email Gateway (Aug 22 2022 at 17:11):

From: Paulo Emílio de Vilhena <pevilhena2@gmail.com>
Dear Jasmin,

Again, many thanks for your reply.

I've attached two screenshots to visualize this situation exactly as it
arrives to me.

In the first one, you can see that the solution proposed by Sledgehammer
doesn't finish the proof ("by fastforce" is in purple and Isabelle
continues to show the goal).

The second screenshot is the situation, I described before.

The code for this theory is open source and can be found on its github
repository inside Cycles.thy theory:
https://github.com/DeVilhena-Paulo/GaloisCVC4/blob/master/Cycles.thy. The
line numbers for these situations can be read in the screenshots.

To repeat the first situation, click on apply Sledgehammer right after
?thesis. To repeat the second, write "using less" right after ?thesis and
apply Sledgehammer.

Cheers,

Paulo.
Screenshot1 from 2018-05-02 12-47-17.png
Screenshot2 from 2018-05-02 12-47-52.png

Email Gateway (Aug 22 2022 at 17:17):

From: Paulo Emílio de Vilhena <pevilhena2@gmail.com>
Hi,

I found a problem with the usage of sledgehammer's proposed solutions:
sometimes, even if it states "Proof found" and it manages to output an
one-line proof without the "timed-out" assertion, the proposed solution
does not work. This problem arrives to me with a certain frequency. Today,
for example, It happened two times in the same proof.

I am available to help solving this problem.

Thanks,

Paulo Emílio de Vilhena.

Email Gateway (Aug 22 2022 at 17:21):

From: Paulo Emílio de Vilhena <pevilhena2@gmail.com>
Dear Jasmin,

Thank you for your reply.

Indeed, Sledgehammer is of great help and even not succeeding it helps
finding relevant facts as you mentioned.

It just feels awkward that, for me, most of the time when this behavior
appears is in very simple situations where the argument of
performance wouldn't work at an intuitive level (I don't know the inner
functioning of Sledgehammer, I'm simply based on the size of the one-line
proof output as an user naively does). For example, there was a situation
where all the provers outputed "by blast" and Isabelle wasn't able to
finish the proof with this command. Another example, to make things
concrete, was the situation of yesterday: if I write "using less" to give
Sledgehammer a hint, all the provers but one proposed "empty by auto" as
solution, which failed, and the last proposed "cycle_decomp.simps by auto"
and also failed. What is bizarre is that if I don't write "using less"
Sledgehammer manages to find solutions that work.

I just repeated the experience today and the same behavior appeared. I also
noticed that it was really fast to propose these faulty solutions so I
don't think it is a performance problem (again, based on naive
observations).

To finish, I would like to suggest a modification to avoid this kind of
situation, which is really annoying: I think it would be nice if
Sledgehammer asks Isabelle to verify if the solution works instead of the
user doing it by hand. If the solution doesn't work, it could just write an
assertion as in the case of "Timed out". I think it would be less
frustrating.

Cheers,

Paulo.

Last updated: Oct 08 2025 at 20:22 UTC