An Isabelle RPC library is provided, allowing Isabelle/ML code to call procedures written in Python (or other languages in the future) on a (remote) host. Not fully tested, but I'm actively using it in my Isabelle AI projects.
The RPC communication is over TCP and MessagePack, so the implementation of a host on other languages should be straightforward.
https://github.com/xqyww123/Isabelle_RPC
Is there a specific reason to do this in Isabelle/ML? The standard way would be a Isabelle/Scala implementation (e.g. there is already a JSON RPC implementation for Isabelle/VSCode, see Tools/VSCode/src/channel.scala) that is simply called from Isabelle/ML.
This is easier to learn (for myself, and no need to learn Scala), through faster protocol (MessagePack vs XML), and involves fewer dependencies (no dependency on Scala, so easier to maintain (for myself))
Also, no need to recompile the Scala plugin, so easier to redistribute (as a standard Isabelle session)
Also, it supports multiple languages. The PIDE protocol ASAIK, is heavy and implemented only on Scala. But you know, most of machine learning infrastructures are written in Python. So in the end, an AI infrastructure needs both the bridge between Poly/ML <--> Scala and the bridge between Scala <--> Python through some RPC protocol.
But now, we can directly bridge Poly/ML <--> Python through the standard TCP and MessagePack. Note that MessagePack is supported by 50+ languages (according to official site). So we can similarly write the RPC hosts in many other languages without modifying the Isabelle end. Indeed, I've written an OCaml client before.
Given the benefits above, the implementation is also slim. You see, it takes only 351 lines of ML code 288 lines of Python. Sure it might not be very mature for now, but the whole tech stack is standard (TCP and MessagePack).
Qiyuan Xu said:
This is easier to learn (for myself, and no need to learn Scala), through faster protocol (MessagePack vs XML), and involves fewer dependencies (no dependency on Scala, so easier to maintain (for myself))
Well, I was asking because it would be much easier to maintain since most of what you'd need is already there, though of course not knowing Isabelle/Scala is a bit of a barrier. Note that there will likely be a Isabelle/Scala tutorial in the next Isabelle workshop at FLOC.
Is performance of these messages actually a bottleneck for your application? It rarely is except when dealing with huge term structures e.g. for proof terms. And this is only for the internal communication between Isabelle/ML and Isabelle/Scala (which uses YXML, which already is a space-efficient representation of XML trees); for communication with the outside, one could easily go to even more efficient formats such as protobuf.
Qiyuan Xu said:
Also, no need to recompile the Scala plugin, so easier to redistribute (as a standard Isabelle session)
This seems to be a misconception, no need to do that at any point (a Isabelle component is a s easy to redistribute as a session). The system takes care of recompilation itself where needed.
Qiyuan Xu said:
Also, it supports multiple languages. The PIDE protocol ASAIK, is heavy and implemented only on Scala. But you know, most of machine learning infrastructures are written in Python. So in the end, an AI infrastructure needs both the bridge between Poly/ML <--> Scala and the bridge between Scala <--> Python through some RPC protocol.
I don't quite understand what you are saying here. To clarify:
Those two processes communicate via internal messages (all the time). When talking to external infrastructure -- independent on how you do that, e.g. via MessagePack -- you would normally talk to the Isabelle/Scala process only. Internally, that might talk to the Isabelle/ML process if needed.
Indeed, any new feature introduces additional maintenance costs compared to existing ones. This argument applies to any new feature. However, people do need a simple and easy-to-learn way to enable communication between Poly/ML and the external environment; they don't want to learn complex software system architectures. I knocked out this framework in one night, in hundreds of lines of code, and it doesn't need to be merged into main. May let users pick whichever one they want.
Last updated: Feb 04 2026 at 02:22 UTC