Stream: Isabelle/ML

Topic: Lexicons for Scanning


view this post on Zulip Hanna Lachnitt (Jun 02 2024 at 22:44):

Hi everyone,

could someone explain to me how lexicons from src/Pure/General/scan.ML work please? I suspect they might be what I need but I am not sure since I have not quite figured them out.

val test_keywords = Scan.make_lexicon (map raw_explode ["echo","echos","kuckuck"])

val keyword_scanner = Scan.literal test_keywords

I have a list of keywords and I want to do different actions depending whether a keyword is detected. So I made a lexicon out of the list (with raw_explode since I have seen that elsewhere in the Isabelle codebase but that might be wrong). Then, I tried to use Scan.literal to write a keyword scanner.

Let's say my overall scanner looks like this:

val scan_command =
  $$ "("
  |-- Scan.many (curry (op =) " ")

Then, this is of type ML: string list -> string list * string list. My keyword scanner expects something of the type (string * string) list. What is the second element of the tuples for? I am looking at the implementation of Scan.literal but don't understand it, it seems to take the fst of the tuples and use that only? From what I understand is that it looks up a character (as a singleton string) in the lexicon (which is a smart data-structure returning only paths that are still viable) and if it finds it it recursively searches for the rest with the amended lexicon. In case it runs out of search string characters but there are still open possibilities in the lexicon on what the input could be it gives the MORE exception (as in my "echo" vs "echos" test).

I know how to accomplish my goal in another way but if there is a built-in solution I want to use that.

Testing out my scanner

val x = keyword_scanner  [("e","this is not used?"),("c",""),("h",""),("o",""),("s","")]
val _ = @{print}("x",x)

If I delete the last element ("s","") it fails. So my questions are:

  1. Are lexicons the right approach for my aim?
  2. What does Scan.literal intuitively do and what does it need as an input (especially what should the second element of the tuple be?)

view this post on Zulip Hanna Lachnitt (Jun 02 2024 at 22:44):

To add to this, I searched through the Isabelle code and found a use like this:

fun keyword_scanner2 s = keyword_scanner :|--
  (fn xs => if map fst xs = raw_explode s then Scan.succeed xs else Scan.fail);

So basically a keyword is given to the scanner and only if it is in the lexicon the Scan succeeds?

view this post on Zulip Lukas Stevens (Jun 03 2024 at 10:58):

To understand how parsing works in Isabelle, it makes sense to read the chapter in the ML cookbook. There, it is explained why a parser always returns a tuple.

view this post on Zulip Hanna Lachnitt (Jun 03 2024 at 14:41):

Thanks a lot, that is super helpful also for other questions I had. I did not know that existed. I'll read that chapter. I do understand why it returns a tuple for all the other scanner functions just not how to use this.


Last updated: Dec 21 2024 at 16:20 UTC