From: "C. Diekmann" <diekmann@in.tum.de>
Dear list,
I have an Isabelle project that runs some tests on large data sets
during build time. The tests are all executed with value[code]. I
observed a huge increase in build time for a certain commit. What is
causing this huge difference? Unfortunately, I cannot reproduce this
issue outside of my project. Code below.
The code that is shared by the two versions:
fun extract_IPSets_generic0 :: "(simple_match ⇒ 32 word × nat) ⇒
simple_rule list ⇒ (32 wordinterval) list" where
"extract_IPSets_generic0 _ [] = []" |
"extract_IPSets_generic0 sel ((SimpleRule m _)#ss) =
(ipv4_cidr_tuple_to_interval (sel m)) #
(extract_IPSets_generic0 sel ss)"
fun extract_src_dst_ips :: "simple_rule list ⇒ (32 wordinterval) list
⇒ (32 wordinterval) list" where
"extract_src_dst_ips [] ts = ts" |
"extract_src_dst_ips ((SimpleRule m _)#ss) ts = extract_src_dst_ips ss
((ipv4_cidr_tuple_to_interval (src m)) #
((ipv4_cidr_tuple_to_interval (dst m))#ts))"
Version 1 defines
definition extract_IPSets :: "simple_rule list ⇒ (32 wordinterval) list" where
"extract_IPSets rs ≡ extract_src_dst_ips rs []"
To build with test cases, it takes over 10 hours:
10:10:49 elapsed time, 38:41:17 cpu time, factor 3.80
Version 2 defines
fun extract_IPSets :: "simple_rule list ⇒ (32 wordinterval) list" where
"extract_IPSets rs = (extract_IPSets_generic0 src rs) @
(extract_IPSets_generic0 dst rs)"
To build with test cases, it takes less than 4 hours:
3:39:56 elapsed time, 21:08:34 cpu time, factor 5.76
W.r.t. a correctness theorem, both versions are equal.
Version 2 is parallelizing a bit better, but this seems not to be the
main cause of the build time difference. What is causing these huge
differences?
When I benchmark on my laptop, The runtime of both implementations is
about equal:
value[code] "let x = (let rs = replicate 1000000 (SimpleRule
simple_match_any simple_action.Accept)
in extract_src_dst_ips rs []) in ()"
value[code] "let x = (let rs = replicate 1000000 (SimpleRule
simple_match_any simple_action.Accept)
in (extract_IPSets_generic0 src rs) @
(extract_IPSets_generic0 dst rs)) in ()"
Interestingly, export_code extract_IPSets in SML
fails for both
versions with the error
"Dependency "int" :: "semiring_div_parity" -> "int" ::
"semiring_parity" would result in module dependency cycle"
What does value[code] do in this case? Does it still compile to SML?
In the file that runs all the value[code] tests, export_code works
(probably because I'm importing Code_Target_Nat and Code_Target_Int
again).
The diff between version1 and version2 is on github. The only change
is the definition of extract_IPSets. More detailed runtime
information in the github commit comments.
Slow (version1):
https://github.com/diekmann/Iptables_Semantics/commit/924358476a9e59faa37484d5547a7775e6d2d90f
Fast (version2):
https://github.com/diekmann/Iptables_Semantics/commit/a05fa38b3daf444030a211aca39d680575dfc870
I would be happy for any hint what is responsible for version2 being so slow.
Best Regards,
Cornelius
From: "C. Diekmann" <diekmann@in.tum.de>
Dear list,
Please don't waste your time on this question. Version1 and Version2
return the result in a different order (which does not affect the
correctness theorem but the algorithms which continue to work with the
result). I will first check if this is responsible; I will come back
if this is not the main cause. I should have thought about this one
earlier, sorry for the spam.
Have a nice weekend,
Cornelius
Last updated: Nov 21 2024 at 12:39 UTC