improving inclusion checks
I'm creating this issue to keep track of ideas that we could use to improve contains()
as I risk to forget about this otherwise.
@pierreganty sent the following example where contains_forq()
is much more efficient than the default complementation-based implementation of the inclusion check.
The tests to run on the above example are:
- (A⊆B def.)
SPOT_CONTAINMENT_CHECK=default autfilt planning_robotic_robustness_3600_A.bs.baccmin --included-in=planning_robotic_robustness_3600_B.bs -c
- (A⊆B forq)
SPOT_CONTAINMENT_CHECK=forq autfilt planning_robotic_robustness_3600_A.bs.baccmin --included-in=planning_robotic_robustness_3600_B.bs -c
In the following, we refer to the two automata as A and B.
Times measured by Pierre & myself using the development version of Spot on our computers are given below for reference
operation | Pierre | Alexandre |
---|---|---|
A⊆B def. | 3530 s | Out-of-mem (>17GB) |
A⊆B forq | 62 s | 25 s |
Those automata are quire large, so not easy to inspect. However one can notice they use 5 atomic propositions, but they don't use all possible valuations of those. In particular A uses only 8 disjoint labels:
% sed -n 's/^\[\(.*\)\].*/\1/p' \
planning_robotic_robustness_3600_A.bs.baccmin | sort -u
!0&!1&!2&!3&4
!0&!1&!2&3&!4
!0&!1&2&!3&!4
!0&1&!2&!3&!4
0&!1&!2&!3&4
0&!1&!2&3&!4
0&!1&2&!3&!4
0&1&!2&!3&!4
% sed -n 's/^\[\(.*\)\].*/\1/p' \
planning_robotic_robustness_3600_B.bs| sort -u
!0
!0&!1 | !0&2 | !0&3 | !0&4
!0&1 | !0&!2 | !0&3 | !0&4
!0&1 | !0&2 | !0&!3 | !0&4
!0&1 | !0&2 | !0&3 | !0&!4
!0&!1&!2&!3&4
!0&!1&!2&3&!4
!0&!1&2&!3&!4
!0&1&!2&!3&!4
0&!1&!2&!3&4
0&!1&!2&3&!4
0&!1&2&!3&!4
0&1&!2&!3&!4
!1&!2&!3&4
!1&!2&3&!4
!1&2&!3&!4
1&!2&!3&!4
So one idea would be to relabel these two automata with only 3 atomic propositions (enough to represent the 2^3=8 labels of A, and restricting the labels of B to only those compatible with that) before checking the inclusion.
Another thing we can notice is that in A, atomic propositions {1,2,3,4} are mutually exclusive. So in this case we can restirct B to only use labels where those propositions are mutually exclusive as follows. Let's call the new automaton C.
% autfilt --exclusive-ap=l1,l2,l3,l4 \
planning_robotic_robustness_3600_B.bs > \
planning_robotic_robustness_3600_C.bs
% sed -n 's/^\[\(.*\)\].*/\1/p' \
planning_robotic_robustness_3600_C.bs | sort -u
!0&!1&!2&!3 | !0&!1&!2&!4 | !0&!1&!3&!4
!0&!1&!2&!3 | !0&!1&!2&!4 | !0&!1&!3&!4 | !0&!2&!3&!4
!0&!1&!2&!3 | !0&!1&!2&!4 | !0&!2&!3&!4
!0&!1&!2&!3 | !0&!1&!3&!4 | !0&!2&!3&!4
!0&!1&!2&!3&4
!0&!1&!2&3&!4
!0&!1&2&!3&!4
!0&1&!2&!3&!4
0&!1&!2&!3&4
0&!1&!2&3&!4
0&!1&2&!3&!4
0&1&!2&!3&!4
!0&!1&!2&!4 | !0&!1&!3&!4 | !0&!2&!3&!4
!1&!2&!3&4
!1&!2&3&!4
!1&2&!3&!4
1&!2&!3&!4
Or we can also try to simplify those labels based on the fact that we know that the proposition are mutually exclusive in A. Let D be that new automaton:
% autfilt --exclusive-ap=l1,l2,l3,l4 \
--simplify-exclusive-ap \
planning_robotic_robustness_3600_B.bs > \
planning_robotic_robustness_3600_D.bs
% sed -n 's/^\[\(.*\)\].*/\1/p' \
planning_robotic_robustness_3600_D.bs | sort -u
!0
!0&!1
!0&1
0&1
!0&!2
!0&2
0&2
!0&!3
!0&3
0&3
!0&!4
!0&4
0&4
1
2
3
4
Here are some measurements with the new automata
command | Alexandre |
---|---|
A⊆C def. | 509 s |
A⊆C forq | 30 s |
A⊆D def. | Out-of-mem |
A⊆D forq | 30 s |
So restricting the labels of B (as in C) does improve the default approach. It worsens the forq approach, and I'm not sure why. Another remark is that in the case of A⊆C
, where the labels used in both automata are all minterms, contains_forq()
does not need to use bdd_implies
to check compatibility of labels, a constant-time equality check would be faster.
Pierre points out that https://github.com/ravenbeutner/automata-benchmarks-from-hyperproperties/tree/main/planning has smaller instances which are supposed to be generated from nusmv models in https://github.com/HyperQB/HyperQB/tree/master/cases_bmc/
tacas_robotic. Compared to the original, the files included in the current issue have just been simplified with autfilt --small
, and A has been preprocessed to reduce the number of accepting transitions (an important factor in the forq approach). So probably we can learn more by working on smaller instances.