improve determinisation of automata matching string-like patterns
The following NBA for formula (!(G({(a)} |=> {(b)[*10]})))
is derived from #443 (closed).
It can be represented by a DBA with as many states, where the states essentially keep track of the number of !a & b
seen since the last a
.
However the WDBA-minimization code used to obtain the above is suboptimal. Currently it:
- calls the powerset construction to build a 2048-state automaton without looking at the acceptance condition,
- fixes the accepting states of the DBA by comparing it with the NBA (using a product to locate the accepting SCCs),
- minimize it to get the above 12 states.
Changing the formula from *10
to something larger will led to very large intermediate automata. (The original formula used in #443 (closed) was with *32
...)
However in this case most of these steps seem overkill.
-
The powerset construction could be taught about the notion of "accepting sink": if a powerstate contains an accepting sink, it can be reduced to this sink. Doing so (I have a patch already) reduces the intermediate automaton to 1025 states. -
Since we have a terminal NBA as input, the output should be terminal as well, and we could avoid running the product in step 2. -
Finally, I suspect there should be a way to build this DBA from the input NBA right away, but it not yet very clear how.