Well that is the norm if you use acronyms that are not well-known for the target audience. The paper you see is a submission to the SAT'26 conference, a conference dedicated to the problem SAT and related questions. For most people there, especially the ones interested in this paper, OBDD (Ordered Binary Decision Diagrams), FPT (Fixed Parameter Tractable), DNNF (Decomposable Negation Normal Form) and CNF (Conjunctive Normal Form) are pretty standard acronyms and you do not redefine them in every paper. Plus, conference format forces you to fit everything in a bounded number of pages, so you have to choose where to save space.
That said, Knowledge Compilation is one of the worst subfield of computer science regarding acronyms, so I understand your feeling...
There is a branch of computer science (close to SAT/constraints solving communities) studying data structures allowing to represent Boolean functions succinctly yet in a way that allows you to do something with it. Like, quickly finding how many models you have where x1=true and x2=false. Of course, if you list every model, that is easy, but not succinct. So we are looking to tradeoff between succinctness and tractability.
OBDDs are one of the earliest such data structure. You can see it in many different ways depending on your taste/background (automata for finite language, decision-trees with sharing, flowcharts, nested if-then-else...), but they are a very natural way of representing a Boolean function. Plus, they have nice properties. One of them being that you can take any OBDD and minimize it in linear time into a canonical form, that is, a given Boolean function has exactly one minimal canonical OBDD (like every regular language have a canonical minimal DFA, this is actually the same result).
The problem with OBDD is that they are not the most succinct-yet-tractable thing you can come up with and Knowledge Compilation has studied many interesting, more succinct, generalization of them. But almost all such generalizations loose the canonicity, that is, there is no way of naturally definining a minimal and unique representation of the Boolean function with these data structures. Nor any way of "minimizing" them. You get better succinctness but you also loose somewhere else.
There is SDD which are also canonical in some sense but the canonical version may be way bigger than the smallest SDD, which is not satisfactory (though they are interesting in practice).
TDD, introduced in this paper, give such a generalization of OBDD, where you can minimize it toward a canonical form. The idea is to go from "testing along a path" to "testing along a tree" which allows to compute more compact circuits. For example, one big limitation of OBDD is that they cannot efficiently represent Cartesian product, that is, function of the form f(X,Y) = f1(X) AND f2(Y) with X and Y disjoint. You can do this kind of things with TDDs.
That said, they are less succinct than SDDs or other generalizations, so canonicity is not free. The main advantage of having canonicity and minimization is to unlock the following algorithm (used in practice for OBDD) to transform a CNF formula (the standard input of SAT solver) into a TDD:
Let F = C1 AND ...AND Cn where each Ci is a clause:
- Build a small TDD Ti for each Ci (that is not too hard, clauses are just disjunction of variables or negated variables)
- Combine T1 and T2 into a new TDD T' and minimize.
- Iteratively combine T' with T3 ... Tn and minimize at each step.
In the end, you have a TDD computing F. The fact that you can minimize at each step helps the search space to stay reasonable (it will blow up though, we are solving something way harder than SAT) and it may give you interesting practical performances.
Thanks, I'll have to take your word that this TDD minimization is useful in practice, the way SAT solvers are useful despite SAT being NP-hard. The general problem of Boolean function minimization is of course horrendously intractable; as you say, way harder than SAT.
Out of curiosity: when you talk about SDD, are you referring to Hierarchical Set Decision Diagrams or Sentential decision diagrams? I did my Ph.D. on the former, hence the curiosity :-)
I did not know about Ternary Decision Diagrams, sorry for the name clash. I had a look. You can always reencode Ternary Decision Diagrams into Binary ones by just encoding variable x with two bits x^1 and x^2, so Ordered Ternary Decision Diagrams are the same (modulo encoding) as OBDD, hence less succinct that our Tree Decision Diagrams. If you consider Read-Once Ternary Decision Diagrams, then you get something roughly equivalent to FBDD (modulo encoding). So this is incomparable with our TreeDD (that is, some functions are easy for TreeDD and hard for RO-TernaryDD like a low treewidth/high pathwidth CNF formula and some functions are easy for RO-TernaryDD and hard for TreeDD ; take anything separating FBDD from OBDD for example).
Hi, author here! Also positively surprised to see this on HN haha
We (well mainly Guy, if he's around) are working on an implementation, which will be made open source at some point (still rounding the edges a bit). We have very encouraging preliminary results, it does compare well wrt SDD and CUDD. There is still some ideas we would like to try, specifically for model counting.
I agree, though, you can still work toward understanding using an LLM (and take it from a skeptical) by, e.g., using them as challengers to your ideas.
That said, I think it requires a lot of self discipline and should be complemented with other methods and sources of information to be useful. As a teacher, I really try to prevent my undergraduate students from taking the easy road of using LLMs to solve every easyish problem I give them to *learn*. Sure, they did the homework but most of them did not learn anything while doing it and they finish their first year without having learnt anything actionable regarding computer science (observe that I use a different approach with students from other areas, though I still think it is good to spend a few days without relying on LLMs).
I often use a sport analogy to land my point which works with them, so let me share it here. If you want to learn how to run a marathon and drive 42km every day, then you are certainly (hopefully ?) a better driver but nowhere near to running a marathon (fortunately, no one has yet challenged me with the fact that running a marathon is way less useful to get a job than driving).
To efficiently encode XOR-constraint you will necessarily need extra variables (as you proposed) because we know that no bounded depth and polynomial size Boolean circuit can encode parity ("parity is not in AC0").
The problem with this kind of constraints is that they are not local and really disturb CDCL based SAT solvers because you cannot remove the constraint unless you have fixed all its variables. Now, cryptominisat has a specific focus on this kind of constraints and implements ad-hoc routines to exploit this routine. cryptominisat is hence focused toward applications where this kind of constraint naturally appears, be it circuit synthesis or cryptography.
Now, CDCL solver is another beast. Gaussian elimination / Grobner basis are not implemented in these solvers simply because it is "too costly". One way of understanding why is just to reframe what one call SAT solver. This is not really a tool to solve SAT (hell, they can't even deal with the pigeon hole principle without preprocessing), the purpose of a SAT solver is to quickly solve instances coming from natural encoding of constraint systems. In this framework, unit propagation and clause learning are basically extremely efficient because they uncover hidden structure in the way, us human, encode constraint problems. Hence (CDCL) SAT solvers are highly efficient in many applications. But it is easy to kill one with a few variables by simply going out of this framework. For example, they are killers in solving packages dependencies because the structure of these problems matches this approach, CDCL solvers are then just an extremely refined way of bruteforcing your way through the search space.
Now, if your application heavily contains XOR-constraints, then it is likely a (CDCL) SAT solver is not the right approach. For circuit synthesis, some people use BDD based approach (where parity can be efficiently encoded) but sometimes you will simply have to develop your dedicated solver or use other solvers for other NP-complete problem that will have a different area of specialization (as already observed in many comments).
I would also add that #SAT solvers, aiming at counting the number of solutions, are often implicitly solving the ALL-SAT in a more efficient manner than what you would have with modified CDCL solvers because they use other caching and decomposability techniques. Check knowledge compiler d4 for example https://github.com/crillab/d4v2 that can build some kind of circuits representing every solution in a factorized yet tractable way.
I am late to the party, but let me add my grain of salt. I think animations could be a killer feature when preparing for talks. This is something I already do with a modified version of graphviz (I explain my workflow below) and d2 seems to have almost all functionalities I need to adapt it but one (the one I had to add to graphviz): hardcoding custom HTML attributes in the generated SVG for animation purposes.
Let me explain: I have been using revealjs for making slides for a while now. The way "animation" is dealt with here is really interesting: html blocks having the "fragment" class will be visited with a class "current-fragment" and "fragment-visited" one after the other. One can then play with the animation using CSS, eg, .fragment is not displayed but .fragment.current-fragment and .fragment.fragment-visited are displayed. Hence, moving through slides allows to display new content in the slides. Nothing prevents it to work on inlined svg and I can use d2 to create diagrams with nodes/edges having the fragment class and revealjs will incrementally make them appear.
I am however missing something here. Revealjs visits fragments by walking the DOM. Hence the animation depends on the order the diagram is generated in. Moreover, it is impossible to make several nodes appear together (or you need them in the same sub-diagram). Revealjs has a nice workaround: you can add a data-fragment-index attributes in your tag. Nodes will then be visited by following the fragment-index order. Being able to just add indices to the fragment in d2 would be enough to completely animate the diagram using revealjs without any overhead, by just inlining the generated svg into the slides.
I actually do that with graphviz already, where I slightly modified the source to that I can add a data-fragment-index in nodes/edges, which allows me to animate diagrams, see [1] for example for a recent animation I did for a talk.
I guess I could do this in d2 too, but since you are expressing an interest in animation, let me tell you that just being able to specify an order in which blocks are visited and that this is performed via css modification is an extremely powerful approach (you can not only make things appear/disappear, you can also change colors for highlighting or anything you can think of that is doable in css ; changing sizes is a bit of an adventure though as you may fall into inconsistent drawing). Moreover, if you use the fragment/data-fragment-index terminology, you would directly be compatible with revealjs, which could be a really nice combination (though, there are some shortcomings with revealjs regarding fragments index which is why I now use a homecooked [2], minimal version of the same idea ; like having more than one index for a block, or being forced to use numbers for index where 1a 1b 1c would be more interesting to "insert" transition between two existing ones without having to renumber).
Let me just conclude with an example. I would love to be able to write this in d2, to make the edge "CONNECT" appears and the "info" node at the first transition:
You may use xslt for this kind of operation (https://en.wikipedia.org/wiki/XSLT) but it is often overkilled. I would love a language that allows to easily write transductions of HTML documents but I think that combining simplicity and expressiveness for this kind of tasks is a bit painful. I usually end up using pandoc filters or some dedicated javascript script that transforms the DOM and output a new HTML file.
That said, Knowledge Compilation is one of the worst subfield of computer science regarding acronyms, so I understand your feeling...