Haroldbot is a tool that checks the equivalence of arithmetic expressions with 32-bit bitvector semantics, running entirely on the client. This task is mostly accomplished by encoding the expression as an array of 32 binary decision diagrams, one for each bit. Since for a given variable ordering a BDD of a boolean function is canonical, this representation offers and easy equivalence test, at the cost of having to build the BDD first.

Some types of expressions, most notably expressions that involve the product of two variables, cannot be encoded as a small BDD. For those expressions, haroldbot can fall back to using a SAT solver, by encoding the whole expression as a boolean circuit first and then using the Tseytin transformation to convert the problem to CNF.

Haroldbot can also (sometimes) find step-by-step proofs for true statements, this is done by applying expression rewrite rules from both ends of proof until a match is found that "connects" the two.

BDDs, specifically reduced ordered binary decision diagrams (ROBDD) with a fixed order and with "complement edges", form the backbone of haroldbot and have been part of it the longest, with the SAT fallback and proof finder being added during later stages of development.

A BDD is a directed acyclic graph consisting of internal nodes with exactly two children that represent testing a boolean variable and taking either the "low" or the "high" edge, and two leaf nodes that represent the constant results 0 and 1. The "reduced" part of ROBDD means that a node that refers to the same child the same way twice (representing a "choice" between two equal things) is skipped and that any equivalent nodes are deduplicated.

Skipping useless nodes with two equal children is very easy, since the node-creating function can simply return the child instead of a new node, but keeping nodes deduplicated is more interesting. In abstract terms, the typical solution is to maintain a dictionary `Node -> Node`, where looking up a node gives you the canonical node (or nothing, in which case a new node must be created). In haroldbot this is implemented in the node storage itself, by treating the storage arrays as a hash table with open addressing.

Nodes are then referred to by their index in the table, or by the complement of the index to represent a complement edge. The indexes are non-negative, so complement edges are always negative, making them easy to distinguish. To ensure canonicity, only the low edge of a node may be a complement edge. If a node with a complement edge as its high edge is requested, the low edge, high edge, and the index of this node when it is returned are all complemented. The "bottom" sink is represented by 0, there is no "top" sink, instead a complement edge to the "bottom" sink is used.

This representation has a side effect of making some computations with nodes trivial. Any boolean function applied on sinks can be written with the equivalent bitwise operations and it will have the same effect. More interestingly, even in some special cases that do *not* just involve constants, computations can be done directly on node indices without going through the slower BDD-arithmetic algorithms. So the function that computes the XOR between two BDDs starts with the lines

if (f == 0 || g == 0 || f == -1 || g == -1 || f == g || f == ~g) return f ^ g;which takes care of all base cases of the recursive function that computes the XOR of

The full process of computing the XOR (for example) of two BDDs is described well in The Art of Computer Programming Volume 4A, in chapter 7.1.4 *Binary decision diagrams*.

In many parts of haroldbot functions are represented as arrays of 32 BDDs. Below is an example, visualized one bit at the time. Low and high edges are indicated by dashed and solid edges respectively, normal edges and complement edges are indicated by blue and red respectively. An extra root node is visualized, it is not part of the BDD but is necessary here to indicate whether the actual root node of the BDD is entered through a complement edge or through a normal edge. To prevent the visualization from blowing up it is limited to 1000 nodes, so big BDDs are only visualized partially.

This visualization uses the vis.js library, with custom node-coordinates because the built-in layout algorithms did not seem to react well to typical BDD shapes.

BDDs can easily and efficiently be combined according to a logical operation with a recursive algorithm, though the result may have as many as n*m nodes where n and m are the number of nodes in the BDDs that were combined. The potential blow-up of nodes at every bit-operation is not always a problem, for example the addition function (as by default shown above) between two n-bit variables where n>1 needs only 3n+1 nodes.

The product of two n-bit number does suffer from the blow-up of the number of nodes, so many formulas that involve multiplication cannot be handled by the BDD part of haroldbot. Originally haroldbot would just give up on functions when it ran out of space in the pre-allocated BDD storage arrays, but it now falls back to using a SAT solver to try to handle those cases, which is often succesful.

A BDD representation of a boolean formula can be used to easily count for how many inputs it is true or false in time linear in the size of the BDD. This makes it feasible to count the number of solutions (or counter examples) even if it is very high.

The Boolean Satisfiability Problem (SAT) is perhaps the best known NP-complete problem. Nevertheless, modern Conflict-Driven Clause Learning (CDCL) solvers can solve many instances that are of practical interest. Haroldbot uses MiniSat compiled to JavaScript with Emscripten.

SAT by definition only handles boolean formulas, and solvers typically need the formula in conjunctive normal form (CNF). In order to convert formulas over 32bit integers to CNF, haroldbot first builds a boolean circuit and then converts that to CNF by working backwards from the outputs and applying the Tseytin transformation to the backwards-reachable gates. Gates can become disconnected from any output while the circuit is built up. Disconnected gates are not deallocated, instead they are ignored during the conversion to CNF.

Haroldbot prevents the creation of many trivially-equivalent gates by maintaining a fixed-size hash table of gates. Hash collisions are resolved by simply overwriting the old entry, the table does not need to be complete because it is not required for correctness, it is only used to prevent some unnecessary duplication of gates.

The types of gate used in haroldbot are the 2-input OR and XOR gates, and a 3-input "carry" gate which outputs true iff 2 or 3 inputs are true.

Similar to in the BDD, both normal edges and complement edges are used. Complement edges are simpler to handle in the conversion to CNF than explicit negation gates, because they simply correspond to using the complement of the output-variable of the referenced gate and so don't require any CNF clauses. They also create more opportunities for gate-deduplication.

Before converting to CNF, functions are represented by an array of 32 gate-references. Below is an example (visualized one bit at the time), normal edges and complement edges are indicated by blue and red edges respectively. The output (at the top) and inputs (leaf nodes) are not gates, they only exist in the visualization. As with the BDD visualization, there is a limit of 1000 nodes.

This visualization uses the vis.js library, again with custom layout. Because the circuits tend to be deeper than wide (they're not meant for hardware, so depth isn't much of a concern), edges are often drawn

Automatic proof generation in haroldbot is based on a simple bidirectional exploration of the search space. The search space consists of expressions that are equal to either the left hand side of the top level equality operator or equal to the right hand side. The neighbours of a node in the search space are all the expressions that are one rewrite rule away from it.

Below is a visualization of the process, though only nodes that have been taken out of the priority queues are drawn, not all nodes that have ever been generated. It was actually meant to visualize the latter (that would more accurately reflect the process), but for interesting proofs so many nodes are generated that visualizing them is hopelessly slow.

The expression must have '==' as root operator.

Naturally, this is an exponential time (and space) algorithm. This is made worse by the fact that many rewrites increase the size of the expression. In haroldbot that is prevented a bit by preferentially expanding small expressions. Small expressions tend to make the proof nicer (though sometimes longer) and are more likely to go in the "right direction".

In order to rewrite and determine structural equivalences efficiently, expressions are hashed in two different ways. A local hash, which depends only on the the top level operator and its direct children, is used to select likely-applicable rewrite rules. A total hash, which depends on the entire expression, is used to detect duplicate expressions within each expression cloud and to detect a match that connects both clouds with a path (which represents a proof).

Thanks to the local hash, at most rewrite rules are tried per node in an expression, instead of all three hundred rules. Usually fewer, typically around 5. Most rules operate on a fixed top level operator and so are generally not applied to an expression node that represents a different operator, except when the hashes collide.

After the expression-steps of a proof have been found, the explanations are then generated separately. This allows more steps to have a "forward" explanation, which is often nicer than a "backwards" explanation. As part of this process, the node-ids of the expression in the explanation itself are rewritten to the ids found in the actual expression. The front-end wraps each node of an expression in a `span`-tag and uses the id to calculate a `class`-attribute which is used to highlight corresponding expressions on hover.