Accelerating Penumbra's Merkle Tree by up to 4,000,000x

Penumbra is designed from the ground up to be private by default, and implementing a zero-knowledge blockchain doesn’t just mean using fancy cryptographic primitives; it necessitates a rethink of the protocol’s entire dataflow. We want Penumbra users to experience lightning-fast transactions and an instantly-available view of their wallet balance: to do this, we needed to re-think the status quo for shielded synchronization, developing the novel Tiered Commitment Tree to support fast-forwarding, parallelizable client sync! But first, some background:

Unique Challenges of Shielded Synchronization

On a transparent chain like the Cosmos Hub, each full node stores a Merkle tree that publicly records in clear text the entire transaction history and current balance for each address that has ever been used, while verifying light clients check Merkle proofs of inclusion to ensure that the information provided by whichever node they’re talking to has not been altered. By contrast, Penumbra shields all user activity and balances, so it can’t use this architecture. Instead, each light client maintains a fragment of the state of the global shielded pool’s “note commitment tree”—whichever fragment of that tree is relevant to the client’s spending authority—and full nodes only maintain a fixed-size “frontier” of this decentralized structure.

In order to spend a note, a Penumbra client forms a zero-knowledge proof of inclusion for the note it wishes to spend, and the full node checks this proof against the Merkle root of the shielded pool. This is exactly inverse to the direction of data flow for a transparent chain: on Penumbra, clients send proofs to full nodes, not the other way around. Making this architecture just as efficient as a transparent design presents novel challenges which we’ve had the pleasure to solve.

To bring usable financial privacy to everyone, our goal is for light clients to be able to synchronize their wallet state at (or even faster) than the speed at which it’s possible to download a sequence of minimized block data from a node, over a very fast network link, and even on comparatively resource-constrained devices like smartphones. The performance of the note commitment tree is an essential component of delivering that speed.

As we began to scale up our tests of Penumbra, we found that the design of existing Merkle trees were not sufficient to support the lightning-fast synchronization we wanted for the Penumbra client experience. We use the ZK-friendly Poseidon hash function for our note commitment tree, which is efficient in zero-knowledge circuits, but much slower than an ordinary hash function to compute in software. This means that every operation on our note commitment tree cost us many hundreds of times what the same operation could cost a transparent Merkle tree. We wanted a way to reduce the number of hash operations incurred by each insertion into the tree, but more importantly, we wondered if there was a way to avoid performing the bulk of tree insertions altogether. After all, most notes are not spendable by any given client, so any given client immediately “forgets about” most things it inserts into its tree, since it’s only when spending a note that a client needs to have access to the corresponding part of the note commitment tree. If we could find a way to “fast forward” through chunks of the chain’s history that contained no new notes for a given client, we’d be able to save thousands, or even millions, of hash operations.

A Way Forward

Enter the Tiered Commitment Tree (TCT): a de novo, bespoke Merkle tree designed to do precisely that, and more. The tiered commitment tree is:

Quaternary: Each node of the tiered commitment tree has four children. This means that the tree can be much shallower while having the same amount of capacity for leaves, which reduces the number of hash operations required for each insertion. The academic research surrounding the Poseidon hash recommends a quaternary tree specifically for its balance between proof depth and proof size: a binary tree means too many expensive hash operations, whereas an octal tree means too large proofs.

Sparse: A client’s version of the tree need only represent the note commitments pertinent to that client’s state, with all other “forgotten” commitments and internal hashes related to them pruned to a single summary hash.

Semi-Lazy: The internal hashes along the frontier of the tree are only computed on demand when providing a proof of inclusion or computing the root hash of the tree, which means a theoretical performance boost of (amortized) 12x during scanning.

The novel feature of the Tiered Commitment Tree, however, is that it is:

Tiered: the structure of the tree is actually a triply-nested tree-of-trees-of-trees. The global, top-level tree contains at its leaves up to 65,536 epoch trees, each of which contains at its own leaves up to 65,536 block trees, each of which contains at its own leaves up to 65,536 note commitments. It is this structure that enables a client to avoid ever performing the hashes to insert the note commitments in blocks or epochs when it knows it didn’t receive any notes. When a client detects that an entire block, or an entire epoch, contained nothing of interest for it, it doesn’t even need to construct that span of the commitment tree: it merely inserts the singular summary hash for that block or epoch, which is provided by the chain itself inline with the stream of blocks.

Eternity┃           ╱╲ ◀───────────── Anchor
    Tree┃          ╱││╲               = Global Tree Root
        ┃         * ** *           ╮
        ┃      *   *  *   *        │ 8 levels
        ┃   *     *    *     *     ╯
        ┃  ╱╲    ╱╲    ╱╲    ╱╲
        ┃ ╱││╲  ╱││╲  ╱││╲  ╱││╲ ◀─── Global Tree Leaf
                        ▲             = Epoch Root
                     ┌──┘
                     │
                     │
   Epoch┃           ╱╲ ◀───────────── Epoch Root
    Tree┃          ╱││╲
        ┃         * ** *           ╮
        ┃      *   *  *   *        │ 8 levels
        ┃   *     *    *     *     ╯
        ┃  ╱╲    ╱╲    ╱╲    ╱╲
        ┃ ╱││╲  ╱││╲  ╱││╲  ╱││╲ ◀─── Epoch Leaf
                 ▲                    = Block Root
                 └───┐
                     │
                     │
   Block┃           ╱╲ ◀───────────── Block Root
    Tree┃          ╱││╲
        ┃         * ** *           ╮
        ┃      *   *  *   *        │ 8 levels
        ┃   *     *    *     *     ╯
        ┃  ╱╲    ╱╲    ╱╲    ╱╲
        ┃ ╱││╲  ╱││╲  ╱││╲  ╱││╲ ◀─── Block Leaf
                                      = Note Commitment

The tiered structure of the TCT provides a huge scaling solution for Penumbra. Without it, as the network grows into greater adoption and success, blocks would become fuller, and with that, clients would have to perform proportionately more computational effort to synchronize other people’s state that they would immediately throw away! An architecture based on a non-tiered tree would punish light clients for the success of their own network—but with a tiered tree, clients need only pay computational effort proportional to the amount of activity they themselves perform.

Another important feature of the TCT is that we incrementally serialize it to disk. While performing synchronization, we want to persist the client’s current state as we go, so that a power failure or other interruption won’t cause us to have to start over. However, a data structure not built for incremental serialization would need to be serialized as a whole each time we saved our state. In practice, this previously meant writing hundreds of gigabytes to the disk, even during the synchronization of a week-old Penumbra testnet. Not anymore—we take advantage of the structural immutability of the tiered commitment tree to incrementally serialize to disk only the portions that are changed between each save point, reducing writes by a factor of 100 or more.

Using a variety of rather fun Rust type-system tricks, all of the structural properties of the tree are enforced at compile time, so that any violation of a structural invariant of the tree within its own implementation would be a compile time error. This design, in addition to increasing our confidence in its correctness (see also our property-test suite, still in progress towards a full executable specification), shifts some runtime checks to compile time, making it even faster.

Into The Future

At present, we haven’t even fully exploited the capabilities of the TCT—note commitment tree synchronization is now no longer our current client speed bottleneck, so we’re not even tapping into all its power yet. But in the future, the design of the structure can be further leveraged to fast-forward even faster, efficiently utilizing all cores available to the client to compute hashes in parallel. We’ll do that when we make all the other parts of synchronization fast enough that we’re back to bottlenecking on the speed of TCT synchronization. I can’t wait to see how fast we are at that point ;)

When paired with the fuzzy transaction detection capability on our roadmap, this would allow a client to synchronize their private state at speeds dramatically faster than even the speed of merely downloading the chain’s history of blocks from a node. A back of the envelope calculation about the future to come: a client with a few transactions a month could, on commodity hardware like a Macbook Air, fully synchronize their wallet starting from genesis, in minutes, even decades after we launch Penumbra.

More Resources

You can watch a live talk at zkSummit 8 in Berlin that goes into more details about the exact construction of our tree, and how it fits into our architecture. Or if you're hungry for more, join our core engineer @plaidfinch TODAY, November 15, in Twitter Spaces where they'll talk to Josh Cincinnati of Radiant Commons about privacy, the Tiered Commitment Tree, and more!

See you in the Interchain! 🚀