Testnet 38: Kalyke

We've been building Penumbra in public, with weekly testnet releases named after the moons of Jupiter. All of our engineering discussion happens in Discord, where we've been summarizing the changes each week, and now that the protocol is maturing, we'll also be writing up progress updates here on the blog.

Today, we released our 38th testnet, codenamed Kalyke, which contains major improvements that make it much easier and simpler to write clients for Penumbra, and lay the groundwork for exciting future work. To test it out, check out the guide documentation on how to use the command-line client, pcli(pronounced "pickle-y").

Tokenized Unbonding

As described in one of our first posts on shielded staking, Penumbra records delegations using delegation tokens, which represent a share of a particular validator's delegation pool. Delegation to a validator is a protocol-native exchange of staking tokens for delegation tokens, and undelegation exchanges delegation tokens back to staking tokens. Rather than distributing staking rewards, the chain prices them into each delegation token's exchange rate. This means that delegations can be recorded in Penumbra's multi-asset shielded pool like any other asset, and allows us to maintain transparency and accountability for validators, while having privacy for delegators.

However, handling unbonding is a challenge. While we can neatly handle slashing, by marking down the exchange rate to price in a slashing penalty, we need to maintain an unbonding period, so that a malicious validator cannot misbehave, then immediately withdraw stake before they're slashed. This means we need a way to freeze the results of an undelegation until the end of the unbonding period, and apply any slashing penalties that happen in the meantime.

Our initial approach to unbonding involved "quarantining". Because all transactions are private by default, and all value is recorded in the same shielded pool, we built a system that would place the outputs from a transaction with an undelegation in a special quarantine state, and only add them to the shielded pool after unbonding if no slashings had occurred, or otherwise roll back the effects of the transaction.

While this sounds simple, it turned out to be extremely complex: to start, preserving the ability to unapply particular state transitions is quite complex and error-prone, but even worse, that complexity multiplies across every other part of the system. For instance, a key thing to realize about shielded blockchains is that, fundamentally, the way they achieve privacy is by moving execution off-chain, out to the client device at the "edge" of the network -- the role of the ZK proofs is to certify that the client's execution was done correctly -- and so execution happens on the client. But that meant that not only did the fullnode have to pay the complexity of separately maintaining quarantined transactions and the ability to roll them back, so did every single client. Moreover, because outputs are shielded, we had to shield every output of a transaction, meaning that unless an undelegation was carefully constructed with exact change, it could accidentally lock a user's other funds.

This was definitely suboptimal. To build privacy without compromise, we need to make it as easy to build clients for Penumbra as it is to build them for a transparent chain. That's a hard enough problem to start, but this design decision made it much worse.

Instead, we followed a design principle that's emerged in other contexts as we've built Penumbra: any time a token can be in a different state, it should be a different stateful token. In this case, our problems arise from having to maintain a different state for staking tokens that are still unbonding, rather than having a distinct "unbonding token" that represents that state.

In our new design, that's exactly what we do. Rather than exchanging delegation tokens for staking tokens, undelegation is now a two-step process. First, the Undelegate action exchanges delegation tokens (parameterized by validator) for unbonding tokens (parameterized by validator and unbonding period), with an exchange rate that prices in staking rewards. Then, the UndelegateClaim action exchanges unbonding tokens for staking tokens, with a penalty rate that prices in slashings over the unbonding period. The chain validates that the unbonding period is over, and that the penalty rate is correct. In the happy path, the penalty is 0, and unbonding tokens convert to staking tokens 1:1.

This change was a massive simplification for client implementations, which no longer need to unapply transactions, and only have to update forwards through time.

Personal Rollups

Another feature we shipped this week is minimal support for personal rollups, so that transactions can just post opaque state commitments and ZK proofs to the chain, and omit the other transaction contents.

One way to understand the basic design of Zcash-like systems such as Penumbra is that every transaction contains both a micro-rollup, with opaque commitments to the notes consumed and produced by the transaction and ZK proofs those notes were well-formed, and also the rolled-up data itself, encrypted to the sender and/or receiver, who can scan the blockchain to learn about their transactions and decrypt the note contents.

Rather than simply recording notes, Penumbra's commitment tree now records commitments to arbitrary state fragments, with shielded notes just one special case. This means that we maintain forward compatibility with expansions to the kinds of state fragments we record, or changes to the format of existing state fragments. And, rather than requiring that the encrypted payload contents are always present, we now support bare state commitments without payloads, representing state fragments rolled-up off-chain. This also provides scalability benefits, because every other client saves the effort that would be required to scan those state commitments' encrypted payloads.

Currently, we only use this to implement swaps, as described in the next section, but future work building on this foundation will allow users like marketmakers to maintain their state off-chain, unlocking significant performance and scalability gains. Operationalizing this poses some interesting challenges, which we've been thinking about and are excited to share our ideas on soon.

Simplified Swaps

Earlier this fall, we described our initial implementation of shielded swaps, which allow Penumbra users to swap assets from one type to another without leaving the shielded pool. To do this, we need a way for clients to execute state updates asynchronously, because the clearing price for the batch only becomes known after the transaction is submitted. Our insight was that we could model Futures (or Promises) on-chain, pausing execution at an .await point by creating a SNARK-friendly Merklization of all of the intermediate execution state, and then resuming execution in a later transaction by opening the commitment to the intermediate execution state. For instance, in a swap, the initial transaction's Swap action commits to the user's input amounts, trading pair, and claim address, and once the batch prices are published, the SwapClaim action privately mints their pro-rata share of the batch, proving consistency between the public prices and the private input data.

In our initial implementation, we recorded each user's execution state by creating a "swap NFT", 1 unit of an asset whose asset ID was a commitment to their swap inputs, and recording it in a shielded note like any other asset. This was a convenient way to get the system working, but we realized it had a few ill-fitting parts:

We wanted the SwapClaim in the second phase to be possible to submit automatically, and not require separate signing, since that would be bad UX, and because the SwapClaim doesn't represent any new intent, it's just finishing a computation already started by the Swap, which specifies the claim address up front.
A shielded note represents a (typed) value, plus the capability to spend that value, similarly to a Bitcoin P2PK UTXO. This means that every note inherently has a spend capability. How does that capability relate to the claim process? What happens if someone sends someone else a swap NFT, which doesn't affect the pre-specified claim address?
Because the SwapClaim shouldn't require authorization, we ended up building a special-case proof statement that allowed ignoring the spend capability when spending a note, and instead proving that the note recorded a swap NFT, and that the output notes were minted to the correct claim address. This felt wrong: if the capability is going to be ignored, it shouldn't be recorded in the first place!
To establish that the SwapClaim used the correct output prices, we proved that the prices used as the public input to the SwapClaim proof matched the height at which the note recording the swap NFT was created. But this is definitely wrong, because the height at which the note recording the NFT was created is not necessarily the same as the height the swap NFT was created. Since the swap NFT is just another asset in the shielded pool, a user can send their swap NFT to themselves, effectively getting a free option on any later clearing price!
Because the SwapClaim creates output notes, the claimer needs to be able to learn about them, in order to have effective control over the output funds. But while the note commitment is checked in the circuit, the encrypted note payload isn't, so a malicious client could correctly claim a user's output funds, but encrypt garbage data in the payload, effectively burning the money.

However, after the design changes described in the last section, all the ingredients for a much simpler solution that avoids these issues were in place.

Once we generalized our commitment tree to arbitrary state fragments, we could commit directly to users' swap inputs: we recast the note commitment tree as a state commitment tree, define both note commitments and swap commitments as different kinds of state commitments, and insert the swap commitments directly into the state commitment tree rather than creating "swap NFTs". On the client side, we expand the set of possible state payloads to add encrypted swaps as well as encrypted notes, allowing clients to detect their own swaps while scanning.

Because the swap commitment is inserted into the tree by the chain, and can't be relocated after the fact, we can use its height in the SwapClaim proof, which just has to prove knowledge of an inclusion path for the swap commitment, and that the inclusion path passes through the claimed height. To prevent the DoS attack, we make the output notes deterministically derivable from the original swap plaintext and the public clearing prices -- but at that point, why include the output notes at all? Now that we have support for rolled-up state commitments, we can skip including the output notes on-chain, saving scanning work for every other client.

In addition to fixing the security problems in the initial proof of concept, these changes streamline the client-side implementation of synchronization and scanning, by removing special cases and features-jammed-into-other features in favor of a cleaner, more extensible system for client state.

Other progress

We've also been pushing forward on other fronts, with the following work in progress but not yet ready to land in a user-visible way:

We implemented a ZK version of our Spend proof; in an upcoming release, we'll begin replacing our mock transparent proofs with ZK versions as we finalize Penumbra's functionality.
We implemented ICS23-compatible non-existence proofs for the Jellyfish Merkle Tree we use to store Penumbra's public state. This gives us full compatibility with ICS23-verifying IBC chains, once they have merged a branch that contains a small change to allow non-existence proofs for sparse Merkle trees like our JMT or Celestia's SMT. (Existence proofs are already supported without any code changes, which we used this summer to open an IBC connection to the Cosmos Hub).
We migrated our testnet deployment infrastructure to use a Kubernetes cluster, building on the excellent work by Strangelove. Upcoming work will focus on making Penumbra easier to deploy and maintain for those running nodes and validators. If you're interested, stop by the #deployments channel in Discord.

Stay tuned for more news on what we're building soon!