Penumbra Winter 2022 Update

Penumbra is building the first shielded DEX, allowing on-chain trading and marketmaking with private strategies. Since our summer update, we've continued to realize this vision, building and breaking in public, shipping weekly testnets, and growing our community. As 2022 draws to a close, here's a comprehensive update on what we've built since the summer, and what we're building next.

What we've built

Since our last comprehensive update this summer, we've made huge progress on a number of fronts:

Shielded Swaps

We've run through two design iterations on Penumbra's shielded swap mechanism, which allows Penumbra users to privately swap tokens without leaving the shielded pool. On Penumbra, all swaps in each block are executed in a single batch, with a common clearing price, eliminating frontrunning by eliminating the entire concept of intra-block ordering.

Shielded swaps reveal an important technical challenge: the challenge of providing private interaction with public shared state. We want individual users' trades and account balances to stay private, while retaining a public view of the aggregate market state, like clearing prices, available liquidity, and trading volume. We think this is the key problem to solve to make blockchain privacy mainstream, and that focusing on solving one concrete use-case first will point the way to better solutions for blockchain privacy generally. For more on this aspect, check out Henry's talks at DevCon Bogota or at ZKSummit 8.

To solve it, we designed a new, asynchronous ZK execution model that separates the private, per-user state from the public, aggregate state, and explicitly models communication between the two using message-passing. Our insight was that we could model Futures (or Promises) on-chain, pausing execution at each .await point by creating a SNARK-friendly commitment to all of the intermediate execution state, and then resuming execution in a later transaction by opening the commitment to the intermediate execution state.

For instance, in a swap, the initial transaction's Swap action can't know the clearing price of the batch, as it hasn't happened yet, so instead, it commits to the intermediate execution state – the user's input amounts, trading pair, and claim address – and once the batch prices are published, a followup transaction's SwapClaim action can open that commitment, and privately mint the user's pro-rata share of the batch, proving consistency between the public prices and the private input data. Importantly, the followup transaction doesn't require any additional signing, so it can be submitted automatically by a wallet.

In September, we shipped the first version of ZSwap, our mechanism for sealed-bid batch swaps; our announcement post has details on our initial design. That initial design had some rough edges, however, and while thinking through some other aspects of the system design in November, we landed on a second design iteration which simplifies the design and fixes a security flaw. A roundup of the design simplification and bugfix are described in our announcement post for Testnet 38, which we shipped in December.

Transaction Plans, Perspectives, and Views

It should be as easy to interact with the shielded chain state on Penumbra as it is to interact with transparent chain state on a transparent chain. But solving on-chain privacy creates much more complex problems about legibility: now that transaction data is private by default, how do users understand it? On a transparent chain, everyone has complete visibility into every transaction, and this assumption permeates nearly every aspect of existing blockchain tooling. For instance, now that transactions are opaque, how does a user get the information required to decide whether to sign them? How does a user's wallet understand and show the effects of their transactions? How does it understand and show the effects of transactions where it has incomplete visibility?

We realized that to make it as easy to build interfaces and tooling for Penumbra as it is for transparent chains, we needed to address legibility of shielded transactions in a reusable, first-class way, solving the problem once, rather than forcing every interface or tool to build its own solution just to get started. So, we now specify data formats modeling the entire lifecycle of a shielded transaction, both before and after its creation.

Before a transaction is created, it's specified in a TransactionPlan, which contains a complete, plaintext description of all data that will be used to build the transaction. This allows a custodian to review the exact effects of a transaction before authorizing it, so Penumbra users can have confidence that the transaction will do what they intended it to. And because the TransactionPlan is specified as a Protobuf, it can be serialized, sent across the network, and parsed by tooling in any language.

Once the TransactionPlan is assembled into a Transaction, the cleartext data in the plan is turned into encrypted payload data, or omitted entirely and replaced by opaque commitments. Different parties may have different levels of visibility into the transaction's encrypted payloads. For instance, in a payment, the sender can view the entire transaction contents, while the receiver can only view the encrypted memo and the outputs sent to them, but not the sender's spends and change outputs.

We capture the notion that there are different vantage points on the same transaction data with a TransactionPerspective: each TransactionView is generated by viewing a Transaction from some particular TransactionPerspective.

The TransactionView models a decrypted, interpreted view of a transaction, and is intended to be consumed by ecosystem tools and interfaces. For instance, while the Spend action of a Transaction only has opaque commitment data and ZK proofs, the corresponding SpendView in a TransactionView is an enum, either SpendView::Opaque, or SpendView::Visible with the plaintext note that was spent. This means that rather than requiring every ecosystem tool to implement its own logic to decrypt and understand the effects of shielded transactions, tools can consume TransactionViews, without having to support Penumbra-specific cryptography. And, like the TransactionPlan, the TransactionView is also specified as a Protobuf, so it can be serialized and parsed by tooling in any language.

Internally, a TransactionPerspective is a bundle of per-transaction key material and commitment openings, generated by applying an account's long-term full viewing key to the transaction data. The transaction perspective can be thought of as a per-transaction viewing key, but we decided to avoid using that term, because referring to a "transaction viewing key" could give the impression that each transaction has a unique viewing key, rather than multiple possible perspectives.

Beyond developer tooling and ergonomics, this design also improves Penumbra's security and auditability properties. Because the transaction perspective is a self-contained, serializable object, we now have the ability to selectively disclose information about specific transactions, rather than just disclosing an account's long-term full viewing keys, which give access to all past and future activity. This gives users the ability to disclose fine-grained selections from their transaction graph for accounting, auditing, or compliance purposes. It also means that we can design wallets or frontend interfaces that don't require access to long-term key material and allow users to revoke access.

Improved State and Execution Model

The chain state is the backbone of the Penumbra protocol: it's what the network comes to consensus on, it's how the network records data, it's how nodes implement application logic and thus intertwined with the chain's execution model, and so on. So it's critical to get right. As one of the first projects building a Tendermint chain in pure Rust, we didn't have an existing state layer and application framework to use, which has been both a curse and a blessing: a curse, because designing and building one is a lot of work, and a blessing, because the one we've ended up building is extremely powerful.

Our first iteration was Postgres-based, and only useful as an MVP. Our second iteration moved all of the chain state into a Jellyfish Merkle Tree forked from the Diem project, and used it to build a component-based node framework. This worked well, but further iteration revealed some major limitations. This fall, we addressed those limitations by landing the third iteration of Penumbra's state model, which we think is close to its final form, and will be generally useful for Rust-based blockchains other than Penumbra.

Penumbra's new storage system provides a versioned, verifiable key-value store with lightweight, copy-on-write forks. A single Storage instance records a sequence of versioned States, each a lightweight snapshot of a particular version of the chain state. Each State instance can also be used as a copy-on-write fork to build up changes to the chain state before committing them to persistent storage, and those changes to the State can be built up in transactionally-applied groups.

Each State consists of two data stores:

  • A verifiable key-value store, with UTF-8 keys and byte values, backed by the Jellyfish Merkle Tree. The JMT is a sparse merkle tree that records hashed keys, so we also record an index of the keys themselves to allow range queries on keys rather than key hashes. This index, however, is not part of the verifiable consensus state.
  • A secondary, non-verifiable key-value store with byte keys and byte values, backed directly by RocksDB. This is intended for use building application-specific indexes of the verifiable consensus state – and in particular, indexes of liquidity positions used to accelerate our DEX engine.

The penumbra-storage crate provides a generic, byte-oriented key-value store. To use it, we layer on behavior with extension traits in our penumbra-proto crate that transform it into an object store with Protobuf-encoded data. This means that every piece of data in the Penumbra chain state has a documented and standardized schema, because the chain state is a kind of public API, and should be treated as such. As a nice side effect, this design means that code inspecting the Penumbra chain state can be written in any language with a Protobuf compiler.

Because each State snapshot is independent of all others, there's no need for a global state mutex, as in the Cosmos SDK. For instance, pd creates a new State snapshot at the beginning of processing each RPC request, allowing that request to be processed against a consistent version of the chain state without requiring synchronization with the consensus worker. And, because those snapshots can be mutated without committing them to the chain state, it's easy to implement extremely powerful simulation capability. For instance, a fullnode could run simulations of different batch trades against the current on-chain liquidity, processing each member of the ensemble in parallel.

Our execution model also takes advantage of these capabilities. On Penumbra, transaction verification and execution is modeled with the ActionHandler trait and splits into three phases:

  1. check_stateless, which performs all stateless checks, like signature or proof verification;
  2. check_stateful, which performs stateful checks against the chain state prior to execution, like consistency checks between the public inputs to proofs and the chain state;
  3. execute, which performs checks while writing changes to the chain state.

All three phases must succeed for a transaction's changes to be applied to the state, but segmenting them allows the implementation to maximize parallelism and concurrency: all check_stateless calls within a block can be run concurrently, and all check_stateful calls within a transaction can be run concurrently, leaving only the writes in execute on the critical path. This means that the bulk of the compute-intensive processing (in check_stateless) and the bulk of the state reads (in check_stateful) can be run in parallel, using Tokio to schedule the task graph over the available execution resources.

This architecture is still relatively Penumbra-specific, since our efforts are focused on getting Penumbra to mainnet, but we expect that by the time Penumbra is ready, we'll be able to extract it into a general-purpose application framework for building blockchains in Rust. Already, the penumbra-storage crate is free of Penumbra-specific dependencies and is intended for general use.

Client Sync Improvements

Penumbra clients need to synchronize with the network to learn about their state and create transactions -- unlike a transparent chain, they can't just ask a fullnode for their account balance, because a fullnode doesn't have that data, and therefore can't leak it. To handle this, Penumbra includes a client protocol that allows clients to synchronize their local state with the chain. Making this protocol simple and efficient translates directly into faster and better client experiences.

In our summer update, we described optimization work we did to accelerate the sync process, and dove into more detail in our post on our Tiered Commitment Tree. However, while we'd made synchronization fast, some accumulated mistakes from previous design iterations meant it was more complex than it needed to be, making it more difficult to write clients for Penumbra. Meanwhile, further design iterations on other parts of the system gave us new insight on how we could simplify it. We combined these ideas into improvements to the unbonding mechanism and to the swap claim mechanism, and shipped them in December; full details are written up in the announcement post for Testnet 38.

Protocol Governance MVP

In August, we shipped an MVP of Penumbra's governance component, with support for validator voting. Penumbra's governance component is similar to the Cosmos SDK's governance module, with the biggest difference being that because Penumbra provides delegation privacy, on Penumbra, individual delegators' votes are private, while validator votes are transparent. This provides privacy for delegators and accountability for validators, who cast default votes on behalf of their delegation pool (as in the Cosmos SDK).

We also make a few other changes, based on lessons learned from the Cosmos ecosystem. For instance, validator votes are signed not with the validator's long-term identity key, but with a governance-specific subkey. This allows validators to participate in governance without having to constantly pull their long-term key material out of cold storage, or even to delegate their participation to other entities, opening up interesting possibilities (e.g., a validator that specializes in operating infrastructure could delegate their governance voting to a DAO).

ZK Proof Implementation

In order to iterate on Penumbra's cryptography and system design, we've built our current testnets using mock "transparent proofs". Everywhere we intend to have a ZK proof, we instead encode the witness data transparently as a mock proof, and check the proof statements directly in software. This gives us an opaque blob of data with the exact same verification interface as a real ZK proof, but allowed us to iterate much more quickly than we could if we were doing circuit programming.

This fall, we started the process of replacing our mock proofs with ZK proofs, and wrote circuit implementations of our Spend and Output proofs, which we expect to integrate into testnets in the new year.

Tendermint 0.35/0.34 Migration

Last year, we started building Penumbra against Tendermint 0.35, which at the time was newly released. However, our testnets, as well as those of Celestia and Vega, revealed that the 0.35 release of Tendermint had serious stability problems when deployed in real network environments. At one point during the summer, for instance, we were unable to keep Penumbra testnets running for more than a day or two, not because of any problem with the Penumbra application, but because validators would mysteriously be unable to communicate with each other, causing chain halts unless we ensured that a single validator had a supermajority of voting power (allowing it to continue producing blocks even when unable to communicate with any other validators).

As a result of these stability issues, the Tendermint 0.35 release was discontinued by the Tendermint team, along with the unreleased work on Tendermint 0.36. Instead, Tendermint 0.37 will be based on the stable 0.34 code currently used in production.

Migration from 0.35 to 0.34 was a somewhat painful process. We interface to Tendermint via ABCI, and in particular through the ABCI domain types we contributed upstream to the tendermint-rs crate. However, because we had written that code against the 0.35 branch of tendermint-rs, which had to be rolled back, we had to backport the interface definitions, and then propagate those changes through our dependency tree, including ibc-rs. However, in collaboration with the Informal Systems team who maintain the tendermint-rs and ibc-rs crates, we were able to get through it, and have had no network stability issues since migrating to Tendermint 0.34.

Unfortunately, this change means that ABCI++, the upgraded interface to Tendermint that allows application logic to hook into consensus and implement advanced features like threshold cryptography, will not ship in its full form until Tendermint 0.38 at the earliest. In addition, based on our experiences so far, we're hesitant to rely on features that affect the network layer (like vote extensions, which allow validators to gossip threshold shares) without testing them in real-world conditions.

These factors led us to decide to defer flow encryption to a future network upgrade, rather than including it in the initial mainnet launch scope. This means that Penumbra won't conceal users' contributions to batch totals at launch. While this is unfortunate, the protocol is still designed to enable encrypting them in the future, and we feel that even without flow encryption, Penumbra's privacy guarantees are still a huge advance over the status quo.

Better Tooling, Deployments and Integration Testing

Finally, we also took the time to invest in improved tooling. We're now using Buf to host our Protobuf definitions, and auto-generate Go and Typescript packages that can work with any Penumbra data types or communicate with pd's RPC. We added native grpc-web support to pd, so every fullnode can serve GRPC to web clients without requiring a proxy, and started working on integrating automatic HTTPS support so every fullnode can serve RPC requests securely without any extra configuration beyond a domain name. Finally, we added a GRPC proxy to pd that provides access to the Tendermint RPC, so that Penumbra clients only need to speak to one RPC endpoint.

In collaboration with the Strangelove team, we also built Kubernetes-based testnet deployment infrastructure, which we'll be using for testnet deployments going forward, and started work on integrating Penumbra into their IBC test framework to do automatic conformance testing between Penumbra and Cosmos SDK testnets.

What we're building next

We're aiming for a mainnet launch next year, and closed out 2022 by planning the major components left to build before then. We'll be diving into more detail on each of these projects in future posts, but for now, here are brief summaries:

Penumbra DEX Engine

The last major component of the Penumbra ledger is the DEX engine, which executes batch swaps against the active liquidity. Penumbra's batched execution model has interesting implications for the DEX engine: because the DEX only runs executes once per block, not once per transaction, we can afford to make the DEX engine considerably more sophisticated, as the execution cost is amortized over all transactions in the block.

Penumbra uses a hybrid, order-book-like AMM with automatic routing. Liquidity on Penumbra is recorded as many individual concentrated liquidity positions, akin to an order book. Each liquidity position is its own AMM, with its own fee tier, and that AMM has the simplest possible form, a constant-sum (fixed-price) market maker. These component AMMs are synthesized into a global AMM by the DEX engine, which optimally routes trades across the entire liquidity graph.

Penumbra has no intra-block trade ordering, so DEX execution operates at the end of the block in four phases:

  1. All newly opened liquidity positions are added to the active set.
  2. Trades are batched by liquidity pair and executed against the active liquidity.
  3. The chain arbitrages all active positions against each other and burns the arbitrage profits.
  4. All newly closed positions are removed from the active set.

In this model, intra-chain arbitrage (making prices consistent within the chain) is performed automatically by the chain, while inter-chain arbitrage (making prices on Penumbra consistent with other markets) is performed by arbitrageurs. This arbitrage game has interesting pro-rata dynamics, which you can read about in this research paper we collaborated on with the Bain Capital Crypto team.

It has other useful properties: market-makers can create fill-or-kill positions with prices valid for exactly one block without having to compete for ordering within that block, by opening and then closing a liquidity position in the same transaction. Or, traders can execute on the market-maker side, by creating a fill-or-kill position whose price crosses the spread, and waiting for the chain to fill it through arbitrage. Because trades are optimally routed, the chain can compose cheap stableswaps between different IBC paths on either end of a trade, making available liquidity independent of the specific bridge path of an asset. Finally, passive market-makers can replicate the payoff curves of other AMM trading functions, like UniV2's xy=k, though they may be outcompeted by active LPs.

Our new state model is key to implementing this design efficiently, as it allows us to maintain application-specific indexes (like the liquidity positions on each trading pair) with transactional, copy-on-write semantics, so we can speculatively mutate state during routing and choose whether or not to apply the results.

Web Interfaces

We've been laying the groundwork for web interfaces to Penumbra, with our modular client design, the TransactionPerspective/TransactionView design, and the Protobuf tooling that allows accessing Penumbra data structures from Typescript. We've identified a few key goals for interfaces to Penumbra:

We want to be able to support a wide variety of interfaces to Penumbra.

While it’s important that there be at least one first-class wallet experience at launch, Penumbra’s capabilities are multifaceted, and users will probably be best served by specialized interfaces: e.g., one for basic transfers or swaps, one for governance, one for power users to manage liquidity or examine the liquidity graph, etc.

Supporting a wide variety of interfaces is also important for decentralization: no one entity should “own” the Penumbra userbase via control of a single frontend, or be at risk of becoming a chokepoint for control of those users. This also means that it should be possible to build frontend interfaces to Penumbra that do not require custom backend infrastructure beyond an ordinary pd full node.

We want those interfaces to be as easy to build as interfaces to transparent chains.

Historically, interfaces to shielded chains have been more difficult to build than interfaces to transparent chains, because they require the application developer to manage synchronization of users’ private state, unlike a transparent chain, where user state is accessible via RPC. Penumbra is about privacy without compromise, so we want to make it as easy to build those interfaces for Penumbra as it is for a shielded chain.

We want our users to have security when interacting with those interfaces.

In order for users to benefit from the availability of a wide variety of third-party interfaces, users need to be confident they can use them without risk of losing funds or being hacked. We need to ensure that only the user can authorize a transaction, and be able to understand exactly what actions they’re authorizing when they do so.

Web Architecture

To accomplish these goals, we're building implementations of our view and custody services, and bundling them into a browser extension that can communicate with web content. All long-term key material is kept inside of the browser extension, so web content only has access to per-transaction data, and all transactions must be approved through the browser extension, which provides a secure display path for the user to review a proposed transaction. This allows a user to selectively grant viewing or spending permissions to web interfaces to relatively untrusted web interfaces, and to revoke those permissions later. The resulting architecture is shown below:

                 ┌───────────┐
                 │ Extension │
             ╭   │ ┌───────┐ │ custody
    spending │   │ │custody│ │ protocol
  capability │   │ │service│◀┼─────────────────┐
             ╰   │ └───────┘ │                 │
                 │           │                 │
                 │           │                 │
                 │           │                 │
             ╭   │ ┌───────┐ │ view            │
full viewing │   │ │view   │ │ protocol        │
  capability │   │ │service│◀┼──────┐          │
             ╰   │ └───────┘ │      │          │
                 │   ▲       │      │      ┌───┼────────────┐
                 └───┼───────┘      │      │   │ Web Content│
                     │              │      │   ▼            │
             ╭       │              │      │ ┌───────┐      │
 transaction │       │              └──────┼▶│wallet │      │
perspectives │       │                     │ │ logic │      │
             ╰       │                     │ └───────┘      │
                     │                     │   ▲ │          │
                     │ specific/oblivious  └───┼─┼──────────┘
                     │ client protocols        │ │
                     ├─────────────────────────┘ │ tx
                     │┌──────────────────────────┘ broadcast                     .───.
                     ││                                                        ,'     `.
             ╭   ┌───┼┼─────────────────────────────────────┐             .───;         :
      public │   │   ││                    Penumbra Fullnode│            ;              │
       chain │   │   ││ grpc/grpc-web                       │          .─┤              ├──.
        data │   │   ▼▼                                     │        ,'                     `.
             │   │ ┌────┐     tm rpc proxy     ┌──────────┐ │       ;               Penumbra  :
             │   │ │    │◀────────────────────▶│          │ │       : ┌───────────▶ Network   ;
             │   │ │ pd │◀────────────────────▶│tendermint│◀┼─────────┘                      ╱
             │   │ └────┘       abci app       └──────────┘ │         `.     `.     `.     ,'
             ╰   └──────────────────────────────────────────┘           `───'  `───'  `───'

We're working with the Zpoken team to build this system and feed back improvements to Penumbra's RPCs along the way, and we're excited to be able to show the results when they're ready.

Narsil: Sharded Custody and Personal Rollup

Because Penumbra uses custom cryptography to achieve privacy, we can't easily make use of existing custody tooling like HSMs or hardware wallets without extending them to add support for our cryptography. However, secure custody is a critical operational requirement. Our plan to address it is to build a threshold signing implementation using FROST, providing off-chain multisignatures indistinguishable from any other transactions.

This tool will be called Narsil, and it will act as a custodian and audit log for a single Penumbra account, sharing control of that account's spending authority across multiple Narsil shards.

While FROST solves the cryptographic part of threshold signatures, it doesn't solve the engineering problem of building a way for the participants to reliably communicate with each other and agree on what messages they should be signing. That problem is also subtle and difficult to solve, but we realized we already have a tool that allows a set of participants to come to consensus on what messages have occurred: Tendermint consensus.

Using Tendermint internally gives us fault-tolerant state replication with an off-the-shelf tool, but it also has another important advantage. Because the Narsil cluster has fault-tolerant replication of the audit log of every transaction the account has authorized, by implication, it has fault-tolerant replication of the entire account state. This means that users can choose to maintain their account state off-chain, using their Narsil cluster as a personal rollup, and only post state commitments, nullifiers, and proofs to the L1.

This seems particularly useful for active market makers, who can cheaply perform frequent updates to their own state (e.g., updating their liquidity positions potentially every block), while saving on gas and saving other users from scanning irrelevant state updates.

We're excited about these possibilities and we'll have more details to share on the Narsil design in the new year. For now, here's a rough idea of how Narsil will be structured, and how it integrates with other Penumbra software:

       ┌────────────────────────┐
       │ pcli (or other client) │
       └────────────────────────┘
            ▲               ▲
            │ custody       │ view
            │ protocol      │ protocol
            ▼               ▼
┌──────┬────────┬──────┬────────┬───────────────────────┐     ┌────────┐
│      │Custody │      │  View  │     client protocol   │     │Penumbra│
│      │Service │      │Service │◀──────────────────────┼─────│Fullnode│
│      └────────┘      └────────┘                       │     └────────┘
│         │ ▲               ▲                           │
│         │ │          View │                           │
│    Auth │ │ Auth   Advice │               absent for  │
│ Request │ │ Data          │                replicas   │
│         │ │          ┌────────┐          ┌ ─ ─ ─ ─ ─  │
│         │ │          │ Narsil │ FROST       Narsil  │ │
│         │ └──────────│ Ledger │─────────▶│  Shard     │
│         │            └────────┘           ─ ─ ─ ─ ─ ┘ │
│         │                 ▲     FROST          │      │
│         └─────────────┐   │  ┌─────────────────┘      │
│                       │ ABCI │                        │
│narsild                │   │  │                        │
└───────────────────────┼───┼──┼────────────────────────┘
                        │   │  │
                        ▼   │  ▼
                      ┌──────────┐
                      │Tendermint│
                      └──────────┘

Upgrade Design

Our current weekly testnets start from a new genesis state, and don't preserve any history, which has allowed us to iterate quickly. But before we go to mainnet, we need to build and exercise a real chain upgrade path. Upgrades to a shielded chain like Penumbra are tricky, because unlike a transparent chain, where all state is on-chain and can be snapshotted and migrated, state on Penumbra splits into two categories: public state, recorded on-chain, and private state, recorded off-chain.

While we can perform migrations on the public state, changes to the private state must be accretive, preserving all existing private state and allowing users to roll over their data on their own schedule. We're working on the design of this upgrade mechanism, and plan to exercise it on our testnets prior to mainnet.

ZK Proofs

We've implemented ZK circuits for two proof statements used in Penumbra. In the new year, we'll be integrating those ZK proofs into the rest of the system, replacing the other mock proofs with ZK implementations, now that the system design has matured.

Refinement, Testing, and Assurance

Finally, after finishing the remaining on-chain implementation tasks, we'll be changing gears from an expansive, iterative process of building out the system, exploring the design space, and moving as quickly as possible, to a process of careful refinement of the system we've built, cleaning up our code to make it easier to audit and understand, focusing on testing and assurance, and carefully checking every detail before we're ready to go to mainnet.

To stay up to date with the latest progress along the way, follow us on Twitter, join our Discord and subscribe to the #announcements channel, and check out our weekly testnets!