Penumbra is a fully private proof-of-stake network providing privacy to the Cosmos ecosystem.

Penumbra integrates privacy with proof-of-stake through a novel private delegation mechanism that provides staking derivatives, tax-efficient staking, and on-chain governance with private voting. Penumbra connects to the Cosmos ecosystem via IBC, acting as an ecosystem-wide shielded pool and allowing private transactions in any IBC-compatible asset. Users can also swap these assets using ZSwap, a private decentralized exchange supporting sealed-bid batch auctions and Uniswap-v3-style concentrated liquidity. Sealed-bid batch auctions prevent frontrunning, provide better execution, and reveal only the net flow across a pair of assets in each block, and liquidity positions are created anonymously, allowing traders to approximate their desired trading function without revealing their individual beliefs about prices.

This website is a living document containing unfinished public research, and is subject to revision. The updates section has a list of design updates.

If you’re interested in technical discussion about these ideas, why not

Slides from a presentation at the ZKValidator Cosmos Privacy Showcase can be found here.

# Overview

Penumbra is a fully private proof-of-stake network providing privacy to the Cosmos ecosystem.

Penumbra integrates privacy with proof-of-stake through a novel private delegation mechanism that provides staking derivatives, tax-efficient staking, and on-chain governance with private voting. Penumbra connects to the Cosmos ecosystem via IBC, acting as an ecosystem-wide shielded pool and allowing private transactions in any IBC-compatible asset. Users can also swap these assets using ZSwap, a private decentralized exchange supporting sealed-bid batch auctions and Uniswap-v3-style concentrated liquidity. Sealed-bid batch auctions prevent frontrunning, provide better execution, and reveal only the net flow across a pair of assets in each block, and liquidity positions are created anonymously, allowing traders to approximate their desired trading function without revealing their individual beliefs about prices.

## Private Transactions

Penumbra records all value in a single multi-asset shielded pool based on the Zcash Sapling design, but allows private transactions in any kind of IBC asset. Inbound IBC transfers shield value as it moves into the zone, while outbound IBC transfers unshield value.

Unlike Zcash, Penumbra has no notion of transparent transactions or a transparent value pool; instead, inbound IBC transfers are analogous to t2z Zcash transactions, outbound IBC transfers are analogous to z2t Zcash transactions, and the entire Cosmos ecosystem functions analogously to Zcash’s transparent pool.

Unlike the Cosmos Hub or many other chains built on the Cosmos SDK, Penumbra has no notion of accounts. Only validators have any kind of long-term identity, and this identity is only used (in the context of transactions) for spending the validator’s commission.

## Private Staking

In a proof-of-stake system like the Cosmos Hub, stakeholders delegate staking tokens by bonding them to validators. Validators participate in Tendermint consensus with voting power determined by their delegation size, and delegators receive staking rewards in exchange for taking on the risk of being penalized for validator misbehavior (slashing).

Integrating privacy and proof of stake poses significant challenges. If delegations are public, holders of the staking token must choose between privacy on the one hand and staking rewards and participation in consensus on the other hand. Because the majority of the stake will be bonded to validators, privacy becomes an uncommon, opt-in case. But if delegations are private, issuing staking rewards becomes very difficult, because the chain no longer knows the amount and duration of each address’ delegations.

Penumbra sidesteps this problem using a new mechanism that eliminates staking rewards entirely, treating unbonded and bonded stake as separate assets, with an epoch-varying exchange rate that prices in what would be a staking reward in other systems. This mechanism ensures that all delegations to a particular validator are fungible, and can be represented by a single token representing a share of that validator’s delegation pool, in effect a first-class staking derivative. Although delegation fungibility is key to enabling privacy, as a side effect, delegators do not realize any income while their stake is bonded, only a capital gain (or loss) on unbonding.

The total amount of stake bonded to each validator is part of the public chain state and determines consensus weight, but the bonded stake itself is just another token to be recorded in a multi-asset shielded pool. This provides accountability for validators and privacy and flexibility for delegators, who can trade and transact with their bonded stake just like they can with any other token.

It also provides an alternate perspective on the debate between fixed-supply and inflation-based rewards. Choosing the unbonded token as the numéraire, delegators are rewarded by inflation for taking on the risk of validator misbehavior, and the token supply grows over time. Choosing the bonded token as the numéraire, non-delegators are punished by depreciation for not taking on any risk of misbehavior, and the token supply is fixed.

## Private Governance

Like the Cosmos Hub, Penumbra supports on-chain governance with delegated voting. Unlike the Cosmos Hub, Penumbra’s governance mechanism supports secret ballots. Penumbra users can anonymously propose votes by escrowing a deposit of bonded stake. Stakeholders vote by proving ownership of their bonded stake prior to the beginning of the voting period and encrypting their votes to a threshold key controlled by the validator set. Validators sum encrypted votes and decrypt only the per-epoch totals.

## Private DEX

Penumbra provides private, sealed-bid batch swaps using ZSwap. ZSwap allows users to privately swap between any pair of assets. Individual swaps do not reveal trade amounts. Instead, all swaps in each block are executed in a single batch. Only the total amount in each batch is revealed, and only after the batch has been finalized. This prevents front-running and provides better execution, but also provides long-term privacy for individual swaps. Users can also provide liquidity by anonymously creating Uniswap-v3-style concentrated liquidity positions. These positions reveal the amount of liquidity and the bounds in which it is concentrated, but are not otherwise linked to any identity, so that (with some care) users can privately approximate arbitrary trading functions without revealing their specific views about prices.

# Concepts and Mechanisms

This section provides an overview of the concepts involved in Penumbra’s mechanism design.

# Epochs and Threshold Decryption

Penumbra organizes blocks into epochs. The number of blocks in each epoch is chosen relative to the block interval so that epochs last approximately 24 hours. For instance, an 8-second block interval would imply blocks per epoch.

Most state changes (e.g., transfers from one address to another) are applied at block boundaries. However, state changes related to delegation and consensus are generally applied only at epoch boundaries.

The validator set and its consensus weighting is determined on a per-epoch basis, and only changes at epoch boundaries, except in the case of slashing, which causes a validator to be removed from the validator set immediately.

Penumbra requires a homomorphic encryption scheme operating on i64 values that supports threshold decryption and distributed key generation. This scheme is used to allow transactions to include encrypted values that can be aggregated and then decrypted, with the validators revealing only the aggregate value, not the value from any individual transaction.

The choice to restrict validator changes to epoch boundaries plays a key role in Penumbra’s design, in two ways:

1. it allows the validators to share control of a single threshold key over the entire epoch, allowing aggregation of values in different blocks;

2. it provides a time interval over which changes to validators’ delegations can be aggregated, enhancing delegator privacy.

## Homomorphic Threshold Decryption

This encryption scheme only needs to work on i64 values, not arbitrary data, such as an entire transaction. Penumbra does not use threshold decryption to unseal entire encrypted transactions, because Penumbra transactions are constructed not to reveal any unnecessary information.

At the beginning of each epoch, the validator set performs distributed key generation for a homomorphic encryption scheme to produce a decryption key collectively controlled by the validators (on an equal basis, not a stake-weighted basis) and includes the encryption key in the first block of the epoch.

Because this key is only available after the first block of each epoch, some transactions cannot occur in the first block itself. Assuming a block interval similar to the Cosmos Hub, this implies an ~8-second processing delay once per day, a reasonable tradeoff against the complexity of phased setup procedures.

# Notes, Nullifiers, and Trees

Penumbra’s transaction model is derived from the Sapling shielded transaction design, with modifications to support multiple asset types, and several extensions to support additional functionality. The Sapling model is in turn derived from Bitcoin’s unspent transaction output (UTXO) model, in which value is recorded in transaction outputs that record the conditions under which they can be spent.

In Penumbra, value is recorded in notes, which function similarly to UTXOs. Each note specifies (either directly or indirectly) a type of value, an amount of value of that type, a spending key that authorizes spending the note’s value, and a unique nullifier derived from the note’s contents.

However, unlike UTXOs, notes are not recorded as part of the public chain state. Instead, the chain contains a note commitment tree, an incremental Merkle tree containing (public) commitments to (private) notes. Creating a note involves creating a new note commitment, proving that it is well-formed, . Spending a note involves proving that the spent note was previously included in the note commitment tree, using the spending key to demonstrate spend authorization, and revealing the nullifier, which prevents the same note from being spent twice.

Asset types are uniquely identified by an ADR001 name for compatibility with IBC, and Penumbra supports IBC transfers to transfer value into and out of of the zone, allowing Penumbra’s shielded pool to privately record any asset in the Cosmos ecosystem. ADR001 precisely specifies the entire pathway to the origin of any token, but when this precision is not required, we denote the Penumbra bearer tokens for asset XYZ as pXYZ; e.g., transferring ATOM from the Hub to Penumbra results in pATOM.

# Transactions

Transactions describe an atomic collection of changes to the ledger state. Each transaction consist of a sequence of descriptions for various actions. Each description adds or subtracts (typed) value from the transaction’s value balance, which must net to zero. Penumbra uses Sapling’s spend description, which spends a note and adds to the transaction’s value balance, and output description, which creates a new note and subtracts from the transaction’s value balance, and adds many new descriptions to support additional functionality:

• FeeDescriptions declare and burn transaction fees;

• TransferDescriptions transfer value out of Penumbra by IBC, consuming value from the transaction’s value balance, and producing an ICS20 FungibleTokenPacketData;

• DelegateDescriptions convert unbonded stake to bonded stake, consuming unbonded stake from the transaction’s value balance and producing new notes recording bonded stake;

• UndelegateDescriptions convert bonded stake to unbonded stake, consuming bonded stake from the transaction’s value balance and producing new notes recording unbonded stake;

• CommissionDescriptions are used by validators to sweep commission on staking rewards into shielded notes, adding unbonded stake to the transaction’s value balance;

• ProposalDescriptions are used to propose measures for on-chain governance and supply a deposit, consuming bonded stake from the transaction’s value balance and producing a new note that holds the deposit in escrow;

• VoteDescriptions perform private voting for on-chain governance and declare a vote, consuming bonded stake from the transaction’s value balance and producing a new note with the same amount of bonded stake;

• SwapDescriptions consume and burn tokens of one type from a transaction’s value balance and produce a swap commitment that permits the user to later mint tokens of a different type;

• SweepDescriptions allow a user who burned tokens of one type to mint tokens of a different type at a chain-specified clearing price, and adds the new tokens to a transaction’s value balance.

• DepositDescriptions deposit funds into the liquidity pool for a trading pair, and produce liquidity pool shares, consuming value of the traded types and producing value of the shares’ type;

• WithdrawDescriptions redeem liquidity pool shares and withdraw funds from a liquidity pool, consuming the shares’ type and producing value of the traded types;

# Staking and Delegation

As described in the overview, integrating privacy with proof-of-stake poses significant challenges. Penumbra sidesteps these challenges using a novel mechanism that eliminates staking rewards entirely, treating unbonded and bonded stake as separate assets, with an epoch-varying exchange rate that prices in what would be a staking reward in other systems. This mechanism ensures that all delegations to a particular validator are fungible, and can be represented by a single token representing a share of that validator’s delegation pool, in effect a first-class staking derivative.

This section describes the staking and delegation mechanism in detail:

# Staking Tokens

Penumbra’s staking token, denoted PEN, represents units of unbonded stake.

Penumbra treats stake bonded with a particular validator as a distinct asset, denoted PENb, with an epoch-varying exchange rate between PEN and PENb that prices in what would be a staking reward in other systems. This ensures that all delegations to a particular validator are fungible, and can be represented by a single token, in effect a first-class staking derivative that represents fractional ownership of that validator’s delegation pool.

Stake bonded with different validators is not fungible, as different validators may have different commission rates and different risk of misbehavior. Hence PENb is a shorthand for a class of assets (one per validator), rather than a single asset. PENb bonded to a specific validator can be denoted PENb(v) when it is necessary to be precise.

Each flavor of PENb is its own first-class token, and like any other token can be transferred between addresses, traded, sent over IBC, etc. Penumbra itself does not attempt to pool risk across validators, but nothing prevents third parties from building stake pools composed of these assets.

The base reward rate for bonded stake is a parameter indexed by epoch. This parameter can be thought of as a “Layer 1 Base Operating Rate”, or “L1BOR”, in that it serves as a reference rate for the entire chain. Its value is set on a per-epoch basis by a formula involving the ratio of bonded and unbonded stake, increasing when there is relatively less bonded stake and decreasing when there is relatively more. This formula should be decided and adjusted by governance.

Each validator declares a commission percentage , also indexed by epoch, which is subtracted from the base reward rate to get a validator-specific reward rate

The base exchange rate between PEN and PENb is given by the function which measures the cumulative depreciation of unbonded PEN relative to bonded PENb from genesis up to epoch . However, because PENb is not a single asset but a family of per-validator assets, this is only a base rate.

The actual exchange rate between PEN and PENb(v) bonded to validator accounts for commissions by substituting the validator-specific rate in place of the base rate to get

Delegating unbonded PEN to validator at epoch results in PENb(v). Undelegating PENb(v) from validator at epoch results in PEN. Thus, delegating at epoch and undelegating at epoch results in a return of i.e., the staking reward compounded only over the period during which the stake was bonded.

Discounting newly bonded stake by the cumulative depreciation of unbonded stake since genesis means that all bonded stake can be treated as if it had been bonded since genesis, which allows newly unbonded stake to always be inflated by the cumulative appreciation since genesis. This mechanism avoids the need to track the age of any particular delegation to compute its rewards, and makes all shares of each validator’s delegation pool fungible.

# Validator Rewards and Fees

Validators declare a commission percentage , which determines the spread between the base reward rate and the reward rate for their delegators , or equivalently .

Validator rewards are distributed in the first block of each epoch. In epoch , a validator whose delegation pool has size PENb receives a commission of size PEN, issued to the validator’s address.

To see why this is the reward amount, suppose validator has a delegation pool of size PENb. In epoch , the value of the pool is PEN. In epoch , the base reward rate causes the value of the pool to increase to Splitting as , this becomes

The value in the first term, , corresponds to the portion, and accrues to the delegators. Since , this is exactly , the new PEN-denominated value of the delegation pool.

The value in the second term, , corresponds to the portion, and accrues to the validator as commission. Validators can self-delegate the resulting PEN or use it to fund their operating expenses.

Transaction fees are denominated in PEN and are burned, so that the value of the fees accrues equally to all stake holders.

## TODO

• allow transaction fees in PENb with appropriate discounting, but only in transactions (e.g., undelegations, voting) that otherwise reveal the flavor of PENb.

# Voting Power

The size of each validator’s delegation pool (i.e., the amount of PENb of that validator’s flavor) is public information, and determines the validator’s voting power. However, these sizes cannot be used directly, because they are based on validator-specific conversion rates and are therefore incommensurate.

Voting power is calculated using the adjustment function , so that a validator whose delegation pool has PENb in epoch has voting power .

The validator-specific reward rate adjusts the base reward rate to account for the validator’s commission. Since and the adjustment function accounts for the compounded effect of the validator’s commission on the size of the delegation pool.

# Delegation

The delegation process bonds stake to a validator, converting unbonded PEN to bonded PENb. Delegations may be performed in any block, but only take effect in the next epoch.

Delegations are accomplished by creating a transaction with a DelegateDescription. The delegate description specifies a validator , consumes PEN from the transaction’s balance, produces a new shielded note with PENb associated to that validator, and includes an encryption of the delegation amount to the validators’ shared decryption key . Here is the index of the next epoch, when the delegation will take effect.

In the last block of epoch , the validators sum the encrypted bonding amounts from all delegate descriptions for validator in the entire epoch to obtain an encryption of the total bonding amount and decrypt to obtain the total bonding amount without revealing any individual transaction’s bonding amount . These total amounts are used to update the size of each validator’s delegation pool for the next epoch.

Revealing only the total amount of newly bonded stake in each epoch helps avoid linkability. For instance, if the size of each delegation were revealed directly, a delegation of size followed by an undelegation of size could be correlated if an observer notices that there are some epochs so that

This risk is still present when only the total amount -- the minimum disclosure required for consensus -- is revealed, because there may be few (or no) other delegations to the same validator in the same epoch. Some care should be taken in client implementations and user behavior to mitigate the effects of this information disclosure, e.g., by splitting delegations (and undelegations and votes) into multiple transactions in different epochs involving randomized sub-portions of the stake. However, the best mitigation would simply be to have many users.

## TODO

• specify the exact data in a delegate description

# Undelegation

The undelegation process unbonds stake from a validator, converting bonded PENb to unbonded PEN. Undelegations may be performed in any block, but only settle after the undelegation has exited the unbonding queue.

The unbonding queue is a FIFO queue allowing only a limited amount of stake to be unbonded in each epoch, according to an unbonding rate selected by governance. Undelegations are inserted into the unbonding queue in FIFO order. Unlike delegations, where only the total amount of newly bonded stake is revealed, undelegations reveal the precise amount of newly unbonded stake, allowing the unbonding queue to function.

Undelegations are accomplished by creating a transaction with a UndelegateDescription. Undelegation descriptions have different behaviour depending on whether or not the validator was slashed.

In the unslashed case, the undelegate description spends a note with value PENb, reveals , and produces PEN for the transaction’s balance, where is the index of the current epoch. The nullifiers revealed by undelegate descriptions are not immediately included in the nullifier set, and new notes created by a transaction containing an undelegate description are not immediately included in the note commitment tree. Instead, the transaction is placed into the unbonding queue to be applied later. In the first block of each epoch, transactions are applied if the corresponding validator remains unslashed, until the unbonding limit is reached.

If a validator is slashed, any undelegate transactions currently in the unbonding queue are discarded. Because the nullifiers for the notes those transactions spent were not included in the nullifier set, the notes remain spendable, allowing a user to create a new undelegation description.

Undelegations from a slashed validator are settled immediately. The undelegate description spends a note with value PENb and produces PEN, where is the slashing penalty and is the epoch at which the validator was slashed. The remaining value, , is burned.

Because pending undelegations from a slashed validator are discarded without applying their nullifiers, those notes can be spent again in a post-slashing undelegation description. This causes linkability between the discarded undelegations and the post-slashing undelegations, but this is not a concern because slashing is a rare and unplanned event which already imposes worse losses on delegators.

# Example Staking Dynamics

To illustrate the dynamics of this system, consider a toy scenario with three delegators, Alice, Bob, and Charlie, and two validators, Victoria and William. Tendermint consensus requires at least four validators and no one party controlling more than of the stake, but this example uses only a few parties just to illustrate the dynamics.

For simplicity, the the base reward rates and commission rates are fixed over all epochs at and , . The PEN and PENb holdings of participant are denoted by , , etc., respectively.

Alice starts with PENb bonded to Victoria, Bob starts with PENb bonded to William, and Charlie starts with unbonded PEN.

• At genesis, Alice, Bob, and Charlie respectively have fractions , , and of the total stake, and fractions , , of the total voting power.

• At epoch , Alice, Bob, and Charlie’s holdings remain unchanged, but their unrealized notional values have changed.

• Victoria charges zero commission, so . Alice’s PENb(v) is now worth PEN.
• William charges commission, so . Bob’s PENb(w) is now worth , and William receives PEN.
• William can use the commission to cover expenses, or self-delegate. In this example, we assume that validators self-delegate their entire commission, to illustrate the staking dynamics.
• William self-delegates PEN, to get PENb in the next epoch, epoch .
• At epoch :

• Alice’s PENb(v) is now worth PEN.
• Bob’s PENb(w) is now worth PEN.
• William’s self-delegation of accumulated commission has resulted in PENb(w).
• Victoria’s delegation pool remains at size PENb(v). William’s delegation pool has increased to PENb(w). However, their respective adjustment factors are now and , so the voting powers of their delegation pools are respectively and .
• The slight loss of voting power for William’s delegation pool occurs because William self-delegates rewards with a one epoch delay, thus missing one epoch of compounding.
• Charlie’s unbonded PEN remains unchanged, but its value relative to Alice and Bob’s stake has declined.
• William’s commission transfers stake from Bob, whose voting power has slightly declined relative to Alice’s.
• The distribution of stake between Alice, Bob, Charlie, and William is now , , , respectively. The distribution of voting power is , , , respectively.
• Charlie decides to bond his stake, split evenly between Victoria and William, to get PENb(v) and PENb(w).
• At epoch :

• Charlie now has PENb(v) and PENb(w), worth PEN.
• For the same amount of unbonded stake, Charlie gets more PENb(w) than PENb(v), because the exchange rate prices in the cumulative effect of commission since genesis, but Charlie isn’t charged for commission during the time he didn’t delegate to William.
• William’s commission for this epoch is now PEN, up from PEN in the previous epoch.
• The distribution of stake between Alice, Bob, Charlie, and William is now , , , respectively. Because all stake is now bonded, except William’s commission for this epoch, which is insignificant, the distribution of voting power is identical to the distribution of stake.
• At epoch :

• Alice’s PENb(v) is now worth PEN.
• Bob’s PENb(w) is now worth PEN.
• Charlies’s PENb(v) is now worth PEN, and his PENb(w) is now worth PEN.
• William’s self-delegation of accumulated commission has resulted in PENb(w), worth PEN.
• The distribution of stake and voting power between Alice, Bob, Charlie, and William is now , , , respectively.

This scenario was generated with a model in this Google Sheet.

# Transfers into Penumbra

IBC transfer mechanics are specified in ICS20. The FungibleTokenPacketData packet describes the transfer:

FungibleTokenPacketData {
denomination: string,
amount: uint256,
sender: string,
}


The sender and receiver fields are used to specify the sending account on the source chain and the receiving account on the destination chain. However, for inbound transfers, the destination chain is Penumbra, which has no accounts. Instead, token transfers into Penumbra create an OutputDescription describing a new shielded note with the given amount and denomination, and insert an encoding of the description itself into the receiver field.

# Governance

Like the Cosmos Hub, Penumbra supports on-chain governance with delegated voting. Unlike the Cosmos Hub, Penumbra’s governance mechanism supports secret ballots. Penumbra users can anonymously propose votes by escrowing a deposit of bonded stake. Stakeholders vote by proving ownership of their bonded stake prior to the beginning of the voting period and encrypting their votes to a threshold key controlled by the validator set. Validators sum encrypted votes and decrypt only the per-epoch totals.

## TODO

• emergency vs normal proposals, where emergency proposals are approved as soon as 2/3 validators vote yes
• is “liquid democracy”-style voting possible? currently the only supported delegation is to validators, but it could be useful to support delegation to third parties. one way to do this would be to allow the proxy voter to create votes on behalf of their voting delegators, but this would require that the proxy spend and create their notes, which is a huge power. this could be minimized by creating a special “voting key” capability, but nothing ensures that the proxy voter actually creates valid notes when spending them, so they could still destroy their voting delegators’ power. another

# Proposal Creation

Penumbra users can propose votes by escrowing a minimum amount of any flavor of PENb. They do this by creating a transaction with a ProposalDescription, which consumes some amount of PENb from the transaction’s balance, and creates a new escrow note with the same amount. The escrow note is not included in the note commitment tree until voting completes. The voting period begins in the next epoch and lasts for a fixed number of epochs (e.g., 7).

## TODO

• work out details enabling joint escrow (proposal crowdfunding)
• need proposition selection (only one proposition can be voted on at a time)

# Private Voting

Votes are the same as on the Cosmos Hub: Yes, No, NoWithVeto, and Abstain. NoWithVeto is the same as No but also votes that the proposer should lose their deposit. The intended cultural norm is that No should be used to indicate disagreement with good-faith proposals and NoWithVeto should be used to deter spam proposals.

• this isn’t exactly the same as on the hub, where there’s a distinct veto mechanic; is that mechanic good to copy?

By default, stakeholders’ votes are delegated to the validator their stake is bonded to. Validators’ votes are public, signed by the validator’s key, and act as a default vote for their entire delegation pool.

Stakeholders vote by proving ownership of some amount of bonded stake (their voting power) prior to the beginning of the voting period.

To do this, they create a transaction with a VoteDescription. This description reveals the validator and one of the four votes above, consumes PENb(v) from the transaction’s balance, produces a new note with PENb, and includes , an encryption of the voting power to the validators’ decryption key.

Descriptions that spend notes contain a proof-of-inclusion for the note they spend. This establishes that a commitment to the spent note was previously included in the note commitment tree, an incremental Merkle tree whose current root (”anchor”) is included in each block. Normally, these proofs are created and validated with respect to a recent anchor. However, transactions containing VoteDescriptions are always validated with respect to the anchor in the last block before the voting period begins. This prevents double-voting, by ensuring that only notes created before voting began can be used to vote.

## TODO

• vote acceptance thresholds / veto behavior
• encrypt voting power for each vote choice so that the vote does not reveal it

At the end of each epoch, validators collect the encrypted votes from each delegation pool, aggregate the encrypted votes into encrypted tallies and decrypt the tallies. At the end of the voting period, these per-epoch, per-validator tallies are combined as follows. For each validator , the per-epoch tallies from that validator’s delegation pool are summed to obtain voting power subtotals for each vote option. Then, these subtotals are summed to determine the amount of the delegation pool that voted. The validator’s vote acts as a default vote for the rest of the delegation pool. Finally, these per-validator subtotals are multiplied by the voting power adjustment function to obtain the final vote totals.

If the vote was not vetoed, the escrowed note from the ProposalDescription is included in the note commitment tree, so that it can be spent by the proposer. Otherwise, it is not, and the funds are burned.

# ZSwap

Penumbra provides private, sealed-bid batch swaps using ZSwap. ZSwap allows users to privately swap between any pair of assets. Individual swaps do not reveal trade amounts. Instead, all swaps in each block are executed in a single batch. Only the total amount in each batch is revealed, and only after the batch has been finalized. This prevents front-running and provides better execution, but also provides long-term privacy for individual swaps. Users can also provide liquidity by anonymously creating Uniswap-v3-style concentrated liquidity positions. These positions reveal the amount of liquidity and the bounds in which it is concentrated, but are not otherwise linked to any identity, so that (with some care) users can privately approximate arbitrary trading functions without revealing their specific views about prices.

## Frequent batch auctions as a market design response

Budish, Cramton, and Shim (2015) analyze trading in traditional financial markets using the predominant continuous-time limit order book market design, and find that high-frequency trading arises as a response to mechanical arbitrage opportunities created by flawed market design:

These findings suggest that while there is an arms race in speed, the arms race does not actually affect the size of the arbitrage prize; rather, it just continually raises the bar for how fast one has to be to capture a piece of the prize... Overall, our analysis suggests that the mechanical arbitrage opportunities and resulting arms race should be thought of as a constant of the market design, rather than as an inefficiency that is competed away over time.

— Eric Budish, Peter Cramton, John Shim, The High-Frequency Trading Arms Race: Frequent Batch Auctions as a Market Design Response

Because these mechanical arbitrage opportunities arise from the market design even in the presence of symmetrically observed public information, they do not improve prices or produce value, but create arbitrage rents that increase the cost of liquidity provision1. Instead, they suggest changing from a continuous-time model to a discrete-time model and performing frequent batch auctions, executing all orders that arrive in the same discrete time step in a single batch with a uniform price.

In the blockchain context, while projects like Uniswap have demonstrated the power and success of CFMMs for decentralized liquidity provision, they have also highlighted the mechanical arbitrage opportunities created by the mismatch between continuous-time market designs and the state update model of the underlying blockchain, which operates in discrete batches (blocks). Each prospective state update is broadcast to all participants to be queued in the mempool, but only committed as part of the next block, and while it is queued for inclusion in the consensus state, other participants can bid to manipulate its ordering relative to other state updates (for instance, front-running a trade).

This manipulation is enabled by two features:

• Although trades are always committed in a batch (a block), they are performed individually, making them dependent on miners’ choices of the ordering of trades within a block;

• Because trades disclose the trade amount in advance of execution, all other participants have the information necessary to manipulate them.

ZSwap addresses the first problem by executing all swaps in each block in a single batch, first aggregating the amounts in each swap and then executing it against the CFMM as a single trade.

ZSwap addresses the second problem by having users encrypt their swap amounts using a homomorphic threshold decryption key controlled by the validators, who aggregate the encrypted amounts and decrypt only the batch trade. This prevents front-running prior to block inclusion, and provides privacy for individual trades (up to the size of the batch) afterwards.

Users do not experience additional trading latency from the batch auction design, because the batch auctions occur in every block, which is already the minimum latency for finalized state updates. A longer batch latency could help privacy for market-takers by increasing the number of swaps in each batch, but would impair other trading and impose a worse user experience.

## Private, sealed-bid batch auctions

A key challenge in the design of any private swap mechanism is that zero-knowledge proofs only allow privacy for user-specific state, not for global state, because they don’t let you prove statements about things that you don’t know. While users can prove that their user-specific state was updated correctly without revealing it, they cannot do so for other users’ state.

Instead of solving this problem, ZSwap sidesteps the need for users to do so. At a high level, swaps work as follows: users privately burn funds of one kind in a coordinated way that allows the chain to compute a per-block clearing price, and mint or burn liquidity pool reserves. Later, users privately mint funds of the other kind, proving that they previously burned funds and that the minted amount is consistent with the computed price and the burned amount. No interaction or transfer of funds between users or the liquidity pool reserves is required. Users do not transact with each other. Instead, the chain permits them to transmute one asset type to another, provably updating their private state without interacting with any other users’ private state.

This mechanism is described in more detail in the Sealed-Bid Batch Auctions section.

## Concentrated Liquidity

ZSwap uses the concentrated liquidity mechanism introduced in Uniswap v3 to make liquidity provision more capital-efficient. Uniswap v3’s insight is that that existing constant-product market makers like Uniswap v2 allocate liquidity inefficiently, spreading it uniformly over the entire range of possible prices for a trading pair. Instead, allowing liquidity providers (LPs) to restrict their liquidity to a price range of their choosing provides a mechanism for market allocation of liquidity, concentrating it into the range of prices that the assets in the pair actually trade.

Liquidity providers create positions that tie a quantity of liquidity to a specific price range. Within that price range, the position acts as a constant-product market maker with larger “virtual” reserves. At each price, the pool aggregates liquidity from all positions that contain that price, and tracks which positions remain (or become) active as the price moves. By creating multiple positions, LPs can approximate arbitrary trading functions.

In Uniswap v3, all positions are public and tied to the identity of the LP who created them. In ZSwap, positions are also public, but they are opened and closed anonymously. This means that while the aggregate of all LPs’ trading functions is public, the trading function of each individual LP is not, so LPs can approximate their desired trading functions without revealing their specific views about prices, as long as they take care to avoid linking their positions (e.g., by timing or amount).

Handling of concentrated liquidity is described in more detail in the Concentrated Liquidity section.

## Liquidity Mining and Fees

ZSwap supports liquidity mining, described in more detail in the Liquidity Mining section.

Because ZSwap positions are created anonymously, fees and liquidity mining rewards cannot be paid out to the position’s owner, as in Uniswap v3. Instead, fees and rewards accrue to the position, and are claimed when the position is closed. This process is described in the Closing Positions section.

1

on HFT, N.B. their footnote 5:

A point of clarification: our claim is not that markets are less liquid today than before the rise of electronic trading and HFT; the empirical record is clear that trading costs are lower today than in the pre-HFT era, though most of the benefits appear to have been realized in the late 1990s and early 2000s... Rather, our claim is that markets are less liquid today than they would be under an alternative market design that eliminated sniping.

# Sealed-Bid Batch Auctions

ZSwap’s sealed-bid batch auctions conceptually decompose into two parts: the market-making function itself, whose design is inspired by Uniswap v3, and the batching procedure, which allows multiple users’ swaps to be batched into a single trade executed against the trading function. This section focuses on the batching procedure, leaving the mechanics of the trading function for later.

A key challenge in the design of any private swap mechanism is that zero-knowledge proofs only allow privacy for user-specific state, not for global state, because [they don’t let you prove statements about things that you don’t know][barrywhitehat-private-uniswap]. While users can prove that their user-specific state was updated correctly without revealing it, they cannot do so for other users’ state.

Instead of solving this problem, ZSwap sidesteps the need for users to do so. Rather than have users transact with each other, the chain permits them to transmute one asset type to another, provably updating their private state without interacting with any other users’ private state. This works as follows.

1. Users create transactions with SwapDescriptions that privately burn their input assets and encrypt the amounts to the validators. This description identifies the trading pair , consumes of types from the transaction balance, and contains an encryption of the trade inputs and a swap commitment (analogous to a note commitment) that binds to and a nullifier. Rational traders will choose either or , i.e., trade one asset type for the other type, but the description provides two inputs so that different swap directions cannot be distinguished.

2. Validators sum the encrypted amounts of all swaps in the batch to obtain an encryption of the combined inputs , then decrypt to obtain the batch input without revealing any individual transaction’s input . Then they execute against the trading pool, updating the pool state and obtaining the effective (inclusive of fees) clearing prices and .

3. In a future block, users who created SwapDescriptions obtain assets of the new types by creating a SweepDescription. This description reveals the block height containing the swap description, the trading pair , and the nullifier of a swap commitment. Then, it proves inclusion of a swap commitment with nullifier and trade inputs in the note commitment tree for that block, and proves that and . Finally, it adds of types to the transaction’s balance (e.g., to be consumed by an OutputDescription that creates a new note).

Although sweep descriptions do not reveal the amounts, or which swap’s outputs they claim, they do reveal the block and trading pair, so their anonymity set is considerably smaller than an ordinary shielded value transfer. For this reason, client software should create and submit a transaction with a sweep description immediately after observing that its transaction with a swap description was included in a block, rather than waiting for some future use of the new assets. This ensures that future shielded transactions involving the new assets are not trivially linkable to the swap.

This design reveals only the net flow across a trading pair in each batch, not the amounts of any individual swap. However, this provides no protection if the batch contains a single swap, and limited protection when there are only a few other swaps. This is likely to be an especially severe problem until the protocol has a significant base of active users, so it is worth examining the impact of amount disclosure and potential mitigations.

Assuming that all amounts are disclosed, an attacker could attempt to deanonymize parts of the transaction graph by tracing amounts, using strategies similar to those in An Empirical Analysis of Anonymity in Zcash. That research attempted to deanonymize transactions by analysing movement between Zcash’s transparent and shielded pools, with some notable successes (e.g., identifying transactions associated to the sale of stolen NSA documents). Unlike Zcash, where opt-in privacy means the bulk of the transaction graph is exposed, Penumbra does not have a transparent pool, and the bulk of the transaction graph is hidden, but there are several potential places to try to correlate amounts:

• IBC transfers into Penumbra are analogous to t2z transactions and disclose types and amounts and accounts on the source chain;
• IBC transfers out of Penumbra are analogous to z2t transactions and disclose types and amounts and accounts on the destination chain;
• Each unbonding discloses the precise amount of newly unbonded stake and the validator;
• Each epoch discloses the net amount of newly bonded stake for each validator;
• Liquidity pool deposits disclose the precise type and amount of newly deposited reserves;
• Liquidity pool deposits disclose the precise type and amount of newly withdrawn reserves;

The existence of the swap mechanism potentially makes correlation by amount more difficult, by expanding the search space from all amounts of one type to all combinations of all amounts of any tradeable type and all historic clearing prices. However, assuming rational trades may cut this search space considerably.1

## TODO

• consider the effect of multi-asset support on correlations more deeply.
1

Thanks to Guillermo Angeris for this observation.

# Cryptography

This section describes the cryptographic design of Penumbra and the primitives it uses.

# Primitives

Penumbra uses the following cryptographic primitives, described in the following sections:

• The Proof System section describes the choice of proving curve (BLS12-377) and proof system (Groth16, and potentially PLONK in the future);

• The decaf377 section describes decaf377, a parameterization of the Decaf construction defined over the BLS12-377 scalar field, providing a prime-order group that can be used inside or outside of a circuit;

• The Poseidon for BLS12-377 section describes parameter selection for an instantiation of Poseidon, a SNARK-friendly sponge construction, over the BLS12-377 scalar field;

• The zk555 section describes zk555, an instantiation of the STROBE protocol framework for use inside (and outside) of circuits.

# Proof System

Penumbra needs SNARK proofs. Because the choice of proving system and proving curve can’t really be cleanly separated from the rest of the system choices (e.g., the native field of the proving system informs what embedded curve is available, and how circuit programming is done), large parts of the rest of the system design block on making a choice of proving system.

## Goals

1. Near-term implementation availability. Creating fast and high-quality implementations of elliptic curves and proof systems is fun, but I don’t want the project to block on it.
2. High performance for fixed functionality. Penumbra currently intends to support fixed functionality; programmability is a good future goal but isn’t a near-term objective. The fixed functionality should have as high performance as possible.
3. Longer-term flexibility. The choice should ideally not preclude many future choices for later functionality. More precisely, it should not impose high switching costs on future choices.
4. Recursion capability. Penumbra doesn’t currently make use of recursion, but there are a lot of cool things it could enable in the future, and it would be able to do those in the future.

Setup ceremonies are beneficial to avoid for operational reasons, but not for security reasons. A decentralized setup procedure is sufficient for security.

## Options

Proof systems:

• Groth16:
• Pros: high performance, very small proofs, mature system
• Cons: requires a setup for each proof statement
• PLONK:
• Pros: universal setup, still fairly compact proofs, seems to be a point of convergence with useful extensions (plookup, SHPLONK, etc)
• Cons: bigger proofs, worse constants than Groth16
• Halo 2
• Pros: no setup, arbitrary depth recursion
• Cons: bigger proof sizes, only useful with the Pallas/Vesta curves which don’t support pairings

Curve choices:

• BLS12-381:

• Pros: very mature, used by Sapling already
• Cons: no easy recursion
• BLS12-377:

• Pros: constructed as part of Zexe to support depth 1 recursion using a bigger parent curve, deployed in Celo, to be deployed in Zexe
• Cons: ?
• Pallas/Vesta:

• Pros: none other than support for Halo 2’s arbitrary recursion
• Cons: no pairings mean they cannot be used for any pairing-based SNARK

## Thoughts

Although the choice of proof system (Groth16, Plonk, Halo, Pickles, ...) is not completely separable from the choice of proving curve (e.g., pairing-based SNARKs require pairing-friendly curves), to the extent that it is, the choice of the proof system is relatively less important than the choice of proving curve, because it is easier to encapsulate.

The choice of proving curve determines the scalar field of the arithmetic circuit, which determines which curves are efficient to implement in the circuit, which determines which cryptographic constructions can be performed in the circuit, which determines what kind of key material the system uses, which propagates all the way upwards to user-visible details like the address format. While swapping out a proof system using the same proving curve can be encapsulated within an update to a client library, swapping out the proving curve is extremely disruptive and essentially requires all users to generate new addresses and migrate funds.

This means that, in terms of proof system flexibility, the Pallas/Vesta curves are relatively disadvantaged compared to pairing-friendly curves like BLS12-381 or BLS12-377, because they cannot be used with any pairing-based SNARK, or any other pairing-based construction. Realistically, choosing them is committing to using Halo 2.

Choosing BLS12-377 instead of BLS12-381 opens the possibility to do depth-1 recursion later, without meaningfully restricting the near-term proving choices. For this reason, BLS12-377 seems like the best choice of proving curve.

Penumbra’s approach is to first create a useful set of fixed functionality, and generalize to custom, programmable functionality only later. Compared to Sapling, there is more functionality (not just Spend and Output but Delegate, Undelegate, Vote, ...), meaning that there are more proof statements. Using Groth16 means that each of these statements needs to have its own proving and verification key, generated through a decentralized setup.

So the advantage of a universal setup (as in PLONK) over per-statement setup (as in Groth16) would be:

1. The setup can be used for additional fixed functionality later;
2. Client software does not need to maintain distinct proving/verification keys for each statement.

(2) is a definite downside, but the impact is a little unclear. As a point of reference, the Sapling spend and output parameters are 48MB and 3.5MB respectively. The size of the spend circuit could be improved using a snark-friendly hash function.

With regard to (1), if functionality were being developed in many independent pieces, doing many setups would impose a large operational cost. But doing a decentralized setup for a dozen proof statements simultaneously does not seem substantially worse than doing a decentralized setup for a single proof statement. So the operational concern is related to the frequency of groups of new statements, not the number of statements in a group. Adding a later group of functionality is easy if the first group used a universal setup. But if it didn’t, the choice of per-statement setup initially doesn’t prevent the use of a universal setup later, as long as the new proof system can be implemented using the same curve.

Because Penumbra plans to have an initial set of fixed functionality, and performance is a concern, Groth16 seems like a good choice, and leaves the door open for a future universal SNARK. Using BLS12-377 opens the door to future recursion, albeit only of depth 1.

# The decaf377 group

Penumbra, like many other zero-knowledge protocols, requires a cryptographic group that can be used inside of an arithmetic circuit. This is accomplished by defining an “embedded” elliptic curve whose base field is the scalar field of the proving curve used by the proof system.

The Zexe paper, which defined BLS12-377, also defined (but did not name) a cofactor-4 Edwards curve defined over the BLS12-377 scalar field for exactly this purpose. However, non-prime-order groups are a leaky abstraction, forcing all downstream constructions to pay attention to correct handling of the cofactor. Although it is usually possible to do so safely, it requires additional care, and the optimal technique for handling the cofactor is different inside and outside of a circuit.

Instead, applying the Decaf construction to this curve gives decaf377, a clean abstraction that provides a prime-order group complete with hash-to-group functionality and whose encoding and decoding functions integrate validation. Although it imposes a modest additional cost in the circuit context, as discussed in Costs and Alternatives, the construction works the same way inside and outside of a circuit and imposes no costs for lightweight, software-only applications, making it a good choice for general-purpose applications.

## Implementation

A work-in-progress implementation of decaf377 can be found here.

# Costs and Alternatives

Arithmetic circuits have a different cost model than software. In the software cost model, software executes machine instructions, but in the circuit cost model, relations are certified by constraints. Unfortunately, while Decaf is a clearly superior choice in the software context, in the circuit context it imposes some additional costs, which must be weighed against its benefits.

At a high level, Decaf implements a prime-order group using a non-prime-order curve by constructing a group quotient. Internally, group elements are represented by curve points, with a custom equality check so that equivalent representatives are considered equal, an encoding function that encodes equivalent representatives as identical bitstrings, and a decoding function that only accepts canonical encodings of valid representatives.

To do this, the construction defines a canonical encoding on a Jacobi quartic curve mod its 2-torsion (a subgroup of size 4) by making two independent sign choices. Then, it uses an isogeny to transport this encoding from the Jacobi quartic to a target curve that will be used to actually implement the group operations. This target curve can be an Edwards curve or a Montgomery curve. The isogenies are only used for deriving the construction. In implementations, all of these steps are collapsed into a single set of formulas that perform encoding and decoding on the target curve.

In other words, one way to think about the Decaf construction is as some machinery that transforms two sign choices into selection of a canonical representative. Ristretto adds extra machinery to handle cofactor 8 by making an additional sign choice.

In the software cost model, where software executes machine instructions, this construction is essentially free, because the cost of both the Decaf and conventional Edwards encodings are dominated by the cost of computing an inverse or an inverse square root, and the cost of the sign checks is insignificant.

However, in the circuit cost model, where relations are certified by various constraints, this is no longer the case. On the one hand, certifying a square root or an inverse just requires checking that or that , which is much cheaper than actually computing or . On the other hand, performing a sign check involves bit-constraining a field element, requiring hundreds of constraints.

## Sign checks

The definition of which finite field elements are considered nonnegative is essentially arbitrary. The Decaf paper suggests three possibilities:

• using the least significant bit, defining to be nonnegative if the least absolute residue for is even;

• using the most significant bit, defining to be nonnegative if the least absolute residue for is in the range ;

• for fields where , using the Legendre symbol, which distinguishes between square and nonsquare elements.

Using the Legendre symbol is very appealing in the circuit context, since it has an algebraic definition and, at least in the case of square elements, very efficient certification. For instance, if square elements are chosen to be nonnegative, then certifying that is nonnegative requires only one constraint, . However, the reason for the restriction to fields is that and should have different signs, which can only be the case if is nonsquare. Unfortunately, many SNARK-friendly curves, including BLS12-377, are specifically chosen so that for as large a power as possible (e.g., in the case of BLS12-377).

This leaves us with either the LSB or MSB choices. The least significant bit is potentially simpler for implementations, since it is actually the low bit of the encoding of , while the most significant bit isn’t, because it measures from , not a bit position , so it seems to require a comparison or range check to evaluate. However, these choices are basically equivalent, in the following sense:

###### Lemma.1

The most significant bit of is if and only if the least significant bit of is .

###### Proof.

The MSB of is if and only if , but this means that , which is even, is the least absolute residue, so the LSB of is also . On the other hand, the MSB of is if and only if , i.e., if , i.e., if . This means that the least absolute residue of is ; since is even and is odd, this is odd and has LSB .

This means that transforming an LSB check to an MSB check or vice versa requires multiplication by or , which costs at most one constraint.

Checking the MSB requires checking whether a value is in the range . Using Daira Hopwood’s optimized range constraints, the range check costs 2. However, the input to the range check is a bit-constrained unpacking of a field element, not a field element itself. This unpacking costs .

Checking the LSB is no less expensive, because although the check only examines one bit, the circuit must certify that the bit-encoding is canonical. This requires checking that the value is in the range , which also costs , and as before, the unpacking costs .

In other words, checking the sign of a field element costs , or in the case where the field element is already bit-encoded for other reasons. These checks are the dominant cost for encoding and decoding, which both require two sign checks. Decoding from bits costs c. , decoding from a field element costs c. , and encoding costs c. regardless of whether the output is encoded as bits or as a field element.

## Alternative approaches to handling cofactors

Decaf constructs a prime-order group whose encoding and decoding methods perform validation. A more conventional alternative approach is to use the underlying elliptic curve directly, restrict to its prime-order subgroup, and do subgroup validation separately from encoding and decoding. If this validation is done correctly, it provides a prime-order group. However, because validation is an additional step, rather than an integrated part of the encoding and decoding methods, this approach is necessarily more brittle, because each implementation must be sure to do both steps.

In the software cost model, there is no reason to use subgroup validation, because it is both more expensive and more brittle than Decaf or Ristretto. However, in the circuit cost model, there are cheaper alternatives, previously analyzed by Daira Hopwood in the context of Ristretto for JubJub (1, 2).

###### Multiplication by the group order.

The first validation method is to do a scalar multiplication and check that . Because the prime order is fixed, this scalar multiplication can be performed more efficiently using a hardcoded sequence of additions and doublings.

###### Cofactor preimage.

The second validation method provides a preimage in affine coordinates . Because the image of is the prime-order subgroup, checking that satisfies the curve equation and that checks that is in the prime-order subgroup.

In the software context, computing and computing cost about the same, although both are an order of magnitude more expensive than decoding. But in the circuit context, the prover can compute outside of the circuit and use only a few constraints to check the curve equation and two doublings. These costs round to zero compared to sign checks, so the validation is almost free.

The standard “compressed Edwards y” format encodes a point by the -coordinate and a sign bit indicating whether is nonnegative. In software, the cost of encoding and decoding are about the same, and dominated by taking an inverse square root. In circuits, the costs of encoding and decoding are also about the same, but they are instead dominated by a sign check that matches the sign of the recovered -coordinate with the supplied sign bit. This costs c. as above.

## Comparison and discussion

This table considers only approximate costs.

Operationdecaf377Compressed Ed + Preimage
Decode (from bits)400C400C
Decode (from )750C325C
Encode (to bits)750C750C
Encode (to )750C325C

When decoding from or encoding to field elements, the marginal cost of Decaf compared to the compressed Edwards + cofactor preimage is an extra bit-unpacking and range check. While this effectively doubles the number of constraints, the marginal cost of c. is still small relative to other operations like a scalar multiplication, which at 6 constraints per bit is approximately .

When decoding from or encoding to bits, the marginal cost of Decaf disappears. When the input is already bit-constrained, Decaf’s first sign check can reuse the bit constraints, saving c. , but the compressed Edwards encoding must range-check the bits (which Decaf already does), costing c. extra. Similarly, in encoding, Decaf’s second sign check produces bit-constrained variables for free, while the compressed Edwards encoding spends c. bit-constraining and range-checking them.

However, in the software context, the prime-order validation check costs approximately 10x more than the cost of either encoding. Many applications require use of the embedded group both inside and outside of the circuit, and uses outside of the circuit may have additional resource constraints (for instance, a hardware token creating a signature authorizing delegated proving, etc.).

Performing validation as an additional, optional step also poses additional risks. While a specification may require it to be performed, implementations that skip the check will appear to work fine, and the history of invalid-point attacks (where implementations should, but don’t, check that point coordinates satisfy the curve equation) suggests that structuring validation as an integral part of encoding and decoding is a safer design. This may not be a concern for a specific application with a single, high-quality implementation that doesn’t make mistakes, but it’s less desirable for a general-purpose construction.

In summary, Decaf provides a construction that works the same way inside and outside of a circuit and integrates validation with the encoding, imposing only a modest cost for use in circuits and no costs for lightweight, software-only applications, making it a good choice for general-purpose constructions.

1

I have no idea whether this is common knowledge; I learned of this fact from its use in Mike Hamburg’s Ed448-Goldilocks implementation.

2

The value 73 is computed as:

from itertools import groupby

def cost(k):
return min(k-1, 2)

def constraints(bound):
costs = [cost(len(list(g))+1) for (c, g) in groupby(bound.bits()[:-1]) if c == 1]
return sum(costs)

constraints(ZZ((q-1)/2))


as here.

# Inverse Square Roots

As in the [internet-draft], the decaf377 functions are defined in terms of the following function, which computes the square root of a ratio of field elements, with the special behavior that if the input is nonsquare, it returns the square root of a related field element, to allow reuse of the computation in the hash-to-group setting.

• TODO: actually specify this procedure.

Fix as 2841681278031794617739547238867782961338435681360110683443920362658525667816. Then is a nonsquare -th root of unity.

• TODO: possibly change this value to be convenient for implementations, rather than something that’s convenient for SAGE;

Define sqrt_ratio_zeta(u,v) as:

• (True, ) if and are nonzero, and is square;
• (True, ) if is zero;
• (False, ) if is zero;
• (False, ) if and are nonzero, and is nonsquare.

Since is nonsquare, if is nonsquare, is square. Note that unlike the similar function in the ristretto255/decaf448 [internet-draft], this function does not make any claims about the sign of its output.

• TODO: describe efficient implementation using 2020/1407

# Decoding

Decoding to works as follows:

1. Decode s_bytes to a field element , rejecting if the encoding is non-canonical. (If the input is already a field element in the circuit case, skip this step).

2. Check that is nonnegative, or reject (sign check 1).

3. .

4. .

5. (was_square, v) = sqrt_ratio_zeta(1, u_2 * u_1^2), rejecting if was_square is false.

6. if is negative (sign check 2).

7. .

The resulting coordinates are the affine Edwards coordinates of an internal representative of the group element.

• simplify formulas using numerator instead of 1

# Encoding

Given a representative in extended coordinates , encoding works as follows.

1. .

2. (_ignored, v) = sqrt_ratio_zeta(1, u_1 * (-1 - d) * x^2).

3. (sign check 1).

4. .

5. .

6. Set s_bytes to be the canonical little-endian encoding of .

# Group Hash

• specify, need to match the choice of quadratic nonresidue with used in invsqrt.

# Test Vectors

## Small generator multiples

This table has hex-encodings of small multiples of the generator :

ElementHex encoding
0000000000000000000000000000000000000000000000000000000000000000
0800000000000000000000000000000000000000000000000000000000000000
b2ecf9b9082d6306538be73b0d6ee741141f3222152da78685d6596efc8c1506
2ebd42dd3a2307083c834e79fb9e787e352dd33e0d719f86ae4adb02fe382409
6acd327d70f9588fac373d165f4d9d5300510274dffdfdf2bf0955acd78da50d
460f913e516441c286d95dd30b0a2d2bf14264f325528b06455d7cb93ba13a0b
ec8798bcbb3bf29329549d769f89cf7993e15e2c68ec7aa2a956edf5ec62ae07
48b01e513dd37d94c3b48940dc133b92ccba7f546e99d3fc2e602d284f609f00
a4e85dddd19c80ecf5ef10b9d27b6626ac1a4f90bd10d263c717ecce4da6570a
1a8fea8cbfbc91236d8c7924e3e7e617f9dd544b710ee83827737fe8dc63ae00
0a0f86eaac0c1af30eb138467c49381edb2808904c81a4b81d2b02a2d7816006
588125a8f4e2bab8d16affc4ca60c5f64b50d38d2bb053148021631f72e99b06
f43f4cefbe7326eaab1584722b1b4860de554b23a14490a03f3fd63a089add0b
76c739a33ffd15cf6554a8e705dc573f26490b64de0c5bd4e4ac75ed5af8e60b
200136952d18d3f6c70347032ba3fef4f60c240d706be2950b4f42f1a7087705
bcb0f922df1c7aa9579394020187a2e19e2d8073452c6ab9b0c4b052aa50f505

# Fuzzy Message Detection

By design, privacy-preserving blockchains like Penumbra don’t reveal metadata about the sender or receiver of a transaction. However, this means that users must scan the entire chain to determine which transactions relate to their addresses. This imposes large bandwidth and latency costs on users who do not maintain online replicas of the chain state, as they must “catch up” each time they come online by scanning all transactions that have occurred since their last activity.

Alternatively, users could delegate scanning to a third party, who would monitor updates to the chain state on their behalf and forward only a subset of those updates to the user. This is possible using viewing keys, as in Zcash, but viewing keys represent the capability to view all activity related to a particular address, so they can only be delegated to trusted third parties.

Instead, it would be useful to be able to delegate only a probabilistic detection capability. Analogous to a Bloom filter, this would allow a detector to identify all transactions related to a particular address (no false negatives), while also identifying unrelated transactions with some false positive probability. Unlike viewing capability, detection capability would not include the ability to view the details of a transaction, only a probabilistic association with a particular address. This is the problem of fuzzy message detection (FMD), analyzed by Beck, Len, Miers, and Green in their paper Fuzzy Message Detection, which proposes a cryptographic definition of fuzzy message detection and three potential constructions.

This section explores how Penumbra could make use of fuzzy message detection:

• In Sender and Receiver FMD, we propose a generalization of the original definition where the false positive probability is set by the sender instead of the receiver, and discusses why this is useful.

• In Constructing S-FMD, we realize the new definition using a variant of one of the original FMD constructions, and extend it in two ways:

1. to support diversified detection, allowing multiple, publicly unlinkable addresses to be scanned by a single detection key;
2. to support arbitrarily precise detection with compact, constant-size keys.

The resulting construction can then be integrated into the Sapling key heirarchy.

• In Parameter Considerations, we discuss how the false positive rates should be chosen.

## Acknowledgements

Thanks to George Tankersley and Sarah Jamie Lewis for discussions on this topic (and each independently suggesting the modifications to realize S-FMD), to Gabrielle Beck for discussions about the paper and ideas about statistical attacks, and to Guillermo Angeris for pointers on analyzing information disclosure.

The goal of detection capability is to be able to filter the global stream of state updates into a local stream of state updates that includes all updates related to a particular address, without identifying precisely which updates those are. The rate of updates on this filtered stream should be bounded below, to ensure that there is a large enough anonymity set, and bounded above, so that users processing the stream have a constant and manageable amount of work to process it and catch up with the current chain state. This means that the detection precision must be adaptive to the global message rates: if the false positive rate is too low, the filtered stream will have too few messages, and if it is too high, it will have too many messages.

However, this isn’t possible using the paper’s original definition of fuzzy message detection, because the false positive probability is chosen by the receiver, who has no way to know in advance what probability will produce a correctly sized stream of messages. One way to address this problem is to rename the original FMD definition as Receiver FMD (R-FMD), and tweak it to obtain Sender FMD (S-FMD), in which the sender chooses the detection probability.

The paper’s original definition of fuzzy message detection is as a tuple of algorithms (KeyGen, Flag, Extract, Test). The receiver uses KeyGen to generate a root key and a flag key1. A sender uses the receiver’s flag key as input to Flag to produce a flag ciphertext. The Extract algorithm takes the root key and a false positive rate (chosen from some set of supported rates), and produces a detection key. This detection key can be applied to a flag ciphertext using Test to produce a detection result.

This scheme should satisfy certain properties, formalizations of which can be found in the paper:

###### Correctness.

Valid matches must always be detected by Test; i.e., there are no false negatives.

###### Fuzziness.

Invalid matches should produce false positives with probability approximately , as long as the flag ciphertexts and detection keys were honestly generated.

###### Detection Ambiguity.

An adversarial detector must be unable to distinguish between a true positive and a false positive, as long as the flag ciphertexts and detection keys were honestly generated.

In this original definition, the receiver has control over how much detection precision they delegate to a third party, because they choose the false positive probability when they extract a detection key from their root key. This fits with the idea of attenuating credentials, and intuitively, it seems correct that the receiver should control how much information they reveal to their detector. But the information revealed to their detector is determined by both the false positive probability and the amount of other messages that can function as cover traffic. Without knowing the extent of other activity on the system, the receiver has no way to make a principled choice of the detection precision to delegate.

## Sender FMD

To address this problem, we generalize the original definition (now Receiver FMD) to Sender FMD, in which the false positive probability is chosen by the sender.

Like R-FMD, S-FMD consists of a tuple of algorithms (KeyGen, Flag, Extract, Test). Like R-FMD, KeyGen generates a root key and a flag key, and Test takes a detection key and a flag ciphertext and produces a detection result. Unlike R-FMD, Extract produces a detection key directly from the root key, and Flag takes the false positive rate and the receiver’s flag key to produce a flag ciphertext. As discussed below, S-FMD can be realized using tweaks to either of the R-FMD constructions in the original paper.

In R-FMD, flag ciphertexts are universal with respect to the false positive rate, which is applied to the detection key; in S-FMD, the false positive rate is applied to the flag ciphertext and the detection key is universal. (This means there is no meaningful difference in capability between the root key and the detection key, so the distinction is maintained just for ease of comparison between the two variants).

Unlike R-FMD, S-FMD allows detection precision to be adaptive, by having senders use a (consensus-determined) false positive parameter. This parameter should vary as the global message rates vary, so that filtered message streams have a bounded rate, and it should be the same for all users, so that messages cannot be distinguished by their false positive rate.

1

The paper calls these the secret and public keys respectively, but we avoid this in favor of capability-based terminology that names keys according to the precise capability they allow.

# Constructing S-FMD

The original FMD paper provides three constructions of R-FMD. The first two realize functionality for restricted false-positive probabilities of the form ; the third supports arbitrary fractional probabilities using a much more complex and expensive construction.

The first two schemes, R-FMD1 and R-FMD2, are constructed similarly to a Bloom filter: the Flag procedure encrypts a number of 1 bits, and the Test procedure uses information in a detection key to check whether some subset of them are 1, returning true if so and false otherwise. The false positive probability is controlled by extracting only a subset of the information in the root key into the detection key, so that it can only check a subset of the bits encoded in the flag ciphertext. We focus on R-FMD2, which provides more compact ciphertexts and chosen-ciphertext security.

In this section, we:

• recall the construction of R-FMD2, changing notation from the paper;
• extend S-FMD2 to support diversified detection, allowing multiple, unlinkable flag keys to be detected by a single detection key;
• extend S-FMD2 to use deterministic derivation, allowing 32-byte flag and detection keys to support arbitrary-precision false positive probabilities;
• summarize the resulting construction and describe how it can be integrated with the Sapling key heirarchy.

## The R-FMD2 construction

First, we recall the paper’s construction of R-FMD2, changing from multiplicative to additive notation and adjusting variable names to be consistent with future extensions.

The construction supports restricted false positive values for , a global parameter determining the minimum false positive rate. is a group of prime order and and are hash functions.

###### R-FMD2/KeyGen

Choose a generator . For , choose and compute . Return the root key and the flag key .

###### R-FMD2/Extract

On input and root key , parse and return as the detection key.

###### R-FMD2/Flag

On input , first parse , then proceed as follows:

1. Choose and compute .
2. Choose and compute .
3. For each , compute
1. a key bit ;
2. a ciphertext bit .
4. Compute .
5. Compute .

Return the flag ciphertext .

###### R-FMD2/Test

On input , , first parse , , then proceed as follows:

1. Compute .
2. Recompute as .
3. For each , compute
1. a key bit ;
2. a plaintext bit .

If all plaintext bits , return (match); otherwise, return .

To see that the Test procedure detects true positives, first note that since and , so the detector will recompute the used by the sender; then, since , the detector’s is the same as the sender’s .

On the other hand, if the ciphertext was created with a different flag key, the resulting will be random bits, giving the desired false positive probability.

One way to understand the components of the construction is as follows: the message consists of bits of a Bloom filter, each encrypted using hashed ElGamal with a common nonce and nonce commitment . The points are included in the ElGamal hash, ensuring that any change to either will result in a random bit on decryption. The pair act as a public key and signature for a one-time signature on the ciphertext as follows: the sender constructs the point as the output of a chameleon hash with basis on input , then uses the hash trapdoor (i.e., their knowledge of the discrete log relation ) to compute a collision involving , a hash of the rest of the ciphertext. The collision acts as a one-time signature with public key .

The fact that the generator was selected at random in each key generation is not used by the construction and doesn’t seem to play a role in the security analysis; in fact, the security proof in Appendix F has the security game operate with a common generator. In what follows, we instead assume that is a global parameter.

## From R-FMD2 to S-FMD2

Changing this construction to the S-FMD model is fairly straightforward: rather than having the sender encrypt bits of the Bloom filter and only allow the detector to decrypt of them, we have the sender only encrypt bits of the Bloom filter and allow the detector to decrypt all potential bits. As noted in the previous section, this means that in the S-FMD context, there is no separation of capability between the root key and the detection key. For this reason, we skip the Extract algorithm and (conceptually) merge the root key into the detection key.

The S-FMD2 construction then works as follows:

###### S-FMD2/KeyGen

For , choose and compute . Return the detection key and the flag key .

###### S-FMD2/Flag

On input flag key , first parse , then proceed as follows:

1. Choose and compute .
2. Choose and compute .
3. For each , compute
1. a key bit ;
2. a ciphertext bit .
4. Compute .
5. Compute .

Return the flag ciphertext .

###### S-FMD2/Test

On input detection key and flag ciphertext , first parse and , then proceed as follows:

1. Compute .
2. Recompute as .
3. For each , compute
1. a key bit ;
2. a plaintext bit .

If all plaintext bits , return (match); otherwise, return .

The value of is included in the ciphertext hash to ensure that the encoding of the ciphertext bits is non-malleable. Otherwise, the construction is identical.

## Diversified detection: from S-FMD2 to S-FMD2-d

The Sapling design provides viewing keys that support diversified addresses: a collection of publicly unlinkable addresses whose activity can be scanned by a common viewing key. For integration with Sapling, as well as other applications, it would be useful to support diversified detection: a collection of publicly unlinkable diversified flag keys that share a common detection key. This collection is indexed by data called a diversifier; each choice of diversifier gives a different diversified flag key corresponding to the same detection key.

In the Sapling design, diversified addresses are implemented by selecting the basepoint of the key agreement protocol as the output of a group-valued hash function whose input is the diversifier. Because the key agreement mechanism only makes use of the secret key (the incoming viewing key) and the counterparty’s ephemeral public key, decryption is independent of the basepoint, and hence the choice of diversifier, allowing a common viewing key to decrypt messages sent to any member of its family of diversified transmission (public) keys.

A similar approach can be applied to S-FMD2, but some modifications to the tag mechanism are required to avoid use of the basepoint in the Test procedure.

Let be the set of diversifiers, and let be a group-valued hash function. At a high level, our goal is to replace use of a common basepoint with a diversified basepoint for some .

Our goal is to have diversified flag keys, so we first replace by in KeyGen, so that and the components of the flag key are parameterized by the diversifier . The sender will compute the key bits as and the receiver computes them as If instead of , then and the middle component will match.

But then the sender needs to construct as in order to have a known discrete log relation between and . This is a problem, because is recomputed by the receiver, but if the receiver must recompute it as , detection is bound to the choice of diversifier, and we need detection to be independent of the choice of diversifier.

The root of this problem is that the chameleon hash uses basis , so the receiver’s computation involves the basepoint . To avoid it, we can use a basis , where . is used in place of , but is computed as , i.e., as the chameleon hash . The sender’s collision then becomes , allowing the detector to recompute as The point must be included in the flag ciphertext, increasing its size, but detection no longer involves use of a basepoint (diversified or otherwise).

Concretely, this mechanism works as follows, splitting out generation of flag keys (diversifier-dependent) from generation of detection keys:

###### S-FMD2-d/KeyGen

For , choose , and return the detection key .

###### S-FMD2-d/Diversify

On input detection key and diversifier , first parse . Then compute the diversified base , and use it to compute . Return the diversified flag key .

###### S-FMD2-d/Flag

On input diversified flag key and diversifier , first parse , then proceed as follows:

1. Compute the diversified base .
2. Choose and compute .
3. Choose and compute .
4. Choose and compute .
5. For each , compute
1. a key bit ;
2. a ciphertext bit .
6. Compute .
7. Compute .

Return the flag ciphertext .

###### S-FMD2-d/Test

On input detection key and flag ciphertext , first parse and , then proceed as follows:

1. Compute .
2. Recompute as .
3. For each , compute
1. a key bit ;
2. a plaintext bit .

If all plaintext bits , return (match); otherwise, return .

## Deterministic flag key generation

One remaining obstacle is the size of the flag keys. A flag key supporting false positive probabilities down to requires group elements; for instance, supporting probabilities down to with a 256-bit group requires 448-byte flag keys. It would be much more convenient to have smaller keys.

One way to do this is to use a deterministic key derivation mechanism similar to BIP32 to derive a sequence of child keypairs from a single parent keypair, and use that sequence as the components of a flag / detection keypair. To do this, we define a hash function . Then given a parent keypair , child keys can be derived as with . Unlike BIP32, there is only one sequence of keypairs, so there is no need for a chain code to namespace different levels of derivation, although of course the hash function should be domain-separated.

Applying this to S-FMD2, the scalar becomes the compact detection key, and the point becomes the compact flag key. The full detection key can be derived from the compact detection key , and the full flag key can be derived from the compact flag key . In addition, it is no longer necessary to select the minimum false positive probability as a global parameter prior to key generation, as the compact keys can be extended to arbitrary length (and thus arbitrary false positive probabilities).

Deterministic derivation is only possible in the S-FMD setting, not the R-FMD setting, because S-FMD does not require any separation of capability between the component keys used for each bit of the Bloom filter. Public deterministic derivation does not allow separation of capability: given and any child secret , it’s possible to recover the parent secret and therefore the entire subtree of secret keys. Thus, it cannot be used for R-FMD, where the detection key is created by disclosing only some of the bits of the root key.

Unfortunately, this optimization does not seem to be possible to combine with diversified detection. The detection key components must be derived from the parent secret and (the hash of) public data, but diversified detection requires that different public data (flag keys) have the same detection key components.

• Is there some clever way to sidestep this issue? seems somewhat fundamental.

## Sapling Integration

• Pick diversified detection XOR deterministic flag key generation
• Addresses include a flag key / compact flag key
• 32-byte compact flag keys are probably workable, 320-448 byte full flag keys are probably not.

# Parameter Considerations

• How should the false positive rate be determined? In some epoch, let be the false positive rate, be the total number of messages, be the number of true positives for some detection key, and be the number of detections for that detection key. Then and ideally should be chosen so that:

1. is bounded above;
2. When is within the range of “normal use”, is close enough to that it’s difficult for a detector to distinguish (what does this mean exactly?);
• The notion of detection ambiguity only requires that true and false positives be ambiguous in isolation. In practice, however, a detector has additional context: the total number of messages, the number of detected messages, and the false positive probability. What’s the right notion in this context?

• What happens when an adversary manipulates (diluting the global message stream) or (by sending extra messages to a target address)? There is some analogy here to flashlight attacks, although with the critical difference that flashlight attacks on decoy systems degrade privacy of the transactions themselves, whereas here the scope is limited to transaction detection.

# Homomorphic Threshold Decryption

A sketch of one construction is as follows.

Write a value in radix with signed coefficients, i.e., as with . Encode each coefficient as and use ElGamal encryption to form the ciphertext Each ElGamal ciphertext consists of two group elements; if group elements can be encoded in 32 bytes, this gives a 384-byte ciphertext. To decrypt , use ElGamal decryption to obtain the group elements , and then use a lookup table to recover from , or fail if the value is unknown.

This can in principle be done inside of a zkSNARK circuit if the underlying group is an embedded elliptic curve, together with certification that the ciphertext was correctly formed with in-range coefficients.

Addition and subtraction of ciphertexts are done componentwise, using the homomorphic property of ElGamal encryptions, and the fact that .

Adding ciphertexts of values whose coefficients were bounded as results in a sum whose coefficients are bounded as . Choosing and accounting for sign means that a lookup table of size is sufficient to decrypt sums of up to 128 well-formed ciphertexts. Sums of more than 128 ciphertexts can be handled by summing blocks of 128, decrypting the block sum, and summing the plaintexts.

Unfortunately, this method reveals slightly more information about a sum of encrypted summands than would be ideal. Ideally, it would reveal only the sum of the encrypted summands, but in fact it reveals the sum of each radix- chunk, without carrying between them. Carrying collapses information about the summands, but that information is revealed by this scheme. This seems unlikely to be a problem in practice, but it is worth quantifying.

## TODO

• the bounds above are a ballpark estimation; refine them and make them precise
• work out integration with ABCI++ protocol phases

These updates describe design progress on Penumbra:

# 2020-06-07: ZSwap

Replaces the previous sketch DEX design with ZSwap, an new design that integrates Uniswap v3’s concentrated liquidity mechanism.

# 2020-05-11: Cryptography design

Creates a new Primitives section for the cryptographic primitives Penumbra will use.

Adds a rationale for the choice of proof system.

Adds an overview of the decaf377 construction with an overview of the relative cost compared to other encodings, sketches the encoding and decoding functions and adds test vectors. Specifying the square root algorithm and hash-to-group map go together.

Also puts in placeholders for a Poseidon instance, and a STROBE-with-Poseidon construction.

# 2020-04-20: Project update log and issue tracker

Created this project update section of the notes, backfilled it, and published the repo to use as an issue tracker for design details.

# 2020-04-16: ABCI domain types

The first design iteration of tower-abci used raw protobuf-generated types from the tendermint-proto crate. However, because these types are generated from protobuf definitions, they’re much less ergonomic: they don’t follow Rust naming conventions, they can’t be well-organized into modules, they can’t be easily documented, etc. More importantly, because protobufs require that all fields be optional and have default values, using raw proto types means that validation checks are scattered throughout the codebase.

To fix this, tendermint-rs PR #862 adds a complete model for the ABCI protocol as a set of Rust domain types for each request and response method, and provides validating conversions between the domain types and the raw proto types.

Using this code in tower-abci makes the interface significantly better, with the bulk of the kvstore example application fitting on one screen of code.

# 2020-04-12: tower-abci

Penumbra will use ABCI to talk to Tendermint. ABCI is the interface between Tendermint (a consensus engine for BFT replication of a state machine), and an arbitrary application (the state machine to be replicated). The ABCI interface consists of a set of requests and responses the consensus engine makes to drive the application state.

This means that Penumbra needs an ABCI interface. Existing Rust interfaces to ABCI require application developers to write their own ad-hoc synchronization logic to handle concurrent ABCI requests. This makes it difficult to get up-and-running, and difficult to correctly handle concurrent requests.

To address this, tower-abci is a new design for an asynchronous ABCI interface using Tower. Tower is a library of modular components for building networking clients and servers. Tower defines a core abstraction, the Service trait, which represents an asynchronous function with backpressure, and then provides combinators that allow generic composition of additional behavior, e.g., timeouts, buffering, load-shedding, rate-limiting, instrumentation, etc.

The tower-abci crate has two parts:

1. An ABCI server, which listens for connections and forwards ABCI requests to one of four user-provided Services, each responsible for processing one category of requests (consensus, mempool, info, or snapshot).

2. Middleware that splits a single Service implementing all of ABCI into four cloneable component services, each implementing one category of requests. The component services use message-passing to share access to the main service, which processes requests with the following category-based prioritization:

1. ConsensusRequests sent to the Consensus service;
2. MempoolRequests sent to the Mempool service;
3. SnapshotRequests sent to the Snapshot service;
4. InfoRequests sent to the Info service.

Because the ABCI server takes one service per category, users can apply Tower layers to the services they pass to the ABCI Server to add category-specific behavior, such as load-shedding, buffering, etc.

These parts can be combined in different ways to provide different points on the tradeoff curve between implementation complexity and performance:

1. At the lowest level of complexity, application developers can implement an ABCI application entirely synchronously. To do this, they implement Service<Request> so that Service::call performs request processing and returns a ready future. Then they use split::service to create four component services that share access to their application, and use those to construct the ABCI Server. The application developer does not need to manage synchronization of shared state between different clones of their application, because there is only one copy of their application.

2. At the next level of complexity, application developers can implement an ABCI application partially synchronously. As before, they implement Service<Request> to create a single ABCI application, but instead of processing all requests in the body of Service::call, they can defer processing of some requests by immediately returning a future that will be executed on the caller’s task. Although all requests are still received by the application task, not all request processing needs to happen on the application task.

3. At the highest level of complexity, application developers can implement multiple distinct Services and manually control synchronization of shared state between them, then use these to construct the ABCI Server.

Because these use the same interfaces in different ways, application developers can move gradually along this curve according to their performance requirements, starting with a synchronous application, then refactoring it to do some processing asynchronously, then doing more processing asynchronously, then splitting out one standalone service, then using entirely distinct services, etc.

# 2020-03-26: Design presentation

Initial publication of these notes, and presentation at the ZKValidator Cosmos Privacy+ZKP showcase.

The slides for the presentation can be found here, and a recording is published here.