Double spending bug in Polygon’s Plasma bridge
I thought I was out of the security game for a while now and that my interests have moved on to other fields. Last week I came back from an extended holiday, and refreshed I looked around on CT on what had happened. Somehow I stumble over the Polygon announcement that they launched their bug bounty program on Immunefi. I have worked on fraud-proof systems before, specifically on Plasma. They are tricky to implement securely, and there is a lot of complexity around the exit games and withdrawing funds from L2 to L1. Taking a quick look at the bounty page, I skim through the published contracts to find the contract that locks up the funds on L1 when users move funds to and from the Polygon network. The contract is called DepositManagerProxy, and it holds tokens valued at close to one billion USD at the time of writing. I am intrigued and down I fall that rabbit hole.
Polygon’s Plasma overview
Before analyzing the bug, it’s essential to understand how Plasma works, at least on a high level. The below diagram shows the flow of user assets through the Polygon network (source). A user deposits funds into the bridge contract on L1. The tokens are locked on L1 and made available on the Plasma network so users can transfer them. An aggregator called the child chain bundles the Plasma transactions into blocks and submits checkpoints to L1. This function allows users to confirm that transactions have been successfully processed on the child chain and to detect misbehavior. When a user decides to exit the funds back to L1, the tokens need to be burned on the Plasma chain. The user presents the receipt of the burn transaction to the Plasma bridge as proof that the tokens were burnt. After a challenge period of 7 days, the funds can be withdrawn back to the user on L1. Such a mechanic is typical for fraud-proof systems as they assume certain actions are valid and only revert it if another user can prove that it was invalid. The exit game scenarios are fairly complex in Plasma; if you want to learn more, you can check out EthHub. After the challenge period, the exit can be processed and the tokens are sent back to the user from the bridge contract.
The bug
An exit can be started by calling the startExitWithBurntTokens function from the ERC20PredicateBurnOnly contract. The following parameters are among what is parsed by the ExitPayloadReader library from the transaction data.
- block receipt root: receipts root hash of the block
- receipt: a receipt containing all essential information of a transaction such as sender, receiver, amount, and what token is sent
- receipt proof: Merkle proof of the receipt
- branch mask: the traversal path of where the receipt hash is located within the receipt proof
These values are passed into the MerklePatriciaProof.verify function in line 115, which checks that a receipt is included in a tree of receipts by traversing the receipt proof based on the branch mask. The checkBlockMembershipInCheckpoint function checks that the receipt root was included in a checkpoint, which proves that the receipt was included in a previous block. If both checks pass, then the exit id is created from the timestamp and the block number of the checkpoint and the branch mask see line 158. The exit id is returned and stored in a priority queue that orders all exits based on their age. When an exit has been in the priority queue for at least 7 days since the burn transaction it refers to, it can be processed and the funds are returned to the user on L1.
An important security property that must always hold when a user starts an exit is that one exit id can only refer to one burn transaction. The exit id is composed partly of the branch mask that comes directly from user input, which immediately felt like a soft spot when I looked at the code for the first time. If it were possible to create an alternative branch mask that somehow passes the receipt inclusion proof, then we would find a way to create more than one exit id referring to the same burn transaction. A closer look at the verify function in the MerklePatricaProof library reveals that something happens to the path before the proof is traversed in a loop from line 40. In line 35, the branch mask is passed into the _getNibbleArray function.
Reviewing the _getNibbleArray function shows that the branch mask (function argument encodedPath) is Hex Prefix encoded. The Ethereum Yellow Paper defines this type of encoding in the following way: Hex-prefix encoding is an efficient method of encoding an arbitrary number of nibbles as a byte array. The _getNibbleArray function considers the path length an odd number if it starts with 1 or 3. It discards the first nibble and expands every consecutive nibble to a byte when it is decoded. If the path length is even and the first nibble is not 1 or 3, it discards the entire first byte. Let’s take an example and assume that the branch mask is 0x00819c. The first byte is discarded as the path length is an even number, and the rest of the nibbles are expanded to 0x0801090c.
Checking a few exits from the ERC20PredicateBurnOnly contract seems to indicate that the decoded branch mask is always four, so an even number. Almost an entire byte is discarded in this case and it creates a lot of waste or, for our purposes, ways to encode the same path in different ways; see the list below. In total, it is possible to create 14x16 = 224 different encodings for the same raw path. A malicious user can leverage the issue to create alternative exits for the same burn transaction and perform double spends on the Polygon network.
The impact
There are capital requirements for a successful attack, but they are minimal compared to a malicious user’s potential reward. Let’s take an example and understand how much a malicious user could have gained compared to the funds required for an attack. A malicious user
- deposits 200,000 USD worth of tokens to the DepositManager contract
- burns the tokens with a burn transaction on Polygon network
- starts the exit
- waits the seven days challenge period
- processes the exit and gets the initial funds back
Then the attack is launched, and 223 alternative exit payloads are generated with the technique described above, and exits are initiated for each one of them. All exits get a unique id and are added to the exit queue. Their age is already older than the challenge period since the burn transaction has been aggregated into a Plasma block, so the funds can be released on L1. After all exits have been started and processed, the malicious user gains 223 times the amount on top of the initial deposit or tokens valued at 44.6 million USD.
A malicious user can increase the reward by either increasing the deposit amount or depositing and exiting funds multiple times.
The fix
The vulnerability has been fixed by rejecting any encoding that does not start with 0x00 (see here). It’s not very elegant, but it fixes the double-spending bug by hard coding the encoding meta character.
TL;DR
If I had to guess why the bug happened, I would say it might be due to using someone else’s code and not having a 100% understanding of what it does. The library that verifies the Merkle Patrica proofs has been written by a 3rd party, and it can be found in other Github projects. The assumption that the proof path is unique and that it can be wrapped into the exit id in its encoded form seems to support the theory that the code in the library was not well understood. It’s OK to use exiting building blocks when you write smart contracts, but you must understand all implications of doing so. At the end of the day, it’s your code; it does not matter if you or someone else wrote it.