00

7 min readDec 14, 2021

What is the funniest number in cryptography (Episode 2 )? 0 . The reason is that ∀x, x ∗ 0 = 0, i.e., the equation is always satisfied no matter what x is. We’ll use zero to attack zero-knowledge proof (ZKP). In particular, we’ll discuss a critical issue in a cutting-edge ZKP PLONK [1] C++ implementation which allows an attacker to create a forged proof that all verifiers will accept. We’ll show how theory guides the attack’s direction. In practice, the attack works like a charm and we’ll show how the attack falls through a chain of perfectly aligned software cracks.

In the same codebase, there is an independent critical ECDSA bug where (r, s) = (0, 0) is a valid signature for arbitrary keys and messages, but we won’t discuss it further because it’s a known ECDSA attack vector in the Google Wycheproof project that I worked on a few years ago.

All bugs have been responsibly disclosed through the vendor’s bug bounty program with total reward ~ $15,000 (thank you).

If you prefer a pdf version, please take a look at https://github.com/cryptosubtlety/00

How theory guides the attack’s direction?

In any zero knowledge proof (ZKP) system, there is a prover and a verifier. The prover has to convince the verifier that it knows a witness of a certain statement. The verifier has to check whether a certain equation is satisfied or not. PLONK uses polynomial commitment and pairing. For the purpose of this article, you don’t have to know what polynomial commitment or pairing e is. All you need to know is that the pairing e(P₁, P₂) maps 2 points P₁, P₂ to a finite field and G₁, G₂ are 2 base points on the elliptic curves.

One distinctive notation in PLONK is the following: [x]₁= x.G₁, [x]₂= x.G₂. What it means is that when [x]₁ is published, the attacker does not know x, the attacker only knows the product x.G₁and without breaking discrete log problems, x remains secret. However, the attacker can manipulate [x]₁. We’ll use this observation in the attack.

The final step in the verifier is to verify following equation

e([Wz]₁+ u.[Wzω]₁,[x]₂)* e(-(z.[Wz]₁+ uzω.[Wzω]₁+ [F]₁— [E]₁), [1]₂) ?= 1

where

The formulas look scary, don’t they? Just count how many parameters there are.

To simplify it, we’ll denote P[1] = [Wz]₁+ u.[Wzω]₁, P[0] = -(z.[Wz]₁+ uzω.[Wzω]₁+ [F]₁ -[E]₁) and the equation becomes

e(P[1], [x]₂).e(P[0], [1]₂) ?= 1

Now, it’s simple. I’m kidding.

When dealing with such complexity, it’s important to find the weakest and easiest place to attack. Recall that we want a full verification bypass, i.e., the attacker doesn’t know any witness, but wants to convince the verifier to accept the proof. Therefore, I was looking for 2 things:

Parameters that an attacker can manipulate.
The least effort way to manipulate parameters.

while forcing the verification equation satisfied.

To give you an brief idea:

[Wz]₁, [Wzω]₁are under the attacker’s control. To be clear, the attacker can manipulate [Wz]₁, [Wzω]₁but the attacker does not know the inside true values Wz, Wzω.
u = hash(transcript) where hash acts as a random oracle, so it’s technically outside the attacker’s control.
x is part of a trusted setup that no one (including the prover and verifier) is assumed to know.
F and E are computed by the verifier (not attacker) in a complicated multi-steps process, so let’s ignore them.

All right, it seems that [Wz]₁, [Wzω]₁are natural targets to attack. What if we use ([Wz]₁= 0, [Wzω]₁= 0) where 0 is an identity (infinity) point in the elliptic curve?

i/ P[1] = [Wz]₁+ u.[Wzω]₁= 0 + u0 = 0 independent of the transcript’s hash u (aka Fiat-Shamir transform).

Basically, we neutralize the role of Fiat-Shamir transform.

Now, there is some hope!

ii/ e(P[1], [x]₂) = e(0, [x]₂) = 1 because e(0, R) = e(R, 0) = 1 for all R.

iii/ We have

e(P[0], [1]₂) = e(-(z.[Wz]₁+ uzω.[Wzω]₁+ [F]₁-[E]₁), [1]₂)

= e(-(z.0 + uzω.0 + [F]₁— [E]₁), [1]₂)

= e(-(0 + 0 + [F]₁ — [E]₁), [1]₂)

= e(-([F]₁— [E]₁), [1]₂)

≠ 1

because [F]₁ — [E]₁ ≠ 0.

Therefore, e(P[1], [x]₂) e(P[0],[1]₂) ≠ 1. So, the attack doesn’t work?

Why does the attack work in practice?

Fortunately, I’m not a theoretical cryptographer. I don’t believe in theory’s security, at least not completely. I trust the program, the binary that runs. Whatever the program output tells me, that’s the truth. So I just input [Wz]₁= 0, [Wzω]₁= 0 to the PLONK’s verifier and see what would happen. The verifier computes e(P[1],[x]₂).e(P[0],[1]₂) = 1 and returns true, so the attack completely bypasses the verifier. Woo-hoo!

Note that it’s not strange at all in the attack process where the attacker observes unexplainable behavior. It’s pretty normal and common. Sometimes, it’s the start of a surprise attack. The investigation showed that the attack falls through a chain of perfectly aligned software cracks.

The root cause

Before we move on, a technical software implementation detail is that points in the elliptic curve are often represented in 3 forms: byte array (on the wire or storage), affine coordinate P = (x, y) or projective coordinate P = (x, y, z).

Recall that in our attack vector, we use [Wz]₁= 0 = (0, 0), [Wzω]₁= 0 = (0, 0) or (P[0] ≠ 0, P[1] = 0), where 0 = (0, 0) means its affine coordinate (x, y) = (0, 0). As a reminder, the attackers don’t control P[0], P[1] directly, they have to manipulate them through [Wz]₁, [Wzω]₁. If you’re lazy, just use a zero byte array for the whole proof (which includes [Wz]₁,[Wzω]₁) and it will bypass all verifiers.

The verifier checks whether [Wz]₁, [Wzω]₁ are on the elliptic curve or not. [Wz]₁ = 0, [Wzω]₁ = 0 are not valid points on the curve. However, the verifier does not stop immediately when it sees invalid points. It continues the execution, but it excludes the invalid 0 points in some further computations. The amazing thing is that while [Wz]₁ = 0, [Wzω]₁, P[1] = 0 are excluded in some further computations, they’re included in the crucial computation with pairing which allows the attack to work. If the program returns false immediately once it sees invalid points, the attack would fail.
In the elliptic curve code, there is another check to reject the infinity point. However, according to the code, P[1] = 0 is not infinity. The is_point_at_infinity method checks whether the most significant bit of the P[1] is 1, but P[1] = 0’s most significant bit is 0. Hence P[1] = 0 bypasses the infinity point check.
In the computation process, there is a method that computes the inverse of 0 mod p. The method doesn’t check for 0 input. It uses Fermat’s little theorem, i.e., it uses the equation x^(p — 1) = 1 mod p or x^(p — 2). x = 1 mod p or x^(p — 2) is the inverse of x mod p. However, when x = 0, x^(p — 2) = 0 which means that the inverse of 0 is 0. This isn’t even correct mathematically because the inverse of 0 mod p shouldn’t exist. If the inverse method rejects 0, the attack would again fail.
Now, the array (P[0], P[1]) = (P[0] ≠ 0, P[1] = 0) are in projective coordinates (x, y, z). The projective coordinate is often just an intermediate representation for optimization purposes. After finishing the computation, the projective coordinates go through a process called normalization to eliminate z (i.e. to make z = 1) and goes back to affine coordinate (x, y) ~ (x, y, 1). The code does not normalize points individually, instead it batch-normalizes an array of points together where P[1].z = 0 will affect P[0]. The vulnerable code outputs (P[0], P[1]) = (0, 0), i.e., it turns non-zero point P[0] into a 0 point. For instance, here is the output from the program

“Before batch_normalize

P[0]: {0x12270675066dbf202e8766f5fa48648f95032fbff46996a08e05e427ed0fffb9,0x2cce89ca786bd0a3db55776a24aa3253bce3b8ef689849f93596b5b26afec90f,0x04ae1f4cd5f84a484acc4ba115fbd02a879d2e30b8cd97e18f3865887213823b }

P[1]: {0x0000000000000000000000000000000000000000000000000000000000000000,0x0000000000000000000000000000000000000000000000000000000000000000,0x0000000000000000000000000000000000000000000000000000000000000000 }

After batch_normalize

P[0]: {0x0000000000000000000000000000000000000000000000000000000000000000,0x0000000000000000000000000000000000000000000000000000000000000000,0x0000000000000000000000000000000000000000000000000000000000000001 }

It’s worth noting in projective coordinates (x, y, z), we should reject z = 0 for regular points, but the code doesn’t. If the code rejects z = 0 in projective coordinates, the attack would again fail. As a small note, it’s worth noting that if each point is normalized independently then after normalization P[1] would be different from zero causing the attack to fail.

As a consequence the final verifier’s computation becomes e(P[1], [x]₂) e(P[0], [1]₂) = e(0, [x]₂) e(0, [1]₂).

5. Lastly, while P[1] = 0 is not on the curve and P[1] = 0 is not infinity according to step 2, the pairing code considers P[1] = 0 as infinity, in the sense that e(0, R) = 1 for all R. Without these contradicting views of the same input 0 in step 2 and step 5, the attack would fail.

Acknowledgements

We thank Ariel Gabizon for the feedback on this article.

[1]Ariel Gabizon, Zachary J. Williamson, Oana Ciobotaru. PLONK: Permutations over Lagrange-bases for Oecumenical Noninteractive arguments of Knowledge.

00

How theory guides the attack’s direction?

Why does the attack work in practice?

The root cause

Acknowledgements

Written by Quan Thoi Minh Nguyen