What is Zero-Knowledge Anyways?

ZK is a multi-billion dollar industry today and it is the key to a scalable, private future. But how can you tell if the "zero-knowledge" you're hearing about is actually private? The vast majority of the ZK systems deployed today, including Starkware, Succinct, and RiscZero, only tap into the (succinct) verifiability property of these systems and haven't reached privacy applications. With Stablecoin and RWA adoption growing, ZK for privacy will be the only way to bring it mainstream for compliance and security. Here is the tl;dr on what it means for a system to have the "succinct verifiability" property versus the "privacy" property, and how do you tell which systems deliver the critical privacy property.

All proof systems, including how mathematical proofs are taught in high school, have to satisfy two basic properties:

- Correctness: If the statement is true, the prover can generate a proof that the verifier accepts

- Soundness: If the statement is incorrect, no matter how a proof was generated by the prover, the verifier rejects

Here is an unwritten property that we tend to ignore in proof systems, but, nevertheless, is just as important - it must be easy to verify a proof. Because if it is not efficient to verify the proof, the verifier can D-I-Y the proof. As an example, consider verifying the proof that the 100th Fibonacci is 354224848179261915075. If this takes longer than computing it, the proof system is useless.

Succinct verification addresses precisely this conundrum. A proof system with succinct verification allows one to verify such a computation in time significantly less than the time required to compute it.

What does it mean for a proof to be zero-knowledge?

So far, we have only talked about the integrity of the computation. But what about privacy, a cornerstone of cryptography? In general, proof systems do not guarantee privacy: they may reveal information about the statement beyond the mere fact that it is true. In the worst case, from a privacy standpoint, they could be no better than sending all the data in the clear.

Their main advantage is efficiency: they enable the verifier to check the validity of huge statements (such as a transaction log or medical records) without examining all the data.

So, when do proof systems guarantee privacy, and how can we capture the leakage, if any, with mathematical precision? This is where the concept of zero-knowledge (ZK) comes in. A zero-knowledge proof ensures that the verifier learns nothing beyond the bare fact that a particular statement about the data is true, along with whatever information the verifier already knows.

Let's look more closely at what this means. Every statement being proven has two parts:

Public state – information already known to the verifier (e.g., rules of the Fibonacci sequence)
Private state – sensitive information known only to the prover (e.g., secret account balances)

A proof system is said to be zero-knowledge if there exists an efficient algorithm, called a simulator, that can generate something that looks identical to the proof without access to the prover’s private information. The essential requirement is that the simulated proof is indistinguishable from the one produced by the prover with the private inputs. Here is the central thesis of zero-knowledge property: if the verifier cannot distinguish the simulated proof from the real one, it implies that the verifier learns nothing beyond the validity of the statement itself.

Let’s call the simulator Sim. Suppose there is a verifier that claims it learns something from the proof and public inputs. Call this learning algorithm Learn, so

Learn(public input, proof) = Learn(public input, Prover(private input, public input))

captures what it learns. We argue that Learn did not gain any new information, since it could have obtained the same result by running

Learn(public input, Sim(public input))

Why? Because Sim produces outputs that cannot be distinguished from actual proofs, Learn cannot tell the difference, which means it must produce the same outcome in both cases. The crucial point here is that Sim runs efficiently; if Sim required exponential time, this conclusion would not hold.

Here is the bottom line: To prove something is zero-knowledge, there is only one way we know how to: Construct an efficient simulator.

As a concrete example, suppose I want to prove that I hold more than 5 ETH. The proof will convince you that my assets exceed 5 ETH, but it will reveal nothing about whether I hold 10, 100, or 10,000 ETH. Conversely, you will know I do not hold 3 ETH, because that already follows from the public claim. Zero-knowledge guarantees that the verifier learns exactly this and nothing more. This is a subtle point. The verifier learns something from the fact that the statement about the data is true, which it did not know before. So we cannot claim the verifier gained no knowledge. We can only claim it did not learn anything more than what it can conclude, assuming the statement about the data is true.

Designing such a simulator is a subtle and non-trivial task. Even when a proof appears to be “hiding” because the revealed information looks perfectly random, and may even be informally described as “perfect” zero-knowledge, true privacy requires a careful analysis. To rigorously guarantee zero-knowledge, everything revealed in the proof must be reproducible in an indistinguishable way by an efficient algorithm that operates without any access to the prover’s private state.

The important takeaway: whenever someone advertises a “zero-knowledge” system, the litmus test is whether they can explain proof generation via an efficient simulator.

What does it mean for a proof to be zero-knowledge?

More Stories