Update October 2021: I’ve implemented a proof of concept of the zero-knowledge proof this post describes here.
In FixedID: Building a Better SSO I described a system for building a proof-of-personhood protocol with built-in account recovery. In that post, I mentioned the possibility of using the system for SSO while maintaining user privacy. This post will elaborate a specific system for achieving that.
Existing web2 SSO standards (“sign in via Google, Facebook, Twitter,” etc.) are highly centralized and vulnerable to censorship. They also leak data in two directions: the SSO provider can see a list of all apps you’ve signed into, and the app can see details of your public profile on the service. However, many web2 and even web3 apps rely on these centralized providers anyway, because (1) they’re a weak form of sybil-resistance, (2) they’re more convenient for end users, and (3) they move some of the burden of account recovery off of each app and onto the SSO provider.
On the other hand, the Ethereum ecosystem has standardized on “sign in with Ethereum” as a replacement for username/password combinations. This still isn’t ideal, however. Importantly, losing your private key now means you lose all your funds and your accounts! This may be acceptable for crypto pioneers, but it’s unrealistic to expect the wider world to never lose or share their private keys. Account recovery needs to be a first-class property of any sign-in system.
A Naive Implementation of SSO via FixedID
FixedIDs are designed to be fully recoverable, which solves the largest UX issue with “sign in with Ethereum.” Building on that foundation, a naive SSO flow might work as follows:
- The user generates an access token signed by the private key of the wallet associated with their FixedID.
- The user sends the token to the app they’re signing into.
- The app consults the mapping of wallet->FixedID publicly readable in the FixedID contract, and determines which FixedID the sign-in request corresponds to.
- The app responds to the user with a session token tied to the user’s FixedID, valid for a few days.
- If the user loses their private key and has to recover their account, they just go through the FixedID recovery flow to assign a new private key. This also automatically restores access to all FixedID-backed SSO accounts, since app user accounts are tied to FixedIDs, instead of specific addresses.
This implementation is great for account recovery, but it’s bad news from a privacy point of view. Since each FixedID must be tied to a verifiable real-world identity, every app would know the real identity of all of their users, which is unacceptable for many use cases.
Luckily, with the help of zero-knowledge proofs we can do better.
Private SSO with FixedID
Starting from the naive approach above, we need to modify steps 1-3 so a user doesn’t have to share their FixedID (or the address associated with their FixedID) with an app to verify they’re a unique human. Here’s one way to do that:
- Once a week, the FixedID contract generates a Merkle tree1 spanning all active FixedIDs, where each leaf is a
(fixedID,public_key)tuple. The root of the tree is published on-chain and publicly verifiable.
- Each app chooses a static
AppIdused to generate sign-in requests. Typically, this will be the app’s domain name.
- When a user wants to sign into an app, the app provides its
AppIdalong with a large random nonce, and the user responds with two pieces of information:
- An app token, of the form
AppToken = Hash("$FixedID,$AppId")
- A proof that the
AppTokenwas generated correctly. (I’ll explain this proof in detail below.)
- An app token, of the form
- The app validates the proof to ensure the
AppTokenwas generated correctly. It then saves the token to its database, and returns a session token tied to the
AppToken, valid for a few days.
- If the user loses their private key and has to recover their account, they go through the FixedID recovery flow to assign their FixedID to a new address. Once that is done, they can just repeat step (3) to get a new session token for the app. Since the
AppTokendoesn’t depend on the user’s private key in any way, only the FixedID and
AppIDwhich never change, recovery works totally transparently from an app’s perspective.
A zero-knowledge proof can be modeled as a program run with a set of inputs chosen by a “prover.” The prover discloses the output of the function as well as some subset of the inputs to a “verifier,” without leaking any information about the undisclosed inputs.
As a user, we want to prove to an app that we generated a valid
AppToken, without revealing the actual FixedID used to create it. That can be done with a program written as follows:
- The Merkle root of the current set of FixedID users
- The app’s public
- The app’s provided nonce
- The generated
- The user’s FixedID
- The public key associated with the FixedID
- The private key associated with the FixedID
- The path through the Merkle tree to the user’s
The proof program then verifies the following properties:
- the private key matches the public key
- the provided path through the Merkle tree is valid, and leads to the provided
AppTokenstring matches the output of
If any of the above checks fail, the program returns 0. Otherwise, it returns the provided nonce.
Once the app receives the finished proof, it has to check that all of the provided public inputs match their expected values, and that the proof returned the provided nonce. Assuming that is the case, the provided
AppToken must be valid.
Addendum: Cracking Hashes
Unfortunately, there’s one remaining issue. The provided
AppToken hash is very vulnerable to cracking, since the
AppId is static and there will be at most a few billion FixedIDs to try. We need to add more entropy to the string that gets hashed!
Given the length of this post, I’ll save a full treatment of possible solutions and their tradeoffs for another time. But one simple solution is to require users to remember a static “password”, and save a hash of that password on the blockchain. That password hash could be present in the Merkle tree, and the password could be added as an input to the
Compromising the password would be bad (since sites could de-anonymize their users), but not nearly as bad as losing your private key.
Zero-knowledge proofs make it possible to build a decentralized single-sign-on system that is fully recoverable, fully private and sybil-resistant. This set of properties have never before been combined in a practical system.
Actually, an RSA Accumulator or similar set digest is likely a better data structure here, because additions/updates are much cheaper than in a Merkle tree and proofs can use much less data. But Merkle trees are more familiar and work fine to illustrate the protocol. ↩