🦋 Glasswing #4 - We Need a New Turing Test
AI breaks CAPTCHA. A new identity system is needed.
CAPTCHA: The Mainstay of Bot Prevention
CAPTCHA is annoying. Yet, it is the mainstay of ensuring you are not a robot.
CAPTCHA is an acronym for “Completely Automated Public Turing test to tell Computers and Humans Apart”. It works by presenting a challenge that is easy for humans to solve but difficult for bots to complete.
Examples of CAPTCHA systems that you have probably come across include “click the stoplights”, “match the two trucks”, “solve a puzzle”, “listen to audio”, or “solve a math problem”.
If you have ever watched the movie The Imitation Game with Benedict Cumberbatch, the movies title is in reference to Alan Turing’s seminal research paper “Computing Machinery and Intelligence” where he introduces the Imitation Game, or what we now refer to as the Turing Test.
The Turing Test is a cognitive evaluation method in which an interrogator engages in a text-based conversation with two players, one of whom is human and the other is a machine. The goal is to determine if the machine can generate responses that are indistinguishable from those of the human player. If the interrogator is unable to consistently identify which player is the machine, the machine is deemed to have exhibited intelligent behaviour equivalent to that of a human.
CAPTCHA takes on the key principles from the Turing Test and attempts to automate it for all websites to determine whether a given user was a human or not. This technology has evolved over time and has taken different forms, including distorted text and image recognition challenges, audio challenges, and more recently, behavioural biometrics, which uses machine learning algorithms to analyze user behaviour patterns such as mouse movement and typing speed.
Among the top 100k visited sites on the internet today, ~35k currently use some form of CAPTCHA technology to prevent bots. The most popular CAPTCHA tool today is Google’s reCAPTCHA API, which for larger companies is not cheap.
Any website that is reliant on CAPTCHA for the health of its platform is subject to severe risk of attack from bots as CAPTCHA become easier to break with AI. There are already many studies that show how different types of modern machine learning techniques can be used to attack CAPTCHA, and is only going to get worse.
What happens when all CAPTCHA breaks?
If a system is unable to distinguish between a human or not, the amount of non-human generated content that is going proliferate our social platforms is going to be uncontrollable.
With the advent of generative foundation models (ie., chatGPT, GPT-4, DALLE-2), we are going to see more bots generating forged content in a very persuasive way. This is going to decrease in user experience and engagement. Users of broad form social media platforms like Instagram, Facebook, and Twitter are going to become frustrated with the influx of spam and fake accounts on the platform.
With an increased amount of deception, there is going to be a decrease in trust in the authenticity of the information we view, reducing customer confidence in the platform.
The impacts of this are not just negative towards humans but also for negative for AI models. The more bots persist forged content, the data that AI is trained is going to include data from non-human generated sources, meaning that the model’s distribution starts to diverge from the human natural language.
How to Fix CAPTCHA?
Turing’s original Imitation Game made an assumption of having no prior information about the players involved in the game. The probability of being a human was determined by independent trials. CAPTCHA may now need to take on a more Bayesian approach to identity verification.
This means that instead of a verification system trying to evaluate whether or not you are a human at the time step you are trying to get access to a resource, the verification itself will have to incorporate some prior information about you.
An existing instantiations of this today is crypto wallets. Crypto wallets are an identifier that can control the linking of verifiable attestations of your interactions across the decentralized internet to a single identifier. They contain proof of personhood attestations, your social profiles, and if you choose to link it, part of your transaction history.
Privacy is a large concern for crypto wallets today, but as I have posted before, there are many startups working on private implementations of identity attestations that can be mapped to your identifier with zero-knowledge proof (ZKP) technology.
Allow me to preface by stating that verifiable attestations can be achieved without relying on a blockchain. However, it is worth noting that the most commonly utilized tools for enabling individuals to carry their identities with them are currently found in the Web3 ecosystem.
As generative foundation models continue to dismantle our Web2 platforms, it will become necessary for these platforms to adopt comparable identity and reputation frameworks. In fact, we are already witnessing a move in this direction, with Meta's recent announcement regarding the launch of a decentralized version of Instagram, and Twitter's support for Bluesky's efforts to construct the AT protocol.
I think Balaji has put it quite well here. We are entering the phase of the internet where verification and authenticity is a scarce.
As AI improves, our reliance on cryptographic primitives heightens. As cryptographic primitives improve to allow for interoperable data across platforms, AI can help us make sense of this data. This symbiotic relationship is not well discussed.
I have a new paper being released on March 20th, and it would be much appreciated for my subscribers to take a read and share comments when it drops. I will be sure to make a post.
"However, it is worth noting that the most commonly utilized tools for enabling individuals to carry their identities with them are currently found in the Web3 ecosystem."
"Sign in with Google" would like to have a word with you. Scan your passport to link your account to your identity and Google can provide the ZKP-equivalent verification to a third-party that you're a person without sharing your passport. Are there some downsides compared to solutions like Worldcoin? Yes, but there are also advantages. Based on your own metric of usage, Google is used by about 1000x more people and has solid structural moats to maintain that.
"Based on your own metric of usage, Google is used by about 1000x more people and has solid structural moats to maintain that." -> Sign in with google proves that I have a google account, nothing more.
"Scan your passport to link your account to your identity and Google can provide the ZKP-equivalent verification to a third-party that you're a person without sharing your passport." -> They don't today
"Sign in with Google" would like to have a word with you." -> Email is email@example.com
Maybe I am missing something here but I don't understand your point? My argument was to state how interoperable identity today is seen in an elegant way with crypto wallets that let you attest to authenticity of your social interactions from one platform on another. Google could surely do this, but to my knowldege they do not today?