CAPTCHA, I Swear I'm Not a Robot

Technology

From optical interpretation to ReCAPTCHA: the evolution of CAPTCHA technology, its functioning, and the contribution of artificial intelligence.


Captcha

“There’s something uniquely disheartening about being asked to identify a fire hydrant and struggling to do so” wrote journalist Josh Dzieza. And who can blame him?

Anyone who surfs the web will surely have encountered CAPTCHAs, those security tests created to distinguish human users from bots, involving the recognition of numbers, letters, and sounds, or a simple click to declare, “I’m not a robot.”


What Does CAPTCHA Mean?


CAPTCHA might bring to mind “gotcha,” an informal English term meaning “I caught you.” The association isn’t coincidental: these tests are designed to “catchand stop bots and automated programs.

Of course, CAPTCHA has a more technical meaning:

Completely Automated Public Turing (test to tell) Computers (and) Humans Apart, namely the Turing Test.

Conceived by Alan Turing, a pioneer in 20th-century cryptography, the Turing Test is a method to evaluate a machine’s intelligence by testing its interaction capabilities during a human-machine conversation. Throughout the dialogue, the participant doesn’t know if they’re interacting with artificial intelligence or another human being. If they’re unable to distinguish between them, the test is considered passed.


From Letter Recognition to Google


CAPTCHA, a term you’ve likely encountered while browsing online, represents an essential part of web security. Yet, its history reveals a journey from the simple recognition of distorted letters to the complexity of modern Google reCAPTCHA. 

In the ’90s and early 2000s, the surge in hackers exploiting automated programs on the internet led computer experts from various organizations, including AltaVista and Sanctum, to join forces to tackle the new threat. The first CAPTCHAs, founded on the optical interpretation of distorted letters, provided a bulwark against automated access to websites. Users were challenged to prove they’re “not a robot” by identifying and correctly rewriting the displayed text. 

With the widespread use of CAPTCHAs, spam attempts and automated attacks became more sophisticated. Over time, further improvements were introduced; among these, audio-based tests, initially designed to make websites more accessible. 


ReCAPTCHA’s “I’m Not a Robot”


2007 marked the rise of the reCAPTCHA system, which, born from a research project at Carnegie Mellon University, revolutionized the very concept of CAPTCHA.

The basic idea was to create a system that not only protected websites from spammers and bots but also contributed to digitizing books and texts that computers couldn’t read. In other words, by completing the CAPTCHA, users were supporting the online digitization and accessibility of books and documents.

After acquiring the company in 2009, Google launched the “No CAPTCHA reCAPTCHA” system in 2014, appreciated mainly for its ease of use: a simple click or tap is enough to prove that “I’m not a robot.” Utilizing advanced technologies like machine learning and user behavior analysis, the system of the U.S. giant can accurately determine whether it’s a human or not.


How Does a CAPTCHA Work?


It depends on the type of CAPTCHA.

  • I’m Not a Robot: The parameter used for the Turing test is based on the set of user behavior before and after the click. It includes mouse movements, subsequent actions, previously visited sites, and similar variables.
  • Distorted Text: Users are challenged to decipher a distorted image, transforming it into characters and demonstrating their ability to interpret intricate visual inputs. Variants may include rotation games, distortions, or overlaps between letters, along with a background that further complicates interpretation. There’s also an audio version of this CAPTCHA.
  • Images: While we humans can easily distinguish objects in a photo, such as machines or traffic lights, for a computer, it’s a more challenging task. It must interpret the request, examine the image, and identify where the objects are located. This makes image-based CAPTCHAs more secure but also adds an extra layer of complexity compared to text-based ones.

The Downside of CAPTCHAs


Despite countless pros, CAPTCHA technology also has some cons.

  • User Experience: Online procedures become longer with the addition of CAPTCHA tests, causing discomfort for a number of users. The increasing complexity of such tests makes their resolution a frustrating experience, as highlighted by a study conducted at Stanford University in 2010.
  • Low Conversion Rates: Negative user experiences and CAPTCHA inaccessibility can lead to a decline in website conversions. According to a 2009 study involving 50 websites, CAPTCHA usage resulted in a 3.2% decrease in legitimate conversions.
  • Privacy Concerns: Among users and researchers, CAPTCHAs driven by artificial intelligence like reCAPTCHA v3 are viewed critically because they use codes and cookies to monitor user activities on various websites. This raises questions regarding the transparency and use of collected data.

Dogs Smiling and Other Oddities


The web is filled with strange CAPTCHAs. Some users have been asked to spot the happiest dogs among images, while others have been tasked with identifying clouds resembling horses. Seems like an easy task? Not when elephant-clouds are thrown into the mix.

Then there are CAPTCHAs that challenge with graffiti of different artistic styles, asking you to distinguish the code’s letters amidst the drawings or to identify a geometric shape that seems to defy the laws of physics.

And if you’re wondering if there are even more unusual ones, the answer is yes. In 2014, Kevin Gimpel and his team created a CAPTCHA that pits muffin photos against seal pups. The task requires selecting only a certain category of images from a selection of deceptive and distorted representations.

Good luck!

Our Savior: Artificial Intelligence?


In the context of CAPTCHA, the use of AI is mainly aimed at improving online cybersecurity and optimizing the authentication process. In addition to this, AI is employed in testing CAPTCHAs. Its growing resolution capabilities serve as a stimulus for developers to constantly improve their solutions.

But not everything that glitters is gold.

If CAPTCHAs become too complex to be easily solved by human users, there’s a risk of compromising the overall browsing experience. Faced with a daunting task, users may become discouraged and abandon the process, causing losses for the service or website owner.

So are there other possible solutions to distinguish between humans and machines? Replacing a widely used system, even if disliked by many, still remains a challenge. As computer science professor Jason Polakis already said in 2019: “We’re at a point where making things harder for software ends up making them too hard for most humans. Alternatives are needed.”