How Jumio Uses AI for Automatic Recognition of ID Documents

AI & Automatic ID Recognition

Jumio recently announced the beta launch of Jumio Go, a fully automated identity verification solution. Jumio Go combines AI, OCR and certified liveness detection technologies to automatically extract ID document data and validate the user’s digital identity in real time.

Jumio Go, along with Jumio’s entire suite of identity verification and authentication solutions, automatically performs checks and balances to verify an identity with AI. All a user has to do is use a smartphone or webcam to submit a picture of a government-issued ID and a corroborating selfie in order for Jumio to make a definitive verification decision.

One of the first steps of the identity verification process is to automatically recognize the type of ID submitted by the user. Our AI models automatically recognize:

  • Issuing country
  • Document type (ID card, driver’s license)
  • ID subtype (e.g., California Real ID driver’s license)
Figure 1: Example of Jumio’s Automatic Document Classification
Jumio uses supervised machine learning to automatically recognize and classify ID documents. With this state-of-the-art technology, computers can successfully identify the contents of digital images once their models are  properly trained.

Why We Need to Classify ID Documents

Proper ID classification is an important prerequisite for optical character recognition (OCR), data extraction and security checks. Jumio’s identity verification solutions support more than 3,500 ID types. By identifying the type of ID, we can then determine the relative position of the data fields for automatic data extraction and for the security checks required to automatically validate the ID.

The content and position of the data present on the ID depends on the layout of the ID document itself, and varies by country and document type.

So, in order to successfully extract data from an ID document, we need first to classify it. The goal of automatic ID document classification is to assign three labels to each ID picture: country, type and subtype.

Figure 2: Reasons for ID Document Classification
The Importance of Supervised Machine Learning

In order to automatically recognize the ID document, Jumio implements a supervised machine learning flow. This requires three basic ingredients:

1. Tagged data

Jumio has tagged millions of ID document images to fine-tune the machine learning flow.

How AI and Machine Learning Can Simplify ID Recognition & Classification

Improving the Accuracy and Speed of Verification

2. Machine learning model

Deep Neural Networks are a family of machine learning algorithms and methods that have been successfully applied to solve computer vision and image recognition problems across academia and industry. By tuning the network parameters based on initial data or “training set” the AI is able to make decisions. The Jumio product team has identified more than 3,500 ID classes, based on the most commonly used documents from the main geographies of Jumio’s customer base.

3. Train the model based on the tagged data

Jumio’s engineering team has collected about 3,000 tagged image for each ID class, resulting in a machine learning training set consisting of millions of tagged images. Jumio’s AI models are trained based on the tagged data.

Figure 3: Example of Machine Learning Training

With the training, the model parameters have been carefully tuned and optimized so that AI can make decisions. In this way Jumio’s identity verification solutions can rely on a classifier that creates 3,500 classes and that, given a government-issued ID, will automatically recognize country, type and subtype, among millions of ID document images.

Figure 4: Jumio Go Machine Learning Flow

Finally, Jumio’s AI models have been integrated and deployed in mass production.

Above is a picture of the Jumio machine learning flow for ID document classification.

The Jumio Advantage

One of the key advantages of Jumio’s AI and machine learning algorithms is the ability to leverage the huge amount of data amassed during the process of verifying more than 200 million identities.

Automatic recognition and classification of government-issued IDs is the first model in a series of AI models that allow Jumio to automatically extract ID document data and validate a user’s digital identity in real-time.

By building other AI models on top of this initial step, Jumio can then remove a large amount of friction from the digital onboarding process while still fighting online identity fraud and meeting AML and KYC compliance mandates.

Want to learn more? Check out our new on-demand webinar, How AI and Machine Learning Can Simplify ID Recognition and Classification.

eIDAS Primer Identity Verification

