Where have all the data entry candidates gone?

6 min readMay 16, 2022

If you are struggling to hire data entry roles to help extract data from documents, please take comfort in the fact that you are not alone. Businesses and institutions of all sizes, even the IRS, are challenged by an acute labor shortage:

What Happened?

The above article was published in April 2022. To better understand the reason for this shortage, you will need to read another article from the same month:

With the Great Resignation, there has been a drought of technical talent available for entry level tech positions. To combat this, technology companies have responded by hiring people from lower, minimum wage roles, providing them extensive training and support to perform these entry-level tasks.

According to The Wall Street Journal article above, “ many employers from IBM to CVS now say they are happy to help relatively inexperienced new hires get trained up in coding, cybersecurity and healthcare technology to fill positions.”

The problem with this approach is it places the burden of training on operations team managers and executives, diverting their attention from their higher-level responsibilities.

So What Now?

The bottleneck created by this lack of experienced data entry resources significantly impacts productivity and results in revenue being left on the table.

As a result, operations executives face these challenges:

“I cannot hire more people.”
“When someone leaves, my productivity takes a hit”
“I need to prepare for more people leaving this role over time”

Let’s look at each of these aspects in a little more detail and identify potential solutions.

Challenge 1: I cannot hire more people

Solution: Increase work capacity without adding more people

If you remember your middle school algebra, you may recall working on labor problems. They sounded something like, “If one person can paint ⅓ of a room in one day, then how many people will it take to paint a house with 5 rooms in 7 days”. You have a similar math challenge here, but in this new world, instead of asking “how many people would you need”, you should ask, “if Iron Man were to do this work, how long would he take?”

Iron Man is one of my favorite superheroes. I am not sure if you have ever thought about this, but from all the superheroes out there, Iron Man is the only one that you can aspire to become. You would have to be born on a different planet to become Superman or be bitten by a fictional radioactive spider to become Spiderman. But with access to the right engineering talent and capital — you can become Iron Man.

Anyway, I digress.

Coming back to our math riddle, Iron Man has access to an exoskeleton that helps him increase his basic human capabilities. Without the iron suit, he can lift 20 lbs. of weight. With it, he can lift 200 lbs. If he can jump more than three feet without the suit, he can leap 30+ feet with it. You need something similar for your data entry team. If a data entry team member can process 40 documents an hour, just imagine what they could do with an iron suit.

Intelligent Document Processing (IDP) systems are the iron suits for your team. With IDP, they can get higher levels of data processing done. When your team starts their work, instead of processing data from scratch, they can focus on reviewing the work of the algorithms, which can exponentially increase the output range or scope. IDP refers to automating your document capture, extraction and classification process with the help of advanced AI-based technologies, such as NLP, machine learning, computer vision, deep learning, etc. As IDP systems are automated, it also helps to scale faster. When your business volumes pick up, it generally takes a significant amount of time to increase the size of your team, train the new staff and wait for them to become efficient. Scaling an IDP platform simply requires more infrastructure, which is relatively inexpensive and rapidly available.

Challenge 2: When someone leaves, my productivity takes a hit

Solution: Retain the knowledge when a person leaves

While we are talking about superheroes, let us linger a little longer in the fantasy world and take a look at another interesting artifact — the pensieve from Harry Potter:

You may recall this from the movies or the books — a pensieve is a bowl where one can store their memories with all the details. You can share your memories with someone else who can enter the pensieve and relive them.

What if you had the means to automatically save someone’s knowledge as they were leaving your data entry team? Wouldn’t it be magical to have someone start by reliving these memories and not spend countless weeks training new staff?

Machine Learning is the pensieve that can help you retain knowledge from your old staff by automatically learning from their every action on a data entry job.

Machine learning algorithms enable IDP systems to exponentially learn from training and corrections each time a document is processed. These self-learning systems rely highly on the data they are fed to interpret and learn from past data sets. They also provide extraction results that are better each time you input feedback or corrections. The more you engage with a machine learning system, the better functionality it provides. That is because machine learning systems use statistical methods and algorithms to get trained and then make predictions, extractions and classifications about the key insights or high-value data from among a larger set of structured and more often, semi-structured or unstructured data.

Challenge 3: I need to prepare for more people leaving this role over time

Solution: Get on an efficiency increase treadmill

With an iron suit and a pensieve at your disposal, what else can you add to your repertoire to deal with the challenge of ever increasing demand for efficiency gain? This time, you will find the answer in yet another corner of fantasyland — inside The Matrix.

In the second part of the Matrix trilogy, Agent Smith figures out that he cannot deal with Neo on his own. He needs far too many agents to stand a fighting chance. This makes him figure out an agent replication algorithm that can create thousands of agents on the fly.

A feedback-based continuous learning loop is the replication algorithm for data entry work. A feedback loop is a strategy or mechanism designed to leverage current predictions, feedback and corrections of the machine learning models in an IDP system to retrain and improve the quality and increase the accuracy of future predictions. This mechanism ensures the IDP system is constantly trained and matured to provide you high accuracy. It may prove as good as 90% accuracy or more within a few months of operation. The feedback loop brings the best results when corrections are integrated with extraction.

Do not become an automation casualty

We are going through a transitional phase in automation. The world and businesses will change dramatically over the next several years, and we will see an increasing number of manual tasks performed by AI. However, this is not the first time we are going through this sea change. The Industrial Revolution put us through a similar transformation years ago. Business teams that recognize this change and are willing to adapt, will realize significant improvements in operations, resulting in much happier customers.

They say, ‘never waste a good crisis’. The labor crisis of today offers a unique window of opportunity to deploy AI-enabled automation. Seize this opportunity today and let it help you deal with the labor shortage. It will dramatically benefit your operations tomorrow.

Originally published at https://www.infrrd.ai.