Rules-Based Data Transformation with Intelligent Document Processing

5 min readOct 15, 2022


Document rules (also known as business rules) are the icing on the cake for an Intelligent Document Processing (IDP) solution. You can enjoy a delicious cake even without the icing but a baker can transform a simple cake into a multi-layered piece of art with the addition of this sweet, adaptable medium. It is his playground to be creative and make the cake unique. Similarly, an IDP solution can offer you good extraction results but if you want to customize or transform the extracted data to your process needs, you need document rules.

What is data transformation?

Data transformation is the process of transforming raw data into a clean, consumable format.

When processing documents, required data is extracted by an IDP solution, pre-processing logic needs to be applied to ensure the output data is in a standard format and ready to be consumed by your downstream processes.

Say you have two teams, one in Asia and another in Europe processing documents using the same IDP solution. Here, document rules can standardize the results by transforming the extracted data to match the pre-defined global settings, such as transforming time or currency to a common format. Essentially, it makes sure that the output from the IDP solution is made consumable.

Challenges of data transformation with IDP

One of the key challenges with most IDP solutions in the market is configuring document rules for data transformation. Businesses will need their IT or tech team to intervene and configure them in the backend. This dependency can cause delays, as well as quality issues if the outsourced team does not clearly understand your business process and what data formats you need.

This is where self-configurable document rules become a game-changer because you get the final output faster while saving you the hassle of routing the request and getting it done through an external team.

Document rules enable you to validate extracted data. In other words, document rules enable IDP users or business teams to configure and apply business or functional logic to the IDP workflow without any external dependencies.

What can document rules do?

There are a number of areas where document rules add value to the IDP workflow, which primarily depends on your business needs. In advanced and modern IDP solutions, configurable rules are usually available in the user interface to address some of the most common business needs such as:

* Transform date format: It helps you transform the date format based on your priorities. For example, if you want uniformity while displaying dates or if you want to avoid misinterpretations between mm/dd vs. dd/mm formats, this rule takes care of it.
* Transform currency format: This is a common data transformation when dealing with multinational companies. For example, you want to list all the amounts along with their ISO currency codes, such as USD or EUR.
* Change string case: Another common data transformation requirement is to bring uniformity to the casing. For example, you want to extract all the merchant names in lowercase or vice versa.
* Transform amount format: Defines how amounts should be extracted. For example, if you have customers in the US, Europe, and Asia but you want to see the total amount in invoices only in dollars, this rule takes care of it.
* Remove irrelevant characters: If the extracted value contains irrelevant or junk characters, which happens often because of document quality, you can set up a document rule to trim the final value to delete these unwanted characters.
* Defuse text value: This is a high-value document rule that many businesses use. This rule is used to defuse the extracted value, which means the value extracted can be a combination of multiple values. For example, name_mobilenumber. This rule transforms the complex text value by separating the values into name and mobile number.

Another common business need is to validate the extracted data against a third-party data source, maybe a database or an ERP solution. Document rules are an effective tool to do this validation.

The unique configurable nature of the document rules leaves room for businesses to configure them in many ways to fit their specific needs. They can even create multiple document rules to ensure standardization of not only one, but a multitude of fields.

What are the benefits of document rules?

Once rightly identified and configured, document rules can drastically improve efficiency and take your business processing to the next level. Some of the key business benefits are:

* Minimal vendor dependency: Document rules allow you to reduce dependency on IDP vendors. For example, if you want to defuse text values, where you may be getting the combination of values from two or more fields, you may have to wait until the IDP partner or your tech team customizes it for you, which could take a few hours or days. But, when you can configure this rule from the user interface, you eliminate this dependency and it saves you a ton of effort and time.
* Self-configuration: As someone with a strong knowledge of your business process and its data flow, when you have the power to configure your own document rules, you significantly reduce the chances of errors as opposed to using a less knowledgeable external team.
* Process improvement: The most important benefit of having document rules in the user interface is that it makes your business smarter by offering a faster turnaround time. You only configure them once but it makes data consumable every single time for every single document.

In essence, document rules help apply business logic to extracted data. Self-configurable rules make your business smarter and more efficient as your team can configure business logic to the existing extraction workflow with no or minimal dependency on external resources.

Even though document rules are a high-value feature, it is sometimes overlooked by many IDP users when evaluating an IDP solution without understanding their tremendous potential. Make sure the IDP solution you choose offers self-configurable document rules for data transformation.

Originally published at




Infrrd has been offering AI as a Service since inception. Their focus is on developing faster Enterprise AI platform using AI, ML & NLP-