Transcription Accuracy and Automation

Accuracy is the proportion of data where machine predictions are correct. These predictions match the actual value that the data represents. To understand what the true value of a piece of data is, the machine uses Quality Assurance tasks. When QA is enabled, humans can give feedback to the machine and improve accuracy over time.

Transcription accuracy and automation

Transcription accuracy is measured on a field level. For example, if you have a Social Security number that is “123-45-6789” and the machine transcribes the SSN as “123-45-678”, the machine will be scored as 0% accurate, even though 8 of the 9 digits are correctly transcribed. Transcription is considered to be accurate only if the full field is correctly transcribed. For the machine to have 100% transcription accuracy, all 9 digits of the SSN need to be transcribed correctly.

Accuracy and automation

Automation is defined as the proportion of data where the machine is confident enough to make a prediction without human supervision. An automated field is a field that a human does not need to review or transcribe on their own. If 90% of the fields are automated, you only need human effort to process 10% of the fields.

There is a relationship between accuracy and automation. A higher accuracy target will see the machine send more fields for a human transcription or review. You can trade higher accuracy for lower automation, and vice-versa.

Accuracy is continuously measured, and the machine adapts to customer data every day. We will take a look at how you can improve both transcription accuracy and automation in the sections below.

Accuracy targets

Hyperscience provides the ability to set accuracy targets for document transcription once certain conditions are met. You can also set accuracy targets for individual fields in Structured documents (v40 and later) and for individual fields and table columns in Semi-structured documents (v40.1 and later). These targets will direct both the confidence thresholds and automation levels.

The benefits of setting accuracy targets are the following:

Accuracy targets enable both fine-tuning and auto-thresholding. Read more about fine-tuning and auto-thresholding in the sections below.
Accuracy targets lead to increased automation as the model improves its decision making process of sending fields to Supervision.

Each field type can have a different accuracy target. The minimum number of QA records per field type are:

For Structured text - 5000 fields
For Semi-structured text - 2000 fields
For checkboxes - 2000 fields
For signatures - 2000 fields
For table text - 2000 table cells

The accuracy targets are all configurable on a flow level. You can access these settings by clicking on Flows in the left-hand sidebar and clicking on the name of a flow. You can set accuracy targets for Structured or Semi-structured text only if the Transcription Automation Training setting is enabled for that document type. To learn more about Transcription Automation Training settings, see the “Structured Document Transcription” and “Semi-structured Document Transcription” sections in Flow Settings.

Period of records to use is a flow setting that applies to all field types. There are separate Periods of records to use flow settings for Structured and Semi-structured documents. You decide how many days in the past you would like to pull data from. You need to maintain the minimum number of QA records for each field during the period you set.

Targets for transcriptions of specific fields or table columns

This section describes a feature that is available for Structure fields in v40.0 and later and for Semi-structured fields and table columns in v40.1 and later.

If you have certain fields or table columns (a.k.a. “entries”) that require a higher or lower level of accuracy than the rest of the entries in a flow’s release, you can set individual accuracy targets for each of those entries. For example, if you’re processing documents that contain names and Social Security numbers, and their transcriptions need to be as accurate as possible, you can set accuracy targets for these fields that are higher than those set for your documents, overall. Doing so eliminates the need for higher accuracy standards to be applied to the remaining entries, which may reduce the number of Transcription Supervision tasks that are generated.

Setting a particular entry’s Target Accuracy to 99%, for example, does not always guarantee that 99% accuracy will be reached all the time. The feature utilizes the Accuracy thresholds for the entire Transcription Fine-tuning model, which are valid throughout all entries and not available per entry. Even so, setting a Target Accuracy for a particular entry to 99% guarantees that the entry will have a higher accuracy threshold compared to setting the Structured Text Target Accuracy or Semi-structured Text Target Accuracy for all fields.

No additional QA tasks are generated if you set accuracy targets at the entry level, and the sampling rates specified in flows’ settings still apply. The Period of records to use flow setting also applies calculations of entry-level accuracy, as it does to calculations of transcription accuracy at the document level.

Editing targets in the Field Dictionary

You can set field-level accuracy targets for Structured documents in the Field Dictionary. These targets are applied anywhere the field appears in your documents. If you would like the accuracy target to apply only when the field is processed by a particular flow, you can set the target on the flow level. A field-level accuracy target set at the flow level overrides one set in the Field Dictionary, if any. As you set accuracy targets for fields at the flow level, a list of available fields appears, along with any accuracy targets entered in the Field Dictionary. To learn how to set field-level accuracy targets in the Field Dictionary and at the flow level, see Editing Fields in the Field Dictionary and Flow Settings, respectively.

Machine confidence

Machine confidence is an internal non-configurable number. The machine confidence numbers only have meaning inside the specific machine learning models. Note that machine confidence is different from accuracy and probability. The relationship between accuracy and confidence is not straightforward. For example, fields with machine confidence of 0.8726 are not 87% accurate or 87% likely to be accurate, and 0.8726 does not mean 87% confidence.

The confidence scores and confidence thresholds are automatically adjusted over time.
You control the accuracy requirement – you indicate how accurate you want the transcription to be. Hyperscience adjusts all the other metrics, so you achieve the desired accuracy.
Let’s say you require 99.5% accuracy. This accuracy target may lead to 75% automation. When the machine starts learning from QA results, the automation may increase to 94% while maintaining the same level of accuracy. You can learn more about improving automation while keeping the desired level of accuracy in the section below.

Confidence distribution from QA data

Results from QA tasks are used to create a confidence distribution. In the example below, you can see such a confidence distribution:

As mentioned in the previous section, the confidence values do not mean anything outside of the particular machine learning models. If a field is transcribed with 0.96 confidence, this does not mean that the model is 96% sure that the data was transcribed correctly. As observed in the chart above, 0.96 for this particular confidence distribution means that the model is about 99.74% confident in its transcription accuracy (100% - .26%). Therefore, if you want to have a 99.5% overall accuracy budget, the model picks a threshold whose number of errors divided by the total number of fields is 0.5%.

Fine-tuning and auto-thresholding

QA tasks allow the confidence distribution to further adapt to the data the model sees in your instance. To enforce verified results with very little ground truth error, we use consensus. To learn more about consensus, see Scoring Transcription Accuracy.

Fine-tuning is the process of Transcription Automation Training. During fine-tuning, machine learning models use verified QA tasks to gain new fine-tuned confidence, based on the observed ground truth and the results from current and previous QA tasks. Ground truth is used to fine-tune the models by:

Decreasing confidence on high-confidence errors - cases where the machine incorrectly transcribed fields with high confidence. The machine learns from these high-confidence errors and improves accuracy.
Increasing confidence on low-confidence correct extractions - cases where the machine correctly transcribed fields but the fields were still sent to Supervision. The machine learns from these low-confidence correct extractions and improves automation.

The original confidence is the confidence the model has in reading fields correctly. The fine-tuned confidence incorporates both the original confidence as well as all QA data, thus making the fine-tuned confidence more accurate.

Using fine-tuned confidence, the model creates a new confidence threshold in a process called auto-thresholding. Auto-thresholding is one way our machine learns and improves over time. Auto-thresholding is scheduled to run nightly and whenever you edit a flow’s Structured Document Transcription settings or Semi-structured Document Transcription settings.

With learnings from QA tasks, machine accuracy can be measured across the confidence spectrum.
Thresholds are automatically set, based on the desired level of accuracy.
To reach the desired level of accuracy, the model auto-adjusts the thresholds automatically.