Measuring the accuracy of trained models is crucial to the success of your use case. In this article, you will learn what accuracy is, about the different types of accuracy, and how accuracy is determined.
What is accuracy?
Accuracy helps you understand how often the system correctly predicts values compared to the actual values that reached consensus during QA . It measures the effectiveness of your model based on the proportion of correct predictions out of all predictions made.
Accuracy may be impacted by factors like imbalanced datasets (insufficient examples of documents with similar visual layouts) or inconsistencies in annotations. That’s why it relies on Quality Assurance tasks (QA). When QA tasks are enabled, humans can provide feedback to the machine, allowing it to improve accuracy over time by understanding the true content of each piece of data. Learn more in our What is Quality Assurance? article.
Types of accuracy
Machine Accuracy
Machine accuracy indicates how accurately a specific model predicts the correct value for a given task. This metric varies depending on the type of task the model is performing:
Classification - reflects the model’s capability to correctly predict the layout to which a page belongs.
Identification - assesses the model’s ability to predict the precise positions of tables or fields within a document.
Transcription - measures the model’s ability to accurately transcribe text.
Machine Accuracy is computed on confident predictions that have been sampled for QA and reached consensus for the correct value.
Manual Accuracy
Manual accuracy refers to the accuracy of a task that relies on human input. It involves assessing the correctness of human-generated decisions in comparison to the ground truth of your data.
Classification - reflects data keyers’ percent of correct decisions for the layout to which a page belongs
Identification - assesses data keyers’ precision in determining the positions of tables or fields within a document
Transcription - measures data keyers’ accuracy when transcribing text.
Manual Accuracy indicates both data keyers’ input from the submission and the QA task.
Determining accuracy
To determine the accuracy of the machine or a data keyer, the system requires QA consensus. Consensus means two identified locations or transcriptions for a field or table must match. If the machine’s prediction or the data keyer’s value matches in QA, the value is considered accurate. Learn more about consensus in Scoring Field Identification Accuracy and Scoring Transcription Accuracy.
When accuracy increases, automation decreases. If you want better accuracy, more fields with high confidence will still need human checking. This is because the model requires a higher level of certainty before relying entirely on machine transcription.
Accuracy reports
Machine Accuracy vs Manual Accuracy report
The chart comparing Manual Accuracy and Machine Accuracy shows metrics for both data keyers and the machine over a chosen period. It's important to note that the chart displays accuracy based on occurrences rather than individual fields.
Learn more about the report in Manual Accuracy vs Machine Accuracy.
Document Output Accuracy report
The Document Output Accuracy is determined by the final transcription of a specific field or cell extracted during submission processing, regardless of whether it was performed by a human or a machine. This report is focused on the correctness of the transcribed content.
In Structured documents, this value represents the Transcription Accuracy of the output.
In Semi-structured documents:
If a field/cell is sampled for ID QA, and it's determined that the location differs from the one extracted during submission, it won't be included in the report.
However, if a field/cell is sampled for Transcription QA, and it's found that the transcription differs from the one extracted during submission, it will be included in the report.
For fields, the chart shows accuracy on an occurrence level.
Below the chart, you'll see the Field Accuracy percentage and the Table Cell Accuracy percentage, which reflects the average accuracy for the chosen date range. Learn more in Document Output Accuracy.