Detecting and Correcting Anomalies in Field Annotations

Even if all your keyers go through the same training in annotating fields for a given use case, they may not always annotate the same fields consistently. Inconsistencies in annotations can impact model performance, and finding the cause of the reduced performance can take considerable time and effort. With Labeling Anomaly Detection, the system finds potential anomalies in field annotations and highlights them for review. You can then review these potential anomalies and either mark them as correct or make any necessary adjustments to them.

Limitations of Labeling Anomaly Detection in v37

  • Anomalies can only be detected for fields with a single occurrence and a single bounding box. Therefore, there may still be anomalies across annotations for fields with multiple occurrences or multiple bounding boxes.

  • Even if you use Labeling Anomaly Detection to find and correct anomalies, we still recommend that you complete model validation tasks (MVTs) for your Field Identification models.

  • You can run Labeling Anomaly Detection for up to 5,000 pages at a time.

Before using Labeling Anomaly Detection, make sure that you've:

  • uploaded the required number of training documents,

  • analyzed the data to create groups of similar documents, and

  • annotated the documents to identify the fields whose data you want to extract.

Using the Labeling Anomaly Detection feature has two main steps:

  1. Detect potential anomalies.

  2. Review the potential anomalies and make any necessary corrections.

1. Detect potential anomalies

  1. Go to Library > Models, and select Identification Models from the drop-down list at the top of the page.

  2. Find the Field ID model you want to find anomalies for, and click on its name.

  3. In the Training Data Analysis card, click Analyze Data.

If anomalies were detected during the analysis:

  • a Documents with Potential Anomalies card appears in the upper-right corner of the page, indicating how many documents have potential anomalies, and

  • each document containing potential anomalies is highlighted with a yellow bar on the left side of its entry in the Training Documents table.

DocumentsWithPotentialAnomalies.png

2.  Review potential anomalies and make any necessary corrections.

  1. Above the Training Documents card, click Filters, and then select a group from the Group drop-down list.

  2. Click Apply Filters.

  3. Click the Edit Annotations link for a document highlighted as having potential anomalies.

  4. Review one of the annotations highlighted as being a potential anomaly.

    • If you know how the same field was annotated in other documents in the group, correct the field's annotation, and click Save Changes.

    • If the annotation is correct, click on the annotation, click Field appears correct, and then click Save Changes.

FieldAnomalyReviewAnnotations.png

  • If you're not sure why the field's annotation doesn't match annotations in other documents, click the X in the upper-right corner of the page, and review other documents in the group to see what typical annotations of the field look like. When you've determined how the field should be annotated, return to the document highlighted as having potential anomalies. Correct the field's annotation or mark it as correct, and click Save Changes.

  1. Repeat step 4 for each field in the document highlighted as being a potential anomaly.

  2. Repeat steps 3-5 for each document in the group that is highlighted as having potential anomalies.

  3. Repeat steps 1-6 for each group of documents.

After completing these two main steps, we recommend clicking Analyze Data again to make sure all anomalies have been corrected and that no new anomalies have been introduced.

Note that fields marked as correctly annotated may still be flagged as anomalies during future analyses of training data.