Improving Layout Performance

Layout performance reporting allows users to review machine errors on pages matched to a given layout version for a Structured document. This can help users understand whether changes should be made to field definitions to improve automation and transcription performance.

For example, consider a field with a Numeric data type but whose values commonly contain letters. If an iteration of that field is sampled for QA, the QA process will likely score the machine’s transcription of that field as a machine error.

By comparing the machine’s response, the correct response as per QA, the field configuration, and the field crop, a user may observe that the data type chosen for that field is a poor fit for its likely values and edit the field definition. This will help to improve the machine’s accuracy and increase automation for future pages.

Example findings and improvements

Below are a few other example layout performance findings and descriptions of what improvements can be made from those findings:

  • As described in the section overview, consider changing a field’s data type if machine errors are often missing or incorrectly transcribing characters. For example, if a field is configured to use the Numeric data type, but errors show the machine transcribing an “S” as “5”, consider changing the data type to one that allows for letters.

  • If pre-printed slashes, dashes, dots, and so on are often left out by the machine but included by human keyers, ensure the dropout setting is disabled for the field.

  • For errors on larger fields, review the size and appearance of the field crop and ensure that the multiline setting is enabled if multiple lines of text are often present.

  • The recommended document resolution for optimal scanning results is 200 dpi. If the resolution is below 200 dpi, you may still achieve high levels of accuracy and automation, but 200 dpi will ensure more accurate and more automated results.

  • If characters at the end of a value are often missing from the machine’s transcription but included by human keyers, the bounding box size may need to be extended in order to accommodate the full value of the field.