Scoring Transcription Accuracy

Similar to Field ID QA, the machine measures transcription accuracy by establishing Quality Assurance (QA) consensus; consensus means that two entries for a given field must have matching normalized transcriptions. In other words, when the transcription of the machine (with high confidence) or that of a data keyer matches what is determined in QA consensus, then the data keyer or the machine is marked as accurate. 

With Release 25.0.0, the system scores every human entry for Field ID. However, as in previous releases, there are specific instances where this is not the case:

  • The machine is only scored when the confidence is above the Transcription threshold. If the confidence is below the threshold/field was sent to Supervision, the machine is not scored for accuracy. However, the response may be used to establish consensus and measure human accuracy.

  • When a field requires human intervention during processing (e.g. Supervision transcription), the machine entry is not scored. This means that if a field is marked as Consensus required, even if that field is used to establish consensus, it will not be scored.

Scoring transcription 

Here are a few example scenarios: Example 1

Extraction source

Entry

Accuracy

Machine

William Zamora DDS

Accurate

QA (Keyer 1)

William Zamora DDS

Accurate

Consensus

William Zamora DDS

--

 Example 2

Extraction source

Entry

Accuracy

Machine

William Zamora OOS

(low confidence)

Not Measured

Supervision Transcription (Keyer 1)

William Zamora DDS

Accurate

QA (Keyer 2)

William Zamora ODS

Inaccurate

 

QA (Keyer 3)

William Zamora DDS

Accurate

Consensus

William Zamora DDS

 

 Example 3

Extraction source

Entry

Accuracy

Machine

William Zamora 00S

(low confidence)

Not Measured

Supervision (Keyer 1)

William Zamora DDS

Accurate

QA (Keyer 2)

William Zamora DDS

Accurate

Consensus

William Zamora DDS

--

QA consensus can be established without generating additional tasks if the previous Supervision entries established consensus. Here is an example of that scenario:

Example 4

Extraction source

Entry

Accuracy

Machine

William Zamora DDS

(low confidence)

Not Measured

Supervision (keyer 1)

William Zamora DDS

Accurate

Consensus

William Zamora DDS

--

 Abandoned fields

There are two situations where a field may be abandoned and therefore have its corresponding entries not counted towards accuracy reporting:

  1. A QA field is abandoned if no consensus is reached after 3 human entries. If the field was sent to Supervision, the Supervision entry would count towards the total 3.

    • Although an abandoned QA field is not counted towards accuracy metrics, the completed QA entries do count towards a keyer’s productivity.

    • A QA field is abandoned if two keyers mark the entry as illegible during the Transcription QA task.

Example 1

Extraction source

Entry

Accuracy

Machine

William Zamora 00S

Abandoned

QA (keyer 1)

William Zamora DDS

Abandoned

QA (keyer 2)

William Zamora OOS

Abandoned

QA (keyer 3)

*illegible*

Abandoned

Consensus

--

--

 Example2

Extraction source

Entry

Accuracy

Machine

William Zamora 00S

Abandoned

QA (keyer 1)

*illegible*

Abandoned

QA (keyer 2)

*illegible*

Abandoned

Consensus

--

--