Document Classification

The Document Classification Supervision task is the first step in Extraction Supervision. It is used to categorize and combine pages that were not classified by the machine.

To enable Document Classification, select the Manual Classification Supervision setting in your flow. To learn more about flow settings, see Flow Settings.

Document Classification task

mceclip0.png

Document Classification allows you to provide additional structure to submissions as part of the Page Sorting phase of processing. 

Specifically, Document Classification allows a user to do the following:

  • Add uncategorized pages to grouped documents.

  • Reorder pages in grouped documents.

  • Remove pages from grouped documents.

  • Classify (apply layouts to) manually grouped documents.

  • Reclassify (apply different layouts to) machine-misclassified documents.

This task supports the customer workflows that require the association of documents within a submission for downstream processing. For example, if a particular Semi-structured form must be accompanied by additional materials (e.g., a paystub or check) in order to be considered valid, Document Classification allows you to associate the form with its additional materials in the submission and reflect this association in the submission output.

Document Classification task’s interface

The Document Classification task’s interface includes a page that contains three vertical panels - left, middle, and right panel.

mceclip1.png

Left (uncategorized) panel

The left panel contains all uncategorized pages. To provide additional context for the keyer, the pages are displayed in the order in which they were submitted. A thumbnail of each uncategorized page is shown in the left panel, and a larger view of the page in focus is shown in the preview panel on the right. 

Selecting pages from the left panel, using either the mouse or keyboard shortcuts, allows you to add these pages to a new or existing document in the middle panel.

To submit the Document Classification task, all pages from the left panel must be categorized. If some pages do not have a matching layout available, you can select the “Other” category for them from the layout drop-down list. Selecting “Other” as a layout variation for these pages results in sending them to the No Layout Variation Found section (Submissions > No Layout Variation Found).

Middle (grouped documents) panel

The middle panel contains all grouped documents in a submission. In this panel, you can add, remove, and reorder pages in grouped documents.

All pages in the left and middle panels have submission page numbers assigned to them. These numbers represent the order in which these pages were submitted into the system. 

Grouped documents in the middle panel are ordered by the submission page number of the first page in each document. When you manually create a new document in the middle panel, its pages are automatically reordered by submission page number, and no layout is selected. You need to manually select a layout.

To modify the layout of a machine-classified Structured document, you need to:

  1. Remove all pages from the machine-classified Structured document.

  2. Create a new group of pages from the removed uncategorized pages.

  3. Assign a Structured layout variation to the newly-created group of pages.

For more details, see the “Grouping pages into documents” section of this article.

Once you remove all pages from a document, the system automatically deletes it. You can make a document empty by moving all its pages to other documents or to the left panel. 

You can open multiple documents at the same time, but only a single document can be focused. When you close a document, the page selection within this document is cleared.

To submit the Document Classification task, you need to assign all documents to layout variations. If an uncategorized page does not belong to any existing layout variation, you can assign this page to one of the following categories from the layout drop-down list:

  • Blank Page

  • Additional Form Page

  • Other

Right (preview) panel

The right panel shows a zoomable full-page view of your selected page.

Completing Document Classification task

The following options are available to you during Document Classification tasks.

Grouping pages into documents

During Document Classification, you can group multiple pages into documents. For example, several pages that are part of a single invoice can be categorized as invoice pages and combined into a single multi-page invoice document.

For all pages in the left panel, you need to:

  • click the Create New button at the bottom of the panel to create new documents for these pages, or

  • use drag-and-drop or keyboard shortcuts to move these pages to existing documents in the middle panel.

For each grouped document in the middle panel, you need to select a layout variation. The options include all live layouts for the given release, as well as "Blank Page," "Additional Form Page," and "Other." 

Modifying the layout of machine-classified Structured documents is restricted. You can manually classify Structured documents by creating new documents from uncategorized pages and assigning Structured layout variations to these groups of pages. As part of the grouping process, you can remove pages from grouped documents and then add these pages to other documents, using either drag-and-drop or keyboard shortcuts.   

Reordering pages

After you've grouped a set of pages together into a document, you can change the order of that document's pages. To do so, use drag-and-drop or the "Reorder" keyboard shortcuts listed in the Keyboard Shortcuts section of this article.

Adding and removing pages from documents

In the sections below, you can learn about reclassification of Structured and Semi-structured documents.

Adding and removing pages from Structured documents

Reclassification of Structured documents relates to:

  • reclassifying (applying different layouts to) machine-misclassified Structured documents, 

  • adding pages to machine-classified Structured documents with missing pages, and

  • removing pages from machine-classified Structured documents. 

For example, when you select exactly one page in the left panel, you can add it to a machine-classified Structured document that has at least one missing page. To do so, select the specific page index within the Structured document to attempt to assign the page to. The selected page goes through ad-hoc registration that verifies if the page matches the given Structured layout image:

  • If registration succeeds, the page is added to the Structured document at the selected index, and its metadata will be automatically extracted downstream by the machine.

  • If registration fails, the page is appended to the end of the Structured document, and it will require manual extraction during Flexible Extraction.

Note that pages of machine-classified Structured documents cannot be reordered, and the drag-and-drop functionality is intentionally restricted for these documents.

Adding and removing pages from Semi-structured documents

Reclassification of Semi-structured documents relates to adding and removing pages from machine-classified Semi-structured documents. For example, when you select a page or a set of pages in the left panel, you can add them to a machine-classified Semi-structured document by using the drag-and-drop functionality or keyboard shortcuts. 

Manual Rotation

Sometimes the sampled documents may go to Manual Classification with an incorrect orientation. Learn more in Manual Rotation.

An improperly rotated document may be impossible to read or extract data from. That’s why in v38 we’re introducing the Manual Rotation feature which allows you to correct the orientation of a page within a document. 

The Manual Rotation feature is available in the Manual Classification task. 

You can rotate all selected Uncategorized pages at 90° clockwise

To rotate a page: 

  1. Right-click on the page you need to adjust 

  2. Click Rotate Page 90° Clockwise

You can also use the shortcut ALT/Option +R or click the rotate button ()at the right-hand side of the screen. 

During Machine classification, the page image may be adjusted in order to obtain a match. Uncheck it to reset the image to its submission state. 

 

Reprocessing of misclassified documents

When initiated from other Supervision tasks, the Document Classification task allows users to manually classify machine-misclassified documents. As a result, the users can reclassify all pages of the submission and submit them for reprocessing. To learn more see Reprocessing.

Keyboard Shortcuts

Navigation

Task

Mac Shortcut

Windows Shortcut

Focus uncategorized panel

1

1

Focus grouped documents panel

2

2

Focus pages

,,,or ➡

,,,or ➡

Select page

Space

Space

Select multiple pages

Command + Space

Control + Space

Navigate documents

Tab or Shift + Tab

Tab or Shift + Tab

Open/close document

Option + 0

Alt + 0

Open/close all documents

Option + Shift + 0

Alt + Shift + 0

Document management

Task

Mac Shortcut

Windows Shortcut

Create new document

Option + N

Alt + N

Add to document

A or Z

A or Z

Remove from document

Delete

Backspace

Restore submission order

R

R

Reorder page up

Option +

Alt +

Reorder page down

Option +

Alt +

Reorder page left

Option +

Alt +

Reorder page right

Option + ➡

Alt + ➡

Insert missing page at index

Return

Enter

Remove focus from missing page

Esc

Esc

All tasks

Task

Mac Shortcut

Windows Shortcut

Keyboard shortcuts

F2

F2

Zoom in

Option + +

Alt + +

Zoom out

Option + -

Alt + -

Close task

Command + Option + X

Control + Alt + X

Submit task

Command + Return

Control + Enter