Model Management

Overview

In Hyperscience, model management takes place within Library > Models. From here you can see a table that displays a list of Semi-structured layouts and their associated models.

In this table, you can see the state of the models for the current system version as well as for future versions.

You can also view Classification models in the instance by clicking the Locator Models drop-down list and then clicking Classification Models.

Clicking a layout’s name from the table redirects you to the Model Details page. Read more about the Model Details page in the section below.

Model Details page

The Model Details page for Locator models has three tabs:

Field Identification tab
Table Identification tab
System Upgrade Evaluation tab

In the Field Identification and Table Identification tabs, you can perform the following actions:

Download a model.
Download training data.
Upload training documents.
Upload a new model.
Train a model.
Delete a model.
Analyze training data.
See performance metrics related to a model.
Add and edit annotations of training documents.

In the System Upgrade Evaluation tab, you can perform the following actions:

View the state of each model for future versions of the system.
Upload a new model.
Train a model with a future trainer.
Download a model.
Delete a model.

The Model Details page for Classification models has only one tab where you can perform the following actions:

Download a model.
Download training data.
Upload training documents.
Upload a new model.
Train a model.
Check model activity.
Check model compatibility.

Locator models

Field Identification and Table Identification tab

The Field Identification tab is visible only if the respective model’s layout contains at least one field. The Table Identification tab is visible only if the respective model’s layout contains at least one table column. In the sections below, you can learn more about what is present in the Field Identification and Table Identification tabs of the Model Details page.

Upload and download training documents

To upload and download training documents, use the Upload Training Documents and Download Training Documents in the upper-right corner of the page. To learn more, see Importing and Exporting Training Data.

Field Identification Model and Table Identification Model cards

You can view metrics for the current Field Locator and Table Locator models as well as candidate models. Based on your assessment of the anticipated model performance, you can choose to deploy the candidate models or keep the current models.

Note that if you decide to adjust target accuracy on this screen, those changes will apply to the whole system. These changes will not affect submissions currently being processed, only upcoming submissions.

To learn more about the performance metrics for models, see Evaluating Model Training Results.

You can also use the Actions dropdown on the current and candidate models to run training, upload a model, download a model, or delete a model. To learn more about training a new model, see:

The Field-level Automation and Table-level Automation tables allow you to view and edit annotations of fields and table columns. To learn more about these tables, see Training Data Management.

Test target accuracy

With the “Test target accuracy” feature, you can view the projected automation of your model with the target accuracy you specify. By entering test values, you can determine the target accuracy value that best meets your needs before changing your settings.

To test a target accuracy:

Click Edit next to the current target accuracy value.
In the dialog box that appears, enter a test target accuracy.
Click Save.

After you’ve decided on a target accuracy to use for your submissions, you need to apply that target accuracy to your individual flows before it can take effect. To do so, edit the Field Identification Target Accuracy value in your flow’s settings to match the target accuracy you chose. Any changes you make to a flow’s target accuracy will affect only upcoming submissions, not submissions currently being processed.

To learn more about editing a flow’s settings, see Flow Settings.

Training Data Analysis

To reduce the number of annotation errors and make the process of training Field Locator and Table Locator models faster and less complex, you can use the Training Data Analysis card to run data analysis. The analysis assigns training documents to groups and suggests how you can improve each group’s model performance – either by adding or removing training documents from these groups. To learn more about training data analysis, see Training Data Analysis and Guided Data Labeling.

Training Documents

In v35 and above, the Training Data Management tools are moved to the Training Documents card, which is located at the bottom of the Model Details page. To learn more about Training Data Management, see Training Data Management.

System Upgrade Evaluation tab

On the System Upgrade Evaluation tab, you will see the state of each model for future versions of the system. In the Actions dropdown, you can run training, upload a model, download a model, or delete a model.

You don’t need to attach a future version trainer and train the new version model during the upgrade process, but make sure that the model you are currently using is still supported post-upgrade.

Flows and models from up to two versions back can now automatically be used after upgrading. Old models are no longer removed from the system. For example, you can use v35 and v36 flows and their models in v37. For more information, see Forward-Compatible Models.

Classification models

Classification tab

The Classification tab allows you to manage your Classification models. In the sections below, you can learn more about what is present in the Classification tab of the Model Details page.

Downloading and importing Classification models and training data

When downloading a Classification model, you can also download the training data for that model. By downloading both, you can migrate your models and training data in a single file, allowing you to use the model in multiple instances without needing to retrain the model in each instance.

To download a model and its training data, click Actions on the Model Details page, and then click Download Classification Model and Data. The downloaded file includes both the model and the pages used to train the model. All pages used to train the model, whether they were manually uploaded or came from processed submissions, are included. These pages are contained in a training_data subfolder in the downloaded ZIP file. This folder contains a subfolder for each label applied to the pages’ images.

After you download the file, you can import it to another instance, just as you would any other exported model. The pages appear as manually uploaded pages in the destination instance, even if they originally came from processed submissions. Any pages already in the destination instance are not imported. The PII Deletion policy in the destination instance applies to the imported training data, not the PII Deletion policy in the data’s original instance.

Model Overview

The Model Overview card provides you with information about the following:

Model Name – Name of the Classification model
Last training – last training date and time of the Classification model
Last deployment – last deployment date and time of the Classification model

If the Classification model is live, the Model Overview card also has a Projected Automation chart that provides you with information about how your model will perform based on different target accuracies.

Model Training

The Model Training card allows you to upload, download, and delete training documents for your layout variations. You need to upload a minimum of 10 training document pages for each layout variation, but we recommend uploading at least 120 pages for optimal performance.

Note that HTML documents cannot be used to train Classification models.

Model Activity

The Model Activity card has a table that provides you with information about your Classification model’s trainings and deployments. This table has the following columns:

Training Started – start date and time of model training
Status – status of the model training
Actions – available actions

Model Compatibility

The Model Compatibility card has a table that provides you with information about which releases the Classification model is compatible with. This table has the following columns:

Release Name – name of the release
Created On – creation date of the release
Status – status of the release

Continuous Model Training

If Continuous Field Locator model improvement and/or Continuous Classification model improvement settings are enabled, and you import a model from another environment, you might end up with lower automation rates. Models only use training data from their current environment and if you do not have enough training data in your new environment, your old model will be overwritten by a worse one. For optimal performance, we recommend that you train models manually and disable the Continuous Field Locator model improvement and Continuous Classification model improvement settings. Only enable these settings if instructed to do so by a Hyperscience representative.