Training Models

  • Updated on Mar 31, 2025
  • Published on Oct 3, 2024
The trainer is a separate machine dedicated to handling resource-heavy tasks like training Identification models. Running it on the same machine as the main application would slow down submission processing. Instead, the trainer operates independ...
  • Updated on Mar 31, 2025
  • Published on Nov 7, 2024
Text Segmentation is the process of partitioning an image into regions containing text into meaningful and distinct pieces or blocks of text. It is the first step of downstream processing tasks such as classification, text transcription, fields, t...
  • Updated on Mar 31, 2025
  • Published on Oct 10, 2024
The Model Management page allows you to see a list of all models trained on this instance. In this article, you'll learn how to navigate the pages for different types of models. To access the Model Management page, go to Library >  Models...
  • Updated on Mar 11, 2025
  • Published on Oct 10, 2024
Training Data Management (formerly Keyer Data Management) allows you to improve and supervise models by working directly with the training data (“ground truth”) obtained from each document in the training set. You can group documents, see incompa...
  • Updated on Mar 31, 2025
  • Published on Oct 10, 2024
Having a diverse, representative training set is crucial for a high-quality identification model. The Hyperscience application allows you to train a model with fewer annotations with minimal impact on performance.  How data is curated The Tra...
  • Updated on Mar 31, 2025
  • Published on Oct 10, 2024
Document Eligibility Filtering indicates whether a document is eligible for training, based on internal checks in the application and our machine learning logic. It provides additional information about documents that were excluded from the training...
  • Updated on Mar 31, 2025
  • Published on Oct 10, 2024
A high-quality model requires consistent annotations. That's why identifying potential discrepancies in the training sets before model training is crucial. To help with this effort, we've included a tool called Labeling Anomaly Detection in Training...
  • Updated on Mar 31, 2025
  • Published on Oct 10, 2024
Adding new data to your training set or making minor changes to its annotations may require several iterations of model re-training. Incremental Training helps you build upon your existing identification model without losing previously acquired info...
  • Updated on Mar 31, 2025
  • Published on Oct 10, 2024
In this article, you’ll learn how to navigate through and use Training Data Management for Identification models. Learn more about each feature in Training Data Management . Accessing TDM for Identification models Each identification mod...
  • Updated on Mar 31, 2025
  • Published on Oct 10, 2024
Hyperscience extracts data from documents and converts them into a machine-readable format. We support Structured, Semi-structured, and Additional documents. To learn how to differentiate between the document types, see Understand Document Types...
  • Updated on Apr 1, 2025
  • Published on Mar 31, 2025
There are two ways to train a Field ID model. To manually train and deploy models, go to the Model Details page, and follow the instructions in this article. To automatically train and deploy Field ID models, you can enable the Continuous Fie...
  • Updated on Mar 31, 2025
  • Published on Oct 10, 2024
A trained Table ID model enables cell-level predictions and automatic table processing. A Table ID model can be trained to automatically identify both gridded and non-gridded tables. A standard grid format refers to tables where data falls neat...
  • Updated on Mar 31, 2025
  • Published on Oct 10, 2024
Classification models are a crucial part of document processing, as they help the system determine which layout should be used to process each page you upload. Training Data Management for Classification allows you to add, remove, and update train...
  • Updated on Mar 28, 2025
  • Published on Oct 3, 2024
In previous versions of Hyperscience, if you wanted to retrain a model after upgrading to a new version of the application, you needed to upgrade the model’s flow, along with all of the flow’s other models. Beginning in v40, you can retrain a model...
  • Updated on Mar 31, 2025
  • Published on Oct 10, 2024
If you create a new Semi-structured layout version, there will be no models immediately available. For optimal layout performance, train a model on the newest layout version. Recall that Identification models are trained at the layout level. I...
  • Published on Oct 10, 2024
Overview Once a new Field ID or Table ID model is trained or uploaded, before uploading the model, you can evaluate the projected automation based on the Field Identification Target Accuracy setting for Field ID models and Table Identification...
  • Published on Oct 3, 2024
To achieve better automation rates for document classification, a classification model must be trained for each Semi-structured and Additional layout. How to Initially Train the Classification Model To train a new model, navigate to the Classif...
  • Updated on Mar 31, 2025
  • Published on Oct 3, 2024
Transcription models are collections of fine-tuning models. Select  Transcription models  from the drop-down menu at the top of the Models page to view all available fine-tuning models in your instance. In the Transcription Models ta...
  • Updated on Mar 31, 2025
  • Published on Oct 10, 2024
To meet the specific automation needs of your various lines of business, you can configure transcription, or finetuning, models at the flow level. This flexibility allows you to: enter dedicated transcription automation and accuracy setting...
  • Published on Oct 10, 2024
Each model is only compatible with one Semi-structured layout, but a model is not necessarily compatible with every version of a layout. Compatibility logic is determined by comparing fields that the model was trained on (e.g. fields in the live la...
  • Updated on Mar 28, 2025
  • Published on Oct 3, 2024
In v40, models for flows created in v38 and above are forward compatible , meaning that you can use them in v40 without having to retrain them during the upgrade process. As a result, forward-compatible models allow you to: upgrade v38 or v39...
  • Published on Oct 10, 2024
To ensure that you do not lose any training data during application upgrades and model setups, you can move your training data between environments. The ability to export your models’ training data from production to lower environments can also hel...
  • Updated on Mar 31, 2025
  • Published on Oct 10, 2024
Retraining a Field or Table ID model is not necessary if one, or some combination, of the following properties were changed in an existing field’s settings: Field name Output name Supervision Required Identification Supervisi...
  • Published on Oct 10, 2024
To take action on Field Locator model training, or any training job for that matter, navigate to Administration > Trainer . To learn more about the Trainer application, see What is the Trainer? Canceling a training job Follow these ste...