Adding new data to your training set or making minor changes to its annotations may require several iterations of model re-training. Incremental Training helps you build upon your existing identification model without losing previously acquired information.
The system recommends one of the following options, based on internal dataset analysis. The recommended option is automatically selected.
You can override the selected option, but the recommendation will remain the same. Note that this won’t be valid for imported models.
Train from scratch
This option is recommended if your current model’s existing knowledge of your documents is not sufficient. It starts the training process from the beginning, using all eligible documents. Learn more about document eligibility in Document Eligibility Filtering.
Examples:
You have a new field in the layout or the multiline option is checked. In this case, the model doesn’t recognize the field. You need to train from scratch, as everything the model has learned up to that point could potentially be invalidated by the new field.
A warning message appears on the model details page, indicating that the ground truth does not include newly added fields and you have to run a model training.
Note that if you change the data type of a field the system will recommend you to train from scratch.
Your annotations are inconsistent. The performance of your model is low, and the anomalies after training data analysis indicate that you’ve incorrectly labeled a field (i.e., the location of that field is different throughout most of your training set). You need to correct these anomalies to achieve better results. In that case, you’re changing a crucial part of your training data, and you need to start over.
You’ve uploaded a large number of new documents to your training set. In these situations, you need to teach the model on the newly added data for the model to have a stable base knowledge.
Train from last training
This option is recommended when you want to enrich the training data of your existing model by adding more examples or if you’ve addressed anomalies after the last training iteration.
Do NOT use this option if the model performance is low due to inconsistent annotations or poorly represented data. This option leverages the existing knowledge of your model. That’s why, in these cases, it results in longer training times or worse-performing models.
Examples:
The performance of your model is high but you have several anomalies. After you’ve corrected annotations, you need to retrain. Instead of starting from the beginning, the system uses the existing knowledge of your model and builds upon it, without losing the current progress.
You’ve added some new examples to enrich your training set. You’ve annotated the new documents and need to re-train. The system uses the current progress of your model and enriches its existing knowledge, providing higher performance.
Incremental Training isn’t available if you’re training a brand-new model that’s never been trained before.
Using Incremental Training
To choose one of the available training options:
Go to Library > Models.
Click on the name of a layout to access its Model Management page.
Click on the Field Model or Table Model tab, depending on the type of model you want to train.
Select your candidate model and click Deploy Model.
A dialog box appears, asking if you want to deploy the model.
Click Confirm.
Analyze your data and address any potential anomalies. Learn more in Detecting and Correcting Anomalies in Field Annotations and Detecting and Correcting Anomalies in Table Annotations.
Click Re-train
A dialog box showing the options for re-training appears:
Click Run training.
Keep Current Model
After your training is completed, you’ll see the Current model and the Candidate one, along with the details for each. Learn more about model results in Evaluating Model Training Results.
Based on internal analysis and the last date your training ground truth data was modified, the system will automatically recommend one of the two options described above.
If you want to keep your current model, then:
Select Current
Click Keep Model.
A dialog box appears, indicating that the candidate model will be deleted:
Click Keep.
If you still need to improve your mode, follow steps 6-9 under Using Incremental Training.
Incremental Training and imported models
You won’t be able to use imported models as a starting point for Incremental Training because the system won’t know what training data was used. In such cases, you should train your model from scratch in order to achieve optimal results. That’s why, when you import a model and the training data to a production instance, you’ll have to make at least one training iteration before you can use Incremental Training.
Incremental Training for Identification models is available ONLY to models trained on v37 or later.
If you’re upgrading from v36 or earlier and you have live models, the Train from last training option will be grayed out.