Upgrade Instructions

Before upgrading, review Upgrade Considerations and Known Issues and Upgrade Best Practices to ensure your upgrade goes smoothly.

Upgrade to a major version

You can upgrade to the next major version of Hyperscience from any minor or hotfix version unless otherwise noted.

Prerequisites

Using compressed tables in an Oracle database

During an upgrade, the Hyperscience application issues DDL queries (CREATE, DROP, ALTER, TRUNCATE) that are not supported on compressed tables when using an Oracle database. You need to decompress any compressed tables prior to upgrading Hyperscience.

Compression tends to increase the database’s CPU usage and can cause the query plan used for Hyperscience queries to change. These changes can affect the performance of the Hyperscience application.

The upgrade process has four main steps that you can find below.

Step 1: Install the new trainer

To maintain automation performance between versions, install the next version of the trainer and attach it to your current application. For example, if you are running v30 and upgrading to v31, attach the v31 trainer to your v30 application.

The new trainer version should run side-by-side with the existing trainer, whether on a separate machine or on the same machine. To install it on the same machine as your current trainer, see Install a trainer on your current trainer’s machine below.

If the trainers are on the same machine, the application will only allow one of the two parallel trainers to process a job at any given time. Therefore, while your trainers can share a machine, we recommend that each version of the trainer has its own dedicated machine. This way, the trainers will run independently of each other and performance will not be affected.

Step 2: Wait for artifact jobs to finish

Artifact jobs are jobs that a trainer completes with data from an older version of the application (e.g., the jobs completed by a v32 trainer on a v30 application). A trainer can use data from the application version it is bundled with, along with the previous two application versions.

The new trainer must have successfully finished all tasks prior to upgrading the application. Expect these jobs to take 24-36 hours to complete.

The following jobs are included:

Model-retraining jobs for all Semi-structured layouts (e.g., Classification models, Field Locator models, Table Locator models):
- Classification models
  - These jobs use Document Classification tasks that you’ve completed to categorize and combine unmatched pages, and Model Validation Tasks.
  - A minimum of 10 documents per layout is required. For better performance, we recommend using at least 120 documents per layout.
- Field Locator models
  - These jobs use QA tasks, Supervision tasks, Model Validation Tasks, and Ground Truth Correction tasks.
  - A minimum of 400 pages required.
  - A maximum of 5,000 pages can be used to train each model.
- Table Locator models
  - These jobs use QA tasks, (v31 and earlier) Supervision tasks, Model Validation Tasks, and Ground Truth Correction tasks.
  - A minimum of 400 pages required.
  - A maximum of 5,000 pages can be used to train each model.
If Transcription Automation Training is enabled (Administration > Settings), these jobs are also included:
- Finetuning
  - These jobs use QA tasks.
  - A minimum of 5000 field transcriptions, 2000 signature transcriptions, and 2000 checkbox transcriptions required.
  - A maximum of 20000 transcriptions per type can be used to train each model.
- Auto-thresholding
  - These jobs use QA tasks.
  - A minimum of 5000 field transcriptions, 2000 signature transcriptions, and 2000 checkbox transcriptions required.
  - A maximum of 20000 transcriptions per type can be used to train each model.
Recalibration records

The above-mentioned requirements are not explicitly stated, but training will fail if they are not met. If training fails, reach out to your Hyperscience representative.

Under normal conditions, when a new trainer is attached, training will be triggered on all models automatically, and all you need to do is monitor their completion in the application as follows:

When you go to Administration > Trainer, there should not be any failed entries under “Completed” for the new version of the trainer:
Look for "Finetuning," "Auto Thresholding," and "OICR Recalibration" in the Finished section, and ensure that an entry exists for the next version of the application. The “Completed On” date does not matter as long as there is at least one entry here. There could be multiple entries.
There should not be any entries under “Queue,” either under “Running” or “Queued”:

The following point only applies to customers who use Semi-structured or Additional layouts. Otherwise, you can skip this verification.

Once there are no more entries under “Running” or “Queued”:
- If you go to Library > Models, you should see that the column for the new version should say "Finished" for each of the layouts that are in use. These models are “Field Locator Models” and “Table Locator Models” and apply only if you have Semi-structured layouts.
- In the upper-left portion of the same page, a drop-down list allows you to select either Locator Models or Classification Models.
  - Click Classification Models. The page reloads, and a Filter by release drop-down list appears in the upper-right part of the page.
  - Select your active layout release or all layout releases that may potentially be used, and ensure that all resulting rows have “Finished” entries in the next version’s column. This guideline applies to any layout release that has at least one Semi-structured or Additional layout.

Step 3: Deploy the new application

After all of the artifact jobs have finished and all of your models have been trained on the new trainer, you can deploy the new version of the application.

Before deploying the new application, make a backup of your database, if you haven't already. Doing so prevents data loss, and you can use the backup if you need to revert your changes.

For more information on deploying Hyperscience, see the articles in Installing Hyperscience and Configuring Hyperscience. Follow any guidelines given for the version of Hyperscience that follows your current one. For example, if you are running v30, follow the guidelines for v31.

(v30 and later) If you have SELinux enabled on your application machines, you need to give your application containers access to the ZIP files that contain your flows' blocks. For more information, see the "If using SELinux" section of Application Installation (Production).

Note that any machines not running init will not be functional during the deployment, and you will not be able to process submissions on those machines as you migrate to the new version of Hyperscience.

Step 4: Stop the old trainer

After completing the application upgrade, you should stop running the previous version of the trainer. To stop the old trainer, do one of the following:

If your trainers are on different machines, turn off the old trainer’s machine. It is no longer needed and can be repurposed.
If your trainers are on the same machine, refer to Remove the older trainer after application upgrade below.

Upgrade to a minor or hotfix version

In most cases, models are compatible across a major version's minor and hotfix versions. When upgrading to a minor or hotfix version, unless otherwise noted, you do not need to run the target version's trainer side-by-side with the existing trainer. Therefore, you can skip "Step 2: Wait for artifact jobs to finish" in the process above, and your upgrade steps are:

Install the new trainer
Deploy the new application
- You can deploy the new application right after installing the new trainer. You do not need to wait for your models to be trained on the new trainer.
Stop the old trainer

Using trainers on the same machine

If both of your trainers will be on the same machine, follow the instructions below to install the new trainer and remove the older one.

Install a trainer on your current trainer’s machine

Download the new bundle and extract it while keeping the existing bundle directory.
Within the old bundle, there should be a subdirectory named trainer-x.x.x, where x.x.x indicates the version number. Copy this subdirectory into the new bundle directory.
There should be two subdirectories within the new bundle directory now: one for the previous version and one for the current version.
Copy the previous “.env” file into the new bundle directory.
In v31.0.12+ and v32.0.3 and later, the Trainer needs a local directory called “trainer_media” under HS_PATH. If you’re upgrading to v31.0.12+ or v32.0.3 or later for the first time, you need to create the “trainer_media” directory as follows:
```
mkdir -p /mnt/hs/trainer_media
chown 1000:1000 /mnt/hs/trainer_media
```
Start the new trainer by running the following command:
```
sudo bash run.sh trainer <application_url> <token>
```

Once the command finishes, verify that there are two trainers by going to Administration > Trainer and looking for the new trainer in the “Connections” section.

Remove the older trainer after the application upgrade

Once the application is successfully upgraded, you no longer need the previous trainer version.

To remove it:

Run the following command within the new bundle directory:
```
rm -rf trainer-x.x.x
```
where x.x.x is the old trainer version.
Restart the current trainer by running the following command:
```
sudo bash run.sh trainer <application_url> <token>
```

To verify that the trainer was successfully removed, go to Administration > Trainer and check the “Connections” section to ensure that only one trainer is listed.