Upgrading to v32

This article explains the v32 upgrade process and lets you know what you can expect after the upgrade is complete.

Preparing to upgrade

  • Review what’s new — Before upgrading to v32, we recommend reading the release notes for v32 and any versions between your current version and v32. These release notes describe the changes and new features we’ve introduced in each version of the application. For links to these release notes, see the Release Notes section of our site.

  • Review our upgrade documentation — For an overview of the upgrade process, review our upgrade documentation:

  • Make sure your infrastructure meets the requirements for v32 — Over the past few versions, we’ve made the following changes to our infrastructure requirements:

    • Databases:

      • Deprecated support for PostgreSQL 9.5 — Beginning with v30, we’re no longer supporting the use of PostgreSQL 9.5 databases.

      • Deprecated support for Oracle 12.1 — Beginning with v32, we’re no longer supporting the use of Oracle 12.1 databases.

    • Trainer VM CPU cores: Beginning with v31, we strongly recommend having 16 cores for each CPU in a trainer VM if you are processing Semi-structured documents. If you have only 8 cores for these CPUs, you can expect 60-70% longer training times and an increased risk of crashes during training, particularly on datasets with longer, denser documents.

    • Trainer RAM: Beginning with v31, we strongly recommend that your trainer have 64GB of RAM, which will maximize the performance of the 16-core CPUs described above.

Beyond these requirements, your Hyperscience representative can recommend specific infrastructure improvements, based on your anticipated use of Hyperscience v32.

The upgrade process

The steps involved in upgrading to v32 depend on what version of Hyperscience you're currently using. 

Upgrading from v28

If you are upgrading from v28, you cannot upgrade directly to v32, as you cannot train a v32 trainer on the v28 application. Instead, you need to upgrade to either v30 and v31 and then upgrade to v32. For simplicity, we recommend upgrading to v31 rather than v30, as doing so eliminates the need for a v30 bundle.

To learn more about upgrading to v31, see Upgrading from v28 to v31. The key infrastructure and feature considerations explained in the article also apply to v32. Following the outlined steps will ensure that your overall upgrade process from v28 to v32 is successful.

You can upgrade to the latest versions of v30 and v31 before upgrading to v32. However, make sure the versions you upgrade to are compatible with each other:

  • V30.0.13 or later can only be upgraded to v31.0.6 or later.

  • V31.0.6 or later can only be upgraded to v32.0.1 or later.

Draining submissions

Before upgrading to v32, all submissions created in v28 or earlier must be completed. You can complete these submissions in v28 before beginning the upgrade process, or you can complete them in v30 or v31. There is no filter in the application to determine which submissions were created in v28 or earlier, but you can check for these submissions in one of the following ways: 

  • Use the submission date and the dates in your upgrade history as a guide — Submissions created in v28 or earlier have a Submission Date that predates  your upgrade to v30 or v31. 

  • Check Administration > Jobs — In the Legacy Jobs view of the Jobs table, look for jobs with Created dates that predate your upgrade to v30 or v31 and contain submission tasks.

Ensure these submissions have a Completed status before upgrading to v32. If they are not completed when you upgrade to v32, they will likely be halted, and you will need to resubmit or recreate them in order for them to be processed. 

Upgrading from v30

Before upgrading to v32, follow the steps in Draining submissions to make sure that all submissions in your system that were created in v28 or earlier have been completed.

When upgrading from v30, you should upgrade to v31 before upgrading to v32. When doing so, you can follow our standard process for non-sequential upgrades:

  1. Train a v32 trainer on your v30 application.

  2. When all the training tasks have finished, upgrade your application to v31 on at least one machine running the application.

  3. As soon as possible, upgrade your application to v32 on all machines running the application. In order to avoid a possible reduction in model performance, we recommend minimizing the amount of time your instance runs v31.

You can upgrade to the latest version of v31 before upgrading to v32. However, make sure your current version of Hyperscience and the versions you upgrade to are compatible with each other:

  • V30.0.13 or later can only be upgraded to v31.0.6 or later.

  • V31.0.6 or later can only be upgraded to v32.0.1 or later.

Upgrading from v31

Before upgrading to v32, follow the steps in Draining submissions to make sure that all submissions in your system that were created in v28 or earlier have been completed.

To upgrade from v31 to v32, you can follow our standard process for sequential upgrades, which we've outlined in Upgrade Process Overview.

Note that v31.0.6 or later can only be upgraded to v32.0.1 or later.

What to expect after upgrading

After upgrading to v32, you may notice some changes in your Flow Library and slowness in processing Structured documents. You may also need to complete some additional steps to restore full functionality to your instance, as described in the sections below.

Machine Classification

After upgrading, your first submissions with Structured documents might take up to a few hours to complete. The Machine Classification Block uses precomputed data to classify Structured documents. After upgrading, this precomputed data is invalidated. The system regenerates this data the first time a submission goes through Machine Classification after upgrading. 

If processing submissions with Structured documents takes longer than expected, you should check the logs from the Activity Log section of the Submission Output page. Verify that the Machine Classification task is the one that takes more time than expected.

mceclip0.png

Flows

When upgrading from v30 or v31, the system will automatically move your flows to v32 and add new ones for v32. No changes will be made to your pre-v32 flows as part of this process. Submissions will continue to be processed through your default flow (either "Document Processing" or "Document Processing (V31)" by default). This flow will be active and deployed in v32.

Flows and Custom Code Blocks created in v30 or v31 will continue to work in v32.

System-generated flows

The system-generated flows that appear in your v32 instance depend on which version you're upgrading from.

 

Upgrade path

System-generated flows

From v28 to v30 to v31 to v32

  • "Document Processing"

  • "Document Processing (V31)" 

  • "Document Processing (V32)"

From v28 to v31 to v32

  • "Document Processing (V31)" 

  • "Document Processing (V32)"

From v30 to v31 to v32

  • "Document Processing"

  • "Document Processing (V31)" 

  • "Document Processing (V32)”

From v31 to v32

  • "Document Processing" (if you used v30) 

  • "Document Processing (V31)"

  • "Document Processing (V32)”

Document Processing (V32)

A new, uncustomized "Document Processing (V32)" flow will be added and will be disabled. This flow is a modified version of the default "Document Processing" and "Document Processing (V31)" flows and contains features introduced in v32. You can work with your Hyperscience representative to determine if this updated flow would be beneficial for your business. If you decide to use the new flow, you will need to add Input Blocks and Output Blocks to it, as they are not included by default.  

Document Processing Notifications

The system will also move your "Document Processing Notifications" flows to v32, which contain Notification Blocks that execute mid-flow. Each of these flows remain unchanged during the upgrade and are not shown in the application by default. In addition, a new “Document Processing Notification (V32)” flow is created during the upgrade process. Contact your Hyperscience representative if you need to enable or disable connections in these flows after upgrading to v32.

Flow-specific transcription models

If you are processing Structured documents, you have the option of training flow-specific transcription, or finetuning, models in v32. By default, flows created in previous versions use the system-level transcription model.

If you do not run recalibration on a v32 trainer on a v31 application during the upgrade process, you may experience delays in processing submissions. Recalibration runs on any flows that are live when the upgrade occurs. We recommend completing this recalibration as a best practice for upgrades, even if you are not planning to use flow-specific transcription models.

Message Queue Listener connections

Starting in v32, the Hyperscience application includes an updated version of the Message Queue Listener connector. This version fixes several performance and reliability issues found in previous versions. It also changes the behavior of the connector when processing malformed or erroneous input, delegating the job of surfacing errors to the message broker. To ensure the message broker is ready to handle these errors, a Dead Letter Queue (DLQ) should be set and configured so messages that can’t be processed by the input connector are removed from the queue and moved to a different queue for automatic or manual supervision.

Details on how to set up a DLQ and create a policy for it can be found in these articles:

S3 Submission Retrieval Store 

If you:

  • are using an S3 bucket as your submission retrieval store,

  • you are not authenticating through IAM roles, and

  • are planning to use the "Document Processing (V32)" flow, 

you can use the S3 Submission Retrieval Store field of the Submission Initialization Block to enter the values for AWS access key ID and secret access key. You do not need to edit your “.env” file.

If these credentials are missing or incorrect, you can edit them directly in the Submission Initialization Block. You do not need to edit your “.env” file.

If you are not using an S3 bucket for your submission retrieval store, or if you are authenticating to S3 through IAM roles, this field should contain an empty object (i.e., {}).

S3 Submission Retrieval Endpoint URL

If you:

  • you use a submission retrieval store that is not in the public cloud (i.e., its URL does not point to s3.amazonaws.com — for example, a government cloud or an S3-compatible internal setup), and

  • are planning to use the "Document Processing (V32)" flow, 

make sure you can see your S3 submission retrieval endpoint URL in the Submission Initialization Block. It should appear in the S3 Submission Retrieval Endpoint URL field.

If this value is missing or incorrect, you can edit it directly in the Submission Initialization Block. You do not need to edit your “.env” file.

If you are not using an S3-like internal setup for your submission retrieval store, this field should be blank.