Upgrading to v33

Versions 33.1.14, 33.1.15, 33.1.16, 33.1.17, and 33.1.18 have issues with using v31 flows after an upgrade. We recommend upgrading to these versions only if you plan to use v32 or v33 flows. 

This article explains the v33 upgrade process and lets you know what you can expect after upgrading.

Preparing to upgrade

  • Review what’s new — Before upgrading to v33, we recommend reading the release notes for v33 and any versions between your current version and v33. These release notes describe the changes and new features we’ve introduced in each version of the application. For links to these release notes, see the Release Notes section of our site.

  • Review our upgrade documentation — For an overview of the upgrade process, review our upgrade documentation:

  • Make sure your infrastructure meets the requirements for v33 — Over the past few versions, we’ve made the following changes to our infrastructure requirements:

    • Databases:

      • Deprecated support for PostgreSQL 9.5 — Beginning with v30, we’re no longer supporting the use of PostgreSQL 9.5 databases.

      • Deprecated support for Oracle 12.1 — Beginning with v32, we’re no longer supporting the use of Oracle 12.1 databases.

    • Trainer VM CPU cores: Beginning with v31, we strongly recommend having 16 cores for each CPU in a trainer VM if you are processing Semi-structured documents. If you have only 8 cores for these CPUs, you can expect 60-70% longer training times and an increased risk of crashes during training, particularly on datasets with longer, denser documents.

    • Trainer RAM: Beginning with v31, we strongly recommend that your trainer have 64GB of RAM, which will maximize the performance of the 16-core CPUs described above.

    • Beyond these requirements, your Hyperscience representative can recommend specific infrastructure improvements, based on your anticipated use of Hyperscience v33.

    • Increased upgrade times – In v33, we introduced various performance improvements that required the creation of more database indexes and migrating data stored in the application’s database.

      On-premise customers may see an increase in the time it takes to upgrade their application services to v33 and should plan up to an extra day in their upgrade window. The impact of these changes is greater for customers with larger databases.

      Our v33 database migrations are executed after running ./run.sh init for the first time within the new bundle, and migrations must be completed before the Hyperscience application can be used. To reduce the size of your database and the time required to upgrade, you can configure a shorter Submission Record Deletion period in Settings.

The upgrade process

The steps involved in upgrading to v33 depend on what version of Hyperscience you're currently using. 

Upgrading from v28 

If you are upgrading from v28, you cannot upgrade directly to v33, as you cannot train a v33 trainer on the v28 application. Instead, you need to use the following upgrade path:

  1. Upgrade to v31.

  2. Upgrade to v32.

  3. Upgrade to v33. 

You also need to drain your instance of submissions created in v28 or earlier before upgrading to v32. See Draining submissions for more information.

Follow these steps to upgrade to v33:

  1. Train a v31 trainer on your v28 application.

  2. When all the training tasks have finished, upgrade your application to v31 on all machines running the application. You do not need to upgrade to v30 before upgrading to v31.

  3. Train a v33 trainer on your v31 application.

  4. When all the training tasks have finished, upgrade your application to v32 on at least one machine running the application.

  5. As soon as possible, upgrade your application to v33 on all machines running the application. In order to avoid a possible reduction in model performance, we recommend minimizing the amount of time your instance runs v32.

To learn more about upgrading to v31, see Upgrading from v28 to v31. The key infrastructure and feature considerations explained in the article also apply to v33. Following the outlined steps will establish a foundation for a successful overall upgrade process from v28 to v33.

For best results, we recommend upgrading to the latest versions of v31 and v32 before upgrading to v33. 

  • Your v32 version needs to be v32.0.8+.

Also, make sure the versions you upgrade to are compatible with each other:

Compatibility between V31 and V32 versions

V31.0.5 and earlier ➜ V32.0.0 and later

V31.0.6 to V31.0.11 ➜ V32.0.1 and later

V31.0.12 and later ➜ V32.0.3 and later

Draining submissions

Before upgrading to v32, all submissions created in v28 or earlier must be completed. You can complete these submissions in v28 before beginning the upgrade process, or you can complete them in v31. There is no filter in the application to determine which submissions were created in v28 or earlier, but you can check for these submissions in one of the following ways: 

  • Use the submission date and the dates in your upgrade history as a guide — Submissions created in v28 or earlier have a Submission Date that predates your upgrade to v31. 

  • Check Administration > Jobs — In the Legacy Jobs view of the Jobs table, look for jobs with Created dates that predate your upgrade to v31 and contain submission tasks.

Ensure these submissions have a Completed status before upgrading to v32. If they are not completed when you upgrade to v32, they will likely be halted, and you will need to resubmit or recreate them in order for them to be processed. 

Upgrading from v30

Before upgrading to v32, follow the steps in Draining submissions to make sure that all submissions in your system that were created in v28 or earlier have been completed. 

When upgrading from v30, you should upgrade to v31 and v32 before upgrading to v33. You can either:

  • do a sequential upgrade to v31, followed by a non-sequential upgrade to v33, or 

  • do a non-sequential upgrade to v32, followed by a sequential upgrade to v33.

For example, if you choose the second option, you would follow these steps to upgrade to v33:

  1. Train a v32 trainer on your v30 application.

  2. When all the training tasks have finished, upgrade your application to v31 on at least one machine running the application.

  3. As soon as possible, upgrade your application to v32 on all machines running the application. In order to avoid a possible reduction in model performance, we recommend minimizing the amount of time your instance runs v31.

  4. Train a v33 trainer on your v32 application.

  5. When all the training tasks have finished, upgrade your application to v33 on all machines running the application. 

For best results, we recommend upgrading to the latest versions of v31 and v32 before upgrading to v33. 

  • Your v32 version needs to be v32.0.8+.

Make sure the versions you upgrade to are compatible with each other:

Compatibility between V30 and V31 versions

Compatibility between V31 and V32 versions

V30.0.14 and earlier ➜ V31.0.0 and later

V31.0.5 and earlier ➜ V32.0.0 and later

V30.0.15 and later ➜ V31.0.12 and later

V31.0.6 to V31.0.11 ➜ V32.0.1 and later

 

V31.0.12 and later ➜ V32.0.3 and later

Upgrading from v31

Before upgrading to v32, follow the steps in Draining submissions to make sure that all submissions in your system that were created in v28 or earlier have been completed. 

Follow these steps to upgrade from v31 to v33

  1. Train a v33 trainer on your v31 application.

  2. When all the training tasks have finished, upgrade your application to v32 on at least one machine running the application.

  3. As soon as possible, upgrade your application to v33 on all machines running the application. In order to avoid a possible reduction in model performance, we recommend minimizing the amount of time your instance runs v32.

For best results, we recommend upgrading to the latest version of v32 before upgrading to v33. Make sure the versions you upgrade to are compatible with each other:

Compatibility between V31 and V32 versions

V31.0.5 and earlier ➜ V32.0.0 and later

V31.0.6 to V31.0.11 ➜ V32.0.1 and later

V31.0.12 and later ➜ V32.0.3 and later

Upgrading from v32

To upgrade from v32 to v33, you can follow our standard process for sequential upgrades, which we've outlined in Upgrade Process Overview.

What to expect after upgrading

After upgrading to v33, you may notice some changes in your Flow Library and slowness in processing Structured documents. You may also need to complete some additional steps to restore full functionality to your instance, as described in the sections below.

Machine Classification

After upgrading, your first submissions with Structured documents might take up to a few hours to complete. The Machine Classification Block uses precomputed data to classify Structured documents. After upgrading, this precomputed data is invalidated. The system regenerates this data the first time a submission goes through Machine Classification after upgrading. 

If processing submissions with Structured documents takes longer than expected, you should check the logs from the Activity Log section of the Submission Output page. Verify that the Machine Classification task is the one that takes more time than expected.

mceclip0.png

 

Flows

When upgrading from v30,  v31, or v32, the system will automatically move your flows to v33 and add new ones for v33. No changes will be made to your pre-v33 flows as part of this process. Submissions will continue to be processed through your default flow (likely "Document Processing," "Document Processing (V31)," or “Document Processing (v32)” by default). This flow will be active and deployed in v33.

Any flows and Custom Code Blocks created in v31 or v32 will continue to work in v33. However, we cannot guarantee that flows created in v30, including the “Document Processing” and “Document Processing Notifications” flows, will work in v33, and we do not officially support the use of these flows. 

  • If you are actively using custom flows created in v30, work with your Hyperscience representative to recreate these flows in v33.

  • If you are using “Document Processing” as your default flow, ask your Hyperscience representative to change your default flow to “Document Processing (V33).”

  • Similarly, if you are using the “Document Processing Notifications” flow, work with Hyperscience to set up “Document Processing Notifications (V33)” to meet your needs.

To migrate your processing to “Document Processing (V33)”:

  1. Go to your lower environment.

  2. Duplicate the newly-created “Document Processing” flow.

  3. Configure the duplicated flow with the configuration settings and notifications of the “Document Processing” flow you are using to process submissions in your production environment.

  4. Re-train any Identification, Classification, and Transcription models associated with the previous version's flows by clicking the Run training button on the model details page for each model. Doing so ensures that those models are trained on the latest version of the application and are compatible with the new version's flows.

  5. Assign a release to the duplicated “Document Processing” flow.

  6. Deploy the duplicated “Document Processing” flow.

  7. To test the duplicated “Document Processing” flow, manually upload and process a few submissions through this flow.

  8. If you use an integration to upload submissions, change your integration’s target flow UUID to the UUID of the duplicated “Document Processing” flow. You can copy the UUID of your duplicated “Document Processing” flow by clicking the Copy link at the bottom of the Flow Settings sidebar on the left-hand side of the Flow Studio.

  9. Disable the old “Document Processing” flow.

  10. Repeat steps 2-7 in your production environment, or export the newly created flows and models from your lower environment and import them to your higher environment.

System-generated flows

The system-generated flows that appear in your v33 instance depend on which version you're upgrading from.

Upgrade path

System-generated flows

From v28 to v31 to v32 to v33

  • "Document Processing (V31)" 

  • "Document Processing (V32)"

  • “Document Processing (V33)”

From v30 to v31 to v32 to v33

  • "Document Processing"

  • "Document Processing (V31)" 

  • "Document Processing (V32)”

  • “Document Processing (V33)”

From v31 to v32 to v33

  • "Document Processing" (if you used v30) 

  • "Document Processing (V31)"

  • "Document Processing (V32)”

  • “Document Processing (V33)”

From v32 to v33

  • "Document Processing" (if you used v30) 

  • "Document Processing (V31)" (if you used v31)

  • "Document Processing (V32)”

  • “Document Processing (V33)”

Document Processing (V33)

A new, uncustomized "Document Processing (V33)" flow will be added and will be disabled. This flow is a modified version of the default "Document Processing" and “Document Processing (V3x)” flows and contains features introduced in v33. You can work with your Hyperscience representative to determine if this updated flow would be beneficial for your business. 

If you decide to use the new flow, you will need to add Input Blocks and Output Blocks to it, as they are not included by default.  

Document Processing Notifications

The system will also move your "Document Processing Notifications" flows to v33, which contain Notification Blocks that execute mid-flow. Each of these flows remain unchanged during the upgrade and are not shown in the application by default. In addition, a new “Document Processing Notifications (V33)” flow is created during the upgrade process. Contact your Hyperscience representative if you need to enable or disable connections in these flows after upgrading to v33.

Importing flows

In v33, flows are exported as ZIP files. These ZIP files contain the flow’s JSON file and the Python files for any Custom Code Blocks in the flow. 

In the unlikely event that you need to import a v33 flow to an instance using a previous version of Hyperscience, you need to manually upload your flow’s Python files after importing your flow.

Submission Initialization Block

In v32 and v33, we added settings that allow you to configure your submission retrieval store in the Flow Studio.

S3 Submission Retrieval Store 

If:

  • you are using an S3 bucket as your submission retrieval store,

  • you are not authenticating through IAM roles, and

  • you are planning to use the "Document Processing (V32)" or “Document Processing (V33)” flow after upgrading, 

you can use the S3 Submission Retrieval Store field of the Submission Initialization Block to enter the values for AWS access key ID and secret access key. You do not need to edit your “.env” file.

If you are not using an S3 bucket for your submission retrieval store, or if you are authenticating to S3 through IAM roles, this field should contain an empty object (i.e., {}). 

S3 Submission Retrieval Endpoint URL

If:

  • you use a submission retrieval store that is not in the public cloud (i.e., its URL does not point to s3.amazonaws.com — for example, a government cloud or an S3-compatible internal setup), and 

  • you are planning to use the "Document Processing (V32)" or “Document Processing (V33)” flow, 

make sure you can see your S3 submission retrieval endpoint URL in the Submission Initialization Block. It should appear in the S3 Submission Retrieval Endpoint URL field.

If this value is missing or incorrect, you can edit it directly in the Submission Initialization Block. You do not need to edit your “.env” file.

If the bucket you’re using as your submission retrieval store is in a public cloud (as opposed to a government cloud or an S3-compatible internal setup), this field should be blank. 

OCS Configuration

If:

  • you use an OCS instance as your submission retrieval store, and

  • you are planning to use the “Document Processing (V33)” flow, 

you can use the OCS Configuration field of the Submission Initialization Block to enter the host URL, username, password, and SSL certification information for your OCS instance. You do not need to edit your “.env” file.

Generic Web Storage (HTTP/HTTPS) Configuration

If:

  • you use a generic web storage solution as your submission retrieval store, and

  • you are planning to use the “Document Processing (V33)” flow, 

you can use the Generic Web Storage Configuration field of the Submission Initialization Block to enter the username, password, and SSL certification information for your storage service. You do not need to edit your “.env” file.

Additional considerations when upgrading from v28, v30, or v31

If you’re not currently using v32, you should be aware of the following changes that we made in v32, as they can affect your use of v33.

Flow-specific transcription models

If you are processing Structured documents, you have the option of training flow-specific transcription, or finetuning, models in v32. By default, flows created in previous versions use the system-level transcription model.

If you do not run recalibration on a v32 trainer on a v31 application during the upgrade process, you may experience delays in processing submissions. Recalibration runs on any flows that are live when the upgrade occurs. We recommend completing this recalibration as a best practice for upgrades, even if you are not planning to use flow-specific transcription models.

Message Queue Listener connections

Starting in v32, the Hyperscience application includes an updated version of the Message Queue Listener connector. This version fixes several performance and reliability issues found in previous versions. It also changes the behavior of the connector when processing malformed or erroneous input, delegating the job of surfacing errors to the message broker. To ensure the message broker is ready to handle these errors, a Dead Letter Queue (DLQ) should be set and configured so messages that can’t be processed by the input connector are removed from the queue and moved to a different queue for automatic or manual supervision.

Details on how to set up a DLQ and create a policy for it can be found in these articles: