V40 Release Notes

Versions v40.1.x are available to SaaS customers only.

40.1.2 (18 Nov 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

40.1.1 (15 Nov 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

40.1.0 (12 Nov 2024)

Internationalization

Updated

Support for German translations — You can now provide interface-text translations for the de-DE locale (German, Germany).

To learn how to upload translation files, see Providing a Translated User Interface.

Submission Pre-processing

Updated

New Rotation Correction model – We’ve implemented a new page-level rotation-detection model, which improves performance for text-sparse documents, documents with noisy backgrounds (e.g., SSN cards, driver's licenses, birth certificates, etc.), and mixed-language documents.

Training

Updated

Deterministic Training Recovery – In v40, we introduced the Trainer Resiliency feature, which allows you to resume interrupted training tasks from their last saved checkpoint. Deterministic Training Recovery enhances Trainer Resiliency by ensuring that models' automation rates are unaffected by the use of saved checkpoint data in the completion of training tasks.

Training Data Management

Updated

Importing and exporting Classification models – With the updates included in v40.1, users can download trained Classification models and import them to other instances.

Note that these updates apply only to Classification models and not to their training data.

For more information, see TDM for Classification.

Accuracy

New

Field-level accuracy targets for transcriptions in Semi-structured documents – We've extended the field-level accuracy targets introduced in v40 to include the transcription of fields and table columns in Semi-structured documents. With this update, you can specify accuracy targets for the transcription of particular fields and table columns in a flow's settings.

Note that accuracy targets apply only to the transcription of fields and table columns and not to their identification. Field-level accuracy targets for identification will be available in an upcoming version.

For more information on setting field-level accuracy targets for Semi-structured documents, see Transcription Accuracy and Automation.

Flow Blocks

New

Google Cloud Platform Integration Blocks – We’ve introduced five new Google Cloud Platform (GCP) Blocks, which enable automation solutions requiring the usage of large language models (LLMs) or vision language models (VLMs). These blocks use Retrieval Augmented Generation (RAG) techniques to provide ground-truth data to LLMs and create high-accuracy responses:

  • Vertex AI Block (VLM/LLM)

  • BigQuery Block (BigQuery querying Block)

  • Vertex AI Embeddings Block (Embedding model)

  • Vector Search Create Block (Creating vector indexes)

  • Vector Search Query Block (Querying vector indexes)

Each one of these blocks helps to enable and build automation workflows that leverage Generative AI and RAG on GCP. The use of RAG techniques minimizes the likelihood of LLM hallucinations in the blocks’ output, providing the most relevant and accurate information possible.

For more information about these blocks, and for assistance in implementing them, contact your Hyperscience representative.

File Storage

Updated

Enhanced Google Cloud Storage (GCS) integration – If you are using a GCS bucket as your file store, you can take advantage of our enhanced GCS integration, which supports the use of Object Versioning and Application Default Credentials (ADC).

To learn more about this enhanced integration and how to configure it, see Google Cloud Storage.

Known Issues

Task Queue

Completing selected tasks from the Perform Tasks action – If a keyer selects the checkboxes for individual tasks in the Task Queue, clicks Actions, and then clicks Perform Tasks, no tasks appear in the table. A fix for this issue will be available in an upcoming patch version.

Reporting

Access to Reporting pages for users with custom roles – We’re fixing an issue that prevents users from accessing the entire Reporting section of the application if they do not have permission to access the Reporting Overview page (Reporting > Overview). This issue affects only those users who are in custom permission groups that have permission to access other pages in the Reporting section.

40.0.10 (21 Nov 2024)

Large Language Model (LLM) Blocks

Fixed

Execution of LLM Install Flow – We've fixed an issue that caused the execution of the Hyperscience-provided LLM Install Flow to fail with the error ModuleNotFoundError: No module named 'authlib'.

40.0.9 (6 Nov 2024)

Internationalization

Updated

Support for German translations — You can now provide interface-text translations for the de-DE locale (German, Germany).

To learn how to upload translation files, see Providing a Translated User Interface.

40.0.8 (25 Oct 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

40.0.7 (15 Oct 2024)

Layouts and Models

Fixed

Messaging about latest layout and model versions not being live – We've resolved a version-comparison issue that caused incorrect "Latest version is not live" warning messages to appear on the details pages for layouts and models.

40.0.6 (7 Oct 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

40.0.5 (3 Oct 2024)

Connections

Updated

Specifying AWS regions for S3 Notifier connections – We've added an AWS Region setting to S3 Notifier Output Blocks, which allows you to specify the region of the S3 bucket that notifications are being sent to (e.g., us-west-2). Specifying a region helps to prevent location-constraint errors from occurring when attempting to connect to the notifications' S3 bucket.

Infrastructure

Updated

Refactoring of docker-compose files – We've refactored our docker-compose files by merging frontend and backend files into docker-compose.forms.<type>.yml. This update allows the starting of containers to be controlled from within Docker.

Upgrades

Fixed

Upgrading to v40 – We've fixed a data-migration issue that caused upgrades to v40 to fail with unsupported operand type(s) errors. This issue affected instances that had run v35 within the past year, regardless of what version was being run immediately before the upgrade.

40.0.4 (26 Sept 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

40.0.3 (19 Sept 2024)

Versions 40.0.0-40.0.2 were not released and are not supported.

User Experience

New

Internationalization of the user interface – If your organization needs to provide a translated user interface (UI) for your keyers, a System Admin can upload translations of the interface's text in the languages of your choice. After a CSV containing the translations has been uploaded, a drop-down menu appears in the upper-right corner of the application, allowing any user to change the language of the UI text. By translating the application's text into your users' native languages, you can increase keyer productivity and expand the use of Hyperscience at your organization.

You can provide translations for up to three additional locales. Translations in languages read from right to left are not supported. In v40, only interfaces used by keyers (e.g., Supervision) are able to be translated with this feature. We will make more UI text available for translation in future versions of Hyperscience.

Note that Hyperscience does not offer translations of UI text, nor does the platform include a translation-management interface.

To learn how to provide translations of UI text, see Providing a Translated User Interface.

Enforcement of license packages – To ensure that you can use only those features we can support in your license package, any features not included in your license package are no longer available for use. This change will take effect upon entering a license key created under Hyperscience's Hypercell pricing plan. Instances with license keys created under non-Hypercell pricing are not affected and include all available Hyperscience features.

More details about the enforcement of license packages can be found in License Packages and Feature Availability.

For more information about our license packages and what is included in each one, contact your Hyperscience representative.

Layouts

New

Document Drift Management – Document Drift Management (also known as Layout Triage) helps you manage and classify unmatched pages more effectively. It automatically groups similar pages based on their visual patterns, allowing you to easily organize and create new layouts for documents that don’t match existing ones.

This feature replaces the previous "Find Potential Layouts" and "No Layout Found" processes, offering a more streamlined and effective way to handle unmatched pages.

Document Drift Management simplifies handling unmatched pages by letting you manually adjust groups and create accurate layouts directly from these pages. It works best with Structured documents, where the visual similarity between the pages is consistent.

To learn more, see Document Drift Management (Layout Triage).

Training

New

Trainer Resiliency – With Trainer Resiliency, interrupted training tasks can be resumed from their last saved checkpoint. This feature helps to reduce the potential of lost time when training a model in the event of network or other infrastructure failures.

You can save checkpoint data for the following types of models:

  • Field Identification

  • Table Identification

  • Long-form Extraction

By default, checkpoint data is saved every 30 minutes, but you can choose to save data at a different frequency. This data is saved in the /var/www/forms/forms/media directory. Note that Trainer Resiliency requires 6GB of additional server capacity in order to save checkpoint data.

This feature Is not enabled by default. You can enable it by adding variables to the ".env" file, or you can ask your Hyperscience representative for assistance.

To learn more about Trainer Resiliency, see Trainer Resiliency.

Models

New

Extended Model Compatibility – In v39 and earlier, models could only be used with flows that were created in the same version of Hyperscience as the models were trained on. While the application could run flows created in the previous two versions of Hyperscience, any retraining of those flows' models required that the flows be updated, as well. This limited compatibility caused delays in the deployment of newer, more performant models, as well as the overall time-to-value for Hyperscience in many use cases.

In v40 and later, models do not need to be trained in the same version that their flows were created in in order for them to be used in submission processing. For example, in v40 of the application, if you have a model trained in v38 and a flow created in v38, but you need to retrain the model, you can retrain the model in v40 and continue to use it with the v38 flow.

For more details on compatibility among application, flow, and model versions, as well as the impact of this update on the v40 upgrade process, see Compatibility Across Application, Flow, and Model Versions.

Updated

Training Data Management enhancements for Identification Models - We’ve made enhancements in the Training Data Management for Identification models. In v40, you will be able to see all models associated with the currently supported model versions in the Model History card. This enhancement provides a view of the entire model history, enabling you to revert to previously trained models.  Another improvement in v40 is that the models that are rejected or undeployed will no longer disappear from the history view. Instead, they remain visible as part of the comprehensive model history, giving you better visibility and control over your model.

Accuracy

New

Field-level accuracy targets in Structured documents – In many use cases, some field transcriptions require higher levels of accuracy than others, like names or Social Security Numbers. With the updates included in v40, you can set accuracy targets that are specific to each of these fields in Structured documents. By allowing you to tailor accuracy requirements at the field level, this feature eliminates the need to create separate flows for critical transcriptions. It also prevents keyers from having to complete Transcription Supervision tasks for lower-value fields due to high accuracy targets at the flow level.

You can set field-level accuracy targets in the field dictionary or in a flow's settings. Any targets set at the flow level for a field override any targets for that field that have been set in the field dictionary. The accuracy achieved for each field you've set a specific target for can be found on the Accuracy page of the application (Reporting > Accuracy).

Note that setting field-level accuracy targets has no effect on the number of QA tasks that the system creates.

To learn how to set field-level accuracy targets, see Transcription Accuracy and Automation.

Flows

New

Document Renderer Block – The Document Renderer Block allows you to download documents as PDFs after they go through Classification—either Machine or Manual. You can configure the block in Document Processing Subflow V40. In the block settings, you can specify the page size (in inches or millimeters) and adjust the image quality. Note that higher quality results in larger file sizes, with a default quality of 50% that balances size and clarity. Once configured, you can find a download URL in each submission’s JSON output. Paste this URL after the core URL of your instance to initiate the PDF download (e.g., example.hyperscience.com/api/<URL>).

For more information, see Flow Blocks.

Updated

Packaging of Python 3.9 runtime – In preparation for Python 3.9's end-of-life in October 2025, we've supported both Python 3.9 and Python 3.11 in the past several versions of Hyperscience. To continue to support both versions of Python in v40, we’ve included the Python 3.9 runtime as a package that is automatically deployed if you are still using flows that depend on Python 3.9. Doing so gives your organization additional time to upgrade any Code Blocks or external packages that use Python 3.9 before support ends for that version of Python.

More information on deploying the Python 3.9 runtime in v40 can be found in Developing Flows.

Flows versioning and UX updates – To provide a better overall user experience, we've made the following enhancements to the Flows user interface and versioning:

  • All flows now on a single page — The lists of top-level flows and all flows in the instance no longer appear on separate pages. The Top-level Flows page has been removed, and all flows, both top-level and subflows, can be found on the Flows page of the application.

  • Behavior of the back button (<) on flow-related pages — We've updated the behavior of the Back (<) button on certain flow-related pages (e.g., pages in Flow Studio, flow-run pages) to create a more consistent user experience. We've also removed the button from some of these pages.

  • Versioning of On-Error and Notification subflows — Because their functionality does not change from version to version, we are no longer versioning the On-Error and Notification subflows that are included in each version of Hyperscience. These subflows are still included in v40, but they do not include any version designation in their names.

Default values for subflows and blocks – Default values declared in the manifest or input elements in the JSON files of flows are now taken into account when running those flows as subflows. By making  the flow-development process more intuitive, this update enhances the development experience and makes the behavior of blocks and subflows more predictable.

Note that values entered specifically on the subflow-calling block take precedence over any default values in the subflow’s definition. After that, values in the input element of the top-level flow's JSON are used before those specified in that file's manifest element.

This update may change the current behavior of any flows invoked as subflows if they don’t have explicit default values in their manifest or input elements.

Classification

Updated

Layout ID Matching for Structured Documents – Layout identifiers help the system improve classification accuracy by using distinguishing features from documents. With the flow-level settings introduced in v40, you can specify how layout identifiers are used during the Classification process:

  • Classify using Layout ID — When this setting is enabled, the system checks for a matching layout identifier in the document. If the identifier matches the expected one in the layout variation, the document is classified accordingly. If it doesn't match, the document is either sent for further review or to Document Drift Management, preventing misclassification.

  • Bypass Classification by Layout ID — This setting bypasses validation by layout identifier if the matched layout variation doesn’t have an identifier specified.

Pre-computing data for the classification of Structured documents – To increase the overall efficiency of the classification process, we've updated the system to pre-compute data for the classification of Structured documents in each release. This pre-computation occurs when the first submission enters a flow that is using a release that contains Structured documents.

Unstructured Extraction

Updated

Multiple Occurrences for Unstructured Extraction – We’ve enhanced the Unstructured Extraction (UNLP) model to support the extraction of multiple occurrences of fields. With this update, you can now capture and extract various instances of the same field within a document with long segments of text. You can also select the engine type from the UI.  

For more information, see Training a New Field Identification Model.

Reporting

New

Infrastructure metrics in the Usage Report – In v40, we’ve added key infrastructure information to the Usage Report:

  • Database version

  • Operating system details

  • Kubernetes version

  • Helm chart version

To learn more, see Usage Report.

Authentication

Updated

Support for SAML_STAFF_PERMISSION_ROLES – We've removed support for the SAML_STAFF_PERMISSION_ROLES ".env' file variable. The variable’s functionality was never enabled in the application, so there is no action required on your part as a result.

For more information on configuring SAML, see SAML.

Additional OIDC configuration options – We've added the following ".env" variables for OIDC configurations:

  • HS_OIDC_VERIFY_KID — Indicates whether the OIDC client verifies the kid field in JWT tokens. Defaults to true.

  • HS_OIDC_VERIFY_SSL — Indicates whether the OIDC client verifies the SSL certificates of the OpenID provider's responses. This value can be a boolean value or a path to a certificate bundle. Defaults to true.

For more information about these and other OIDC configuration options, see OpenID Connect (OIDC).

Fixed

Resetting passwords for locked-out users in SaaS self-service user management – We've fixed an issue that prevented passwords from being reset for locked-out users. The issue affected SaaS deployments with self-service user management enabled.

Infrastructure

New

Logging NGINX-related issues – To enable you to debug NGINX-related issues more effectively, we've introduced the NGINX_ERROR_LOG_LEVEL ".env" file variable. You can use this variable to specify the minimum severity level an issue must have in order for it to be logged by the syslog utility. The default value is info.

More details on NGINX_ERROR_LOG_LEVEL and its possible values can be found in Security.

Updated

Support for Ubuntu 16 and RHEL 7 – From v40 onward, we are not supporting the use of Ubuntu 16 or RHEL 7 with Hyperscience.

To learn more about our supported operating systems, see Infrastructure Requirements.

Support for PostgreSQL 12 – Beginning in v40, the Hyperscience application will no longer support PostgreSQL 12.For more information about our supported databases, see Infrastructure Requirements.