V39 Release Notes

Versions v39.1.x and v39.2.x are available to SaaS customers only.

39.2.19 (23 Apr 2025)

Classification

Updated

Font rendering consistency for Microsoft font-based documents - We’ve updated the support for Microsoft fonts to ensure consistent rendering of documents, such as .docx, across environments. In version 40.0.19, certain .docx files displayed unexpected layout differences when converted to images, which impacted structured classification accuracy in downstream processes. This was caused by missing Microsoft fonts that were previously available as part of the LibreOffice library, but were removed in later versions of the library. This update restores alignment with the behavior observed in version 40.0.08 and improves reliability in classification workflows dependent on layout structure.

NOTE: You may notice differences in how .txt files are rendered across environments. This is due to a change in LibreOffice behavior introduced in their March 2025 release, which affects text-to-image conversion when exporting to PDF. To learn more, see LibreOffice documentation.

39.2.18 (15 Apr 2025)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.2.17 (6 Mar 2025)

Training Data Management

Fixed

Previewing pages in Training Data Management (TDM) for Classification models — We've fixed an issue that caused delays in loading preview images of pages in TDM for Classification models.

39.2.16 (19 Feb 2025)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.2.15 (3 Feb 2025)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.2.14 (17 Jan 2025)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.2.13 (6 Dec 2024)

Training Data Management

Fixed

Migration of data to Training Data Management – We've fixed a data-migration issue that caused database deadlocks to occur when training data was sent to Training Data Management. This issue affected data coming from completed submissions that contained more than 500 pages.

Flow Blocks

Updated

"Scope" setting for HTTP REST Blocks – We've added a Scope setting to HTTP REST Blocks, which allows you to specify a scope for requests authorized with OAuth 2.0. This setting is available only if the block's Authorization Type is set to OAuth 2.0 Client Credentials.

39.2.12 (21 Nov 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.2.11 (19 Nov 2024)

Large Language Model (LLM) Blocks

Fixed

Execution of LLM Install Flow – We've fixed an issue that caused the execution of the Hyperscience-provided LLM Install Flow to fail with the error ModuleNotFoundError: No module named 'authlib'.

39.2.10 (6 Nov 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.2.9 (25 Oct 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.2.8 (11 Oct 2024)

Submission Pre-processing

Fixed

Processing of email attachments ingested through the Email Listener – We've fixed a pagination issue in the Submission Initialization Block that prevented email attachments from being ingested through Email Listener connections in some situations.

Layouts and Models

Fixed

Messaging about latest layout and model versions not being live – We've resolved a version-comparison issue that caused incorrect "Latest version is not live" warning messages to appear on the details pages for layouts and models.

39.2.7 (4 Oct 2024)

Connections

Updated

Specifying AWS regions for S3 Notifier connections – We've added an AWS Region setting to S3 Notifier Output Blocks, which allows you to specify the region of the S3 bucket that notifications are being sent to (e.g., us-west-2). Specifying a region helps to prevent location-constraint errors from occurring when attempting to connect to the notifications' S3 bucket.

39.2.6 (26 Sept 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.2.5 (13 Sept 2024)

Training Data Management

Updated

Contents of Training Data tab for Classification models – To enhance the user experience, we've made the following updates to the Training Data tab for Classification models:

  • The Summary card provides the following additional data:

    • Total uploaded pages

    • The number of pages required and recommended for the model

    • Total excluded pages

    • The number of excluded pages recommended for the model

  • We've changed the text of the add documents links to add training data to more accurately reflect their purpose.

Fixed

Classification training data in exports – We've fixed an issue that caused some Classification training data to be incorrectly assigned to excluded layouts when training data was exported.

39.2.4 (28 Aug 2024)

Training Data Management

Fixed

Responsiveness of Training Data Management (TDM) for Classification after upgrade – We've fixed a data-migration issue that caused deadlocks and delays in TDM for Classification upon upgrading to previous versions of v39.2.

"Download Classification Model and Data" action or TDM for Classification – Because Classification training data cannot be imported to v39.2 of the application, we have removed the Download Classification Model and Data option from the menu in the upper-right corner of the Training Data page for Classification models.

Selecting rows in the tables on the Training Data page for Classification models – We've fixed an issue that prevented users from selecting all of the rows in the Training Data and Excluded Training Data tables. As part of this update, you can now choose to select all rows on the current page of the table or all rows on all pages of the table. In previous versions, only the rows visible in the table could be selected, not rows on other pages of that table.

39.2.3 (19 Aug 2024)

Training Data Management

Updated

Uploading training data for Classification models – We've reduced the amount of time required to upload training for Classification models. To make this optimization possible, the system now performs pre-processing calculations after the upload process is complete and before training begins.

Contents of Training Data tab for Classification models – To enhance the user experience, we've made the following updates to the Training Data tab for Classification models:

  • The Summary card provides the following additional data:

    • Total uploaded pages

    • The number of pages required and recommended for the model

    • Total excluded pages

    • The number of excluded pages recommended for the model

  • We've changed the text of the add documents links to add training data to more accurately reflect their purpose.

Reporting

Fixed

Generating Field Exception Reports for one-month periods – We've fixed an issue that caused out-of-memory errors to occur when Field Exception Reports were generated for a one-month period in some instances. As part of this update, these reports are now exported as CSV files rather than ZIP files.

39.2.2 (1 Aug 2024)

Models

Updated

Version information on the model details pages for Field Identification models – We've removed the Version column from the Model History table on the model details page.

Custom Supervision

Fixed

Viewing full page images during Custom Supervision – We've fixed a CSS issue that caused portions of page images to be hidden in Custom Supervision. This issue prevented keyers from finding the information needed to complete Custom Supervision tasks in some situations.

Connections

Fixed

CURL_CA_BUNDLE and ActiveMQ connections – We've fixed an issue that caused the ActiveMQ Message Queue Listener and Notifier Output Blocks to fail when the CURL_CA_BUNDLE ".env" file variable did not have a value.

Security

Fixed

Addressing security vulnerabilities in basepython packages – To ensure security, we've updated the following packages to their latest versions:

  • file

  • libmagic-mgc

  • libmagic1

  • libnghttp2-14

  • unzip

39.2.1 (17 Jul 2024)

V39.2.x Known Issue

Models

[Addressed in v39.2.2] “Verion” in Model History card – In v39.2.0 and v39.2.1, the Version shown in the Model History card on the model details page does not match the trainer version the model was trained on. We are working to address this inconsistency, and a fix will be included in a future version of v39.2.

Machine Identification

Fixed

Detecting text in Semi-structured documents – We've fixed an issue that prevented the machine from both detecting text and from generating Identification Supervision tasks in certain situations. Instead, submissions would halt in the flow's Machine Identification Block.

LLM Blocks

Updated

"Completion Parameters" setting for OpenAI (ChatGPT) Block – We've added the Completion Parameters setting to the OpenAI Block, which allows you to add parameters for OpenAI’s /v1/chat/completions endpoint to your requests (e.g., {"response_format": {"type":"json_object"}}).

Audit Log

Updated

Changes to activity names – We've edited the names of some of activities for consistency and improved readability.

Authentication

Fixed

Restarting after entering SAML certificate information in “.env” file – We've fixed an issue that caused application restarts to fail after entering values for SAML_METADATA_URL and SAML_METADATA_CERT_PATH in the ".env" file. The issue affected environments with SAML configured as the primary identity provider.

Security

Fixed

Addressing security vulnerabilities – To ensure security, we've updated:

  • authlib to 1.3.1,

  • urllib3 to 1.26.19/2.2.2,

  • djangorestframework to 3.15.2.

  • cryptography to 42.0.8, and

  • certifi to 2024.7.4.

39.2.0 (9 Jul 2024)

Training Data Management

Updated

Reanalyze data – We’ve implemented a new logic for training data analysis. You can now choose one of the following two options:

  • Reanalyze without ignored anomalies — Any anomalies that were previously ignored during the annotation process will not reappear.

  • Reanalyze from scratch — Any applicable anomalies, including any that have been ignored, will be shown.

Ability to ignore anomalies – We’ve enhanced the Anomaly Detection feature by adding an Ignore Anomaly button. If a field is annotated correctly but has been flagged as an anomaly, you can click this button to prevent it from being considered an anomaly during training.

For more information, see Labeling Anomaly Detection.

Model Management page for Identification models – We’ve added more information in Training Data Management (TDM) for Identification models. You can now see the following:

  • Model Summary — Insights on the live and candidate models. The card displays the status, projected automation, and number of documents used for training. The Model Summary also includes fields and columns available in the layout and used for model training.

  • Projected Automation chart — You can see the performance of your live model directly on the Model Management page. The chart displays how the test target accuracy would affect the automation: the lower the accuracy, the higher the automation, and vice-versa.

  • Identification Report — Available only for Field Identification models, the identification report on the Model Management page displays the following details:  

    • Number of manually and machine-identified fields

    • Field identification accuracy

    • Field-level automation

  • Field/Table Level Automation — Indicates the automation percentage of the fields or columns your model was trained on.

  • Model History — Follow the history of model training for this layout in the Model History table. You can see valuable insights on each model version and determine which one is likely to perform the best, depending on your use case.

Learn more in TDM for Identification Models.

Projected Automation chart – We’ve re-enabled the Projected Automation chart in TDM for Classification. You can now see the projected percentage of automation based on your target accuracy for Classification models.

Note that the Projected Automation chart appears only if a trained or imported model is available.

Audit Log

Updated

Enhancements to activity records – We’ve expanded and improved the records of activities covered in the Audit Log.

To learn more, see Audit Log (v39.1 and later).

Flows

Updated

Merging of Top-level Flows and All Flows pages – To streamline your access to all available information about your flows, we've combined the Top-level Flows and All Flows pages into a single Flows page. The new Flows page does not contain cards for each of your top-level flows. Instead, all of your flows—whether they be top-level flows and subflows—are listed in the All Flows table. You can filter the table's contents by flow status and tags, and you can sort the flows by flow name, the date and time of last save, status, and deployment date and time.

Flows SDK

New

Exporting multiple related flows as a single ZIP bundle – We’ve created a command-line interface (CLI) tool for the Flows SDK that facilitates the management of flows created with the SDK.

With this new tool, you can export multiple flows defined in Python into a single ZIP bundle, without having to export each one separately as a JSON file. For example, you can export a top-level flow together with all of its subflows, and you can also export all the flows required to implement a complex use case. The ZIP bundle contains a JSON file for each exported flow, along with any uploaded files (e.g., CSVs) that the flows may reference via the File data type. The format of the ZIP bundle is the same as the ZIP file that the system creates when you click Export All Flows for a top-level flow in a Hyperscience instance.

Optionally, Python functions that are used by Code Blocks can also be extracted as separate Python files inside the ZIP archive instead of being incorporated into the flows' JSON files as inline code. This option makes it easier to review the Python code and reuse it as flows are edited and created.

After you've created the ZIP bundle, you can then import it from the Flows page of a Hyperscience instance. Doing so imports all the flows inside it.

Authentication

New

Machine credentialsOur API now supports parts of OAuth 2.0, the industry-standard authorization protocol, for better security. In addition, we have developed a user-friendly interface for managing these credentials. Machine credentials are designed to replace API accounts for most types of programmatic access

This feature will be enabled by default in a future version of v39.2. To learn more about machine credentials, or to enable the feature in the meantime, contact your Hyperscience representative.

Infrastructure

Fixed

Upgrading Django and its dependencies – To increase the functionality and security of your system, we've upgraded Django to 4.2.12. We've also upgraded its dependencies to the following versions:

  • botocore-stubs to 1.34.84

  • matplotlib-inline to 0.1.7

  • pyzmq to 26.0.0

  • types-awscrt to 0.20.7

  • types-s3transfer to 0.10.1

Databases

Fixed

Django and inserting rows in Oracle databases – We've fixed a Django-related issue that prevented the django.db.models.functions.Now() function from inserting rows in the databases.

Submission Retrieval Store

New

Support for Microsoft Azure Blob Storage – You can now use Azure Blob Storage as a submission retrieval store. When connected to Azure Blob Storage, the system receives file URLs from Azure, which it then uses to download the files and process them as submissions. You can configure individual flows to ingest data from a blob by editing the Submission Bootstrap settings in each flow.

To learn more about using Azure Blob Storage as a submission retrieval store, see Flow Blocks.

SaaS

New

Monitoring cloud service status – With our new cloud service status page at https://status.hyperscience.net/, you can check the health of your SaaS deployment of Hyperscience. The ability to monitor the status of your deployment in near-real time allows you to take swift action to ensure your organization meets its SLAs.

Note that the status page provides general health information for all production deployments of our SaaS offering; it does not include information about the status of specific production deployments or particular system components (e.g., databases, file stores).

To learn more about the cloud service status page, see SaaS Service and Support.

39.1.5 (1 Aug 2024)

Machine Identification

Fixed

Detecting text in Semi-structured documents – We've fixed an issue that prevented the machine from both detecting text and from generating Identification Supervision tasks in certain situations. Instead, submissions would halt in the flow's Machine Identification Block.

Custom Supervision

Fixed

Viewing full page images during Custom Supervision – We've resolved a CSS issue that caused portions of page images to be hidden in Custom Supervision. This issue prevented keyers from finding the information needed to complete Custom Supervision tasks in some situations.

LLM Blocks

Updated

"Completion Parameters" setting for OpenAI (ChatGPT) Block – We've added the Completion Parameters setting to the OpenAI Block, which allows you to add parameters for OpenAI’s /v1/chat/completions endpoint to your requests (e.g., {"response_format": {"type":"json_object"}}).

Connections

Fixed

CURL_CA_BUNDLE and ActiveMQ connections – We've fixed an issue that caused the ActiveMQ Message Queue Listener and Notifier Output Blocks to fail when the CURL_CA_BUNDLE ".env" file variable did not have a value.

Informative error messages from UiPath Notifier Output Blocks – We've resolved an exception-handling issue In UiPath authentication that made it more difficult to troubleshoot failures in the UiPath Notifier Output Block.

Authentication

Fixed

Restarting after entering SAML certificate information in “.env” file – We've fixed an issue that caused application restarts to fail after entering values for SAML_METADATA_URL and SAML_METADATA_CERT_PATH in the ".env" file. The issue affected environments with SAML configured as the primary identity provider.

39.1.4 (3 Jul 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.1.3 (20 Jun 2024)

Models

Fixed

Using undeployed Identification models – We've fixed an issue that resulted in the continued use of undeployed Identification models in submission processing in some circumstances.

Training Data Management

Updated

Projected Automation chart – We’ve added the Projected Automation chart in TDM for Classification. You can now see the projected percentage of automation based on your target accuracy for Classification models.

Note that the Projected Automation chart is available only if a trained or imported model is available.

Flexible Extraction

Fixed

Transcribing fields in manually reclassified Structured documents – We've resolved an issue that prevented fields from appearing in Flexible Extraction tasks for Structured documents that had been manually reclassified.

Security

Fixed

Addressing security vulnerabilities – To increase the functionality and security of your system, we've upgraded:

  • requests to 2.32.2,

  • docker to 7.1.0,

  • types-requests to 2.31.0.6, and

  • idna to 3.7.

39.1.2 (6 Jun 2024)

Models

Fixed

Memory consumption by flows containing several models – We've fixed a memory-consumption issue that caused runtime errors in the Machine Identification Block when processing submissions in a flow for the first time. The issue occurred in flows containing several models, particularly models for nested tables.

Flow Blocks

Updated

Reprocessing Block optimizations – We've updated the implementation of the Reprocessing Block to allow it to both gather available input and generate tasks dynamically, increasing its overall efficiency.

Connections

Fixed

Filename output of attachments ingested through the Email Listener – We've fixed a character-decoding issue that caused filenames of email attachments to appear incorrectly in the output of the Email Listener. The issue occurred when the filenames contained Unicode characters.

SaaS

Fixed

Training of Table Identification models – We've resolved a memory-leak issue that caused the training of Table Identification models to take longer than expected and ultimately fail. The issue primarily affected models for tables containing large amounts of data (e.g., nested tables, tables with a large number of columns).

39.1.1 (23 May 2024)

Training Data Management

Fixed

Showing anomalies in table annotations – We've fixed an issue that prevented detected anomalies in table annotations from being shown in the application in some situations.

Flows

Fixed

Steps in importing flows – We've resolved an issue that resulted in 400 errors at various points in the flow-import process in some instances. The issue was caused by a mismatch between the next step in the process and the step associated with the passed transaction ID.

Machine Classification

Fixed

Image Correction for documents with large, dark areas – We've fixed an issue that prevented Image Correction from detecting the incorrect orientation of documents that contained large, dark areas (e.g., images of checks on a dark background).

Machine Identification

Fixed

Asterisk as custom character for splitting segments – We've resolved an issue with text-segment detection that resulted in IndexError: list index out of range during Machine Identification. The issue occurred when custom_char_for_splitting_segments was set to * in /admin.

Connections

Fixed

Testing for Apex classes in Salesforce Listener connections – We've fixed an issue where test connections for the Salesforce Listener did not verify the presence of Apex classes permissions. The issue resulted in successful test connections for the Salesforce Listener, even when the Listener was not able to ingest submissions.

Authentication

Fixed

Authenticating through LDAP – We've addressed a race condition that prevented users from authenticating through LDAP in some situations.

Kubernetes

Fixed

Supporting multiple Python versions – We've fixed an issue related to our support of multiple Python issues that prevented Kubernetes deployments from starting.

SaaS

Fixed

Passing credentials to authentication functions – We've resolved an issue that prevented authentication functions from being called with a username, password or both. This issue resulted in session-token failures.

39.1.0 (7 May 2024)

Submission Processing

Updated

Using Ghostscript to process PDFs – To reduce the time required to process PDF files in submissions, we've updated the system to use Ghostscript to process PDFs by default rather than Mutool. Mutool is now used as a fallback option only. If necessary, you can reverse the order in which these tools are used by editing the PDF_PAGINATION_LIBS variable in your ".env" file. In previous versions, the system preferred Mutool over Ghostscript, and there was no option to change that preference.

For more information about these updates, see Processing PDFs.

Training Data Management

Updated

Document Anomaly filter - We’ve introduced a Document Anomaly filter in Training Data Management (TDM), which allows you to find documents with anomalies or Model Validation Tasks (MVTs). By activating the filter, you can focus solely on affected documents. Anomalies remain visible until you re-analyze your data, so your team members can review the anomalies at their pace. These improvements help you efficiently address anomalies and improve overall data quality.

Fixed

Page links in warning messages for tables - We've addressed an issue in table-annotation pages in TDM where clicking a page link in a warning message didn't change the focus of the middle panel to the selected page. With this update, you can directly access the mentioned page by clicking its link in the right-hand panel while annotating tables.

Flows

New

Support for Python 3.11 – In preparation for Python 3.9's end-of-life in October 2025, we now support the use of Python 3.11 in custom code, custom flows, and Python packages, as well as the use of Python 3.9. This support ensures the continued security and reliability of your operations as you upgrade your code and flows to use Python 3.11.

The default flows included in Hyperscience have been updated to use Python 3.11. While you can use either or both Python versions in Hyperscience v39.1, we recommend upgrading the entirety of your flows' code to use Python 3.11 as soon as possible.

More information about Python 3.11 support can be found in the PythonBlock section of the Flow SDK’s Source Documentation.

Multiple Python package versions – Hyperscience now supports multiple versions of Python to be used by code blocks in order to ease the transition between upgrades of the Python runtime. On the Python Packages page (Flows > Python Packages), you can see lists of installed packages for Python 3.9 and for Python 3.11. The page also provides a summary of all installed packages and indicates their compatibility with Python 3.9 and 3.11. ​​With these updates, you can see exactly which Python packages need to be updated to Python 3.11 while maintaining your existing flows and packages.

To learn more about managing Python packages, see Developing Flows.

Flows SDK

Updated

Flows SDK for v39.1 – We’ve released an updated version of our Flows SDK for v39.1, which includes several enhancements to subflows to make them easier to use.

More information about the new Flows SDK version can be found in our Flows SDK documentation.

Custom Supervision

Updated

Mandatory transcription fields – You can now mark transcription fields as mandatory when configuring Custom Supervision tasks. Keyers cannot complete tasks with these fields until they provide transcriptions for them.

Note that checkbox and signature fields cannot be mandatory fields in Custom Supervision.To learn more about configuring Custom Supervision tasks, see our Flows SDK documentation.

Audit Log

Updated

Increased logging of actions in the Audit Log - Audit logging offers enhanced visibility, allowing you to track and monitor actions in your system more effectively. This tracking helps your organization to ensure adherence to security and compliance measures. We’ve expanded the coverage of our Audit Log, starting with SaaS customers, by recording more activities, such as events related to Training Data Management, settings changes, and access to PII data.

To view the Audit Log, go to the Audit Log page (Administration > Audit Log). You can filter the list of activities by date range, activity name, operator (human or machine), and username. You can also download a CSV file of the filtered list, which contains all of the information shown on the page.

For detailed descriptions of all activities in the Audit Log, see Audit Log (v39.1).

Security

New

Security restrictions on lambda functions – To increase the security and reliability of flows, we've added restrictions to the operations which can be executed in lambda functions.

39.0.29 (25 Apr 2025)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.0.28 (11 Apr 2025)

Version 39.0.27 was not released and is not supported.

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.0.26 (26 Mar 2025)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.0.25 (6 Mar 2025)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.0.24 (18 Feb 2025)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.0.23 (3 Feb 2025)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.0.22 (17 Jan 2025)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.0.21 (5 Dec 2024)

Flow Blocks

Updated

"Scope" setting for HTTP REST Blocks – We've added a Scope setting to HTTP REST Blocks, which allows you to specify a scope for requests authorized with OAuth 2.0. This setting is available only if the block's Authorization Type is set to OAuth 2.0 Client Credentials.

39.0.20 (20 Nov 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.0.19 (24 Oct 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.0.18 (9 Oct 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

39.0.17 (24 Sept 2024)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

By default, TVEs (POCs) of v39.0.17 can be run on IPv4 hosts only. To run a TVE of v39.0.17 on an IPv6 host, update the value of FORMS_DB_HOST in the instance's “.env” file as follows:

FORMS_DB_HOST=::1

39.0.16 (13 Sept 2024)

Training Data Management

Fixed

Classification training data in exports – We've fixed an issue that caused some Classification training data to be incorrectly assigned to excluded layouts when training data was exported.

39.0.15 (27 Aug 2024)

Training Data Management

Fixed

Responsiveness of Training Data Management (TDM) for Classification after upgrade – We've fixed a data-migration issue that caused deadlocks and delays in TDM for Classification upon upgrading to previous versions of v39.2.

"Download Classification Model and Data" action or TDM for Classification – Because Classification training data cannot be imported to v39.2 of the application, we have removed the Download Classification Model and Data option from the menu in the upper-right corner of the Training Data page for Classification models.

Selecting rows in the tables on the Training Data page for Classification models – We've fixed an issue that prevented users from selecting all of the rows in the Training Data and Excluded Training Data tables. As part of this update, you can now choose to select all rows on the current page of the table or all rows on all pages of the table. In previous versions, only the rows visible in the table could be selected, not rows on other pages of that table.

39.0.14 (16 Aug 2024)

Training Data Management

Updated

Uploading training data for Classification models – We've reduced the amount of time required to upload training for Classification models. To make this optimization possible, the system now performs pre-processing calculations after the upload process is complete and before training begins.

Contents of Training Data tab for Classification models – To enhance the user experience, we've made the following updates to the Training Data tab for Classification models:

  • The Summary card provides the following additional data:

    • Total uploaded pages

    • The number of pages required and recommended for the model

    • Total excluded pages

    • The number of excluded pages recommended for the model

  • We've changed the text of the add documents links to add training data to more accurately reflect their purpose.

Reporting

Fixed

Generating Field Exception Reports for one-month periods – We've fixed an issue that caused out-of-memory errors to occur when Field Exception Reports were generated for a one-month period in some instances. As part of this update, these reports are now exported as CSV files rather than ZIP files.

39.0.13 (1 Aug 2024)

Connections

Fixed

CURL_CA_BUNDLE and ActiveMQ connections – We've fixed an issue that caused the ActiveMQ Message Queue Listener and Notifier Output Blocks to fail when the CURL_CA_BUNDLE ".env" file variable did not have a value.

Informative error messages from UiPath Notifier Output Blocks – We've resolved an exception-handling issue In UiPath authentication that made it more difficult to troubleshoot failures in the UiPath Notifier Output Block.

39.0.12 (17 Jul 2024)

Machine Identification

Fixed

Detecting text in Semi-structured documents – We've fixed an issue that prevented the machine from both detecting text and from generating Identification Supervision tasks in certain situations. Instead, submissions would halt in the flow's Machine Identification Block.

Authentication

Fixed

Restarting after entering SAML certificate information in “.env” file – We've fixed an issue that caused application restarts to fail after entering values for SAML_METADATA_URL and SAML_METADATA_CERT_PATH in the ".env" file. The issue affected environments with SAML configured as the primary identity provider.

39.0.11 (3 Jul 2024)

Submission Processing

Updated

Using Ghostscript to process PDFs – To reduce the time required to process PDF files in submissions, we've updated the system to use Ghostscript to process PDFs by default rather than Mutool. Mutool is now used as a fallback option only. If necessary, you can reverse the order in which these tools are used by editing the PDF_PAGINATION_LIBS variable in your ".env" file. In previous versions, the system preferred Mutool over Ghostscript, and there was no option to change that preference.

For more information about these updates, see Processing PDFs.

Models

Fixed

Using undeployed Identification models – We've fixed an issue that resulted in the continued use of undeployed Identification models in submission processing in some circumstances.

39.0.10 (20 Jun 2024)

Training Data Management

Updated

Projected Automation chart – We’ve added the Projected Automation chart in TDM for Classification. You can now see the projected percentage of automation based on your target accuracy for Classification models.

Note that the Projected Automation chart is available only if a trained or imported model is available.

Flexible Extraction

Fixed

Transcribing fields in manually reclassified Structured documents – We've resolved an issue that prevented fields from appearing in Flexible Extraction tasks for Structured documents that had been manually reclassified.

Flows SDK

Fixed

Transcribing text from non-Latin language families with the Full Page Transcription Block – We've fixed an issue where the Full Page Transcription Block did not recognize submissions' language-family information, which prevented it from transcribing text from non-Latin language families.

Security

Fixed

Addressing security vulnerabilities – To increase the functionality and security of your system, we've upgraded:

  • requests to 2.32.2,

  • docker to 7.1.0,

  • types-requests to 2.31.0.6,

  • transformers to 4.39.2, and

  • idna to 3.7.

39.0.9 (6 Jun 2024)

Operating Systems

Updated

RHEL 9 support – We now support the use of RHEL 9.

Flow Blocks

Updated

Reprocessing Block optimizations – We've updated the implementation of the Reprocessing Block to allow it to both gather available input and generate tasks dynamically, increasing its overall efficiency.

Machine Classification

Fixed

Image Correction for documents with large, dark areas – We've fixed an issue that prevented Image Correction from detecting the incorrect orientation of documents that contained large, dark areas (e.g., images of checks on a dark background).

Connections

Fixed

Filename output of attachments ingested through the Email Listener – We've fixed a character-decoding issue that caused filenames of email attachments to appear incorrectly in the output of the Email Listener. The issue occurred when the filenames contained Unicode characters.

39.0.8 (23 May 2024)

Models

Fixed

Compatibility of v39.1 Latin Table Identification models with v39 – We've resolved an issue where Latin Table Identification models created in v39.1 could not be loaded into instances running v39.

Training Data Management

Fixed

Showing anomalies in table annotations – We've fixed an issue that prevented detected anomalies in table annotations from being shown in the application in some situations.

Flows

Fixed

Steps in importing flows – We've resolved an issue that resulted in 400 errors at various points in the flow-import process in some instances. The issue was caused by a mismatch between the next step in the process and the step associated with the passed transaction ID.

Manual Classification

Fixed

Rotating images in flows created in v36 – We've fixed an issue that caused submissions in v36 flows to halt when any of their page images were rotated during Manual Classification.

Kubernetes

Fixed

Supporting multiple Python versions – We've resolved an issue related to our support of multiple Python issues that prevented Kubernetes deployments from starting.

SaaS

Fixed

Passing credentials to authentication functions – We've fixed an issue that prevented authentication functions from being called with a username, password or both. This issue resulted in session-token failures.

39.0.6 (19 Apr 2024)

User Experience

Fixed

Month labels for calendars in date filters – We’ve addressed an issue where incorrect month labels for calendars in date filters were shown to users in time zones that were different from the application’s time zone. This issue was caused by inconsistencies in timezone handling.

PII Data Deletion

Fixed

PII data deletion for QA records after upgrading from v37 or earlier – We’ve resolved an issue where field QA records were deleted during the PII-deletion process, while QA records for table cells were not. This issue occurred after upgrading from v37 or earlier to v39.0.1-v39.0.4. In v39.0.5 and later, both field and table-cell QA records are included in the PII-deletion process after upgrading.

39.0.5 (16 Apr 2024)

Flows

Fixed

Processing submissions in v37 flows in v39 instances – We’ve fixed a model-retrieval issue that caused submissions to halt when they were processed in a v37 flow in a v39 instance. With this update, we’ve ensured the forward compatibility of models created in the two versions prior to the application version.

Llama Blocks

Updated

Mistral 7B Instruct model – We've updated the Llama Block to use the Mistral 7B Instruct model rather than Llama2.

PII Data Deletion

Fixed

PII data deletion for QA records – We’ve resolved an issue where field QA records were deleted during the PII-deletion process, while QA records for table cells were not. In 39.0.5 and later, both field and table-cell QA records are included in the PII-deletion process.

Authentication

Fixed

Logging in to instances with LDAP authentication – We've fixed an issue that prevented users from logging in to instances with LDAP authentication in some situations.

Kubernetes

Fixed

Frontend replicas in on-premise Kubernetes deployments – We've resolved an issue that caused on-premise instances that were running Kubernetes and using two or more frontend replicas to become unstable in some situations.

39.0.4 (11 Apr 2024)

This version contains a critical issue and is not supported.

Transcription Models

Fixed

Setting thresholds for transcription models – We've fixed an issue that prevented users from modifying threshold settings for transcription types, even when the Transcription models for those transcription types were disabled. With this update, after running training and disabling Transcription models, you can freely adjust thresholds for the models’ transcription types in flow settings as expected.

Flows

Fixed

Deploying subflows after upgrade – We've resolved an issue that prevented subflows from being deployed after an upgrade. In these situations, only the main flow was deployed after an upgrade, while its attached subflows remained undeployed. This issue caused inconsistencies in records of deployed flow versions when changes were made to subflows. In v39.0.4 and later, any updates to subflows flows are automatically deployed alongside the main flow upon saving, ensuring all components are properly deployed after upgrades.

File Storage

Fixed

Database-batch refresh times and instances with S3 file stores – We’ve resolved an issue related to prolonged database-batch refresh times that caused errors instances with S3 file stores. Additionally, minor logging enhancements have been made for the check_s3_consistency command, including tracking the command run-time, adjusting log levels for ignored inconsistencies, and providing "time since creation" for inconsistent objects, aiding in the identification of false positives.

39.0.3 (8 Apr 2024)

This version contains a critical issue and is not supported.

Languages

Fixed

Machine transcription of Korean multiline fields – We’ve fixed an issue that caused machine transcription to fail on multiline Korean fields. Examples of incorrect transcriptions and their corrected versions appear below.

  • Incorrect: 원전세배당소득 | Correct:  이자.배당소득 원천세

  • Incorrect: 이장소백당소득 | Correct: 이자.배당소득 지방소득세

File Storage

Fixed

Folder creation and inode depletion – We’ve fixed an issue where the file store created folders unnecessarily, which depleted inodes and caused disk-space errors. We've divided files into leaf folders for easier management, improving the file system's structure, preventing overcrowding, and ensuring a more balanced file distribution. As a result, this update minimizes the risk of performance issues.

Databases

New

File-version management for time zone files in Oracle databases – To ensure compatibility between server and client versions, we’ve implemented file-version management for Oracle time zone files. Oracle requires matching time zone file versions on both server and client. If the versions differ, the application may fail to start with error ORA-01805: possible error in date/time operation.

39.0.2 (29 Mar 2024)

This version contains a critical issue and is not supported.

Flows

Updated

​​Floating-point values for SDM_BLOCKS_TASK_POLL_INTERVAL and HYPERFLOW_ENGINE_TASKS_POLL_INTERVAL_SECONDS – In addition to integer values, you can also enter values of type float for the SDM_BLOCKS_TASK_POLL_INTERVAL and HYPERFLOW_ENGINE_TASKS_POLL_INTERVAL_SECONDS ".env" file variables. This update gives you flexibility when customizing your submission-processing latency, particularly if low latency levels are desired.

Quality Assurance

Fixed

Logic for automatic QA sampling rates – We’ve fixed a dereferencing issue that caused silent failures for flows with specific default settings. As part of this update, the logic accurately handles dereferencing, ensuring proper handling of affected flows. The issue affected flows created in Hyperscience versions that preceded the application version.

Audit Log

Fixed

Filtering by multiple users or activities – We’ve fixed an issue where filtering the Audit Log by multiple users or activities showed zero results, despite the presence of matching records. With this update, the Audit Log’s filters correctly display results when multiple users or activities are selected.

39.0.1 (13 Mar 2024)

This version contains a critical issue and is not supported.

Version 39.0.0 was not released and is not supported.

Platform Bundles

Updated

New download link – For v39.0.1 and later, contact your Hyperscience representative or our Support team to receive an up-to-date, account-specific download link for the latest version of Hyperscience. Because download bandwidth is metered, we recommend downloading the bundle once and then installing it on each individual machine in your instance.

Licenses

New

Licenses for on-premise instances – Beginning in v39, we require an instance-specific license key to be entered on the System Health page (Administration > System Health) of each on-premise instance. When you request a license key from your Hyperscience representative, you need to indicate which environment the license key will be applied to (e.g., production, UAT), and you need to provide the instance's URL. Each key has an expiration date and grace period associated with it.

You must enter a license key when deploying a new instance or upgrading an existing instance to v39. After you enter your license key in the application, you can check its status (e.g., active, expiring, expired) at any time on the System Health page. Users in the System Admin permission group are also informed of expiring, extended (in grade period), and expired licenses upon logging in to the application.

When an instance's license has expired, the application becomes unusable, and sending requests to the Hyperscience API results in 403 errors. The only action that can be performed in those instances is entering a new license key. Upon entering a new key, the application's functionality is restored.

For more information about license keys, see License Keys.

Languages

New

Support for Simplified Chinese Semi-structured documents – We now support data identification and extraction for Semi-Structured documents with text printed in Simplified Chinese. With this update, users can now annotate printed documents of any type written in Simplified Chinese and apply our full-page-transcription capabilities to these documents.

Updated

Enhancements to Korean-English text extraction – We’ve optimized the Korean-English extraction of medical data. Specifically, we’ve improved the transcription accuracy of text written in Batang, Gulim, and Gothik fonts.

To learn more about Korean-English and our supported languages, see Supported Languages.

Submissions

Updated

Support for EML files and their attachments – You can now extract data from EML files and their attachments. When an EML file is ingested, the system creates a PDF file from the email's body and processes each of the file's attachments as a separate document in the submission.

More information about supported file types can be found in our What is a Submission? article.

Submission Processing

Fixed

Recognizing long strings of text with no spaces – We've fixed an issue that prevented the system from recognizing entire strings of text when the strings contained up to 70 characters with no spaces.

Training

Fixed

Performance of Table Identification models and select Field Identification models – We've resolved an issue that caused the performance of Table Identification models and Multiple Occurrence and Generic Freeform Text models for Field Identification to decrease. As part of this update, a feature that stopped training when the system determined that additional training would not improve model performance has been disabled by default.

Training Data Management

New

Incremental training for Identification models – Incremental training increases the speed of the training process between iterations as your team addresses anomalies or adds more data to training sets. This feature allows you to retrain your models using the latest version of the model as a starting point. Hyperscience recommends one of the following options based on the dataset analysis and automatically selects it.

  • Train from scratch — This option requires more time to retrain your model. Selecting it restarts the training from the very beginning using all eligible documents. Learn more about document eligibility in Document Eligibility Filtering. Choose this option if:

    • you have uploaded new, diverse documents to your training set, or

    • you’ve made changes during the annotation process, such as adding new fields or updating how existing ones are annotated. Note that consistency throughout the training set is crucial to creating a high-performance model.

  • Train from last training — This option is recommended if you want to improve your model performance in one of the following ways:

    • addressing anomalies after training data analysis

    • enriching your current training set by adding more examples of your documents. The training starts from the last active version of your model.

      • Using this option when you’ve made significant changes to your training data will result in a poor model performance.

Note that this feature can be used only for models trained on v37 and above.

More information about incremental training can be found in Retraining Existing Models.

Training Data Management for Classification Models – We’ve introduced Training Data Management for Classification. This feature enables clients to add, remove, and update documents used to train Classification models to achieve more accurate results.

For more details, see Training Data Management for Classification.

Flows

Updated

More submission data in output of on-error flows – If an on-error flow is run during the processing of a submission, the output of that flow contains the ID of the halted submission, along with details on why the submission halted. Exposing this information enables flow developers to automate remedial actions, and it allows for faster identification of halted submissions.

To learn more about on-error flows, see On-Error Flows.

Flow Blocks

New

Entity Recognition Block – We've combined the functionality of the Named Entity Recognition (NER) and Custom Entity Detection (CED) Blocks into the new Entity Recognition Block. This block can serve as the foundation for both entity-extraction and entity-redaction capable flows.

  • You can configure the block to detect specific data points (e.g., personally identifiable information), ensuring that submissions' output meets the requirements established by FOIA, GDPR, HIPAA, and more. When you combine the Entity Recognition Block’s output with Hyperscience’s data-redaction capabilities, you can use raw submission output for purposes like marketing, product development, and fraud prevention without needing to sanitize the data downstream.

  • The Entity Recognition Block can also extract specific types of data from submissions, even  in cases where customers require more training data for their Identification models. This feature makes the Entity Recognition Block ideal in situations where submissions must be processed as quickly as possible after implementation.

Custom Supervision

New

Freeform text fields in Custom Supervision – Keyers can now add freeform-text metadata for individual documents during Custom Supervision tasks. For example, if each of a flow's documents should be merged to a specific case, you can add a "Merge to Case" field to your Custom Supervision interface. Then, keyers can enter a case number as a value for that field. That case number can be used by other blocks or by downstream systems to complete the processing of the document.

Note that only one value can be added for each freeform text field. Each field name can contain up to 300 characters, and each value can contain up to 500 characters.

To learn how to add freeform text fields to your Custom Supervision tasks, see our Flows SDK documentation.

Reporting

New

Automated usage and settings transmission for on-premise instances – Hyperscience requires customers to send the Usage Report for their instances. Starting in v39, if your on-premise instance is connected to the internet, it will send us the Usage Report automatically.

The Usage Report that Hyperscience receives has sensitive information redacted (e.g., proxy credentials). Report-transmission details can be found in the Audit Log.

You can use ".env" file variables to specify a proxy URL, username, and password. You may also need to add IP addresses to your system's list of allowed addresses.

Note that Usage Reports automatically transmitted to Hyperscience do not contain the additional user-interaction metrics described below.

More details about automatic transmissions of usage and settings information can be found in Automatic Transmissions of the Usage Report.

Updated

User-interaction metrics in the Usage Report – We’ve enhanced the metrics of the Usage Report (Reporting > Usage) to include an overview of users’ interactions with key features in the application.

The new metrics can be found in the following files, which are in the product_analytics folder in the downloaded bundle for the report:

  • block_counts.csv

  • connector_counts.csv

  • db_entity_counts.csv

For more information about the Usage Report and its contents, see Usage Report.

Connections

New

S3 Listener – We've added an S3 Listener option to Input Blocks, which allows you to ingest files and their metadata directly from an S3 bucket. This update reduces implementation time for S3 file retrieval, as it eliminates the need to configure an Amazon SQS connection in a Message Queue Listener or send API requests.

The connection supports the use of AWS Identity and Access Management (IAM) credentials to access the bucket, and it offers a variety of configuration options. For example, you can configure the Listener to retrieve files with specific extensions or files that have not been modified for a certain length of time. You can also specify how often files should be retrieved from the bucket.

As the S3 Listener retrieves submission files, it automatically moves ingested files to an archive bucket.

To learn more about the S3 Listener and how to configure it, see S3 Listener.

S3 Notifier Output Block – With the addition of the S3 Notifier Output Block, you can send submission data to the S3 bucket and folder of your choosing.

The connection allows you to use AWS IAM credentials to access the bucket. The Notifier can create a single JSON for all of a submission's processed documents, individual JSON files for each processed document, or individual JSON files for each document matched to a layout and a JSON file for each unmatched page. You can also choose whether to send all of a submission's data or only high-level data.

For more information about the S3 Notifier and its configuration options, see S3 Notifier.

API

New

Retrieving license key information – You can retrieve data about your instance's license key via our API.

Updated

correlation_id query parameter for Listing Submissions – We've added a correlation_id query parameter to the Listing Submissions endpoint. This parameter allows you to filter for the Submission that was processed in Flow Runs with the specified correlation_id.