V35 Release Notes

Prev Next

35.0.29 (21 Nov 2023)

Reporting

Fixed

Accounting for differences between browser and server times – We've updated our task-completion-time calculations to account for the difference between the server's timestamp and the browser's timestamp.

35.0.28 (11 Oct 2023)

Submissions

Fixed

Opening and uploading submissions – We've resolved an image-processing issue that caused the system to run unnecessary queries on Semi-structured document pages. This issue resulted in delays in opening and uploading submissions in some instances.

Security

Fixed

Addressing security vulnerabilities – To ensure security, we've updated:

  • sentry-sdk to 1.14.0,

  • Pygments to 2.15.0, and

  • cryptography to 41.0.3.

35.0.27 (30 Aug 2023)

Data Types

New

Capitalized Names – We've added a Capitalized Names data type that expects names that have the first letter of each name (e.g., first name and last name) capitalized.

Training

Fixed

Field Identification training and bounding-box coordinates – We've fixed an issue that caused Field Identification training to fail if a document's bounding boxes had certain coordinates.

Releases

Fixed

Loading of Releases page – We've optimized the loading of the Releases page (Library > Releases), resolving an issue that prevented the page from being displayed in some instances.

UiPath Notifier

Fixed

Default authentication method – We've fixed an issue that caused OAuth to be the default authentication method for UiPath Notifier connections. The issue caused flows that used Basic Authentication for these connections to fail.

Reporting

Updated

Definition of dt_started – We've changed dt_started from the time the task was first assigned to the time when the task was opened. This update creates a more accurate measurement of the time taken to complete tasks.

Security

Fixed

Updating Django – To address security vulnerabilities, we've updated Django to 3.2.20.

35.0.26 (10 Aug 2023)

Submission Processing

Fixed

Duplicate submission-processing tasks – We've fixed a race condition in our task-synchronization manager that sometimes caused internal tasks to be executed more than once for a submission, resulting in data corruption.

Flows

Fixed

Subprocesses from pagination – Previously, pagination tasks sometimes created subprocesses that wouldn't time out if they couldn't be completed. To resolve this issue, we've added timeouts to these subprocesses.

Machine Transcription

Updated

Optimizations for English text in OCR-A font – We've made some enhancements that improve transcriptions of English text typed in the OCR-A font.

Security

Fixed

Addressing security vulnerabilities – To ensure security, we've updated certifi to 2023.7.22 and pyJWT to 2.7.0.

35.0.25 (28 Jul 2023)

Trainer 

Fixed

Displayed task statuses for completed tasks – We’ve fixed a UI issue that displayed the trainer’s task status as “Running” after completion on the Trainer page (Administration > Trainer).

Document Classification

Updated

Displaying Submission ID – We’ve added the Submission ID to the top of the page for easier traceability of your uploads (“Document Classification: Submission ”).

Transcription Supervision

Fixed

ResizeObserver loop exceeded error in Chrome on Macs – We've fixed an issue that caused ResizeObserver loop exceeded errors to occur during Transcription Supervision in Mac Chrome browsers in some instances.

Databases

Fixed

Notifications and deadlocks – We've resolved an issue that caused database deadlocks to occur if the user and system made changes to notifications at the same time.

Security

Fixed

Addressing security vulnerabilities – To ensure security, we've updated paddlepaddle to 2.4.2 and pillow to 9.5.0.

35.0.24 (22 Jun 2023)

Training

Fixed

Field Identification training times for large datasets – We've cached the grouping predictions that the trainer generates during Field Identification training, helping to minimize training times for instances with large datasets.

File Storage

Fixed

Sanitizing filename headers – We've fixed a data-sanitization issue in HTTP filename headers that prevented files from being downloaded to the file store. 

​​Running run.sh init as root – We've fixed an issue that caused HS_PATH/media to be created as root when run.sh init was run as root and the HS_PATH/media directory was missing. This issue prevented the application from initializing the file store.

As part of this update, the system verifies the presence and ownership of HS_PATH/media when run.sh init is executed. To learn more, see File Storage Overview.

​​Security

Fixed

Updating requests – To address security vulnerabilities, we've updated requests to 2.31.0.

35.0.23 (5 Jun 2023)

​​Manual Classification

Updated

Unrecognized pages – We've added validations to Manual Classification tasks to prevent the error 'NoneType' object has no attribute 'uuid' from occurring, which results in halted submissions. The error is caused by a page-matching issue in which images are considered missing as the submission moves through the flow.

Output Blocks

Updated

OAuth support for UiPath Notifier Output Blocks – We've added support for OAuth connections in UiPath Notifier Output Blocks. To configure OAuth as an authentication method, you need to create an application grant in UiPath and enter the application's ID and secret in the block's settings.

For more information, see UiPath Notifier.

Security

Fixed

Addressing security vulnerabilities – To ensure security, we've updated:

  • sqlparse to 0.4.4 and

  • the version of Golang used to compile Filebeat to 1.20.4.

35.0.22 (16 May 2023)

Flow Blocks

Updated

Generating checksums for individual blocks – We now generate checksums of each block's command file, which are used to identify the blocks in the database and prevent duplicate blocks from being uploaded.

35.0.21 (26 Apr 2023)

Submissions

Updated

Maximum number of files per upload – We've increased the default maximum number of files per upload from 100 to 1000. This value can be customized with the DATA_UPLOAD_MAX_NUMBER_FILES ".env" file variable.

Security

Fixed

Addressing security vulnerabilities – To ensure security, we've updated:

  • com.fasterxml.jackson.core:jackson-databind to 2.14.2,

  • json5 to 2.2.3,

  • esplint to 0.10.1,

  • mocha to 10.2.0, and

  • webpack to 5.77.0.

35.0.20 (4 Apr 2023)

Classification Supervision

Fixed

User interface for Classification Supervision tasks We've made the following fixes to the Classification Supervision user interface:

  • We've widened the right-hand panel, enlarging the image of the page being categorized.

  • We've fixed an issue that caused the screen to flicker each time a keyer clicked on a thumbnail in the left-hand panel.

  • We've resolved an issue that caused the right-hand panel to be hidden when a keyer clicked on a page group in the middle panel.

Reporting

Fixed

Reporting time spent on Classification Supervision tasks We've fixed an issue that prevented time spent on Classification Supervision tasks from being reported in Document Classification Supervision Time Spent (Seconds). This metric appears in the KeyerPerformance.csv file in the Keyer Projection Report and previously had a value of 0.

Security

Fixed

Updating com.fasterxml.jackson.core:jackson-databind To address security vulnerabilities, we've updated com.fasterxml.jackson.core:jackson-databind to 2.14.2.

35.0.19 (31 Mar 2023)

Submission Processing

Fixed

Splitting a page's text into segments when "0" is a segment's only character We've fixed an issue that prevented a page's text from being split correctly into text segments when segments contained only the "0" character. This issue caused processing delays and excessive memory usage.

Machine Classification 

Fixed

Storing pre-calculations for classifying Structured documents We've resolved an issue that caused invalid memory alloc request errors when the system attempted to store pre-calculated values for the release's Structured layout variations in the database. The issue affected instances with PostgreSQL databases.

SaaS

Updated

Giving users access to /admin in deployments without AWS ALB You can now give /admin access to users whose email addresses are from a specific domain. To do so, add the HS_ADMIN_EMAIL_DOMAIN variable to your ".env" file, and set its value to the email domain you want to give /admin access to (HS_ADMIN_EMAIL_DOMAIN="example.com"). This update applies only to deployments without AWS ALB authentication.

35.0.18 (22 Mar 2023)

Classification Supervision

Updated 

Enhancements to the user experience We've made the following improvements to the Classification Supervision user experience:

  • We've added a Select All link to the top of the left-hand sidebar.

  • We've added an Ungroup All Documents option to the menu at the top of the middle column.

  • We've changed the title of the Create New button at the bottom of the left-hand sidebar to Create New Doc.

  • We've added a page count under the “Uncategorized” header in the left-hand sidebar.

  • We've added a document count under the “Grouped Documents” header in the middle column.

SaaS

Fixed

"API Access" tab in the Users section for deployments without AWS ALB We've fixed an issue that caused the API Access tab to appear in the Users section of the application in deployments that did not use AWS ALB authentication.

35.0.17 (17 Mar 2023)

Security

Fixed

Updating paddlepaddle To address security vulnerabilities, we've updated paddlepaddle to 2.3.2.

35.0.16 (15 Mar 2023)

Training

Fixed

Training Field Identification models on documents with one text segment We've fixed an issue that caused training to fail in some situations where:

  • at least one document contained only a single text segment, and

  • that segment was part of a field's value.

35.0.15 (13 Mar 2023)

Classification Supervision

Fixed

"Perform Tasks" link in Submissions table We've fixed an issue that prevented the Perform Tasks link from appearing in the Submissions table for submissions with Classification Supervision tasks. The issue affected submissions whose first page was classified by the machine.

SaaS

Fixed 

Authentication and SaaS features when AWS ALB is not used We've resolved an issue that prevented users from authenticating in some situations when a method other than AWS ALB was used. This issue also caused some SaaS-specific features to be disabled in affected instances.

35.0.14 (3 Mar 2023)

Keyer Data Management

Fixed

Duplicate pages after training We've fixed an issue that caused pages to be duplicated after their documents were used for training. The issue affected documents that contained at least one empty page. 

Artifacts

Updated

Logging of artifact-export events We've changed the severity of the following events from exceptions to warnings in the logs:

  • Missing artifacts list

  • Missing storage type

  • Missing destination

Permissions

Fixed

Logging in without assigned user groups or permissions Previously, if a user was not assigned to a user group in an identity provider (IdP), or if they were assigned to an IdP user group that did not have any permissions, they could log in to the application, but they could not log out. There was also no messaging to let the user know what they needed to do to resolve the issue. A fix for these issues is included in v35.0.14.

35.0.13 (22 Feb 2023)

SaaS

Fixed

Updating users' permission groups in environments with federated authentication We've resolved an issue that prevented users' permission groups from being updated after they were changed in the identity provider's settings. The issue affected SaaS deployments that used AWS ALB authentication. 

API

Fixed

Checking files in Submission Creation requests We've fixed an issue where sending a Submission Creation request with an empty files parameter caused a 500 (Internal Server) error to be thrown rather than a 400 (Bad Request) error. The updated response also includes more details on what caused the error.

Security

Fixed

Updating oauthlib and cryptography To address security vulnerabilities, we've updated oauthlib to 3.2.2 and cryptography to 39.0.1.

35.0.12 (10 Feb 2023)

Connections

Fixed

Logs for HTTP Notifier Output Blocks and Submission Initialization Blocks – We've resolved an issue that caused authentication information to be included in logs for HTTP Notifier Output Blocks and Submission Initialization Blocks using Generic Web Storage (HTTP/HTTPS) configurations.

Settings

Fixed

Functionality and description of Submission Upload setting – We've fixed an issue that caused the Create Submission button to be enabled when the Submission Upload setting was disabled in the System Settings (Administration > System Settings). Also, we've updated the description of the Submission Upload setting with the correct name of the Create Submission button.

Security

Fixed

Updating ddtrace – To address security concerns, we've updated ddtrace to 1.7.2.

35.0.11 (3 Feb 2023)

Flows

Fixed

"Go to Model Management" link for Machine Transcription Block errors – Previously, when a user clicked the Go to Model Management button for a flow-validation error related to a Machine Transcription Block, the user would be taken to the list of Transcription models in the instance rather than the details page of the model causing the error. A fix for this issue is included in v35.0.11. 

Submission Processing 

Updated

Size of submission metadata – To reduce the size of submission metadata in the database, we've eliminated unnecessary whitespace in the submission metadata JSON. 

Documents

Fixed

Viewing deleted documents – We've resolved an issue that caused errors to occur when users attempted to view images of documents that had been deleted (e.g., documents that were deleted during PII-data deletion).

Custom Supervision

Fixed

Keyboard shortcuts for marking incomplete mandatory tasks as complete – We've fixed an issue that allowed keyers to mark incomplete mandatory tasks as complete by using the CTRL + Enter keyboard shortcut, even though the Complete task (CTRL + Enter) button was disabled.

Reporting

Fixed

Data for Semi-structured documents in Usage Reports – We've resolved an issue that prevented data for Semi-structured documents from appearing in the following Usage Report files:

  • checkbox_machine_transcriptions_report.csv

  • supervision_transcriptions_report.csv

  • machine_transcriptions_report.csv

Inclusion of data from the last hour of the current day – We've fixed an issue that caused data from the last hour in a day (23:00:00 to 23:59:59) to be excluded from the following reports on the Tasks Overview page (Tasks > Overview) for the current day:

  • Average Time Spent per Task Type Today

  • Volume and Processing Distribution Completed Today

Parsing of data timestamps – We've resolved an issue that caused data timestamps to be parsed incorrectly in certain situations, which affected the accuracy of reports.

Message Queue Listeners

Updated

Handling of NullPointerException exceptions and non-2xx HTTP errors – We've made changes to ensure that NullPointerException exceptions and non-2xx HTTP errors are managed in ways that are transparent to the user and do not disrupt the user experience. Specifically, the system manages non-2xx HTTP errors as exceptions, and if one of these errors occurs or a NullPointerException is thrown, the process of handling the message is stopped.

35.0.10 (20 Jan 2023)

Classification

Fixed

Generating Manual Classification tasks – We’ve resolved an issue that prevented Manual Classification tasks from being generated. 

Tables

Fixed

Ground-truth data for nested tables – We’ve fixed an issue with ground-truth data for nested tables that prevented child rows from being linked to their respective parent rows.

Custom Supervision

New

Decision dependencies – The addition of decision dependencies for Custom Supervision tasks provides you with more flexibility for configuring available options in decision drop-down menus. In v35.0.10, Custom Supervision tasks support both decision dependencies within a single document and decision dependencies across multiple documents. 

Configuring decision dependencies within a single document allows you to present different options in decision drop-down menus based on user input. For example, you can configure Custom Decision 1 with two possible answers. Based on the selected answer for Custom Decision 1, you will receive different possible answers for Custom Decision 2. For more information about the above example, see the following table:

Custom Decision 1 Question

Custom Decision 1 Answer

Custom Decision 2 Question

Custom Decision 2 Possible Answers

Has JavaScript experience?

Yes

Select job offer

  • Frontend Engineer

  • Senior Frontend Engineer

Has JavaScript experience?

No

Select job offer

  • Backend Engineer

  • Senior Backend Engineer


mceclip0.gif

The newly-added support for decision dependencies across multiple documents also allows you to present different options in decision drop-down menus based on user input. The difference here is that your answers in one of the documents affect the possible answers in other documents. For example, you can configure Custom Decision 1 in Document 1 with two possible answers. Based on the selected answer for Custom Decision 1, you will receive different possible answers for Custom Decision 2 in Document 2. For more information about the above example, see the following table:

Custom Decision 1 Question in Document 1

Custom Decision 1 Answer in Document 1

Custom Decision 2 Question in Document 2

Custom Decision 2 Possible Answers in Document 2

Has JavaScript experience?

Yes

Select job offer

  • Frontend Engineer

  • Senior Frontend Engineer

Has JavaScript experience?

No

Select job offer

  • Backend Engineer

  • Senior Backend Engineer

Note that you can configure decision dependencies only for documents and cases.

Mandatory decisions – We’ve added support for mandatory decisions in Custom Supervision tasks. To complete a Custom Supervision task, you need to select answers for all mandatory decisions.

Reporting

Fixed

Records of tasks performed on Table Identification training data – We've fixed a task-reporting issue that caused the system to create duplicate records for user tasks that were performed on Table Identification training data.

35.0.9 (13 Jan 2023)

Models

Fixed

Running Training Data Analysis on instances with MSSQL databases – We've fixed an issue related to page-count calculations that caused Training Data Analysis to fail on instances with MSSQL databases.

Upgrades

Fixed

Blank field predictions after upgrading from v34 to v35 – We've resolved an issue that resulted in blank predictions for fields after retraining a model during v34-to-v35 upgrades. The issue was caused by broken mapping between training data and the text segments in documents.

35.0.8 (9 Jan 2023)

Flows

Fixed

"Go to Model Management" buttons in error descriptions – We've fixed an issue that caused each Go to Model Management button in validation-error descriptions to link to the Models page rather than the Model Details page for the relevant model.

Box Integration

Fixed

Maintaining connections to Box – We've resolved an issue that caused connect-reset errors when connecting to Box.

35.0.7 (12 Dec 2022)

Custom Supervision

Fixed 

Deadlocks for submissions with Custom Supervision tasks – We've fixed a decision-record-deletion issue that caused deadlock errors when multiple submissions were processed through a Custom Supervision Block at the same time.

35.0.6 (8 Dec 2022)

Model Training

Fixed

Dividing by zero during the training of Field Identification models – We've fixed a division-by-zero issue that caused the training of Field Identification models to fail in some situations.

Training-data migration and submission metadata – We've resolved an issue that caused training-data migration to fail if submissions had metadata formatted as strings rather than dictionaries.

Supervision

Fixed

Propagating changes made in Field Identification and Transcription Supervision – We've fixed an issue that prevented changes made during Field Identification and Transcription Supervision from being reflected in subsequent tasks in some circumstances. For example, fields shown during Transcription Supervision matched those predicted during Machine Identification, not the edited versions created during Field Identification Supervision.

Exported Data

Fixed

Redacting restricted information from CSV exports – We've removed restricted information from the exports of page-, field-, and table-level data, both in the application and in our API endpoints. Examples of restricted information include the UUIDs of submissions and flows that the user generating the exports does not have access to.

API

Fixed

Submissions created with Content-Type: application/json; charset=utf-8 – We've fixed an issue that caused internal server errors to occur when submissions with Content-Type: application/json; charset=utf-8 were created via the API.

Functionality of the /api/artifacts/schema endpoint – We've removed unnecessary code that caused requests sent to the /api/artifacts/schema endpoint to result in 500 errors.

Security

Fixed

Updating wheel to v0.38.4 – To address security vulnerabilities, we've updated wheel to v0.38.4.

Infrastructure

New

One Identity Safeguard integration – With the addition of a One Identity Safeguard integration, we now provide you with another way to protect and encrypt your system-level secrets. The security that comes with this integration is highly recommended in cases where server-level controls are insufficient. For example, you can protect your database credentials and authentication secrets by storing them in your One Identity Safeguard account instead of keeping them as plain text in your “.env” file.

To learn more about integrating a One Identity Safeguard solution into your instance, see One Identity Safeguard.

Updated

Dynamically-generated secrets.yml file – To reduce the margin of error and improve the experience of configuring secrets-management integrations, the system now dynamically generates the secrets.yml file. The secrets.yml file was previously required for the AWS Secrets Manager and CyberArk Conjur integrations. In v35.0.6 and later, you no longer need to manually create and configure any .yml files. However, if you have configured a secrets.yml file in previous versions, you can continue to use it in future versions of the application. 

35.0.5 (1 Dec 2022)

This version of v35 is not officially supported. v35.0.6 is the first officially supported version of v35.

Model Validation Tasks

Fixed

Displaying multiline bounding boxes – We’ve fixed an issue that caused multiline bounding boxes to be displayed as separate segments during Model Validation Tasks. 

Reporting

Fixed

Downloading the Usage report – We’ve fixed an issue that prevented the Usage report (Reporting > Usage) from populating data for the current date. This issue occurred when downloading the report with one of the following predefined date ranges:

  • Week to Date

  • Month to Date

  • Year to Date

SaaS

Fixed

Access to the Administration page for System Admins – We’ve fixed an issue with SaaS instances that caused a 403 error to occur when System Admins attempted to open the Administration page.

Security

Fixed

Updating cryptography to v38.0.3 – To address security vulnerabilities, we've updated cryptography to v38.0.3.

API

Fixed

Flow-based permissions for API endpoints – To protect API endpoints against unauthorized access, we’ve applied flow-based permissions to all v5 API endpoints that access flow-related resources. For example, users who do not have access to a given flow won’t have access to submissions that are processed by this flow.

35.0.4 (22 Nov 2022)

This version of v35 is not officially supported. v35.0.6 is the first officially supported version of v35.

Quality Assurance

Fixed

QA tasks for submissions with wiped PII data – We’ve fixed an issue that prevented the system from removing QA tasks for submissions whose PII data had already been wiped. 

Keyer Data Management

Fixed

Annotations and Field ID model training – We’ve fixed an issue that invalidated ground-truth annotations in some circumstances. This issue caused the training of Field ID models to fail. 

Training

Fixed

Running training data analysis – We’ve fixed an issue that caused training-data analysis to fail for some layout variations.

35.0.3 (18 Nov 2022)

This version of v35 is not officially supported. v35.0.6 is the first officially supported version of v35.

Quality Assurance

Fixed

Clearing QA tasks on instances that use MSSQL databases – We’ve fixed an issue that prevented users from clearing more than 2000 QA tasks at the same time on instances that use MSSQL databases. 

Reporting

Fixed

Time zones and reporting – We’ve fixed an issue that caused some reports to use UTC instead of the timezone set in the “.env” file.

35.0.2 (10 Nov 2022)

This version of v35 is not officially supported. v35.0.6 is the first officially supported version of v35.

Training

Fixed

Restoring the “Trainer tasks” page in /admin – We’ve restored the “Trainer tasks” page in /admin

Connections

Updated

“Scope” and “Additional OAuth Request Parameters” settings for HTTP Notifier Blocks – We've added optional Scope and Additional OAuth Request Parameters settings for the OAuth 2.0 Client Credentials authorization type in HTTP Notifier Blocks.

Security

Fixed

Updating lxml to v4.9.1 – To address a security vulnerability, we've updated lxml to v4.9.1.

35.0.1 (8 Nov 2022)

This version of v35 is not officially supported. v35.0.6 is the first officially supported version of v35.

User Experience

Updated

Support for Internet Explorer – In v35, we are deprecating support for Internet Explorer. Beginning in v36, Internet Explorer will no longer be a supported browser. 

Although you can continue using Internet Explorer in v35, we recommend using the latest version of Chrome or Microsoft Edge instead, as we will continue to support those browsers.

Languages

Updated

Korean and English Language support – With our first dual-language model, we are able to process documents where a given field can be in both English or Korean. The model automatically handles transcription of either language within a field. This model is ideal for registration documents where entries can either be in Hangul or the English alphabet, and it is especially useful on documents with mixed-language use cases involving tables.

The Korean and English model is available for both Structured and Semi-structured documents. To use it, select the Korean and English option anywhere a language is selectable in the application.

Layouts

Fixed

Editing field customizations on instances that use MSSQL databases – We’ve fixed an issue that resulted in an unexpected error when editing field customizations on instances that use MSSQL databases. 

Layout Editor

Fixed

Deactivating auto-cloned fields – Previously, all auto-cloned fields of a layout variation were linked to a single shared field. Deactivating any of the layout variation’s auto-cloned fields caused this layout variation’s Layout Editor page to crash when a user attempted to open it.

Deactivating fields in Structured layout variations – Previously, deactivating fields in Structured layout variations prevented these fields from being displayed in the layout editor’s Inactive items list. If you clicked the Commit changes button after deactivating some fields in a layout variation, the same issue also prevented you from opening this layout variation. A fix for this issue is included in v35.0.1.

Models

New

Managing transcription models – With the new Transcription Models option on the Models page (Library > Models), you can view a list of the active v35 finetuning models in your instance. You can also enable or disable daily autotraining for your models by selecting or deselecting the Daily autotraining enabled option above the list of models.

For each model listed, you can see the following details:

  • The model's name

  • The number of flows using the model, with a link to more information about those flows

  • Whether the model is currently live

  • What the model applies to (e.g., language family, checkboxes, signatures)

Clicking on a model's name allows you to view the model's history, including its past versions. If needed, you can deploy an older version of a model by clicking its Deploy button. From the model's details page, you can also upload a new version of the model, as well as download all or specific versions of it.

To learn more about managing transcription models, see Managing Transcription Models.

Updated

Training data for Table Locator models – Beginning in v35, Table Locator models are only trained with data from model validation tasks (MVTs) and QA tasks by default. 

You can also use data from Supervision tasks to train these models. To enable this option in your instance, contact your Hyperscience representative.

Flows

New

Configuring mid-flow notifications – You can now configure mid-flow notifications in Flow Studio. As part of this update:

  • You can assign Notification flows to Document Processing flows and set up notifications for specific submission-status changes to be sent to your downstream systems.

  • You can also view and edit each Notification flow on the same page as its Document Processing flow.

These changes make the flow-configuration process more intuitive and give you more visibility into how your flows are connected.

To learn more about mid-flow notifications, see Connecting Flow Blocks to Other Flows.

Running flows on specific application machines – In v35, you can run flows with models that were created in v33 and v34, as well as v35. However, system resources can become limited due to multiple versions of models running at the same time. This issue may cause out-of-memory errors and delays in submission processing. With the Memory Management feature included in v35, you can now allocate specific application machines to flows that were created in a specific version of Hyperscience, eliminating resource-contention issues across flow versions.

Note that this feature does not change our support for flows from previous Hyperscience versions. You can still run flows from Hyperscience versions that were created in the two versions prior to your current application version, but not from versions earlier than those. For example, you can run flows from v35, v34, and v33, but not from v32 or earlier.

To learn more about Memory Management, see Memory Management.

Logging method for Custom Code and Python Blocks – With the new log method in the Flows SDK's HsBlockInstance interface, you can now view log messages from custom blocks on Flow Execution pages in the application and in Docker logs. This method collects messages upon completion of the blocks' tasks, whether the tasks were executed successfully or ended with errors. As a result, this update simplifies the flow-troubleshooting process for developers.

To learn more about flow logs, see Testing and Debugging Flows.

Uploading custom Python packages in the application – If you would like to install custom Python packages for your custom blocks, a System Admin can do so on your behalf on the Python Packages page of the application (Flows > Python Packages). To enable this feature, add the HS_ALLOW_EXTERNAL_PYTHON_PACKAGES_UPLOAD variable to your ".env" file and set it to true.

For more information, see Developing Flows.

Updated

Flow-validation framework and error messaging – To improve the flow-development experience, we've enhanced the validation framework for flows. Updates include clearer descriptions of validation errors, explicit distinctions between errors and warnings, and direction on how to resolve the errors where possible.

Flow Blocks

New

Named Entity Recognition Block – Extending our extraction capabilities to better support use cases like redaction within our flows, we introduce Named Entity Recognition Blocks. These blocks allow you to:

  • detect key PII entities such as names, addresses, locations, organizations, and companies.

  • enhance the full-page transcription output with information about detected entities.

You need to use Named Entity Recognition Blocks in conjunction with Full Page Transcription Blocks. For example, you can build a redaction flow that processes documents through full-page transcription, then detects all personal names, and at the end uses a Custom Code Block to put black boxes over the detected names. 

Fixed

Default value of the Image Correction setting – We’ve fixed an issue that caused the Machine Classification Block’s Image Correction setting to be disabled by default.

Submission Initialization Block

Updated

Testing Submission Initialization Block connections – When setting up or troubleshooting Submission Initialization Blocks, you can now test connectivity to your S3 buckets, OCS buckets, or generic web storage (HTTP/HTTPS). To do so, click the Test File Store Retrieval button under the block's settings.

Submissions Table

Fixed

Names of disabled flows – We've fixed an issue that caused the incorrect names of disabled flows to appear in the Submissions table. Previously, a combination of the flow's identifier and UUID would be given as the flow's name.

Submission Processing

Fixed

Excluding a layout variation and halted submissions – Previously, if you had a live layout with multiple variations and successfully processed documents with one of these variations, excluding this variation from a new release resulted in halted submissions when resubmitting the same documents. A fix for this issue is included in v35.0.1.

Tasks

New

Service Level Agreement (SLA) and Task Management – SLA Management is the creation and management of Service Level Agreement definitions in Hyperscience. SLA Management allows users to set submission deadlines. The system uses these deadlines to automatically prioritize tasks in the Task Queue tab. 

Task Management allows users to manage the Task Queue tab by viewing which tasks are about to breach their SLAs. 

To achieve the goals of SLA and Task Management, we’ve made the following changes to the Tasks page:

  • SLA Rules tab

    • We’ve added an SLA Rules tab to the Tasks page. This tab allows you to manage and add SLA rules. 

    • We’ve added an SLA rule dialog that allows you to configure deadlines for submissions based on their flow and Input Block, or layout variation. The system then uses these deadlines to sort the tasks by priority in the Task Queue tab.

  • Task Queue tab

    • We’ve added an SLA Deadline column to the tasks table in the Task Queue tab.

    • We’ve added SLA Name and SLA Deadline filters to the tasks table in the Task Queue tab.

  • Overview tab

    • We’ve added a Quality Assurance Tasks Chart to the Overview tab that shows the quantity of available QA tasks by task type.

    • We’ve added a Task SLA Performance Against Deadlines table to the Overview tab that shows the volume of tasks that are approaching their deadlines.

    • We’ve added an Average Time Spent per Task Type Today table to the Overview tab that displays the average wait and active time per task type for the current day.

    • We’ve added a Volume and Processing Distribution Completed Today table to the Overview tab that displays the number of completed submissions, documents, pages, fields, table cells, Supervision tasks, and QA tasks for the current day.

    • We’ve renamed the Active Workers table to Active Users. The Active Users table now displays the names of all active users and the task type each one is currently working on. 

Fixed

Flow filter when switching tabs – We’ve fixed an issue that reset the flow filter when switching from the Perform Tasks tab to the Task Queue tab.

Task Queue

Updated

Additional info about SLA rules in the SLA Name filter – To help users differentiate among SLA rules with the same name, we've added an info icon to each SLA rule in the Task Queue's SLA Name filter. Hovering over this icon reveals the definition of the SLA rule.

Filtering tasks by submission-creation date and time – You can now filter the tasks in the Task Queue based on their submissions' creation dates and times. To use this new filter, click on the Filter by Submission Date text box at the top of the Task Queue page.

Text color for SLA deadlines within the next hour – In the Task Queue, SLA deadlines that are within the next 60 minutes appear in orange text.

Classification

New

Reclassification of machine-classified Structured documents – Reclassification of Structured documents allows keyers to add and remove pages from machine-classified Structured documents with missing pages. To learn more, see Document Classification.

Updated

Grouping and ordering improvements – To enhance the Document Classification experience—and give you more flexibility in how you group and order pages—we've rebuilt our Document Classification page. Our new Document Classification page allows you to add, remove, and reorder pages in grouped documents, using drag-and-drop functionality and keyboard shortcuts. To learn more about grouping and ordering improvements, see Document Classification

Removing document foldering and metadata – We've removed the document foldering and metadata features from the application. These features were available in steps 2 and 3 of the Documentation Classification process, respectively, and you can use Case Collation in their place. If you have questions about your specific use case, contact your Hyperscience representative. 

Fixed

Layout variations during Manual Classification – We’ve fixed an issue that caused layout variations to appear multiple times in the Layout drop-down list during Manual Classification.

Manual Classification task after upgrading the application from v33 – We’ve fixed a few issues with submissions that had machine-classified documents in v33 and for which Manual Classification was started after upgrading to v35. The issues were the following:

  • Submission page numbers were missing.

  • You couldn’t add uncategorized pages to machine-classified documents with missing pages. 

  • An unexpected error occurred when you created a new grouped document, added a page to it, and slightly moved this page to the right with drag-and-drop.

Transcription

New

Fine-tuning for Semi-structured documents – To improve transcription performance and minimize the human effort needed to process Semi-structured documents, we’ve made the following changes:

  • We’ve enabled finetuning and recalibration for Semi-structured text and Semi-structured checkboxes.

    • Finetuning can be used only for Semi-structured fields that do not have multiple bounding boxes.

  • We’ve added a Transcription Automation Training setting to both Structured and Semi-structured Document Transcription flow settings. 

  • We’ve added a Periods of Records to Use setting to both Structured and Semi-structured Document Transcription flow settings.

  • We’ve moved the Improved Threshold Accuracy setting to the Structured Document Transcription flow settings.

  • We’ve moved the Finetuning Only For Trained Layouts setting to the Structured Document Transcription flow settings.

  • We’ve removed the Semi-structured Transcription Confidence Boost setting.

Automatic QA Sample Rate setting – To help you determine the Transcription QA sample rate for your fields, we’ve introduced the Automatic QA Sample Rate setting. When this setting is enabled, the system automatically calculates and sets the percentage of fields that should be selected for Transcription QA. This setting applies to all fields, including text, checkbox, and signature fields.

You can enable the Automatic QA Sample Rate setting in the General Transcription section of your flow’s settings. 

Configuring the currency prefix for Transcription tasks – We’ve added a Currency Mask Prefix setting in Administration > Settings. This setting allows you to override the default “$” prefix that is used for currency data types during Transcription tasks. 

Model Validation Tasks

Updated

Removing Model Validation Tasks (MVTs) from the QA queue – We’ve removed MVTs for Field and Table Locator models from the Quality Assurance Tasks card on the Perform Tasks tab (Tasks > Perform Tasks). You can still access MVTs in the Model Details page by clicking the Perform Tasks button.

Training

Fixed

Restoring the “Trainer tasks” page in /admin – We’ve restored the “Trainer tasks” page in /admin.

Keyer Data Management

New

Document clustering – To reduce the number of annotation errors and make the process of training Field Locator and Table Locator models faster and less complex, we’re introducing document clustering in the Keyer Data Management tools. Document clustering allows data keyers and knowledge workers to complete tasks more quickly and to be as accurate as possible while doing so. 

To enable document clustering, we’ve added a Training Data Analysis card to the Model Details page. To help you identify and fix annotation errors more easily, you can use the card to run data analysis. The analysis assigns training documents to groups and suggests how you can improve each group’s model performance – either by adding or removing training documents from these groups.

Annotating training documents with guidance – To speed up the annotation process in Keyer Data Management, we’ve introduced annotation suggestions. These annotation suggestions provide you with predictions about where a given field or a table column might be located. The system automatically starts generating these suggestions once you annotate 2-3 documents from a training documents group. To learn more about annotating training documents with guidance, see Training Data Analysis and Guided Data Labeling.

Adding new fields and columns to layout variations – Previously, when adding a new field or a table column to a layout variation, the annotated training data became unusable. To train a model on the new layout variation, you had to upload and annotate a new set of documents with the new field or table column.

With the improvements included in v35, even if you’ve added a new field or a table column, you can reuse existing ground-truth data of annotated fields and table columns. To train a model with newly added fields and table columns without re-annotating all existing ground-truth data, we’ve introduced the option to add annotations for these fields and table columns, using the Keyer Data Management tools.

Importing and exporting ground-truth data – To save users the effort of resubmitting and annotating old data for each new Field Locator or Table Locator model, we now allow users to export and import ground-truth data within the same layout across different instances. For example, you can move ground-truth data across variations of the same layout in different instances, but you cannot move this data across entirely different layouts. 

Exporting ground-truth data allows you to safely test how a new model performs in another environment and retrain the model if needed. Once the model achieves the desired results, you can import your new ground-truth data into your production environment. During imports, you can choose whether to overwrite or skip importing ground-truth data for duplicate documents. 

In addition to these benefits, the exporting functionality also allows you to store ground-truth data externally, regardless of your PII data deletion settings. When PII data deletion is enabled, a “Model training data at risk” warning message appears on the Model Details page. 

Migrating ground-truth data after upgrading to v35 – We’ve added a background job that automatically migrates ground-truth data after upgrading to v35. This job usually takes a few hours to complete. For a large amount of ground-truth data, this job might take more than a few hours but less than a day to complete.

Reporting

New

Submissions SLA report – We’ve added a Submissions SLA report (Reporting > Processing Time) that can be downloaded. The report includes the following columns:

  • Submission ID 

  • Submission Creation Time 

  • Submission Due Time 

  • Submission Complete Time

  • Delta Between Due Time and Complete Time

    • The value is positive if work has been completed before the SLA deadline.

    • The value is negative if work has been completed after the SLA deadline.

  • Submission Due Date Source

  • SLA Rule Applied - Name

  • SLA Rule Applied - Definition

  • Number of Folders

  • Number of Documents

  • Number of Pages

  • Number of Page Failures

  • Number of Fields

  • Number of Machine Only Fields Transcribed

  • Number of Tables

  • Number of Table Cells

  • Number of Machine Only Table Cells Transcribed

  • Submission State

  • Submission Substate

  • Submission Metadata

  • External ID

To learn more about the Submissions SLA report, see Submissions SLA Report.

Browser-usage data in Application Usage Report – We've added browser-usage data to the Application Usage Report. This data is provided in a separate CSV file whose name is in the format browser_usage_by_day_and_user__start_date_YYYY-MM-DD__end_date_YYYY-MM-DD.csv.

Field Exceptions Report – We've added a Field Exceptions Report to the Accuracy tab in the Reporting section of the application. For each field exception, the report includes information about the field's submission, layout, document, page, and field.

Updated

Separate data for Supervision and QA tasks in KeyerPerformance.csv – We’ve updated KeyerPerformance.csv, which is part of the Keyer Projection report (Reporting > User Performance), to have separate data for Supervision and QA tasks.

Excluding autotranscribed fields from reports – We’ve added a toggle to the following reports to enable users to exclude auto-transcribed fields:

  • Automation (Reporting > Overview)

  • Document Output Accuracy (Reporting > Overview)

  • Manual Accuracy vs Machine Accuracy (Reporting > Accuracy)

  • System Transcription Sampled Errors (Reporting > Accuracy)

Filtering by flow – We’ve added a Flow filter to the following reports:

  • Manual Accuracy vs Machine Accuracy (Reporting > Accuracy)

  • Performance Distribution (Reporting > User Performance)

  • All Users Performance Summary (Reporting > User Performance)

We’ve also added a Flow column to the following “.csv” files from the Keyer Projection Report (Reporting > User Performance):

  • HourlyReportingTaskOverview

  • HourlyReportingSubmissionOverview

  • HistoricalProcessing

  • KeyerPerformance

New metrics in the Keyer Projection Report – We've added columns to the Keyer Projection Report for the following metrics:

  • Supervision tasks

    • Field Identification Supervision Time Spent (Seconds)

    • Fields Identified in Supervision

    • Field Transcription Supervision Time Spent (Seconds)

    • Fields Transcribed in Supervision

    • Fields Thru Puts in Supervision (Hours)

    • Field Characters Keyed in Supervision

    • Field Characters Thru Puts in Supervision (Hours)

    • Table Cell Transcription Supervision Time Spent (Seconds)

    • Table Cells Transcribed in Supervision

    • Table Characters Keyed in Supervision

    • Table Characters Thru Puts in Supervision (Hours)

  • QA tasks

    • Field Identification QA Time Spent (Seconds)

    • Fields Identified in QA

    • Field Transcription QA Time Spent (Seconds)

    • Fields Transcribed in QA

    • Fields Thru Puts in QA (Hours)

    • Field Characters Keyed in QA

    • Field Characters Thru Puts in QA (Hours)

    • Table Cell Transcription QA Time Spent (Seconds)

    • Table Cells Transcribed in QA

    • Table Characters Keyed in QA

    • Table Characters Thru Puts in QA (Hours)

Note that values for these columns are not provided retroactively; they contain data only for tasks completed after upgrading to v35.0.1.

Permissions

Updated

Messaging about flow assignments and existing tasks – When you assign flows to a permission group, any open tasks created from those flows are not restricted to members of that permission group. We've added messaging to the permission group-assignment process to clarify this aspect of the assignments.

Labeling of available flows in Assign Group Access to Specific Flows – Previously, the flows listed under Assign Group Access to Specific Flows were labeled as "Available Live Flows," even if some of the flows listed were disabled. For accuracy, we've changed this label to "Available Flows.” 

Fixed

Number of available permission groups when editing task restrictions – Previously, a maximum of 50 permission groups were visible and available to choose from when editing task restrictions. A fix for this issue is included in v35.0.1.

Security

Fixed

Addressing security vulnerabilities – To ensure security, we've made the following updates:

  • We’ve pinned importlib-metadata to 4.3.0.

  • We’ve updated protobuf to 3.19.16.

  • We’ve updated org.apache.pdfbox to 2.0.27.

  • We’ve updated lxml to 4.9.1.

  • We’ve updated joblib to 1.2.0.

Integrations

Updated

Support for Microsoft 365 in the Email Listener – The Email Listener now supports connections to Microsoft 365 Outlook accounts. To ingest emails from Office 365 inboxes, select Microsoft 365 Outlook from the new Email Provider drop-down list in the Email Listener connection's settings. Then, enter your account and authentication information in the Connection settings fields.

Note that the Email Listener uses Microsoft Graph APIs for account authentication; OAuth authentication is not supported at this time.

More information on the Email Listener can be found in Email Listener.

"Full and Unmatched" export type for output connections – We've introduced a new "Full and Unmatched" export type, which builds on the functionality of the "Individual documents" export type by sending a notification for each unmatched page in a submission. 

For more information on export types, see Universal Integration Block Settings.

Selecting export types for output connections – We've changed the Export Type drop-down list in output connection settings to a set of buttons, with a description of each export type in its respective button. This update makes the selection of an export type more intuitive. As part of this update, the "Individual documents" export type has been renamed "Full individual object."

SSL certificates for HTTP Notifier connections – With the new SSL Certificate Settings options for HTTP Notifier connections, you can now provide information about SSL certificates without editing your ".env" file. 

To learn more, see HTTP Notifier.

Databases

Updated

Support for PostgreSQL 14 – We've added support for PostgreSQL 14.

Networking

Updated

Support for IPv6 – The Hyperscience Platform can be used in both IPv4 and IPv6 network environments.

Installations

Fixed

New installations in instances with MSSQL databases – We've fixed a data-migration issue that caused application deployments to fail in instances with MSSQL databases.

Upgrades

Updated

Forward-compatible models – As part of our efforts to improve the application-experience, we've made models forward compatible, decoupling the model-upgrade process from the application-upgrade process. In v35, v33 and v34 flows that use v33 or v34 models, respectively, for automation continue to work after upgrading to v35. In future versions of Hyperscience, we plan on expanding the number of previous model versions supported by this feature.

Note that this feature does not eliminate the need to train new versions of models. Rather, it allows you to complete this training after the application-upgrade process is complete, making it easier to upgrade to new versions of Hyperscience.

API

New

List Transformed Submissions endpoint – We've added a new List Transformed Submissions endpoint, which allows you to retrieve information about multiple transformed submissions through a single request.

Updated

Filtering List Submission responses by flow UUID – With the new flow_uuid parameter in the List Submission endpoint, you can now obtain information about submissions that were processed in specific flows.

Submission IDs in Document objects – We've added a new submission_id property to the Document object. The value of this property is the id of the Submission that the Document originated from.

Debugging submissions via the API – We've added a debug parameter to the Get Submission and List Submissions endpoints that, when set to true, allows you to retrieve the following additional information about the submissions:

  • The name, version and UUID of the flow that each submission was processed through (flow_name, flow_version, flow_uuid)

  • The name of the flow's release (release_name)

  • Whether the submission was created via API, through an Input Block, or manually in the application, along with details about the user or block responsible for the submission's creation (originator_info)

Source routing tags in Submission objects – When retrieving submission information via the Get Submission or List Submissions endpoints, you can now view a list of each submission's source routing tags in the responses. This list is shown as the value of the new source_routing_tag property in Submission objects.

Flow UUID in Submissions reports – We’ve added a Workflow UUID column to Submissions reports created via our API. To add this data to your report, enter include_extra_fields=flow_uuid as a query parameter in your request (i.e., /api/v5/submissions/csv?include_extra_fields=flow_uuid).

As part of this update, we’ve also added a flow_uuid filter to these reports. When filtering by multiple flows, include multiple flow_uuid filters in your request (e.g., /api/v5/submissions/csv?flow_uuid=123&flow_uuid=345&include_extra_fields=flow_uuid).

Flow-based permissions for API endpoints – To protect API endpoints against unauthorized access from flow-restricted users, we’ve applied flow-based permissions. All v5 API endpoints that access flow-related resources are now protected by flow-based permissions. For example, users who do not have access to a given flow won’t have access to submissions that are processed by this flow.

If you use API calls to access submission resources, make sure that users are assigned permissions to all relevant flows that process those submissions. To learn more about flow-based permissions, see Flow-based Permissions.

For SaaS instances that have API accounts, make sure to assign all relevant flows to the API User permission group.

Fixed

Reverting recent changes to API v4 – We've reverted submission-SLA changes that were made to v4 of our API. This version of our API is deprecated, so we are not adding any new features to it.