V33 Release Notes

33.1.44 (24 Mar 2023)

Input Connections

Fixed

Retrying submissions – We've fixed a flow-run ID issue that sometimes caused images from submissions ingested through folder-based connections to be deleted prior to processing. The issue prevented affected submissions from being resubmitted.

33.1.43 (20 Feb 2023)

Quality Assurance

Fixed

Sampling a large number of values for Table Identification QA in instances with MSSQL databases – We've fixed an issue that caused submissions to halt in the Complete Block when a large number of values in those submissions were sampled for Table Identification QA in instances with MSSQL databases.

PII Data Deletion

Fixed

Accessing pages whose images have been deleted – Previously, if a user tried to access pages images that were deleted in accordance with the PII Data Deletion policy, there was no messaging to let the user know why those images were unavailable. A fix for this issue is included in v33.1.43.

Reporting

Fixed

Downloading Usage Reports after completing checkbox or signature tasks for Semi-structured documents – We've resolved an issue that prevented Usage Reports from being downloaded after keyers completed Supervision and QA tasks for checkboxes or signatures in Semi-structured documents.

Settings

Fixed

Functionality and description of “Submission Upload” setting – We've fixed an issue that caused the Create Submission button to be enabled when the Submission Upload setting was disabled in the System Settings (Administration > System Settings). Also, we've updated the description of the Submission Upload setting with the correct name of the Create Submission button.

Security

Fixed

Updating dependencies – To address security concerns, we've updated:

certifi to 2022.12.7,
oauthlib to 3.2.2,
cryptography to 39.0.1, and
ddtrace to 1.7.2.

33.1.42 (26 Jan 2023)

Submission Processing

Updated

Storage of submission metadata – To reduce the size of submission metadata in the database, we've eliminated unnecessary whitespace in the submission metadata JSON.

Reporting

Fixed

Data for Semi-structured documents in Usage Reports – We've resolved an issue that prevented data for Semi-structured documents from appearing in the following Usage Report files:

checkbox_machine_transcriptions_report.csv
supervision_transcriptions_report.csv
machine_transcriptions_report.csv

SaaS

Fixed

User-Agent HTTP headers in Client Library authentication requests – To address security concerns, the system now requires User-Agent HTTP headers in all Client Library authentication requests.

API

New

Swagger UI and schema endpoints – We've enabled the following Swagger endpoints:

/api/v5/swagger – Displays information about Hyperscience API endpoints that currently have OpenAPI specifications
/api/v5/schema – Downloads the OpenAPI specifications for the instance's version of Hyperscience
/api/v5/schema.json – Provides the same data as /api/v5/schema, but in JSON format

33.1.41 (6 Jan 2023)

Document Classification

Fixed

Classification of pages as blank – We've resolved an issue that caused pages containing only a set of elements that were positioned closely together to be misclassified as blank.

Potential Layout Variations

Fixed

Running Potential Layout Variations jobs with images that are marked to be deleted – We’ve fixed an issue that resulted in a 404 error when running a Potential Layout Variations job that included images that had been marked to be deleted. With this fix, images that are marked to be deleted are no longer included in Potential Layout Variations jobs.

Submission Processing

Fixed

Synchronization of submission-data updates when database connections are lost – We've resolved an issue that prevented update-synchronization managers from recovering when the application's database connections were lost. This issue caused submissions to remain in Submission Initialization and jobs to stay in the "Pending" status.

Processing submissions after adding database capacity – We've fixed an issue that required users to restart the application containers in order to continue processing submissions after increasing the size of the database.

Repeated updates to submission data – We've fixed a race condition that caused tasks that updated submission data to be executed twice when those tasks timed out.

File Storage

Fixed

Limiting the size of uploads to the file store – Previously, if a file containing over 1000 pages was uploaded to the file store in a single HTTP request, the request would fail. In v33.1.41 and later, the system divides these files into multiple batches and uploads the batches individually, preventing the HTTP requests from failing.

Classification

Fixed

Generating Manual Classification tasks – We’ve resolved an issue that prevented Manual Classification tasks from being generated.

PII Data Deletion

Fixed

Deleting PII from large submissions – We’ve fixed an issue that reduced database performance when PII was deleted from submissions containing 1,000 or more pages.

Box Integration

Fixed

Maintaining connections to Box – We've resolved an issue that caused connect-reset errors when connecting to Box.

Processing of fields marked as illegible – Previously, if a submission contained fields marked as illegible during Transcription Supervision, a 400 error would occur in Box. This error occurred because values of illegible fields are recorded as null in Hyperscience, and Box does not support null values. A fix for this issue is included in v33.1.41, and Hyperscience ignores illegible fields when sending submissions to Box.

33.1.40 (8 Dec 2022)

Submission Processing

Fixed

Checking job-queue worker compatibility in flows engine and task managers– To prevent submissions from halting, we've added job-queue-worker compatibility checks to the flows engine and task managers. As part of this update, the system now checks for, logs, and kills obsolete services.

Flows

Fixed

Overflow of SLRU cache in PostgreSQL databases– To prevent SLRU cache overflow in PostgreSQL databases, we've removed several subtransactions and savepoints from our flow processes.

Updating the flows engine– In previous versions, if updates to task records failed because the tasks could not be found (e.g., they were timed out or rescheduled), no exceptions were raised. Also, the preparations for the completion of the tasks were canceled, causing task-processing errors. A fix for this issue is included in v33.1.40.

Upgrades

Fixed

Migration of retry_submission_id values in Oracle databases– We've fixed a data-type issue that caused the migration of retry_submission_id values to fail when upgrading instances with Oracle databases from v32 to v33.1.0 or later.

33.1.39 (7 Dec 2022)

Field Identification

Fixed

Joining and transcription of adjacent text segments– Previously, the Machine Identification Block would sometimes not join adjacent text segments into a single field, causing them to be transcribed separately. A fix for this issue is included in v33.1.39.

Security

Fixed

Updating wheel to v0.38.4– To address security vulnerabilities, we've updated wheel to v0.38.4.

33.1.38 (30 Nov 2022)

Potential Layout Variations

Fixed

Quality Assurance

Fixed

QA tasks for submissions with wiped PII data – We’ve fixed an issue that prevented the system from removing QA tasks for submissions whose PII data had already been wiped.

Connections

Fixed

Sending a large number of objects through Message Queue Listener connections – We’ve fixed an issue with sending a large number of objects through Message Queue Listener connections. This issue caused timeouts to occur and prevented submissions from being processed.

Reporting

Fixed

Downloading the Usage report – We’ve fixed an issue that prevented the Usage report (Reporting > Usage) from populating data for the current date. This issue occurred when downloading the report with one of the following predefined date ranges:

Week to Date
Month to Date
Year to Date

SaaS

Fixed

Access to the Administration page for System Admins – We’ve fixed an issue with SaaS instances that caused a 403 error to occur when System Admins attempted to open the Administration page.

Security

Updated

Updating cryptography to v38.0.3 – To address security vulnerabilities, we've updated cryptography to v38.0.3.

33.1.37 (15 Nov 2022)

Quality Assurance

Fixed

Clearing QA tasks on instances that use MSSQL databases – We’ve fixed an issue that prevented users from clearing more than 2000 QA tasks at the same time on instances that use MSSQL databases.

Reporting

Fixed

Time zones and reporting – We’ve fixed an issue that caused some reports to use UTC instead of the timezone set in the “.env” file.

33.1.36 (7 Nov 2022)

Layout Editor

Fixed

Deactivating auto-cloned fields – Previously, all auto-cloned fields of a layout variation were linked to a single shared field. Deactivating any of the layout variation’s auto-cloned fields caused this layout variation’s Layout Editor page to crash when a user attempted to open it.

Security

Updated

Updating lxml – To address a security vulnerability, we've updated lxml to v4.9.1.

33.1.35 (31 Oct 2022)

Layout Editor

Fixed

Deactivating fields in Structured layout variations – Previously, deactivating fields in Structured layout variations prevented these fields from being displayed in the layout editor’s Inactive items list. If you clicked the Commit changes button after deactivating some fields in a layout variation, the same issue also prevented you from opening this layout variation. A fix for this issue is included in v33.1.35.

Submissions Table

Fixed

Names of disabled flows – We've fixed an issue that caused the incorrect names of disabled flows to appear in the Submissions table. Previously, a combination of the flow's identifier and UUID would be given as the flow's name.

Submission Processing

Fixed

Excluding a layout variation and halted submissions – Previously, if you had a live layout with multiple variations and successfully processed documents with one of these variations, excluding this variation from a new release resulted in halted submissions when resubmitting the same documents. A fix for this issue is included in v33.1.35.

Tasks

Fixed

Flow filter when switching tabs – We’ve fixed an issue that reset the flow filter when switching from the Perform Tasks tab to the Task Queue tab.

Keyer Data Management

Fixed

Deleting fields with multiple bounding boxes – In previous versions, if a user created a new version of a Semi-structured layout that had fields with multiple bounding boxes, and they then deleted those fields with the Keyer Data Management tools, the fields would not be deleted. A fix for this issue is included in v33.1.35.

Connections

Fixed

Notification of failed Message Queue Notifier Output Block test connections – We've fixed an issue that caused failed test connections for Message Queue Notifier Output Blocks to be shown as successful.

33.1.34 (21 Oct 2022)

Classification

Fixed

Submissions with pages classified as "No Layout Variation Found" – We've resolved a pagination issue that caused some submissions to have their pages incorrectly classified as "No Layout Variation Found." The same issue also prevented these submissions from being sent to Manual Classification.

Training

Fixed

Restoring the “Trainer tasks” page in /admin – We’ve restored the “Trainer tasks” page in /admin.

Reporting

New

Field Exceptions Report – We've added a Field Exceptions Report to the Accuracy tab in the Reporting section of the application. For each field exception, the report includes information about the field's submission, layout, document, page, and field.

Security

Fixed

Addressing security vulnerabilities – To address security vulnerabilities, we've updated:

protobuf to 3.19.16.
org.apache.pdfbox to 2.0.27.

33.1.33 (10 Oct 2022)

File Storage

Fixed

Names of files from Generic Web Storage (HTTP/HTTPS) file stores – Previously, the filenames generated for files from Generic Web Storage (HTTP/HTTPS) file stores would sometimes cause errors due to their length. A fix for this issue is included in this version.

33.1.32 (28 Sep 2022)

Kubernetes

Fixed

Custom Code Blocks with Python libraries – We’ve fixed an issue with some tasks that prevented Custom Code Blocks with Python libraries from completing on Kubernetes deployments.

Backward-incompatible API – For v33.1.32, we’ve made the API backward incompatible for Kubernetes deployments. This application version is compatible only with version 4.0.0+ of the operator and version 6.0.0+ of the Hyperscience Helm chart.

33.1.31 (27 Sep 2022)

User Experience

Fixed

Assigning a flow to a permission group – To improve the user experience of assigning a flow to a permission group, we’ve renamed the Available Live Flows label to Available Flows. The Available Live Flows label was incorrect, as the flows table in a permission group’s configuration page included both live and disabled flows.

Layout Editor

Updated

Deleting multiple fields in the Layout Editor – You can now delete multiple fields in the Layout Editor at the same time.

Making changes to shared fields – We've reduced the time it takes for the system to register changes made to shared fields in the Layout Editor.

33.1.30 (14 Sep 2022)

API

Updated

Flow UUID in Submissions reports – We’ve added a Workflow UUID column to Submissions reports created via our API. To add this data to your report, enter include_extra_fields=flow_uuid as a query parameter in your request (i.e., /api/v5/submissions/csv?include_extra_fields=flow_uuid).

As part of this update, we’ve also added a flow_uuid filter to these reports. When filtering by multiple flows, include multiple flow_uuid filters in your request (e.g., /api/v5/submissions/csv?flow_uuid=123&flow_uuid=345&include_extra_fields=flow_uuid).

33.1.29 (13 Sep 2022)

Reporting

Fixed

Software Version column in the Usage report – We’ve fixed an issue that prevented the Software Version column in the Usage report (Reporting > Usage) from being populated.

33.1.28 (12 Sep 2022)

Models

Fixed

Memory usage when downloading Classification models and training data – We've fixed an issue that caused large amounts of memory to be consumed when users downloaded Classification models and their training data.

Authentication

Updated

Adding a SAML_ACS_URL environment variable – We’ve made the SAML AssertionConsumerService URL configurable by adding a new environment variable called SAML_ACS_URL. This variable allows you to override the consumer URL with a fixed path.

Installations

Fixed

Installing the application on instances that use MSSQL databases – We’ve fixed an issue with installing the application on instances that use MSSQL databases. This issue caused timeouts to occur and prevented users from completing the installation process.

33.1.27 (29 Aug 2022)

Classification

Fixed

Multiple high-confidence layout candidates for Structured pages – We’ve fixed a page-matching issue with Structured pages that had more than one high-confidence layout candidate. This issue caused errors during the recalibration of Classification models for Structured documents.

Flows

Fixed

Active finetuning models and using the Flow Studio – We’ve fixed an issue that caused error messages for transcription settings to appear in the Flow Studio if you had at least one active finetuning model in the system for the current version of the application.

Reporting

Updated

Adding new CSV files to the Usage report – We’ve added the following CSV files to the Usage report (Reporting > Usage):

settings.csv – contains all application settings.
db_entity_counts_YYYYMMDD_HHMM.csv – contains counts of database entities.

33.1.26 (24 Aug 2022)

Layout Editor

Fixed

33.1.25 (22 Aug 2022)

Table Identification

Fixed

Using the “Next column” and “Previous column” shortcuts with Caps Lock enabled – Previously, when the Caps Lock key was enabled, using the keyboard shortcuts for Next column (E or ⬇) and Previous column (W or ⬆) reversed the order in which the columns were iterated over. A fix for this issue is included in this version.

Transcription

Fixed

Reaching consensus during Transcription tasks – We’ve fixed an issue that caused Transcription tasks to halt if you couldn’t reach consensus for a field in your first two entries.

Integrations

Fixed

Using the Salesforce Listener in EST time zone – Before the Salesforce Listener Block starts running, it always checks if a record for the configured channel exists in Salesforce’s Last_Processed_Ingestion_Time object. If no record exists, the Salesforce Listener Block creates a new one. Using the Salesforce Listener in a machine that was configured in EST time zone caused the record in the Salesforce’s Last_Processed_Ingestion_Time object to be incorrect. A fix for this issue is included in this version.

33.1.24 (19 Aug 2022)

Cases

Fixed

Refreshing the Cases page after editing or deleting cases – We've fixed an issue that prevented the data on the Cases page from being refreshed after users edited or deleted cases. For example, a deleted case would still appear in the Cases table after being deleted.

Security

Fixed

HttpResponse usage – To address security vulnerabilities, we've made updates to the system's usage of HttpResponse, including adding missing content-type information and output encoding.

Upgrading okhttp to 4.10.0 – To address a security vulnerability, we've upgraded okhttp to the latest stable version, 4.10.0.

33.1.23 (8 Aug 2022)

Cases

Fixed

Multiple-page documents in the Documents table – We've fixed an issue that caused multiple-page documents to appear multiple times in the Documents table on the Case Details page, one time for each page in the document.

SaaS

Updated

Access to /admin – We've restricted /admin access in SaaS instances to Hyperscience employees.

Fixed

Access to other users' API tokens – For security purposes, we've removed the ability of System Admins and Business Admins to view the API tokens of other users.

Security

Fixed

Caching of pages containing submission images – To prevent unauthorized access to sensitive or personally identifiable information, the system now prevents pages containing submission images from being cached by intermediary proxy servers and local web browsers.

33.1.22 (1 Aug 2022)

Submissions

Fixed

Displaying layout identifiers when no fields have been identified – We've resolved an issue that prevented layout identifiers from being shown in the document viewer when no fields were identified on at least one of the document's pages.

Flows

Fixed

Filtering by flows that a user cannot access – We've fixed an issue that allowed users to filter tables and reports by flows that they did not have access to.

Connections

Fixed

Updating Amazon SQS metadata – We've resolved an issue that prevented metadata updates from Hyperscience from being reflected in Amazon SQS.

33.1.21 (29 Jul 2022)

Submission Processing

Fixed

Resubmitting v31 submissions – Previously, submissions that had been successfully matched to layout variations in v31 weren’t matched to any layout variations after being resubmitted in v33. A fix for this issue is included in this version.

Training

Fixed

Page IDs and Classification model training – In rare cases, IDs of submission pages and Training Data pages are the same. We’ve fixed an issue that caused submission page IDs to be used during Classification model training in those rare cases. The issue led to decreased performance after training a Classification model with such pages.

33.1.20 (27 Jul 2022)

Layouts

Fixed

Queries during layout import – We've removed a query from the layout-import process that caused errors when importing layouts and upgrading Hyperscience.

Submissions

Updated

Populating corrected_image_url for unassigned pages – We now provide corrected_image_url values for pages that are not assigned to a layout. With this update, you can obtain URLs for transformed versions of these pages in submission JSON outputs and in Page data in API v5 responses.

Cases

Fixed

Response times for retrieving case information – We've updated the queries used to retrieve case data by external case ID, making the system more responsive when users navigate the Cases table, upload submissions, or complete Custom Supervision tasks.

Flows

Updated

Enhancements to flow filtering – We've improved how the system filters flow-specific data in the application and in reports. In particular, the processes for filtering out restricted flows and determining whether a flow is a Notification flow are more efficient.

Fixed

Restricted flows and deleting assigned groups – We've fixed an issue that caused flows to still be restricted after all the groups assigned to them had been deleted.

Supervision

Fixed

Redirecting keyers after completing submissions – We've resolved an issue that caused keyers to be redirected to the Submissions page after completing a submission's Supervision tasks. The system now redirects keyers to the Submission Completed page.

Quality Assurance

Updated

Confirmation dialog box for clearing QA tasks – When you click Clear QA Tasks on the Perform Tasks page, a confirmation dialog box appears before the QA tasks are cleared, giving users the opportunity to cancel the deletion if needed.

Fixed

Classification QA and images marked for deletion – We've resolved an issue that caused Classification QA tasks to contain page images that had already been deleted. When generating a Classification QA task, the system now confirms that the page images included in the task are not marked for deletion.

Settings

Fixed

Enabling "Gather health statistics" and resetting to default settings – We've fixed an issue that caused an error to occur after users enabled the Gather health statistics setting and then clicked Reset to Default Settings on the Settings page (Administration > Settings).

Authentication

Updated

Restricting application access to members of specific permission groups – With the HS_ALLOWED_LOGIN_GROUPS ".env" file variable, you can now restrict application access to certain permission groups. While it can be used on its own, this feature is designed to be used along with JSON Web Tokens (JWTs) to ensure that keyers can only access the application through the embedded Supervision widget.

TVE Instances

Fixed

Accessing /admin – We've resolved an issue that caused a 500 error to occur when attempting to access the /admin section of TVE instances.

33.1.19 (18 Jul 2022)

Releases

Fixed

Deploying empty releases – We've fixed an issue that allowed users to assign empty releases to flows and then deploy those flows. Doing so resulted in errors when the system attempted to complete document classification.

Flow Blocks

Fixed

Submitting block outputs to localhost URLs – We've fixed an issue that caused blocks to submit their outputs to localhost URLs in some situations. This issue resulted in halted submissions.

Cases

Updated

Cursor-based pagination in the Cases table – We've implemented cursor-paged pagination in the Cases table. As part of this change, we've removed the Per Page drop-down list from the bottom of the table, and we've changed the case count to display "100+ Cases" when the total number of cases exceeds 100.

Fixed

System responsiveness when searching for cases – We've fixed an issue that caused search results to take more than 3 seconds to load on the Cases page when there were a large number of cases in the system.

Quality Assurance

Fixed

QA Uniqueness and clearing QA tasks – We've resolved an issue that prevented the system from taking QA Uniqueness into account when clearing QA tasks. This issue caused all QA tasks to be cleared, even if they were assigned to keyers.

SaaS

Fixed

Editing permission groups when no identity-provider federation is configured – We've fixed an issue that prevented users from editing permission groups in SaaS deployments that did not have any identity-provider federation configured and used an Amazon Web Services Application Load Balancer (AWS ALB).

33.1.18 (14 Jul 2022)

Models

Fixed

Transcription confidence levels for multiline fields – We've fixed a regex issue that caused transcription confidence levels for multiline fields to be low.

Tasks

Fixed

Performing tasks from the Cases page – We've resolved an issue that caused 404 errors to occur after keyers processed documents by clicking Perform Tasks on the Cases page.

Integrations

Fixed

Submissions created from the Box Folder Listener Block – If the folder specified in Folder to Move Completed Files did not exist, the Box Folder Listener Block created an infinite number of submissions. With the fix included in this version, the Box Folder Listener Block creates submissions only if the folder specified in Folder to Move Completed Files exists.

Podman

Updated

Log retention for Podman containers – We've increased the maximum size of the file used to collect stdout data from each Podman container, allowing logged data to be retained for longer periods of time before being overwritten.

Security

Updated

Restrictions for users authenticating with JSON Web Tokens (JWTs) – We've added settings that allow System Admins to restrict application access for users authenticating with JWTs. These settings can prevent these users from accessing specific parts of the application.

Fixed

Authorization for /admin URLs – To address security vulnerabilities, we've updated the permissions required to access several /admin URLs that allow users to upload files or execute code.

33.1.17 (7 Jul 2022)

This version of Hyperscience has issues with using v31 flows after an upgrade. We recommend upgrading to this version only if you plan to use v32 or v33 flows.

Submission Processing

Fixed

PDF_PAGE_BOX “.env” file’s variable and submission processing – We’ve fixed an issue that prevented submission-processing blocks from respecting the “.env” file’s PDF_PAGE_BOX variable.

Permissions

Updated

“View Users” permission and Users table – We’ve updated the View Users permission to allow users to view other users in the Users table.

33.1.16 (6 Jul 2022)

This version of Hyperscience has issues with using v31 flows after an upgrade. We recommend upgrading to this version only if you plan to use v32 or v33 flows.

Releases

Fixed

Response times when creating releases or viewing releases’ layouts – We've fixed an issue that caused the system to be unresponsive when creating releases or viewing the layouts contained in releases.

Models

Fixed

Retrieving Field Level Automation data on the Model details page – We've resolved an issue that caused timeouts to occur when the system attempted to retrieve the Field Level Automation data for display on the Model details page.

Supervision

Fixed

Submitting a single task multiple times – We've resolved an issue that caused Supervision tasks to be submitted multiple times and in place of other tasks in certain situations.

Duplicate cells in a row for a single column – We've fixed an issue that caused two cells to appear in a row for a single column during some Table ID Supervision tasks.

All fields pointing to the same bounding box during Flexible Extraction – We've resolved an issue that caused all fields in a document to point to the same bounding box during Flexible Extraction tasks. The issue occurred when none of the fields in the document had shared_field_id values.

Quality Assurance

Fixed

Clearing QA tasks when filtering to a specific flow – Previously, if a user clicked Clear QA Tasks after filtering the Tasks page to show information for a specific flow, the system cleared all unassigned QA tasks of the selected type. However, the system should have only cleared tasks of the selected type for the selected flow. A fix for this issue is included in this version.

Clearing QA tasks with the “All Flows” filter selected – In previous versions, if the All Flows filter was selected on the Tasks page when QA tasks were cleared, the system cleared all unassigned QA tasks from the queue. However, the system should have only cleared tasks for the flows that the user had access to. A fix for this issue is included in this version.

Cases

Fixed

System responsiveness when sorting by Case ID – We've resolved an issue that caused delays in system responsiveness when a user sorted the Cases table by Case ID on the Cases page (Library > Cases).

33.1.15 (29 June 2022)

This version of Hyperscience has issues with using v31 flows after an upgrade. We recommend upgrading to this version only if you plan to use v32 or v33 flows.

Models

Updated

System Upgrade Evaluation tab and increased performance – To increase performance, we’ve split the request for loading the Current System Version and System Upgrade Evaluation tabs in the Model Details page. We now have a separate request that loads the System Upgrade Evaluation tab only when a user clicks on the tab.

33.1.14 (28 June 2022)

This version of Hyperscience has issues with using v31 flows after an upgrade. We recommend upgrading to this version only if you plan to use v32 or v33 flows.

Tasks

Updated

“Flows” filter in the Perform Tasks tab – Previously, the Flows filter in the Perform Tasks tab updated the number of tasks for a given flow in the Supervision Tasks and Quality Assurance Tasks cards, but it was not applied to the actual tasks that were loaded upon clicking a card’s Perform Tasks button. In v33.1.14, we’ve updated the Flows filter to apply to the actual tasks that are loaded from the cards.

”View Task Queue” permission and filters in Task Queue – We’ve updated the View Task Queue permission to allow users to edit filters in the Task Queue tab.

Fixed

Viewing all active workers in the Active Workers card – We’ve fixed an issue that prevented users from viewing all active workers in the Active Workers card on the Tasks Overview page.

Models

Fixed

Recalibration of Classification models for Structured documents – We've fixed an issue that caused the page IDs of Flexible Extraction documents to point to pages without fields in some situations. This issue caused errors during the recalibration of Classification models for Structured documents.

Permissions

Updated

Permissions and filtering by flow – To allow users to filter by flow without having the View Flows permission, we’ve removed the filtering functionality from the View Flows permission. With the improvements included in this version, the ability to filter by flow is now available to all users regardless of their permissions.

33.1.13 (23 June 2022)

User Experience

Updated

Custom login warning message – You can now create a custom login warning message, which is displayed every time a user logs in to your Hyperscience application. The message is shown as a dialog box that the user can dismiss by clicking the dialog's Continue button.

To add this message to your login experience, add the LOGIN_WARNING_BANNER_TEXT variable to your “.env” file, and enter your message as the variable’s value.

Databases

Fixed

Compatibility error message for Azure SQL Managed instances – We’ve fixed a compatibility issue for Azure SQL Managed instances that resulted in the following error message after running ./run.sh init:

“Could not parse Microsoft SQL Server version.”

API

Fixed

Client ID validation in JSON Web Token (JWT) authentication – We’ve enabled client ID validation in JWT authentication. We now properly handle errors when JWT authentication fails due to being disabled.

33.1.12 (22 June 2022)

Integrations

Fixed

33.1.11 (17 June 2022)

Layout Editor

Fixed

Text of "Read only" label in the Layout Editor – Previously, the "Read only" label in the Layout Editor had white text on a white background, making the text illegible. The issue is fixed in this version, and the "Read only" text is now legible.

Clicking "Undo" when adding a field – We've fixed an issue that caused the Layout Editor to enter an invalid state when a user clicked Undo after creating a field with no Name or Data Type values. In these situations, a "Missing Info" notification would be shown to the user, but the user could not provide the "missing info" or commit their changes.

SaaS

Updated

Restricting access to SQL Explorer – In SaaS deployments, we've restricted access to the SQL Explorer to Hyperscience employees only.

33.1.10 (15 June 2022)

Training

Fixed

Warning message for instances without a trainer – We’ve added a “No Active Trainer” warning message to the Model Details page. If there is no trainer attached, the following message informs users that training tasks can only be queued but cannot be completed: “Trainer tasks can be created and queued, but they will not be completed because there is no active trainer.”

Fixed

Eligible documents for Table ID model training – We’ve fixed an issue that caused documents that had not reached consensus after completing a round of Table ID QA to be eligible for Table ID model training. The issue occurred when the Tables Nonconsensus Training Disabled setting was enabled in /admin.

Model Validation Tasks

Updated

Optimized database queries for Model Validation Tasks (MVTs) – To increase performance, we’ve removed unnecessary JOIN clauses from the queries that generate Field ID MVTs and save their results.

Flows

Fixed

Importing flows in ZIP files – We’ve fixed an issue that caused the flow-import process to fail when importing a ZIP file that was created manually by compressing the parent directory.

Transcription

Updated

Optimized database queries for Flexible Extraction tasks – To increase performance, we’ve removed unnecessary JOIN clauses from the queries that generate Flexible Extraction tasks.

Submission Processing

Fixed

Memory usage for large submissions – We've fixed an issue that caused the system's processing of large submissions to consume a large amount of memory, with the system using all available memory in some circumstances.

33.1.9 (3 June 2022)

Submission Output

Fixed

Field Identification percentage in the Automation card – We’ve fixed an issue that caused the Field Identification percentage in the Automation card on the Submission Output page to be incorrect in the following cases:

Multiple occurrences of a field were on different pages.
The first page did not have any fields.

Rejected documents in the Submission Output page – We’ve fixed an issue that caused documents that were rejected during Transcription Supervision to be accessible in the Submission Output page.

Custom Supervision

Fixed

Transcribed fields with overridden formatting validations – We’ve fixed an issue that caused fields with overridden formatting validations to be blank in Custom Supervision even if they were transcribed in a previous task.

Loading Custom Supervision tasks and “Yes” transcription values for fields – We’ve fixed an issue that resulted in an error when loading Custom Supervision tasks that contained “Yes” transcription values for fields.

Supervision Transcription masking and custom data types – We’ve fixed an issue that prevented Supervision Transcription masking from being applied to custom data types.

Quality Assurance

Fixed

Scrolling through table rows with keyboard shortcuts during Table ID QA – We’ve fixed an issue that resulted in an unexpected error if you used keyboard shortcuts to scroll through table rows during Table ID QA.

Keyer Data Management

Fixed

Page images for Keyer Data Management and Field ID training – We’ve fixed an issue that caused the page image used for Keyer Data Management to be different than the page image used for Field ID training.

Training

Fixed

Documents without field locations and Field ID training – We’ve fixed an issue that prevented documents without any field locations from being eligible for Field ID training.

Releases

Fixed

Auto-saving of releases when selecting a different layout version – We’ve fixed an issue that prevented releases from auto-saving when a user selected a different version of a layout.

Databases

Updated

Support for PostgreSQL 13.x – We've added support for PostgreSQL 13.x.

Security

Fixed

Changing the javax.el library to jakarta.el and updating it – To address a security vulnerability, we’ve changed the Maven javax.el library to jakarta.el and updated it to 3.0.4.

API

Fixed

Machine Accuracy and Machine Accuracy Margin of Error in API queries – We’ve fixed an issue that caused the values for Machine Accuracy and Machine Accuracy Margin of Error to be blank in the following queries:

/api/v5/reporting/transcription_accuracy?start_date=&flow_uuid=
/api/reporting/qa_stats

33.1.8 (25 May 2022)

Flows

Updated

Consistency among Python file names, ZIP file names, and folder names – We’ve updated the Python file names to be consistent with ZIP file names and folder names. Previously, for example, Python file names ended in “_32”, while ZIP file names and folder names ended in “_v32”.

Output Blocks

Updated

“Additional SQS Metadata” setting for Message Queue Notifier Output Blocks – We’ve added an Additional SQS Metadata setting to Message Queue Notifier Output Blocks. This setting is available for the Amazon SQS message queue type and determines what additional metadata is sent to Amazon SQS.

API

Fixed

Configuring claims in JSON Web Tokens (JWTs) – We’ve fixed an issue that prevented users from configuring the claims that are provided in JWTs. The issue occurred in the following scenario when authenticating with a JWT:

A user created a JWT using the Client Credentials Flow. To learn more about the Client Credentials Flow, see Auth0’s Client Credentials Flow.
A user made an API request with the above-mentioned JWT.

SaaS

Fixed

Validating JSON Web Tokens (JWTs) generated by the AWS Application Load Balancer (AWS ALB) – We’ve fixed an issue for SaaS instances that prevented JWTs generated by the AWS ALB from passing validation. This issue resulted in users being unable to log in despite authenticating successfully.

33.1.7 (20 May 2022)

Data Types

New

Dates with Punctuation data type – We've added a Dates with Punctuation data type, which supports dates that are preceded by an opening parenthesis and followed by a closing parenthesis, comma, or period, or any combination of those characters (e.g., (June 21, 2021). ).

Quality Assurance

Fixed

Fields without locations and Transcription QA – Previously, if a user manually classified a structured document and it did not go through Flexible Extraction, fields without locations were sent to Transcription QA. This resulted in an error when trying to complete the Transcription QA task. With the fix included in this version, fields without locations are no longer sent to Transcription QA.

Output Connections

Fixed

Changing the Message Queue Notifier’s parameters – We’ve fixed an issue that prevented Flow Studio from displaying the updated values for the following parameters after a user made changes to them for a deployed flow:

MQ_NO_AUTH_REQUIRED
MQ_USE_EC2_INSTANCE_CREDENTIALS
MQ_SSL_CIPHER_SUITE
MQ_CONNECTION_TYPE
MQ_MESSAGE_GROUP_ID
MQ_MESSAGE_METADATA

Authentication

Updated

Importing and exporting authentication groups – You can now export authentication groups and import them to other instances.

Finetuning Models

Updated

Importing and exporting finetuning models – We’ve added a warning message that informs users that they cannot import or export finetuning models if their instance does not have any flows.

The warning message for exporting appears when clicking the Export button on the Export Finetuning Model page at /admin/forms_qa/finetuningmodelmeta/export-file/.
The warning message for importing appears when clicking the Import button on the Import Finetuning Model page at /admin/forms_qa/finetuningmodelmeta/import-file/.

33.1.6 (20 May 2022)

Training

Fixed

Training with multiple-occurrence fields and checkboxes or signatures – We've fixed a filtering issue that created empty occurrences if both multiple-occurrence fields and checkboxes or signatures were present in a dataset. These empty occurrences caused training to fail.

Tasks

Fixed

Actions for restricted tasks – Previously, if a task in the Task Queue table or the Submissions table was restricted, the Actions links were disabled, even if the keyer had permissions for those actions. With the fix provided in this version, these tasks still appear as restricted, but keyers can now access the actions they have permissions for.

Supervision Widget

Updated

Headers for JWT (JSON Web Token) authentication – We've updated the default header for JWT-authenticated API requests from a custom header to the standard Authentication: Bearer header. As part of this change, the system automatically detects whether a JWT or Django REST Framework (DRF) token is being passed in Authentication: Bearer .

33.1.5 (13 May 2022)

Supervision Widget

Updated

Smooth token updates and support for JSON Web Tokens (JWTs) – The embedded Supervision widget now accepts JWTs in addition to the existing token authentication. The third-party system the Supervision widget is embedded into handles the management of JWTs.

We’ve also added two distinct approaches to ensure smooth token updates without sacrificing user experience:

an onTokenExpiring callback that is called when the token is close to expiring. The callback should return a Promise object with the value of the renewed token.
an updateAuthToken function that updates the authentication token on demand.

33.1.4 (12 May 2022)

Submission Processing

Fixed

Optimized data-aggregation queries upon submission completion – To increase performance, we’ve optimized the data-aggregation queries that run upon submission completion.

Reporting

Fixed

Accuracy and User Performance tabs – We’ve fixed an issue that resulted in an unexpected error in the following Reporting tabs:

Reporting > Accuracy
Reporting > User Performance

Security

Fixed

Updating jackson-databind and jackson-dataformat-cbor – To fix security issues, we’ve updated jackson-databind to 2.13.2.2 and jackson-dataformat-cbor to 2.13.2.

33.1.3 (11 May 2022)

Field Identification

Fixed

Multiple occurrences and shortcuts keys – We’ve resolved an issue that prevented users from navigating between a field’s multiple occurrences when using shortcut keys.

Transcription

Fixed

“None” normalized transcription values and the Semi-structured Transcription Confidence Boost setting – We’ve fixed an issue that prevented the Semi-structured Transcription Confidence Boost setting from handling “None” normalized transcription values.

Quality Assurance

Updated

Optimized Field ID QA query – To increase performance, we’ve optimized the query that checks for consensus during Field ID QA tasks.

Fixed

Transcription QA tasks illegible fields – We’ve fixed an issue that caused a field’s transcription value to be normalized to “None”. The issue occurred in Transcription QA tasks when the user reached consensus for a field that had been previously marked as illegible during Transcription Supervision or Transcription QA.

Keyer Data Management

Updated

Pagination for tables in the Annotations page – Previously, you were able to scroll through all pages of a document when editing table annotations, even if there were many pages in the document. As a result, annotating actions took between 2 and 5 seconds to load for documents with more than 18 pages. In v33.1.3, we’ve added a button () to the top toolbar of the Annotations page that, when activated, restricts scrolling between pages and loads a single page at a time. Clicking this button improves the loading time for annotating actions for large documents. You can still navigate between pages via shortcut keys.

Reporting

Updated

Reporting optimizations – To increase performance, we’ve removed unnecessary data-aggregation operations that do not include reporting data.

Fixed

Selecting a filtering option in the Document Output Accuracy report – We’ve fixed an issue that resulted in an unexpected error when selecting Checkbox from the field types filter in the Document Output Accuracy report (Reporting > Overview).

Permissions

Updated

Warning message when assigning permission groups to flows – We’ve added a warning message that informs users that flow-based permissions are not applied to existing tasks. The warning message appears when assigning a permission group to a flow.

Input Connections

Fixed

Time zones for Warm-Up Interval comparisons for the Universal Folder Listener – To resolve Warm-Up Time Interval calculation issues for the Universal Folder Listener, the system now uses a file's last-modified time in the local time zone rather than UTC when determining if the file is eligible for processing.

Security

Fixed

Updating OpenSSL and OS packages – When creating an installation bundle for a new version of Hyperscience, we now use the latest available version of OpenSSL, and we update relevant OS packages.

33.1.2 (28 Apr 2022)

Flows

Fixed

Halted submissions after Supervision in v31 flows – We've fixed an issue that caused submissions processed in v31 flows to halt after their Supervision tasks were completed.

Task Queue

Fixed

Inactive "Perform Tasks" button – We've fixed an issue that caused the Perform Tasks button to be inactive, even when the user selected checkboxes for submissions that had available tasks.

PII Data Deletion

Updated

Optimized queries for excluding training documents – Previously, queries that excluded training documents with an "Always" Training Status from PII data deletion caused spikes in database CPU usage. A fix for this issue is included in v33.1.2, which prevents these spikes from occurring and improves system performance.

33.1.1 (22 Apr 2022)

Languages

New

Support for Polish submissions – We now support automation on Structured and Semi-structured documents written in Polish. The Polish language model allows our system to extract data from Polish documents, and keyers can complete Transcription Supervision and Transcription QA tasks by entering Polish text.

Note that the Alphanumeric data type is not fully compatible with Polish at this time.

Flows

Updated

Importing v31 and v32 flows – We’ve added support for importing v31 and v32 flows.

Submission Processing

Updated

Optimizing field queries – We've updated certain field-related queries to minimize the use of JOIN clauses. These changes reduce the CPU resources required by these queries and improve overall system responsiveness.

Optimizing queries for viewing documents – We've made improvements to the queries used to show document pages in the application. Previously, pages could take over a minute to load in some instances.

Multiple processing queues – We've added support for multiple processing queues, which allows resource-intensive bulk-insert operations to be executed in one queue and submission-initialization tasks to be executed in another. Executing these types of tasks in parallel improves overall processing times.

Fixed

Matching pages to Structured layouts during Document Classification Supervision – We’ve fixed an issue that prevented users from matching pages to Structured layouts during Document Classification Supervision.

Submissions and Documents tables

Updated

Task count in the Submissions table – To increase performance, we’ve optimized the query that calculates the task count for each submission in the Submissions table.

Fixed

Adding the “Manual Classification” option to the “Status” filter – We’ve added the Manual Classification option back to the Submissions table’s Status filter.

Permissions and “Perform Tasks” button – We’ve fixed an issue that allowed users to click the Perform Tasks button for submissions and documents they did not have access to. The issue occurred in the Submissions and Documents tables.

Submission Output

Fixed

Documents table’s task links in the Submission Output page – We’ve resolved an issue that caused the Documents table’s task links in the Submission Output page to be unclickable.

Tasks

Updated

Limits for Supervision and QA task counts – To reduce page-loading times, we now limit Supervision Tasks and Quality Assurance Task counts on the Perform Tasks page. By default, if there are more than 9999 tasks of a certain type, “9999+” appears as the task count in the application. You can configure this limit by adding the TASKS_LIMIT_PERFORM_TASKS_PAGE variable to your ".env" file.

Fixed

Permissions and “Complete Tasks” button – We’ve fixed an issue that allowed users to click the Complete Tasks button in the Task Queue page for tasks they did not have access to.

Model Validation Tasks

Updated

Order of Model Validation Tasks (MVTs) presented – We've changed the order of the MVTs presented to the user so that it matches the order in which they were generated by the system. Tasks generated first generally contribute more to the model's effectiveness, so finishing them first increases its effectiveness faster.

Field Identification

Updated

Optimized queries for Field Identification – We've made the following updates to the queries executed during Field Identification tasks:

Only the database columns needed for each task are retrieved.
Where possible, complex queries for retrieving fields have been refactored into simpler queries.
We've redesigned the field-ordering process so that it is executed in the application rather than the database, making it more efficient.

Custom Supervision

Fixed

Overlapping objects and an extra scrollbar during Custom Supervision in IE 11 – We’ve fixed an issue that caused an extra scrollbar to appear in the Action panel and the following objects to overlap the Complete task button in IE 11:

expanded drop-down menus that are located at the bottom of the Action panel. When expanded, these drop-down menus also added blank spaces at the bottom of the Custom Supervision page.
case ID that is located at the bottom of the Action panel.

Action panel during Custom Supervision in IE 11 – Previously, if you had configured three tabs in the Action panel, some of these tabs would be hidden in IE 11, and a horizontal scrollbar would be created for the Action panel. With the fix included in this version, we’ve condensed tabs in the Action panel so they do not need a scrollbar.

Quality Assurance

Updated

Optimized queries for QA tasks – We've removed unnecessary JOIN clauses from the queries that are executed during Transcription QA, Field ID QA, and Table ID QA tasks. These updates make more CPU resources available to other queries and processes, helping to increase overall system performance.

Optimizations for the abandonment of QA tasks – We've improved the queries executed in the abandonment of QA tasks in the following situations:

The keyer clicks Mark Layout Variation Incorrect during a Field ID QA task.
The user clears Identification QA or Transcription QA tasks on the Perform Tasks page.

Fixed

Use of system-level Transcription QA settings – We've fixed an issue that caused system-level Transcription QA settings to be used in place of flow-level settings in some situations.

Sampling non-identified fields for QA – We've resolved an issue that caused a submission's fields to be sampled for QA, even though none of its fields were identified during Field Identification.

Connections

Updated

"Audience" setting for HTTP Notifier and HTTP REST API Blocks – We've made the Audience setting optional in HTTP Notifier and HTTP REST API Blocks.

"Timeout (Seconds)" setting for HTTP Notifier Blocks – We've added a Timeout (Seconds) setting to HTTP Notifier Blocks. This setting determines how many seconds the connection remains open if no data is received from the endpoint.

Keyer Data Management

Fixed

Number of documents shown at one time – We've resolved an issue that caused all of a model's training documents to be listed at once on its Training Data page, even when the number of documents was greater than the value selected in the Per Page drop-down list.

Reporting

Fixed

Daylight Savings Time and KeyerProjection.csv – We’ve fixed an issue related to Daylight Savings Time in KeyerPerformance.csv, which is part of the Keyer Projection report (Reporting > User Performance). The issue caused KeyerPerformance.csv to include data that is outside of the report’s date range.

Jobs

Updated

"Retry halted jobs" and "Retry halted flows” actions – We've made the following enhancements to the Retry halted jobs and Retry halted flows actions on the Jobs page:

If the action cannot be performed, its name is shown in gray print.
When a user clicks the name of the action, a count of the number of jobs or flows that will be retried is shown in the confirmation dialog box.

PII Data Deletion

Fixed

Duplicated deletion efforts – We've fixed an issue that caused the system to attempt to delete previously-deleted PII data.

Submission Deletion

Fixed

Blocked jobs – Previously, if multiple jobs attempted to delete the same submission, the jobs were sometimes blocked until that submission was deleted. A fix for this issue is included in v33.1.1, increasing the availability of submission-processing resources.

SaaS

Updated

Autoscaling of trainers – We've updated the registration of trainers in SaaS instances to pass trainer-version data to the application, helping to optimize the autoscaling of trainers.

TVE Instances

Fixed

Database schema in the SQL Explorer tool – We’ve fixed an issue that prevented the database schema from being displayed in the SQL Explorer tool in TVE instances.

API

Updated

Optimized application of filters – We've made the application of filters more efficient by reducing the number of queries that are executed, as well as the JOIN clauses included in the queries.

33.1.0 (6 Apr 2022)

User Experience

Updated

Perform Tasks landing page – After logging in to the application, users are directed to the new Perform Tasks page rather than the Tasks Overview page.

To learn more about the changes we've made to the Tasks pages, see the Tasks section of these release notes.

Layouts

Fixed

Updating layout tags and layout names when importing existing layout variations – We’ve fixed an issue that prevented layout tags and layout names from being updated when importing existing layout variations.

Data Types

Updated

Creating custom field data types (CFDTs) from patterns with “(space)” – Previously, if a user selected the (space) option in the Define Normalization dialog box when creating CFDTs from patterns, the “space” symbol was not visualized in the list of characters stripped in output. With the update included in this version, the “space” symbol is now visualized as “(space)” in the list of stripped characters.

Releases

Fixed

Assign to Flow button and Assign To Flow dialog box – We’ve fixed an issue that caused the Assign to Flow button to be enabled even if a release was assigned to all available flows. When a release was assigned to all available flows, the Current Release column in the Assign To Flow dialog box had an “N/A” value for each flow. A fix for this issue is included in v33.1.0.

Restricted flows and Assign To Flow dialog box – We’ve fixed an issue that caused some flows with restrictions to appear as unrestricted in the Assign To Flow dialog box when assigning a release to a flow.

Potential Layout Variations

Updated

Moving the Find Potential Layouts button – We’ve moved the Find Potential Layouts button from the bottom of the No Layout Variation Found tab to the newly-created Actions drop-down menu. This button is now called Find Potential Layout Variations.

Previously, clicking the Find Potential Layouts button redirected users to the Potential Layout Variations tab without starting a Potential Layout Variations job. With the update included in this version, clicking the Find Potential Layout Variations button opens a dialog box that asks for confirmation before starting the Potential Layout Variations job.

Submissions and Documents

Updated

Start and end times in date filters – When filtering the Submissions or Documents table by date, you can now add specific times to the filter's start and end dates. To do so, after filtering the table by date, add the timestamps to the date_min and date_max parameters in the URL shown in your web browser.

Searching by page ID – To optimize overall search performance, we've removed the ability to search the Submission and Documents tables entries by page ID.

Cursor-based pagination in the Submissions and Documents tables – We've implemented cursor-paged pagination in the Submissions and Documents tables, and we've removed the Per Page drop-down lists for those tables.

Submissions

Updated

Showing task data on the Submissions page – We've updated the queries used to show data about submissions' pending tasks on the Submissions page of the application. These updates improve system performance in instances with a large number of submissions.

Filtering capabilities on the Submissions page – To improve overall filter performance, we've removed the following filters from the Submissions page:

Date Completed
Halted Jobs
Field Exceptions
Potential Layout Variation Candidates
Document-level filters
Page Status

We've also made improvements to the queries executed for other Submissions filters.

Search capabilities on the Submissions page – To improve overall search performance, we've removed the ability to search by file name on the Submissions page.

Submissions table columns – To optimize the responsiveness of the Submissions page, we've made the following changes to the columns in the Submissions table:

We've removed the Field Exceptions column.
The Tasks column no longer shows task counts.
The Completion Date and Submission ID columns are no longer sortable. The only sortable column is Submission Date.

Loading the Submissions table when a large amount of data is in the database – To increase performance, we’ve optimized the queries for loading the Submissions table when the database contains a large amount of data.

Documents

Updated

Sorting the Documents table by column – To improve system responsiveness, we've removed column-sorting capabilities in the Documents table for all columns except Document Date.

Filtering capabilities on the Documents page – To improve overall filter performance, we've removed the Date Completed, Type, and Page Status filters on the Documents page. We've also made improvements to the queries executed for other Documents filters.

Tasks

Updated

Reorganization of Tasks pages – To reduce loading times of Tasks pages, we've reorganized their content and added an additional page.

The new Perform Tasks page (Tasks > Perform Tasks) shows the number of Supervision and QA tasks available to the user and contains Perform Tasks links for each type of task.
The Tasks Overview page (Tasks > Overview) now contains only the In Queue, Active Workers, and Task Deadline Breakdown cards.
We've removed the graph showing processing deadlines from Task Queue page (Tasks > Task Queue), named it Task Deadline Breakdown, and added it as a card on the Tasks Overview page.

Filtering Perform Tasks content by flow – You can now click All Flows on the Perform Tasks page to choose which flow you would like to see task information for. When you filter Perform Tasks content by flow, the task counts and Perform Tasks links are specific to the flow you select.

Filtering capabilities on the Task Queue page – To improve overall filter performance, we've removed the Halted and Time Since Submission filters on the Task Queue page. We've also made improvements to the queries executed for other Task Queue filters.

Search capabilities on the Task Queue page – To improve overall search performance, we've removed the ability to search by file name on the Task Queue page.

Queries for retrieving Task Queue page data – We've optimized the queries used to retrieve the data shown on the Task Queue page, which reduces page loading times in high-volume instances.

Cursor-based pagination in the Task Queue table – We've implemented cursor-paged pagination in the table on the Task Queue page, and we've removed the Per Page drop-down list for the table.

Document counts when filtering the Tasks Overview page – We've optimized the way the system generates document counts on the Tasks Overview page when the page's content is filtered to a specific flow. This update reduces the loading time of the Tasks Overview page.

Queries for Supervision Tasks, QA Tasks, and In Queue cards – We've merged and optimized the queries used to retrieve the data shown in the Supervision Tasks, QA Tasks, and In Queue cards on the Tasks Overview page. This update improves system responsiveness in instances with a large number of available tasks.

Queries for QA Uniqueness – We've reduced the CPU resources required to execute queries for the QA Uniqueness feature.

Removal of "QA Uniqueness Timeout" setting – To increase performance, we've removed the QA Uniqueness Timeout setting.

Performance optimizations in instances with large numbers of tasks and documents – In addition to the improvements already described, we've optimized the following aspects of Tasks pages in instances with large numbers of tasks and documents:

We've improved the responsiveness of the Tasks Overview page.
We've changed the queries used to retrieve Task Queue page data to prevent timeouts.
We've reduced the amount of time needed to open tasks after clicking the Perform Tasks links.

Fixed

Unexpected error when accessing the Task Queue – We've fixed an issue that caused an "unexpected error" to occur when users attempted to access the Task Queue page.

Transcription

Updated

Optimized Transcription Supervision queries – To increase performance, we’ve improved our Transcription Supervision queries by removing all unnecessary operations from them.

Fixed

Halting of multi-page Structured submissions during Flexible Extraction – We’ve fixed an issue that caused multi-page Structured submissions to halt if they were sent to Flexible Extraction and did not contain all pages of a multi-page Structured layout.

Quality Assurance

Updated

Optimized Transcription QA queries – To increase performance, we’ve removed unnecessary operations from Field Transcription QA and Table Transcription QA tasks.

Submission Processing

Updated

Queries for completing submission processing – We've optimized the queries executed by the Complete Block, reducing the amount of time needed to complete submission processing.

Optimized queries for counting multiple occurrences – To reduce unnecessary database load and increase performance, we’ve optimized the queries that count occurrences of fields during Field Identification, Transcription, and Flexible Extraction tasks.

Rescheduling system tasks – The system no longer attempts to reschedule system tasks if there are parallel tasks that are timed out. This update reduces submission-processing times and increases system responsiveness.

Upload Submissions dialog box and default layouts – The Upload Submissions dialog box no longer defaults to any particular Semi-structured layout.

Cases

Updated

Subquery for loading the Cases page – We've added a subquery to decrease the loading time of the Cases page.

Sorting by number of documents – We've removed the ability to sort the Cases table by the number of documents in a case.

Training

Updated

Optimized memory usage during Table ID model training – We’ve optimized the memory usage during Table ID model training for documents with single-row tables.

Fixed

Training Field ID models on a future trainer version and upgrading the application – We’ve fixed an issue that prevented Field ID models that were trained on a future trainer version from predicting fields’ locations, even after upgrading the application.

Keyer Data Management

Fixed

Training Data page’s paging – Previously, regardless of the paging button’s value on the Training Data page (25, 50, or 100 results per page), the Training Data page showed all results. For example, if you selected 25 results per page and there were 30 results in total, the Training Data page would display all 30 results on a single page. With the fix included in this version, the Training Data page respects the paging button’s value.

Reporting

Updated

Optimized reporting-aggregation queries upon submission completion – To increase performance, we’ve removed all unnecessary operations from the reporting-aggregation queries that run upon submission completion.

Fixed

Time zones ahead of UTC time – Previously, when the system time zone is a time zone ahead of UTC time, some tooltips on the Reporting pages showed dates that were one day ahead of the data's actual dates. A fix for this issue is included in v33.1.0.

Notifications

Updated

Asynchronous notifications for retrying halted jobs and failed flows – To increase performance, we’ve implemented asynchronous notifications for retrying halted jobs and failed flows.

Permissions

Updated

Permissions for Tasks pages – To reflect the new organization of the Tasks pages, we’ve made the following changes to permissions:

The View Supervision Queue Card permission is now named View Supervision and QA Cards.
The View In Queue Card and View Active Workers Card permissions have been combined into a permission named View Task Overview. The System Admin, Business Admin, Data Keyer Admin, Data Keyer, and Knowledge Worker permission groups have this permission.
We've removed the View Time In Queue Card and View Completed Today Card permissions.

Determining permissions for submissions and documents – We've optimized the way the system determines whether a user has access to specific submissions and documents, helping to improve overall system responsiveness.

Jobs

Updated

Retrying halted jobs and failed flows – Previously, regardless of the filters applied on the Jobs page (Administration > Jobs), clicking the Retry halted jobs button retried all jobs while clicking the Retry failed flows button retried all flows on the current page. We’ve now added the following new buttons to the Actions drop-down menu:

On the Legacy Jobs page, we’ve added a new Retry halted jobs in filter button that respects the filter applied on the page.
On the Flows page, we’ve added a new Retry failed flows in filter button that respects the filter applied on the page.

We’ve also renamed the Retry halted jobs and Retry failed flows buttons to Retry all halted jobs and Retry all failed flows, respectively. Clicking the Retry all failed flows button now retries all flows.

Pagination for the Jobs page – To increase performance, we’ve implemented an improved pagination for the Jobs page (Administration > Jobs). The improved pagination applies to both the Legacy Jobs tab and the Flows tab.

Data Deletion

Updated

Optimized queries for PII data deletion – To reduce CPU usage and increase performance, we’ve made the following changes:

We’ve optimized database queries that exclude documents with Always training statuses from PII data deletion.
We’ve optimized a database query that caused poor system performance during PII data deletion.
We’ve optimized queries for submissions whose data has already been PII wiped.

Optimized database queries for deleting orphaned images – We’ve optimized the database queries for deleting orphaned images. These database queries now only check the images that have been created since the last deletion and ignore the images that are set to be deleted.

System & Health

New

New “Gather health statistics'' setting – We’ve added a new Gather health statistics setting to the Settings page (Administration > Settings). This setting improves performance and removes some of the reported data on the System & Health page (Administration > Health & System).

Authentication

Updated

Active Workers card and OpenID Connect usernames – If a user's first and last names are not retrieved by our OpenID Connect implementation, the user's email address is used in place of their full name in the Active Workers card on the Tasks Overview Page.

Fixed

Parsing of OpenID Connect’s groups claims – We’ve improved how the system reads OpenID Connect’s groups claims by adding support for additional delimited strings. Previously, Hyperscience supported OpenID Connect’s groups claims only in the format of lists of strings.

Security

Fixed

Audit log editing permissions – We've fixed an issue that allowed System Admins to edit the audit log.

Installations

Updated

PostgreSQL 12.10-alpine Docker image – Our installations now include PostgreSQL 12.10-alpine Docker images.

Upgrades

Updated

Increased upgrade times – In v33, we introduced various performance improvements that required the creation of more database indexes and migrating data stored in the application’s database.

On-premise customers may see an increase in the time it takes to upgrade their application services to v33 and should plan up to an extra day in their upgrade window. The impact of these changes is greater for customers with larger databases.

Our v33 database migrations are executed after running ./run.sh init for the first time within the new bundle, and migrations must be completed before the Hyperscience application can be used. To reduce the size of your database and the time required to upgrade, you can configure a shorter Submission Record Deletion period in Settings.

Fixed

Deployment of "Document Processing Notifications (V33)” flow – We've fixed an issue that prevented the "Document Processing Notifications (V33)" flow from being deployed when upgrading from v32 to v33.

Recalibrating Semi-structured fields – We’ve fixed an issue that prevented the system from passing non-English text when recalibrating Semi-structured fields. The issue led to decreased performance for non-English Semi-structured layouts after upgrading the application.

S3 Submission Retrieval Store

Updated

Support for AWS Signature Version 2 in Submission Initialization Blocks – We’ve added support for AWS Signature Version 2 in Submission Initialization Blocks.

Databases

Fixed

Loading submissions on MSSQL databases – We've fixed an issue that prevented all submissions from being loaded in instances that processed a high volume of submissions and had MSSQL databases.

API

Updated

Pagination of Submission Listing responses – We've added an optional cursor_pagination parameter to the Submission Listing API call. When set to true, count is not included in the response, and the system uses cursor-based pagination rather than numeric offsets when generating next and prev URLs. Also, start_time__gte and start_time__lt are the only other parameters that can be used when cursor_pagination is set to true.

33.0.2 (30 Mar 2022)

This version of v33 is not officially supported. The next patch version of v33, v33.1.0, is the first officially supported version of v33.

For more information, contact your Hyperscience representative.

Flows

Updated

Queries for Transcription Automation – We've optimized the queries used to calculate and display values related to Transcription Automation flow settings.

Queries for Input Block validation – We've eliminated the unnecessary queries that the system executed when checking for duplicate Input Block settings across flows.

Submission Output

Fixed

Automation data and Custom Supervision – We've fixed an issue that caused automation rates to be shown as greater than 100% on the Automation card on the Submission Output page. This issue occurred when Custom Supervision tasks were completed for the submission.

API

Updated

33.0.1 (18 Mar 2022)

This version of v33 is not officially supported. The next patch version of v33, v33.1.0, is the first officially supported version of v33.

For more information, contact your Hyperscience representative.

User Experience

Fixed

Making selections in dialog boxes – Previously, when a user attempted to select text in dialog boxes, the selection was automatically cleared. A fix for this issue is included in v33.0.1.

Layouts

Fixed

Font size of the “Archive layouts” dialog box’s text – We’ve made the Archive layouts dialog box’s text size consistent across the body of the dialog box.

Data Types

Fixed

Stripping of spaces in pattern-based data types – Previously, if (space) was selected as a character to be stripped from the normalized output of a pattern-based data type, spaces were not stripped from the normalized output. A fix for this issue is included in v33.0.1.

Models

Updated

Asynchronous download of Classification models – Previously, when downloading a Classification model, the file was directly downloaded to the user's machine. With the updates in v33.0.1, the system prepares the file in the background and creates a notification in the Notification Center when the file is ready to be downloaded. This change applies to the downloading of Classification models with and without training data.

Fixed

Model-import error messages in IE 11 – Previously, if an attempt to import a model in IE 11 was not successful, the error message shown was not completely contained in the dialog box. A fix for this issue is included in v33.0.1, and the user no longer needs to scroll the dialog box horizontally to view the complete message.

Opening the Models tab – We’ve fixed an issue with opening the Models tab (Library > Models). The issue resulted in system slowness.

Optimized payload queries for cell-level recalibration – We’ve removed redundant operations from payload queries for cell-level recalibration.

Flows

Updated

Optimized queries for loading and calculating Transcription target accuracy and automation values – We’ve optimized the queries for loading and calculating the Transcription target accuracy and automation values in flow settings. Previously, if a user edited a Classification setting’s value, the system would unnecessarily reload all target accuracy and automation values. With the improvement applied in v33.0.1, the system loads and calculates the Transcription target accuracy and automation values only in the following scenarios:

A user loads the Flow Studio page, and the Transcription Automation Training setting is enabled.
A user enables the Transcription Automation Training setting.
A user edits the Period of Records to Use setting’s value.

Fixed

Configuring an Output Block – We’ve fixed an issue that caused an Output Block to be non-configurable if it was the only block in a flow.

Names of restricted flows in Submissions and Documents tables – We've fixed an issue that caused the UUIDs of restricted flows, instead of the flows' names, to appear in the Submissions and Documents tables.

Flow Blocks

New

PDF Decrypt flow block – We’ve added a PDF Decrypt flow block that utilizes the QPDF command-line tool to decrypt PDF submissions prior to processing.

Flow Execution

Fixed

Viewing flow diagrams in the Flow Execution page in IE 11 – We’ve fixed an issue with the Flow Execution page that prevented flow diagrams from being centered in IE 11.

Submission Processing

Updated

Omitting unnecessary columns from queries – We've optimized the database queries run by the system during various user tasks. The queries retrieve data only from columns relevant to the task the user is performing, increasing system responsiveness in high-volume instances.

“Upload Submissions” dialog box and default layouts – The Upload Submissions dialog box no longer defaults to any particular Semi-structured layout.

Storing field data types in Submission objects – We’ve fixed an issue with executing unnecessary MSSQL queries for retrieving field data types during submission processing. These queries slowed down system performance. We now store field data types in Submission objects, which leads to increased performance.

Fixed

Halted submissions and flows without document-processing blocks – We’ve fixed an issue that caused submissions to halt if they were sent to a flow that does not contain any document-processing blocks (e.g., Machine Identification Block, Manual Identification Block, Machine Transcription Block, etc.).

Submission Output

Fixed

Transcription automation and editing field transcriptions during Custom Supervision – We’ve fixed an issue that caused the Automation card on the Submission Output page to display incorrect percentages for Transcription. This issue occurred when a user edited field transcriptions during Custom Supervision.

Submissions and Documents Tables

Updated

Status filter for Custom Supervision in the Submissions and Documents tables – We’ve added Custom Supervision as an option in the Status filter in the Submissions and Documents tables.

Documents

Updated

Showing data about documents on the Documents page – We've updated the queries used to show data about documents' submissions and task restrictions on the Documents page of the application. These updates improve system performance in instances with a large number of documents.

Fixed

Optimized database queries for scheduling recalibration, auto-thresholding, and finetuning – We’ve fixed an issue with executing unnecessary database queries for scheduling recalibration, auto-thresholding, and finetuning.

MSSQL queries and system performance – We’ve fixed an issue with executing unnecessary MSSQL queries that caused spikes in CPU usage. These spikes resulted in system slowness.

Tasks

Updated

Placement of Task Deadline Breakdown chart – We've moved the Task Deadline Breakdown chart from the Task Queue page to the Task Overview page.

Redesign of In Queue chart – We've redesigned the In Queue chart on the Tasks Overview page to consist of a series of vertical bars, one for each task type. Each bar shows the total number of tasks of that type and the number of tasks that are overdue. These counts only include the tasks that the user can perform.

Counting and prioritizing overdue tasks – We've updated the calculations used to determine how many overdue tasks are in the system and the priority of each, reducing the loading time of pages in the Tasks section of the application.

Fixed

Incorrect values for the “Fields Manually Identified” data points in the Field Identification card – We’ve fixed an issue that caused incorrect Fields Manually Identified values to be shown in the Field Identification card (Tasks > Overview).

Field Identification

Fixed

Bounding boxes in Model Validation Tasks (MVTs) – We’ve fixed an issue that caused MVTs to display correct bounding boxes on incorrect pages. The issue occurred in documents with deleted pages.

Table Identification

Fixed

Top toolbar during table-identification review – We’ve fixed an issue that caused the top toolbar and its buttons to be duplicated during table-identification review.

Table Transcription

Fixed

Background color of active cells during Table Transcription – We’ve fixed an issue that caused the background color of a page and an active cell to be the same. With the fix applied in v33.0.1, active cells are now highlighted with a different color from the page’s background.

Custom Supervision

Fixed

Halting of submissions with pages matched to incorrect layout variations – Previously, when pages marked as matched to incorrect layout variations were sent to Custom Supervision, the pages' submissions would halt. A fix for this issue is included in v33.0.1.

Reporting

Fixed

Timestamps in downloaded All Users Performance reports – We've fixed an issue that caused end times to be omitted from downloaded All Users Performance reports.

Optimized queries for loading the Transcription Sampled Errors report – We’ve fixed an issue with the Transcription Sampled Errors report that caused system slowness.

Incorrect values for the “Fields Identified” column in KeyerPerformance.csv – We’ve fixed an issue with the “Fields Identified” column in KeyerPerformance.csv, which is part of the Keyer Projection report (Reporting > User Performance). The issue resulted in incorrect values for the “Fields Identified” column.

Incorrect values for the “ID Fields Completed” column in HourlyReportingSubmissionOverview.csv – We’ve fixed an issue with the “ID Fields Completed” column in HourlyReportingSubmissionOverview.csv, which is part of the Keyer Projection report (Reporting > User Performance). The issue resulted in incorrect values for the “ID Fields Completed” column.

Incorrect values for the “Field ID Supervision” column in the All Users Performance Summary report – We’ve fixed an issue with the “Field ID Supervision” column in the All Users Performance Summary report (Reporting > User Performance). The issue resulted in incorrect values for the “Field ID Supervision” column.

Incorrect value for the “Field ID” column of interest in the Supervision Volume report – We’ve fixed an issue with the “Field ID” column in the Supervision Volume report (Reporting > User Performance). The issue resulted in an incorrect value for the “Field Identification” column of interest.

Permissions

Updated

Indicators for restricted flows, tasks, submissions, documents, layouts, and models – If a user is in a permission group that does not have access to a restricted flow, task, submission, document, layout, or model, a lock icon and tooltip appears next to that object to let them know why they cannot access it.

Assigning a release to a restricted flow – If a user attempts to assign a release to a flow that they don't have access to, the system shows a message explaining that they will not be able to access the release after they assign it to the flow.

Determining whether a user has access to a submission or document – We've optimized the queries used to determine whether a user can access specific submissions or documents, improving system responsiveness.

"Selected Permissions" option in "Configure Group Access Permissions" – We've added a Selected Permissions option to the All Permissions drop-down list under Configure Group Access Permissions.

Fixed

System Admins and task restrictions – We've fixed an issue that caused task restrictions to be applied to System Admins. System Admins now have access to all tasks in the instance.

Name of imported permission groups – Previously, when a user imported a permission group whose name length exceeded the 150-character maximum, the import would fail, but an error message explaining the failure was not shown to the user. A fix for this issue is included in v33.0.1, and an explanatory error message is shown.

Adding authentication groups to permission groups – We've fixed an issue that caused the Hyperscience Permission Group drop-down list to retract when its Select All link, Clear All link, or Select Group Name search box were clicked.

Width of permission group name when viewing and editing groups – We've made the width of the permission group name consistent when viewing and editing a permission group.

Permission group name in "Delete" dialog box – We've fixed an issue in permission group deletion that caused long group names to extend outside of the Delete dialog box.

Authentication

Updated

Security

Fixed

Updating com.google.code.gson to 2.8.9 – To fix a security issue, we’ve updated com.google.code.gson to 2.8.9.

Installations

Updated

PostgreSQL 12.10-alpine Docker image – Our installations now include PostgreSQL 12.10-alpine Docker images.

Databases

Fixed

Submission log deletion in MSSQL databases – We've fixed an issue in submission log deletion queries that caused spikes in database CPU usage.

Task-request indexing in MSSQL databases – We've resolved an issue in the indexing of task requests that caused lock contention in MSSQL databases.

Task retrieval in MSSQL databases – We've fixed an issue that caused task-retrieval queries to consume excessive CPU resources in MSSQL databases.

S3 Submission Retrieval Store

Updated

Support for AWS Signature Version 2 in Submission Initialization Blocks – We’ve added support for AWS Signature Version 2 in Submission Initialization Blocks.

API

Updated

Adding information about API v4 deprecation – We’ve added information about API v4 deprecation to our API documentation.

Adding information about Base64-encoded JSON data in submission creation – We’ve added information in our API documentation about sending submission data in JSON format when creating submissions via the Submission Creation endpoint.

33.0.0 (17 Mar 2022)

This version of v33 is not officially supported. The next patch version of v33, v33.1.0, is the first officially supported version of v33.

For more information, contact your Hyperscience representative.

Languages

New

New languages – We support automation on documents in the following languages:

Korean – We’ve added support for the extraction of Korean printed text from Semi-structured documents. The extraction of handwritten Korean text or Korean text from Structured documents is not supported at this time.
Chinese – We now support the extraction of Chinese printed text from Structured documents. However, it is not yet possible to extract Chinese handwritten text or Chinese text from Semi-structured documents.
Japanese – You can now extract Japanese printed text from Structured documents. Note that we do not support the extraction of Japanese handwritten text or Japanese text from Semi-structured documents.

Note that Auto Thresholding and Transcription Automation (“finetuning”) are not supported in documents containing Korean, Chinese, or Japanese text.

Updated

Enhanced support for Arabic documents – We now fully support the extraction of printed and handwritten text from both Structured and Semi-structured documents. In this version, we've added support for the extraction of handwritten and printed Arabic text from Semi-structured documents. Additionally, we've expanded our support for Structured Arabic documents to include handwritten text.

Note that Auto Thresholding and Transcription Automation (“finetuning”) are not supported in documents containing Arabic text.

Submissions

Updated

New supported file types – We now support submissions sent as HEIC or HTML files.

Flows

Updated

Importing and exporting flows – We've made the following improvements to the process of exporting and importing flows:

Flows are now exported as ZIP files. Each flow's ZIP file contains the flow's JSON file and any Python files used by the flow (e.g., code in Custom Code Blocks).
When importing files, you can import the flow's entire ZIP file.
Upon importing a new version of a flow, the system compares the flow's current JSON file with the JSON file being imported and shows you the differences in key/value pairs between the two. You then have the option of:
- downloading the current JSON file for the flow,
- committing the changes, or
- canceling the import process.

Note that the system only compares the current and imported JSON files, not any Python files used by the flow.

Flow-based permissions – To expand our support for multiple lines of business—and to prevent the performance of one line from affecting the performance of others—we've added flow-specific permissions to the application. When you assign a permission group to a flow, members of that permission group are the only ones who can view and modify:

the flow,
submissions sent to the flow,
documents processed by the flow,
the release assigned to the flow, the release's Classification model and layouts, and the Field Identification models for those layouts.

Members of the assigned permission group are also the only users who can complete Supervision tasks created from their assigned flow. To help you see the work available for a specific flow, you can now filter the entire contents of the Task Queue page to show information about that flow.

As part of this update, we've created a new interface for configuring permission groups, and you can export a permission group's settings and import them to other instances.

Note that, when creating a new release, users can still add any and all layouts, regardless of whether they have access to the layouts they are adding. Therefore, we recommend training your users to ensure that they add the appropriate layouts to releases.

Flows SDK

New

Flows SDK v1.0 – On March 31, 2022, we will release v1.0 of our Flows SDK. With the Flows SDK, engineers at your organization can create flows in Python with our external developer library. The library contains Python classes for each available flow block, which you can use to create custom flows and modify any flows you create with the SDK. When you compile a completed flow, the library creates a JSON file, which you can then upload to Hyperscience. Flows created with the Flows SDK are compatible with Hyperscience v32 and v33.

Note that you cannot use the SDK to modify flows created outside of the Flows SDK.

For more information about the Flows SDK and for access, contact your Hyperscience representative or email us at [email protected].

Submission Initialization Block

Updated

Support for configuring additional submission sources – We've added OCS Configuration and Generic Web Storage (HTTP/HTTPS) Configuration settings to the Submission Initialization Block in the “Document Processing (V33)” flow, allowing you to set up your file store from within the application.

SOAP API Blocks

Updated

WSDL support in SOAP API Blocks – We now support the use of WSDL files in SOAP API Blocks. As part of this update, we’ve added a boolean Use WSDL setting to these blocks. If this setting is enabled, you can enter the WSDL file's URI in the new WSDL URI setting. Enabling this setting also reveals the WSDL Service Name and WSDL Port Name settings.

If this setting is disabled, the behavior of the block is the same as it was in previous versions, except for the sending of the SOAP Headers and SOAP Parameters values in JSON format.

Testing connections – You can now test connections you’ve configured in SOAP API Blocks. To test a connection, click the new Test Connection button beneath the SOAP API Block settings.

Classification

New

Importing and exporting Classification training data – You can now move Classification training data from one instance to another. When exporting a Classification model, you can choose to bundle the model’s training documents with the model, eliminating the need to regenerate training data in each of your Hyperscience instances.

Note that, when importing Classification training data, the PII Deletion policy in the destination instance is applied to the training documents.

Field Identification

New

Identification QA tasks and automation for fields with multiple occurrences – To reduce the need for human input while still reaching the desired output accuracy, we now allow automation for fields with multiple occurrences.

With the introduction of a new ID model for fields with multiple occurrences, you can achieve automation based on the threshold you specify in the Field Identification Target Accuracy flow setting. If the machine’s confidence is below this threshold, the system generates a field ID task for all occurrences.

To automate the identification of all occurrences of a field for a specific layout, you need to select the new field ID model called MULTIPLE_OCCURRENCES under Flex Engine Type for Training for this specific layout before training. This setting can be found at /admin/form_extraction/template/. To improve the training data’s quality, you can identify any inconsistencies that could lead to poor model performance and fix them by adding and removing occurrences using the Keyer Data Management tools. Adding and removing occurrences is also supported in Field Identification QA, Model Validation Tasks, and Flexible Extraction.

With the addition of Field Identification QA tasks for fields with multiple occurrences, you can improve the system’s performance. To support multiple occurrences, the Field Identification QA tasks’ interface is now condensed to a single page containing all occurrences that need to go through QA.

Reporting is now provided on an occurrence level, which gives your organization the ability to track accuracy and automation for all occurrences of fields.

Table Identification

New

Storing low-confidence machine predictions for table cells in Document Output page – Previously, low-confidence machine predictions for table cells were not stored in the Document Output page. In v33 and later, if Manual Identification Supervision is disabled, and the machine has a low level of confidence in table cells’ predictions, the table cells are stored in the Document Output page with the Identification Supervision Required, But Disabled exception. If Supervision is enabled, the table cells undergo Supervision and are stored as identified by machine or user.

Updated

User experience improvements to Table Identification tasks – We've made the following enhancements to the Table Identification user experience:

Improved copycat performance – We've improved the output of the copycat tool in large documents.
Adding a new row – When adding a new row to a table, you only need to draw the row. You do not need to click the + button on the left side of the table. When you draw a new row, the edges of the bounding box automatically snap to nearby text segments.
Navigating tables with the keyboard – You can now use the up and down arrows on your keyboard to select rows above or below your current position in the table.
Improved row splitting – The portion of the cells under the splitter become their own cells, and their bounding boxes automatically snap to nearby text segments.

Drawing over multiple rows – When drawing a bounding box across multiple rows, a cell is created for each row in each selected column.

Transcription

New

Threshold and target accuracy for table cells – The system now allows you to configure a threshold and target accuracy for table cells independently. This configuration enables auto-thresholding for table cells. You can set these table cells’ transcription settings on a flow level. The settings include:

Table Target Accuracy – Your desired accuracy for the transcription of table cells in Semi-structured documents. If Transcription Automation Training is enabled, the system uses this value to calculate Table Automation once the minimum amount of training data is obtained through Table Transcription QA.
Table Automation – This setting shows the level of automation you can expect when the system is working to reach the target accuracy set in Table Target Accuracy. The system automatically calculates this value after the minimum amount of training data is obtained through Table Transcription QA.
Table Threshold – This setting determines the minimum confidence thresholds needed for a table cell to be automatically processed. If Transcription Automation Training is enabled, any value you enter manually will be overwritten by the value calculated by the system, based on your target accuracy.
Table Minimum Legibility Threshold – The minimum confidence score a table cell must have in a Structured document in order for the system to automatically process a table cell. If a table cell’s confidence score is below this value, the system will mark the table cell as illegible.

Updated

Transcribing the rupee symbol – The system can now automatically transcribe the rupee symbol (₹) in submissions.

Custom Supervision

New

Support for editing text crops in documents without defined fields – To enable Supervision for documents that do not have any corresponding layouts or field lists, we now allow you to edit crops, or text segments, from full-page transcription using Custom Supervision.

The supported use case in v33 is the following:

A custom flow processes selected pages through Full Page Transcription. A Custom Code Block and a Routing Block determine whether a page should be sent to Custom Supervision based on predefined criteria, such as:

the machine has low confidence in transcribing text crops, or
the machine identifies text crops that match particular predefined keywords.

Then, you complete tasks for all crops that are sent to Custom Supervision, which allows you to extract the crops’ text with complete accuracy.

Reporting

Updated

“Pages Processed” and “Pages Submitted” filters in the System Throughput report – To display the number of processed and submitted pages, we’ve updated the System Throughput report (Reporting > Overview) to include Pages Processed and Pages Submitted filters.

Classification data in the Keyer Projection report – We’ve updated the Keyer Projection report (Reporting > User Performance) to include the following columns:

Users Performing Classification
Classification Tasks in Starting Work Queue
Classification Tasks Added to Work Queue
Classification Tasks Completed
Classification Tasks In Ending Work Queue
Time Spent in Classification Tasks (Seconds)
Users Performing Classification QA
Classification QA Tasks in Starting Work Queue
Classification QA Tasks Added to Work Queue
Classification QA Tasks Completed
Classification QA Tasks In Ending Work Queue
Time Spent in Classification QA Tasks (Seconds)

Renaming “Doc Org” to “Document Classification” in HourlyReportingTaskOverview.csv – We’ve renamed the “Doc Org” task type to “Document Classification” in HourlyReportingTaskOverview.csv, which is part of the Keyer Projection report (Reporting > User Performance).

Classification data in Supervision Volume, Performance Distribution, and All Users Performance Summary reports – To help you manage your workforce, we’ve updated the following reports to include data about Classification tasks:

Supervision Volume (Reporting > User Performance), data visible in the Classification filter.
Performance Distribution (Reporting > User Performance), data visible in the Classification filter.
All Users Performance Summary (Reporting > User Performance), data visible in the Classification filter.

You can also download the Classification tasks’ data from the above-mentioned reports.

Keyer Data Management

New

Keyer Data Management for tables – In previous versions of Hyperscience, the Keyer Data Management tools were only available for non-table fields. To give you more control over your table cell identification data and automation, we're introducing the following changes in v33:

Editing table annotations – Users with the Edit Training Data permission can edit a document's table annotations, which changes the ground-truth data of the model. To edit table annotations, we’ve introduced a new button under the Actions column in the Training Data table. This button is used only for editing table annotations.
Training status – While editing table annotations, users with the Edit Training Data permission can also indicate whether the document should always or never be included in future model trainings, overriding any PII Deletion settings the document may be subject to.
Column-level Automation table – Similar to the Field-level Automation table, you can now check the automation rates for your layout’s table cells. The Column-level Automation table helps you identify potential ground-truth errors quickly. Each table column has a View Annotations button that redirects you to the Documents table.
Annotation UI for tables – To keep consistency between Table ID Supervision and the annotation UI for tables, the annotation UI for tables uses the Template Tool’s functionality.

Input Connections and Output Connections

Updated

“Headers to Include” setting for Email Listener – We’ve added a Headers to Include setting for the Email Listener input connector. This setting allows you to include headers from emails ingested into Hyperscience via the Email Listener connector.

Removal of "v3" option – Because we have sunset v3 of our API, we have removed the v3 option from the API Version setting in input connections and output connections.

For more information about the sunsetting of API v3, see the API section of these release notes.

Fixed

Hiding “Exchange” and “Routing Key” settings for RabbitMQ Listener – We’ve hidden the Exchange and Routing Key settings for RabbitMQ Listener. Previously, these settings were displayed for RabbitMQ Listener but were unnecessary.

Universal Folder Listener's performance with a large number of resources – We've fixed a resource-detection issue in the Universal Folder Listener's block-process manager that caused delays when a large number of images were present in the source folder.

Updates to "Box Metadata Template Key" setting – Because the Box Metadata Template Key setting is required in Box Notifier Output Blocks, we've removed the "Optional" label for that setting. We've also removed the setting's default value.

Permissions

Updated

Permission groups settings – We’ve updated the interface for permission groups settings. When you click on a permission group (Users > Permission Groups), the permission group’s details are divided into the following sections:

Linked Authentication Groups – If you are using an LDAP authentication provider, these authentication group’s users are automatically added to the permission group.
Users – This section shows the users who have been added to the permission group.
Access Permissions – A list of the permissions assigned to the permission group appears in this section.
Assign Group Access to Specific Flows – If this permission group has been assigned access to particular live flows, they are listed in this section. Users in this permission group, along with any other groups that have been assigned access to these flows, can access these flows and their submissions, documents, releases, layouts, and models. For more information, see the Flows section of these release notes.

Exporting and importing permission groups – You can now export permission groups and import them to other instances.

Duplicating permission groups – If you want to create a permission group that’s very similar to one of your existing permission groups, you can now do so by duplicating the existing permission group and editing the duplicate.

Infrastructure

New

Kubernetes support for private cloud deployments – We now support the use of Kubernetes as a container-orchestration solution in private cloud deployments of Hyperscience.

Note that Kubernetes is supported only in new installations of Hyperscience. It is not currently possible to migrate an existing instance to a Kubernetes-enabled infrastructure.

Furthermore, the use of Kubernetes is not supported in bare-metal, fully on-premise deployments of Hyperscience. In v33, we only support Kubernetes deployments that are deployed with Amazon Elastic Kubernetes Service (EKS).

Our Kubernetes deployments autoscale the trainer only, not the application. In future versions of Hyperscience, we plan on increasing the capabilities of our Kubernetes offering to include the autoscaling of the application.

To learn more about Kubernetes, including its capabilities and implementation, see Kubernetes.

SaaS

Updated

SaaS in EMEA – We now fully support the use of our SaaS solution in Europe. As part of this update, you can choose to have your application and all of your data hosted in the EU, and if you choose this option, only Hyperscience personnel located in Europe can access your Hyperscience infrastructure and application. In accordance with GDPR requirements, you can also request that data about a particular individual be erased from the Hyperscience instance.

While these updates are meant to meet the requirements outlined in European privacy laws, they also allow us to support SaaS deployments in the Middle East and Africa.

System & Health

Updated

Health checks for Database Blocks – The System & Health page now has an External Database Blocks section, which contains a card for each Database Block in each flow. Each card shows the database's status and linked flow, as well as the number of errors found for the database. Each database also has a Connection Logs button, which shows the connection logs for the database's block when clicked.

To learn more, see System & Health Page.

Databases

Updated

Support for PostgreSQL 13.x – We've added support for PostgreSQL 13.x.

Security

Updated

TLS ".env" file variables and output connections – The values for the HS_TLS_CA_BUNDLE and HS_TLS_VERIFY_ENABLED ".env" file variables now apply to connections that use Java Message Service (JMS). These connections include RabbitMQ, ActiveMQ, and IBM MQ output connections.

For more information about these variables, see Security.

API

Fixed

Time zones for start_time and complete_time – Previously, the system used:

the value of the server’s time zone to calculate start_time and complete_time in API responses, and
two different time zones to calculate start_time and complete_time values inside and outside of the output key in the output produced by Output Notification Blocks.

The system now uses UTC time to calculate these values, regardless of where they appear.

Note that this update could potentially cause your integration with Hyperscience to not work as intended if its logic relies on start_time or complete_time.

Updated

Sunsetting of API v3 – With the release of v33 of Hyperscience, we have sunset v3 of our API. The code for API v3 has been removed, and it is no longer possible to send API requests with the /api/v3 URL prefix.

To learn more about our API deprecation policy, see the Hyperscience API Deprecation Policy & Schedule in our API documentation.

Deprecating API v4 – With the release of v33 of Hyperscience, we are deprecating v4 of our API. We will no longer add features or fixes to API v4, and we will sunset it in Hyperscience v37 or March 2023 (whichever is later). At that point, API v4 will be removed from our application.

If you are using this version of our API, we encourage you to use API v5, the latest version of our API.

To learn more about API v5, see our API documentation.

Support for Base64-encoded JSON data in submission creation – You can now send submission data in JSON format when creating submissions via the Submission Creation endpoint. To do so, include the Content-Type: application/json header in your request. When sending requests with this header, note that the request body has a different format than requests sent as multipart/form-data or application/x-www-form-urlencoded.

To learn more about sending data in JSON format, see Submission Creation in our API documentation.