Flow Blocks

With the introduction of flows, we have created the following types of blocks to help you customize your flows to meet your teams’ needs. To learn more about any of these blocks, or for assistance in adding them to your flows, contact your Hyperscience representative.

Block types and settings

Application-level and flow-level settings
While some blocks have settings that allow you to customize how they work, many blocks are affected by application-level or flow-level settings. To learn more about these settings, see Application Settings Overview and Flow Settings.

Input Blocks

Formerly known as “Input Connectors,” Input Blocks allow you to integrate your organization’s data sources into our system. Through these blocks, you can process documents from a variety of sources, such as inboxes, message queues, or folders on the network.

A full list of the Input Blocks we currently support, along with more details about each, can be found in Input Blocks.

Settings

No matter how many Input blocks you choose to enable, information about those blocks is contained in a main Input block for your flow. This block has the settings described below.

Name	Required?	Description
Allow API submissions	No	Indicates whether the flow accepts submissions submitted via API.
Allow manual submissions	No	Indicates whether the flow accepts manually uploaded submissions.

Submission Initialization

The Submission Initialization Block contains settings that connect your flow to your:

AWS S3 submission retrieval store,
OCS submission retrieval store,
Generic web storage (HTTP/HTTPS) submission retrieval store,
[v39.2 and later] Azure Blob Storage retrieval store.

Settings

You can customize the functionality of your block by editing the settings described below.

AWS S3

S3 Submission Retrieval Store

If you are using an S3 bucket as your submission retrieval store and you are not authenticating through IAM roles, provide your AWS access key ID and secret access key in the S3 Submission Retrieval Store field.

To enter your credentials:

Click Edit value.
Enter your credentials in JSON format:
```
{
"aws_access_key_id": "<your_access_key_id>",
"aws_secret_access_key": "<your_secret_key>"
}
```
You can authenticate requests using AWS Signature Version 2 (SigV2). To use AWS Signature Version 2, add the following variable and value to the S3 Submission Retrieval Store field:
```
"s3_signature_version":"s3"
```
Click Done.
Click Save in the upper-right corner of the page.
In the dialog box that appears, click Save & Deploy.

For more information about AWS access key IDs and secret access keys, see Amazon's Understanding and getting your AWS credentials.

S3 Submission Retrieval Endpoint URL

If your submission retrieval store is not in the public cloud (i.e., its URL does not point to s3.amazonaws.com — for example, a government cloud or an S3-compatible internal setup), enter its URL in S3 Submission Retrieval Endpoint URL. You do not need to edit your “.env” file to update this URL.

To edit the endpoint URL for your S3 submission retrieval store:

Enter the URL in the S3 Submission Retrieval Endpoint URL field or edit its contents.
Click Save in the upper-right corner of the page.
In the dialog box that appears, click Save & Deploy.

If the bucket you’re using as your submission retrieval store is in a public cloud (as opposed to a government cloud or an S3-compatible internal setup), leave this field blank.

OCS

OCS Configuration

If you are using an OSC submission file store, enter the configuration details for your file store in this field.

To enter your configuration details:

Click Edit value.
Enter the configuration details in JSON format:
```
{
"host_url": "", 
"username": "", 
"password": "", 
"ssl_cert": ""
}
```
The value of ssl_cert should match the CA bundle filename inside the $HS_PATH/certs directory. To disable certificate validation, set this value to SKIP.
Click Done.
Click Save in the upper-right corner of the page.
In the dialog box that appears, click Save & Deploy.

Generic Web Storage (HTTP/HTTPS)

Generic Web Storage (HTTP/HTTPS) Configuration

If you are using a generic web storage submission file store, enter the configuration details for your file store in this field.

We use Basic Authentication for Generic Web Storage Configuration.

To enter your configuration details:

Click Edit value.
Enter the configuration details in JSON format:
```
{ 
"username": "", 
"password": "", 
"ssl_cert": ""
}
```
The value of ssl_cert should match the CA bundle filename inside the $HS_PATH/certs directory. To disable certificate validation, set this value to SKIP.
Click Done.
Click Save in the upper-right corner of the page.
In the dialog box that appears, click Save & Deploy.

Azure Blob Storage

The Azure Blob Storage option for submission retrieval storage is available in v39.2 and later.

If you are using Azure Blob Storage as your submission retrieval store, you can use the fields described below to configure the system’s connection to the blob.

Azure Blob Storage Authentication Type

From the Azure Blob Storage Authentication Type drop-down list, select the authentication type the system should use to access the blob:

SAS Token Only
Service Principal
Managed Identity
Account Key

When you select an authentication type, additional settings appear.

Settings for SAS Token Only authentication

Name	Required?	Description
Azure Blob Storage Account URL	Yes	The URL of the storage account (e.g., https://.blob.core.windows.net)

Settings for Service Principal authentication

Name	Required?	Description
Azure Blob Storage Account URL	Yes	The URL of the storage account (e.g., https://.blob.core.windows.net)
Azure Blob Storage Tenant ID	No	The tenant ID of the service principal
Azure Blob Storage Client ID	No	The client ID of the service principal. If multiple client IDs exist for the service principle, and Azure Blob Storage Client ID is left blank, the default client ID will be used.
Azure Blob Storage Client Secret	No	The client secret for the service principal
Azure Blob Storage Authority Host	No	The host of the Microsoft Entra authority for the storage account. If omitted, the host of the Azure Public Cloud authority (login.microsoftonline.com) is used. For a list of valid values, see Microsoft’s azure.identity.AzureAuthorityHosts class.

Settings for Managed Identity authentication

Name

Required?

Description

Azure Blob Storage Account URL

Yes

The URL of the storage account (e.g., https://.blob.core.windows.net)

Azure Blob Storage Client ID

The client ID of the managed identity.

If multiple client IDs exist for the managed identity, and Azure Blob Storage Client ID is left blank, the default client ID will be used.

Settings for Account Key authentication

Name	Required?	Description
Azure Blob Storage Account URL	Yes	The URL of the storage account (e.g., https://.blob.core.windows.net)
Azure Blob Storage Account Key	No	The access key for the storage account
Azure Blob Storage Account Name	No	The name of the storage account

If incorrect authentication information is entered, the flow runs for the attempted file-ingestion attempts will fail. The flow runs’ output will contain error messages passed to the system by Azure.

For more information about troubleshooting flow runs, see Testing and Debugging Flows.

Classification Blocks

We’ve divided our Classification function into two blocks: one for Machine Classification and another for Manual Classification.

Machine Classification

With machine classification, Hyperscience can automatically match your submissions to Structured, Semi-structured, or Additional layouts. Machine classification requires training to recognize the kinds of submissions you process through Hyperscience.

Settings

Name	Required?	Description
Image Correction	No	Identifies and corrects the orientation of Semi-structured images by rotating them. Cannot be enabled if Faster PDF Transcription is enabled.
Faster PDF Transcription	No	If enabled, the system processes pages in PDF files in their native format, allowing for faster transcription. If disabled, the system processes PDF pages by creating images of them and extracting data from those images. To ensure that this feature works as intended, only enable Faster PDF Transcription when submitting PDFs whose pages are correctly oriented and do not require rotation before processing. Cannot be enabled if Image Correction is enabled. If you are processing PDFs and other file types in your flow, consider creating a custom flow that routes PDFs to a Machine Classification Block that has Faster PDF Transcription enabled.
Captured Image Enhancement	No	Improves machine readability of Semi-structured documents captured by mobile devices. To rotate and properly process Semi-structured documents captured by mobile devices, we recommend enabling both Captured Image Enhancement and Image Correction. Before enabling Captured Image Enhancement, make sure that the majority of the pages you will be processing are captured by mobile devices. Contact your Hyperscience representative for more information.

Name

Required?

Description

Image Correction

Identifies and corrects the orientation of Semi-structured images by rotating them.

Cannot be enabled if Faster PDF Transcription is enabled.

Faster PDF Transcription

If enabled, the system processes pages in PDF files in their native format, allowing for faster transcription. If disabled, the system processes PDF pages by creating images of them and extracting data from those images.

To ensure that this feature works as intended, only enable Faster PDF Transcription when submitting PDFs whose pages are correctly oriented and do not require rotation before processing.

Cannot be enabled if Image Correction is enabled. If you are processing PDFs and other file types in your flow, consider creating a custom flow that routes PDFs to a Machine Classification Block that has Faster PDF Transcription enabled.

Captured Image Enhancement

Improves machine readability of Semi-structured documents captured by mobile devices. To rotate and properly process Semi-structured documents captured by mobile devices, we recommend enabling both Captured Image Enhancement and Image Correction.

Before enabling Captured Image Enhancement, make sure that the majority of the pages you will be processing are captured by mobile devices. Contact your Hyperscience representative for more information.

Manual Classification

Manual Classification, or Classification Supervision, allows your keyers to manually match submissions to their layouts. Depending on your flow, keyers may perform Classification Supervision if the system cannot match a submission to a layout with high confidence.

Settings

You can customize the functionality of your Manual Classification Block by adding Task Restrictions.

Name	Required?	Description
Default task restrictions	No	Select the task restrictions that should be applied to tasks created by this block. See Task Restrictions Overview for more information.

Identification Blocks

We’ve created two Identification blocks to cover both our machine Identification and Identification Supervision capabilities.

Machine Identification

With Machine Identification, you can automate the identification of fields and tables in your submissions.

Settings

Other than the settings under “Block Details,” Machine Identification Blocks have no block-specific settings.

Manual Identification

Manual Identification allows your keyers to complete Field ID Supervision or Table ID Supervision tasks, where they draw bounding boxes around the contents of certain fields, table columns, or table rows. This identification process ensures that the system transcribes the correct content in the Transcription steps of the data-extraction process.

Settings

You can customize the functionality of your Manual Identification Block by adding Task Restrictions.

Name	Required?	Description
Default task restrictions	No	Select the task restrictions that should be applied to tasks created by this block. See Task Restrictions Overview for more information.

Transcription Blocks

Just as we did with Classification and Identification, we’ve divided our Transcription capabilities into Machine Transcription and Manual Transcription Blocks.

Machine Transcription

In the Machine Transcription Block of your flow, Hyperscience automatically transcribes the content of your submissions, whether the text was written by hand or typed through a machine.

Settings

Other than the settings under “Block Details,” Machine Transcription Blocks have no block-specific settings.

Manual Transcription

Manual Transcription, or Transcription Supervision, lets your keyers manually enter the text found in fields or tables. Depending on your settings, your keyers may manually transcribe certain pre-selected fields or fields that the machine could not transcribe with high confidence.

Settings

You can customize the functionality of your block by editing the settings described below.

Task Restrictions

Name	Required?	Description
Default task restrictions	No	Select the task restrictions that should be applied to tasks created by this block. See Task Restrictions Overview for more information.

Supervision

Name	Required?	Description
Supervision Transcription masking	No	Prevents users from inputting invalid characters during Supervision Transcription tasks.
Table output manual review	No	Generates a Table Transcription task if any table cells are identified, during which the keyer will complete a full manual review of both the transcribed data and its bounding boxes. If disabled, a Table Transcription task will only be generated if one or more cells have transcribed values below the accuracy thresholds defined in Application > Settings.
Create Manual Transcription Task for Tables with Blank Cells	No	Always sends blank cells to Manual Transcription, regardless of machine confidence. Enabled by default. If disabled, a Table Transcription task will only be generated if one or more cells in the table have transcribed values below the accuracy thresholds defined in Application > Settings.

Name

Required?

Description

Supervision Transcription masking

Prevents users from inputting invalid characters during Supervision Transcription tasks.

Table output manual review

Generates a Table Transcription task if any table cells are identified, during which the keyer will complete a full manual review of both the transcribed data and its bounding boxes. If disabled, a Table Transcription task will only be generated if one or more cells have transcribed values below the accuracy thresholds defined in Application > Settings.

Create Manual Transcription Task for Tables with Blank Cells

Always sends blank cells to Manual Transcription, regardless of machine confidence.

Enabled by default. If disabled, a Table Transcription task will only be generated if one or more cells in the table have transcribed values below the accuracy thresholds defined in Application > Settings.

Flexible Extraction Block

Depending on your flow’s configuration, Flexible Extraction tasks allow your keyers to:

validate transcriptions, or
add transcriptions to manually categorized Structured pages, which did not go through regular Transcription Supervision.

To use Flexible Extraction as a data-validation method, you need a Custom Code Block. The rules in that block determine when a document should be sent to Flexible Extraction, as well as whether the entire document or particular fields should be validated. For information on setting up Custom Code Blocks and Flexible Extraction Blocks in this way, contact your Hyperscience representative.

Settings

You can customize the functionality of your block by editing the settings described below.

Task Restrictions

Name	Required?	Description
Default task restrictions	No	Select the task restrictions that should be applied to tasks created by this block. See Task Restrictions Overview for more information.

Supervision

Name	Required?	Description
Flexible Extraction Transcription masking	No	Prevents users from inputting invalid characters during Flexible Extraction tasks.

Collation Block

We’ve created a Collation Block to allow the grouping of files, documents, and pages into cases. To learn more about Case Collation, see Case Collation.

Settings

You can customize the functionality of your block by editing the settings described below.

Name	Required?	Description
Replace case data from duplicate file names	No	Replaces case data from repeated file names within the same case. For example, you can enable this setting if you want to resubmit a file containing new data and remove the old, duplicated data from the case. Note that this setting does not delete the old data; it just removes it from the case.

Custom Supervision Block

To enable the tailoring of a Supervision task’s interface to a specific business process, we’ve created a Custom Supervision Block.

To use Custom Supervision, you need a custom flow with a Custom Code Block:

A custom flow is required because the Custom Supervision Block is not included in the default Document Processing flow.
A Custom Code Block is required to define and format the data input that the Custom Supervision Block needs to show a task.
A Routing Block is not required, but it controls whether a submission is sent to Custom Supervision or not. Without a Routing Block, every submission to your custom flow will go to Custom Supervision.

Settings

Name	Required?	Description
Task purpose	Yes	The custom task name given to the Custom Supervision task in the Task Queue. To learn how to make this custom task name visible in the Task Queue, see Navigating the Task Queue.
Default task restrictions	No	Select the task restrictions that should be applied to tasks created by this block. See Task Restrictions Overview for more information.
Custom Supervision transcription masking	No	Prevents users from inputting invalid characters during Custom Supervision Transcription tasks.

Database Blocks

Database Blocks allow you to make queries from Hyperscience to databases, which increases the overall speed of your flows by minimizing the need for manual transcription.

To learn more about Database Blocks, see Database Blocks.

Custom Code Blocks

Custom Code Blocks enable you to transform and validate extracted submission data before Hyperscience sends it to your downstream systems. The table below lists the kinds of post-processing rules you can implement with Custom Code Blocks.

Rule Type	Description	Examples
Field normalization / Data transformation	Change the formatting of data for compatibility with downstream systems	Make sure names have the correct capitalization and correct it, if needed Convert driver's license information to a specific format In addresses, change "Street" to "ST," "Drive" to "DR," and "Avenue" to "Ave" Remove "LLC" from the end of Company Name entries
Data validation	Perform an external data lookup or check data within the submission to make sure the data is valid, and flag the submission as NIGO if it is not	Verify that a transcribed ZIP code is valid Check that a customer's name and membership number match the customer's record Ensure that only option A or option B is checked, and if both are checked, flag submission as NIGO
Data augmentation	Add data to a submission's JSON to prevent processing issues or to route data to specific downstream systems	Determine a customer's country of residence based on their customer ID, and add the country code to the submission's output Add a "total price" value based on the quantity and cost of items ordered

You cannot add Custom Code Blocks to your flow through the platform. To learn more about Custom Code Blocks, or to add them to your flows, contact your Hyperscience representative.

Settings

If your business needs change and you need to modify a Custom Code Block’s code (e.g., add a keyword to a keyword search), you can do so with guidance from Hyperscience. To learn more, see Modifying Custom Code Blocks.

Named Entity Recognition Block

The Named Entity Recognition Block allows you to:

detect key PII entities such as:
- Names
- Addresses
- Locations
- Organizations
- Companies
enhance full-page transcription output with information about detected entities.

You need to use Named Entity Recognition Blocks in conjunction with Full Page Transcription Blocks. For example, you can build a redaction flow that processes documents through full-page transcription, then detects all personal names, and at the end uses a Custom Code Block to put black boxes over the detected names.

Custom Entity Detection Block (Beta)

The Custom Entity Detection Block allows you to locate and identify:

single words, and
word patterns that can be described with a combination of regular expressions and keywords

You need to use Custom Entity Detection Blocks in conjunction with a block, such as the Full Page Transcription Block, that returns a collection of text segments. For example, you can build a redaction flow that processes documents through full-page transcription, then detects all phone numbers, addresses, and names, and at the end uses a Custom Code Block to place black boxes over the detected text segments.

The Custom Entity Detection Block is a beta feature that is not yet part of our Flows SDK. For information on setting up Custom Entity Detection Blocks, contact your Hyperscience representative.

Routing Blocks

Routing Blocks let you send submission data to different destinations based on the criteria you specify. In this way, Routing Blocks create branches in your flow.

You cannot add or configure Routing Blocks in the application. For assistance, contact your Hyperscience representative.

Settings

Other than the settings under “Block Details,” Routing Blocks have no block-specific settings.

API Blocks

API Blocks allow you to connect to other data sources in your organization in order to augment or verify extracted data. You can work with your Hyperscience representative to configure these blocks and place them anywhere in your flow after data extraction. API Blocks do not contain business logic; that logic lives in subsequent flow blocks.

We offer two types of API Blocks: HTTP Rest and SOAP.

For more information about API Blocks, see API Blocks.

Complete Blocks

Every flow needs a Complete Block. This block initiates Quality Assurance tasks and changes the submission’s status to “Complete.”

Settings

Other than the settings under “Block Details,” Complete Blocks have no block-specific settings.

Output Blocks

Output Blocks were called “Output Connectors” in previous versions of Hyperscience. With Output Blocks, you can send data extracted by Hyperscience to other systems for downstream processing. If you want your flow to send notifications for submission statuses other than “Complete,” you will need to work with your Hyperscience representative to set up a separate Notification flow.

A full list of the Output Blocks we currently support, along with more details about each, can be found in Output Blocks.

You can control which Output Blocks are enabled in your flow at any time by selecting or deselecting the Enabled option in each Output Block.

Settings shared by all block types

All blocks have the following settings under “Block Details”:

Name	Required?	Description
Display Name	Yes	The block's name in Flow Studio. You can change the name for each of your flows.
Description	No	The block's description in Flow Studio. You can change the description for each of your flows.