Flow Blocks

With the introduction of flows, we have created the following types of blocks to help you customize your flows to meet your teams’ needs. To learn more about any of these blocks, or for assistance in adding them to your flows, contact your Hyperscience representative.

Block types and settings

Application-level and flow-level settings

While some blocks have settings that allow you to customize how they work, many blocks are affected by application-level or flow-level settings. To learn more about these settings, see Application Settings Overview and Flow Settings.

InputBlockIcon.pngInput Blocks

Formerly known as “Input Connectors,” Input Blocks allow you to integrate your organization’s data sources into our system. Through these blocks, you can process documents from a variety of sources, such as inboxes, message queues, or folders on the network.

A full list of the Input Blocks we currently support, along with more details about each, can be found in Input Blocks

Settings

No matter how many Input blocks you choose to enable, information about those blocks is contained in a main Input block for your flow. This block has the settings described below.

Name

Required?

Description

Allow API submissions

No

Indicates whether the flow accepts submissions submitted via API.

Allow manual submissions

No

Indicates whether the flow accepts manually uploaded submissions.

SubmissionInitializationBlockIcon.pngSubmission Initialization

The Submission Initialization Block contains settings that connect your flow to your:

  • AWS S3 submission retrieval store,

  • OCS submission retrieval store,

  • Generic web storage (HTTP/HTTPS) submission retrieval store, or

  • Azure Blob Storage retrieval store.

Settings

You can customize the functionality of your block by editing the settings described below.

AWS S3

S3 Submission Retrieval Store

If you are using an S3 bucket as your submission retrieval store and you are not authenticating through IAM roles, provide your AWS access key ID and secret access key in the S3 Submission Retrieval Store field.

To enter your credentials:

  1. Click Edit value.

  2. Enter your credentials in JSON format:

    {
    "aws_access_key_id": "<your_access_key_id>",
    "aws_secret_access_key": "<your_secret_key>"
    }

    You can authenticate requests using AWS Signature Version 2 (SigV2). To use AWS Signature Version 2, add the following variable and value to the S3 Submission Retrieval Store field:

    "s3_signature_version":"s3"
  3. Click Done.

  4. Click Save in the upper-right corner of the page.

  5. In the dialog box that appears, click Save & Deploy.

For more information about AWS access key IDs and secret access keys, see Amazon's Understanding and getting your AWS credentials.

S3 Submission Retrieval Endpoint URL

If your submission retrieval store is not in the public cloud (i.e., its URL does not point to s3.amazonaws.com — for example, a government cloud or an S3-compatible internal setup), enter its URL in S3 Submission Retrieval Endpoint URL. You do not need to edit your “.env” file to update this URL.

To edit the endpoint URL for your S3 submission retrieval store:

  1. Enter the URL in the S3 Submission Retrieval Endpoint URL field or edit its contents.

  2. Click Save in the upper-right corner of the page.

  3. In the dialog box that appears, click Save & Deploy.

If the bucket you’re using as your submission retrieval store is in a public cloud (as opposed to a government cloud or an S3-compatible internal setup), leave this field blank.  

OCS

OCS Configuration

If you are using an OSC submission file store, enter the configuration details for your file store in this field.

To enter your configuration details:

  1. Click Edit value.

  2. Enter the configuration details in JSON format: 

    {
    "host_url": "<your_host_url>", 
    "username": "<your_username>", 
    "password": "<your_password>", 
    "ssl_cert": "<CA_bundle_filename_OR_SKIP>"
    }

    The value of ssl_cert should match the CA bundle filename inside the $HS_PATH/certs directory. To disable certificate validation, set this value to SKIP.

  3. Click Done.

  4. Click Save in the upper-right corner of the page.

  5. In the dialog box that appears, click Save & Deploy.

Generic Web Storage (HTTP/HTTPS)

Generic Web Storage (HTTP/HTTPS) Configuration

If you are using a generic web storage submission file store, enter the configuration details for your file store in this field.

We use Basic Authentication for Generic Web Storage Configuration.

To enter your configuration details:

  1. Click Edit value.

  2. Enter the configuration details in JSON format: 

    { 
    "username": "<your_username>", 
    "password": "<your_password>", 
    "ssl_cert": "<CA_bundle_filename_OR_SKIP>"
    }

    The value of ssl_cert should match the CA bundle filename inside the $HS_PATH/certs directory. To disable certificate validation, set this value to SKIP.

  3. Click Done.

  4. Click Save in the upper-right corner of the page.

  5. In the dialog box that appears, click Save & Deploy.

Azure Blob Storage

The Azure Blob Storage option for submission retrieval storage is available in v39.2 and later.

If you are using Azure Blob Storage as your submission retrieval store, you can use the fields described below to configure the system’s connection to the blob.

Azure Blob Storage Authentication Type

From the Azure Blob Storage Authentication Type drop-down list, select the authentication type the system should use to access the blob:

  • SAS Token Only

  • Service Principal

  • Managed Identity

  • Account Key

When you select an authentication type, additional settings appear.

Settings for SAS Token Only authentication

Name

Required?

Description

Azure Blob Storage Account URL

Yes

The URL of the storage account (e.g., https://<account_name>.blob.core.windows.net)

Settings for Service Principal authentication

Name

Required?

Description

Azure Blob Storage Account URL

Yes

The URL of the storage account (e.g., https://<account_name>.blob.core.windows.net)

Azure Blob Storage Tenant ID

No

The tenant ID of the service principal

Azure Blob Storage Client ID

No

The client ID of the service principal. 

If multiple client IDs exist for the service principle, and Azure Blob Storage Client ID is left blank, the default client ID will be used.

Azure Blob Storage Client Secret

No

The client secret for the service principal

Azure Blob Storage Authority Host

No

The host of the Microsoft Entra authority for the storage account.

If omitted, the host of the Azure Public Cloud authority (login.microsoftonline.com) is used.

For a list of valid values, see Microsoft’s azure.identity.AzureAuthorityHosts class.

Settings for Managed Identity authentication

Name

Required?

Description

Azure Blob Storage Account URL

Yes

The URL of the storage account (e.g., https://<account_name>.blob.core.windows.net)

Azure Blob Storage Client ID

No

The client ID of the managed identity. 

If multiple client IDs exist for the managed identity, and Azure Blob Storage Client ID is left blank, the default client ID will be used.

Settings for Account Key authentication

Name

Required?

Description

Azure Blob Storage Account URL

Yes

The URL of the storage account (e.g., https://<account_name>.blob.core.windows.net)

Azure Blob Storage Account Key

No

The access key for the storage account

Azure Blob Storage Account Name

No

The name of the storage account

If incorrect authentication information is entered, the flow runs for the attempted file-ingestion attempts will fail. The flow runs’ output will contain error messages passed to the system by Azure. 

For more information about troubleshooting flow runs, see Testing and Debugging Flows.

Classification Blocks

We’ve divided our Classification function into two blocks: one for Machine Classification and another for Manual Classification.

MachineClassificationIcon.png Machine Classification

With machine classification, Hyperscience can automatically match your submissions to Structured, Semi-structured, or Additional layouts. Machine classification requires training to recognize the kinds of submissions you process through Hyperscience.

Settings

Name

Required?

Description

Image Correction

No

Identifies and corrects the orientation of Semi-structured images by rotating them.

Cannot be enabled if Faster PDF Transcription is enabled.

Faster PDF Transcription

No

If enabled, the system processes pages in PDF files in their native format, allowing for faster transcription. If disabled, the system processes PDF pages by creating images of them and extracting data from those images.

To ensure that this feature works as intended, only enable Faster PDF Transcription when submitting PDFs whose pages are correctly oriented and do not require rotation before processing.

Cannot be enabled if Image Correction is enabled. If you are processing PDFs and other file types in your flow, consider creating a custom flow that routes PDFs to a Machine Classification Block that has Faster PDF Transcription enabled.

Captured Image Enhancement

No

Improves machine readability of Semi-structured documents captured by mobile devices. To rotate and properly process Semi-structured documents captured by mobile devices, we recommend enabling both Captured Image Enhancement and Image Correction.

Before enabling Captured Image Enhancement, make sure that the majority of the pages you will be processing are captured by mobile devices. Contact your Hyperscience representative for more information. 

ManualClassificationBlockIcon.png Manual Classification

Manual Classification, or Classification Supervision, allows your keyers to manually match submissions to their layouts. Depending on your flow, keyers may perform Classification Supervision if the system cannot match a submission to a layout with high confidence.

Settings

You can customize the functionality of your Manual Classification Block by adding Task Restrictions.

Name

Required?

Description

Default task restrictions

No

Select the task restrictions that should be applied to tasks created by this block. See Task Restrictions Overview for more information.

Identification Blocks

We’ve created two Identification blocks to cover both our machine Identification and Identification Supervision capabilities.  

MachineIdentificationBlockIcon.png Machine Identification

With Machine Identification, you can automate the identification of fields and tables in your submissions.

Settings

Other than the settings under “Block Details,” Machine Identification Blocks have no block-specific settings. 

ManualIdentificationBlockIcon.png Manual Identification

Manual Identification allows your keyers to complete Field ID Supervision or Table ID Supervision tasks, where they draw bounding boxes around the contents of certain fields, table columns, or table rows. This identification process ensures that the system transcribes the correct content in the Transcription steps of the data-extraction process.

Settings

You can customize the functionality of your Manual Identification Block by adding Task Restrictions.

Name

Required?

Description

Default task restrictions

No

Select the task restrictions that should be applied to tasks created by this block. See Task Restrictions Overview for more information.

Transcription Blocks

Just as we did with Classification and Identification, we’ve divided our Transcription capabilities into Machine Transcription and Manual Transcription Blocks.

MachineTranscriptionBlockIcon.png Machine Transcription

In the Machine Transcription Block of your flow, Hyperscience automatically transcribes the content of your submissions, whether the text was written by hand or typed through a machine. 

Settings

Other than the settings under “Block Details,” Machine Transcription Blocks have no block-specific settings.

ManualTranscriptionBlockIcon.png Manual Transcription

Manual Transcription, or Transcription Supervision, lets your keyers manually enter the text found in fields or tables. Depending on your settings, your keyers may manually transcribe certain pre-selected fields or fields that the machine could not transcribe with high confidence.

Settings

You can customize the functionality of your block by editing the settings described below.

Task Restrictions

Name

Required?

Description

Default task restrictions

No

Select the task restrictions that should be applied to tasks created by this block. See Task Restrictions Overview for more information.

Supervision

Name

Required?

Description

Supervision Transcription masking

No

Prevents users from inputting invalid characters during Supervision Transcription tasks.

Table output manual review

No

Generates a Table Transcription task if any table cells are identified, during which the keyer will complete a full manual review of both the transcribed data and its bounding boxes. If disabled, a Table Transcription task will only be generated if one or more cells have transcribed values below the accuracy thresholds defined in Application > Settings.

Create Manual Transcription Task for Tables with Blank Cells 

No

Always sends blank cells to Manual Transcription, regardless of machine confidence.

Enabled by default. If disabled, a Table Transcription task will only be generated if one or more cells in the table have transcribed values below the accuracy thresholds defined in Application > Settings.

FlexExIcon.png Flexible Extraction Block

Depending on your flow’s configuration, Flexible Extraction tasks allow your keyers to:

  • validate transcriptions, or

  • add transcriptions to manually categorized Structured pages, which did not go through regular Transcription Supervision. 

To use Flexible Extraction as a data-validation method, you need a Custom Code Block. The rules in that block determine when a document should be sent to Flexible Extraction, as well as whether the entire document or particular fields should be validated. For information on setting up Custom Code Blocks and Flexible Extraction Blocks in this way, contact your Hyperscience representative.

Settings

You can customize the functionality of your block by editing the settings described below.

Task Restrictions

Name

Required?

Description

Default task restrictions

No

Select the task restrictions that should be applied to tasks created by this block. See Task Restrictions Overview for more information.

Supervision

Name

Required?

Description

Flexible Extraction Transcription masking

No

Prevents users from inputting invalid characters during Flexible Extraction tasks.

Collation Block

We’ve created a Collation Block to allow the grouping of files, documents, and pages into cases. To learn more about Case Collation, see Case Collation.

Settings

You can customize the functionality of your block by editing the settings described below.

Name

Required?

Description

Replace case data from duplicate file names

No

Replaces case data from repeated file names within the same case. For example, you can enable this setting if you want to resubmit a file containing new data and remove the old, duplicated data from the case. Note that this setting does not delete the old data; it just removes it from the case. 

CustomSupervisionBlockIcon.png Custom Supervision Block

To enable the tailoring of a Supervision task’s interface to a specific business process, we’ve created a Custom Supervision Block.

To use Custom Supervision, you need a custom flow with a Custom Code Block:

  • A custom flow is required because the Custom Supervision Block is not included in the default Document Processing flow. 

  • A Custom Code Block is required to define and format the data input that the Custom Supervision Block needs to show a task.

  • A Routing Block is not required, but it controls whether a submission is sent to Custom Supervision or not. Without a Routing Block, every submission to your custom flow will go to Custom Supervision. 

Settings

Name

Required?

Description

Task purpose

Yes

The custom task name given to the Custom Supervision task in the Task Queue. To learn how to make this custom task name visible in the Task Queue, see Navigating the Task Queue.

Default task restrictions

No

Select the task restrictions that should be applied to tasks created by this block. See Task Restrictions Overview for more information.

Custom Supervision transcription masking

No

Prevents users from inputting invalid characters during Custom Supervision Transcription tasks.

Database Blocks

Database Blocks allow you to make queries from Hyperscience to databases, which increases the overall speed of your flows by minimizing the need for manual transcription.

To learn more about Database Blocks, see Database Blocks.

CustomCodeBlockIcon.png Custom Code Blocks

Custom Code Blocks enable you to transform and validate extracted submission data before Hyperscience sends it to your downstream systems. The table below lists the kinds of post-processing rules you can implement with Custom Code Blocks.

Rule Type

Description

Examples

Field normalization / Data transformation

Change the formatting of data for compatibility with downstream systems

  • Make sure names have the correct capitalization and correct it, if needed

  • Convert driver's license information to a specific format

  • In addresses, change "Street" to "ST," "Drive" to "DR," and "Avenue" to "Ave"

  • Remove "LLC" from the end of Company Name entries

Data validation

Perform an external data lookup or check data within the submission to make sure the data is valid, and flag the submission as NIGO if it is not

  • Verify that a transcribed ZIP code is valid

  • Check that a customer's name and membership number match the customer's record

  • Ensure that only option A or option B is checked, and if both are checked, flag submission as NIGO

Data augmentation

Add data to a submission's JSON to prevent processing issues or to route data to specific downstream systems

  • Determine a customer's country of residence based on their customer ID, and add the country code to the submission's output

  • Add a "total price" value based on the quantity and cost of items ordered

You cannot add Custom Code Blocks to your flow through the platform. To learn more about Custom Code Blocks, or to add them to your flows, contact your Hyperscience representative.

Settings

If your business needs change and you need to modify a Custom Code Block’s code (e.g., add a keyword to a keyword search), you can do so with guidance from Hyperscience. To learn more, see Modifying Custom Code Blocks.

Named Entity Recognition Block

The Named Entity Recognition Block allows you to:

  • detect key PII entities such as:

    • Names

    • Addresses

    • Locations

    • Organizations

    • Companies

  • enhance full-page transcription output with information about detected entities.

You need to use Named Entity Recognition Blocks in conjunction with Full Page Transcription Blocks. For example, you can build a redaction flow that processes documents through full-page transcription, then detects all personal names, and at the end uses a Custom Code Block to put black boxes over the detected names. 

Custom Entity Detection Block (Beta)

The Custom Entity Detection Block allows you to locate and identify:

  • single words, and

  • word patterns that can be described with a combination of regular expressions and keywords

You need to use Custom Entity Detection Blocks in conjunction with a block, such as the Full Page Transcription Block, that returns a collection of text segments. For example, you can build a redaction flow that processes documents through full-page transcription, then detects all phone numbers, addresses, and names, and at the end uses a Custom Code Block to place black boxes over the detected text segments. 

The Custom Entity Detection Block is a beta feature that is not yet part of our Flows SDK. For information on setting up Custom Entity Detection Blocks, contact your Hyperscience representative.

Routing Blocks

Routing Blocks let you send submission data to different destinations based on the criteria you specify. In this way, Routing Blocks create branches in your flow. 

You cannot add or configure Routing Blocks in the application. For assistance, contact your Hyperscience representative.

Settings

Other than the settings under “Block Details,” Routing Blocks have no block-specific settings. 

APIBlockIcon.png API Blocks

API Blocks allow you to connect to other data sources in your organization in order to augment or verify extracted data. You can work with your Hyperscience representative to configure these blocks and place them anywhere in your flow after data extraction. API Blocks do not contain business logic; that logic lives in subsequent flow blocks.

We offer two types of API Blocks: HTTP Rest and SOAP.

For more information about API Blocks, see API Blocks.

Document Renderer

Hyperscience converts files into images before processing begins. To convert this data into a format that’s easier to process downstream, we’ve introduced the Document Renderer Block. This block allows you to download a PDF file from submissions that have gone through Machine or Manual Classification.

The Document Renderer block is included in Document Processing Subflow V40.

To configure it:

  1. In the left-hand sidebar, click Flows, and click on the name of the flow that contains Document Processing Subflow V40 (e.g., Document Processing).  

  2. Click Edit Flows.

  3. In Flow Studio, click Start Document Processing Subflow.

  4. In the Settings Type drop-down list, click on Document Rendering.

  5. Select the Document Rendering Enabled setting.

  6. Enter your desired size and quality settings for rendered documents:

    • Adjust the page size (in inches or millimeters), width, and height.

    • Specify the quality of the images — By default, the quality is set to 50%, which balances image clarity and file size. We recommend using this default setting for best results. Lowering the quality reduces the file size but may make images less clear, while increasing the quality creates larger files with sharper images. For example, a document originally 1 MB in size can grow to 40 MB when rendered in high resolution.

  7. Click Save.

Download a document

After you’ve completed your submission, a download URL is available in the submission’s JSON output. To download the documents:

  1. Go to the Submissions page.

  2. Open the submission whose documents you want to download.

  3. Click Actions, and then click View JSON Output.

  4. Use your browser’s search function to locate download_url.

  5. Copy the URL and append it to your environment’s URL (e.g., example.hyperscience.com/api/<URL>).

  6. Choose a folder on your local machine to save the file, and the download will begin.

CompleteBlockIcon.png Complete Blocks

Every flow needs a Complete Block. This block initiates Quality Assurance tasks and changes the submission’s status to “Complete.”

Settings

Other than the settings under “Block Details,” Complete Blocks have no block-specific settings. 

OutputBlockIcon.png Output Blocks

Output Blocks were called “Output Connectors” in previous versions of Hyperscience. With Output Blocks, you can send data extracted by Hyperscience to other systems for downstream processing. If you want your flow to send notifications for submission statuses other than “Complete,” you will need to work with your Hyperscience representative to set up a separate Notification flow.

A full list of the Output Blocks we currently support, along with more details about each, can be found in Output Blocks

You can control which Output Blocks are enabled in your flow at any time by selecting or deselecting the Enabled option in each Output Block.

Settings shared by all block types

All blocks have the following settings under “Block Details”:

Name

Required?

Description

Display Name

Yes

The block's name in Flow Studio. You can change the name for each of your flows.

Description

No

The block's description in Flow Studio. You can change the description for each of your flows.