Supported Languages

Hyperscience supports the automation of submissions written in the following languages:

  • Arabic

  • Bulgarian*

  • Chinese*

  • Czech*

  • Dutch

  • English

  • Estonian*

  • French

  • German

  • Hebrew*

  • Italian

  • Japanese*

  • Kazakh*

  • Korean

  • Korean and English

  • Latvian*

  • Lithuanian*

  • Polish*

  • Portuguese

  • Russian*

  • Slovak*

  • Spanish

  • Thai*

  • Turkish*

(* Support for these languages is currently limited. See Language-specific capabilities and features below for more details.)

Note that Hyperscience does not support the auto-detection of languages at the layout or field levels. The system uses the layout variations the submission's pages are matched to to determine which languages to use for automation.

Language families

We group our supported languages into language families based on their character sets.

Our language families are outlined in the table below.

Language family

Included languages

Arabic

  • Arabic

Baltic-Slavic

  • Latvian

  • Lithuanian 

Chinese

  • Simplified Chinese

Japanese

  • Japanese

Korean

  • Korean

  • Korean and English

Kra-Dai

  • Thai

Latin

  • Dutch

  • English

  • French

  • German

  • Italian

  • Spanish

  • Portuguese

Slavic

  • Bulgarian

  • Czech

  • Polish

  • Russian

  • Slovak

Semitic

  • Hebrew

Uralic

  • Estonian

Turkic

  • Kazakh

  • Turkish

Language-specific capabilities and features

 

Structured

Semi-structured

Hand-written

Printed

Auto-thresholding

Transcript. Auto.

Image Correction 

Arabic

✔

✔

✔

✔

✔

✔

✔

Bulgarian

✔

✔

✔**

✔

 

 

 

Simplified Chinese

✔

 

 

✔

 

 

 

Czech

✔

✔

✔**

✔

 

 

✔

Dutch

✔

✔

✔

✔

✔

✔

✔

English

✔

✔

✔

✔

✔

✔

✔

Estonian

✔

✔

✔**

✔

 

 

✔

French

✔

✔

✔

✔

✔

✔

✔

German

✔

✔

✔

✔

✔

✔

✔

Hebrew

✔

✔

✔**

✔

 

 

✔

Italian

✔

✔

✔

✔

✔

✔

✔

Japanese

✔

 

 

✔

 

 

 

Kazakh

✔

✔

✔**

✔

 

 

 

Korean

✔

✔

✔

✔

✔

✔

✔

Korean and English

✔

✔

✔

✔

✔

✔

✔

Latvian

✔

✔

✔**

✔

 

 

✔

Lithuanian

✔

✔

✔**

✔

 

 

✔

Polish

✔

✔

✔**

✔

 

 

✔

Portuguese

✔

✔

✔

✔

✔

✔

✔

Russian

✔

✔

✔**

✔

 

 

 

Slovak

✔

✔

✔**

✔

 

 

✔

Spanish

✔

✔

✔

✔

✔

✔

✔

Thai

✔

✔

✔**

✔

 

 

✔

Turkish

✔

✔

✔**

✔

 

 

✔

**Performance may be reduced for handwritten documents.

Notes on specific languages

Arabic

  • Performance may be negatively affected if Arabic characters and Latin letters and numbers appear on the same form field.

Korean

  • The template dropout feature does not always work correctly, but is being optimized in future versions of Hyperscience.

  • The following currency characters are not currently supported in Korean transcription and extraction: £, $, €, and Â¥.

  • Fields that may use Latin characters, such as weight (kg) or length (km), and are not accurately transcribed. Emails are also similarly affected, as they are mostly based in the Latin character set.

Polish

  • The Alphanumeric data type is not fully compatible with Polish at this time.

Assigning languages to layouts

Each layout, regardless of its type, has one layout-level language, which you select during the layout-creation process. All of a layout’s variations have the same layout-level language; you cannot select a specific language for each variation. By default, the system applies this language to all fields in the layout’s variations.

Each Semi-structured or Structured layout may also have multiple field-level languages, which you can select in the Layout Editor for individual fields in a layout variation. When assigning field-level languages, you can select one language per field, and you can choose from any of our supported languages.

To learn more about creating layouts, see Creating Structured Layouts, Creating Semi-structured Layouts, and Creating Additional Layouts.

Multiple languages within a submission

When a submission is ingested into Hyperscience, the system matches its pages to one or more layout variations for automation. 

  • Each set of pages matched to a layout variation becomes a document, and each layout variation has a single layout-level language assigned to it. The system will use that language when processing the document, with the exception of any fields that have their own language settings.

  • Each submission can include documents with layout-level languages from the same language family (e.g., two French documents and one English document), regardless of any field-level languages the documents have.

Using a layout across languages

If you have a single layout that you want to use for automation in multiple layout-level languages, you need to create an instance of Hyperscience for each of those languages and add the layout to each instance. 

For example, if you have an insurance form that some customers complete in Spanish and other customers complete in English, you would create a "Spanish" instance with Spanish selected as the layout's language and an "English" instance with English selected as the layout's language.

Languages and releases

If you have layouts in multiple layout-level languages, each release you create can only contain layouts with layout-level languages from one language family.

For example, if you have layouts in Simplified Chinese, Japanese, and Korean, you need a release for your Simplified Chinese layouts, one for your Japanese layouts, and another for your Korean layouts. However, a single release can contain English, Italian, and Spanish layouts.