Hyperscience supports the automation of submissions written in the following languages:
Arabic
Bulgarian*
Chinese*
Czech*
Dutch
English
Estonian*
French
German
Hebrew*
Italian
Japanese*
Kazakh*
Korean
Korean and English
Latvian*
Lithuanian*
Polish*
Portuguese
Russian*
Slovak*
Spanish
Thai*
Turkish*
(* Support for these languages is currently limited. See Language-specific capabilities and features below for more details.)
Note that Hyperscience does not support the auto-detection of languages at the layout or field levels. The system uses the layout variations the submission's pages are matched to to determine which languages to use for automation.
Language families
We group our supported languages into language families based on their character sets.
Our language families are outlined in the table below.
Language family | Included languages |
---|---|
Arabic |
|
Baltic-Slavic |
|
Chinese |
|
Japanese |
|
Korean |
|
Kra-Dai |
|
Latin |
|
Slavic |
|
Semitic |
|
Uralic |
|
Turkic |
|
Language-specific capabilities and features
| Structured | Semi-structured | Hand-written | Printed | Auto-thresholding | Transcript. Auto. | Image Correction |
---|---|---|---|---|---|---|---|
Arabic | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Bulgarian | ✔ | ✔ | ✔** | ✔ |
|
|
|
Simplified Chinese | ✔ | ✔ |
| ✔ | ✔ | ✔ | ✔ |
Czech | ✔ | ✔ | ✔** | ✔ |
|
| ✔ |
Dutch | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
English | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Estonian | ✔ | ✔ | ✔** | ✔ |
|
| ✔ |
French | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
German | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Hebrew | ✔ | ✔ | ✔** | ✔ |
|
| ✔ |
Italian | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Japanese | ✔ |
|
| ✔ |
|
|
|
Kazakh | ✔ | ✔ | ✔** | ✔ |
|
|
|
Korean | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Korean and English | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Latvian | ✔ | ✔ | ✔** | ✔ |
|
| ✔ |
Lithuanian | ✔ | ✔ | ✔** | ✔ |
|
| ✔ |
Polish | ✔ | ✔ | ✔** | ✔ |
|
| ✔ |
Portuguese | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Russian | ✔ | ✔ | ✔** | ✔ |
|
|
|
Slovak | ✔ | ✔ | ✔** | ✔ |
|
| ✔ |
Spanish | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Thai | ✔ | ✔ | ✔** | ✔ |
|
| ✔ |
Turkish | ✔ | ✔ | ✔** | ✔ |
|
| ✔ |
**Performance may be reduced for handwritten documents.
Notes on specific languages
Arabic
Performance may be negatively affected if Arabic characters and Latin letters and numbers appear on the same form field.
Korean
The template dropout feature does not always work correctly, but is being optimized in future versions of Hyperscience.
The following currency characters are not currently supported in Korean transcription and extraction: £, $, €, and ¥.
Fields that may use Latin characters, such as weight (kg) or length (km), and are not accurately transcribed. Emails are also similarly affected, as they are mostly based in the Latin character set.
Polish
The Alphanumeric data type is not fully compatible with Polish at this time.
Assigning languages to layouts
Each layout, regardless of its type, has one layout-level language, which you select during the layout-creation process. All of a layout’s variations have the same layout-level language; you cannot select a specific language for each variation. By default, the system applies this language to all fields in the layout’s variations.
Each Semi-structured or Structured layout may also have multiple field-level languages, which you can select in the Layout Editor for individual fields in a layout variation. When assigning field-level languages, you can select one language per field, and you can choose from any of our supported languages.
To learn more about creating layouts, see Creating Structured Layouts, Creating Semi-structured Layouts, and Creating Additional Layouts.
Multiple languages within a submission
When a submission is ingested into Hyperscience, the system matches its pages to one or more layout variations for automation.
Each set of pages matched to a layout variation becomes a document, and each layout variation has a single layout-level language assigned to it. The system will use that language when processing the document, with the exception of any fields that have their own language settings.
Each submission can include documents with layout-level languages from the same language family (e.g., two French documents and one English document), regardless of any field-level languages the documents have.
Using a layout across languages
If you have a single layout that you want to use for automation in multiple layout-level languages, you need to create an instance of Hyperscience for each of those languages and add the layout to each instance.
For example, if you have an insurance form that some customers complete in Spanish and other customers complete in English, you would create a "Spanish" instance with Spanish selected as the layout's language and an "English" instance with English selected as the layout's language.
Languages and releases
If you have layouts in multiple layout-level languages, each release you create can only contain layouts with layout-level languages from one language family.
For example, if you have layouts in Simplified Chinese, Japanese, and Korean, you need a release for your Simplified Chinese layouts, one for your Japanese layouts, and another for your Korean layouts. However, a single release can contain English, Italian, and Spanish layouts.