File block "Locale" option

Modified on Wed, 24 Apr 2019 at 06:53 PM

This is an example of the effects and use case for the Locale setting in the File block. When opening a CSV file, it matters what locale it was created in.

Here we have one CSV file (attached, and also shown below) being imported with two different settings. The CSV file uses semicolon ";" to separate cell values, which is common in non-English locales (helpfully avoiding conflict between "," for decimal point in non-English locales, and "," for CSV cell separator).

dot only (English);comma only (French);dot then comma (Bad);comma then dot (English);space then comma (Bad);space then dot (Bad);thin space then dot (Bad);thin space then comma (French)
12345.6;12345,6;12.345,6;12,345.6;12 345,6;12 345.6;12 345.6;12 345,6

The file has different examples in it of how numbers might be formatted in different locales.

In English locales the number 12,345.6

is written in French locales as 12 345,6

both meaning 12 <thousand separator> 345 <decimal-point> 6.

In the CSV above, these cases are the column "comma then dot (English)" with value "12,345.6", and the column "thin space then comma (French)" with value "12 345.6".

Omniscope only tolerates one character (locale dependent) for the thousand separator, and disregards it, wherever it appears. For English it's "," and for French it's a thin-space unicode character. In this CSV file there are examples of these cases, plus other examples of using "." or <normal space> as a thousand separator, seen in some other locales not demonstrated here.

Equally, Omniscope only tolerates one character for the decimal point. For English it's "." and for French it's ",".

If Omniscope sees any other punctuation or non-numeric digits, it will assume the column is text.

In the attached example IOZ file, we're using the Validate block to assert the expected output:

First, using schema validation:

We expect Text for columns that can't be interpreted as numbers using the respective configured locale.

We expect Integer for the case of English locale interpreting the French formatted number "12.345,6" wrongly.

And we expect Decimal for all numbers which are interpreted correctly.

Second, using cell value validation:

We expect 12345.6 for the expected successfully parsed columns.

We expect 123456 for the wrongly interpreted integer case (as above).

Note: the number format seen in the report, in the block data preview, and in the validate block text inputs, will be determined by the server's locale (e.g. in the UK numbers will be formatted like 10,000.0) irrespective of what the original CSV column or what the File block has configured. The File block option is purely about how Omniscope interprets the data. Thereafter it is always displayed using the server's data locale, which defaults to the system locale.

To recreate this example, download both the CSV and IOZ attached, into your sharing folder. Open in Omniscope Web and import the IOZ. (WARNING: the attached IOZ file will not work correctly if imported into a server on a non-English locale; the cell value validation rules will need to be reconfigured. This file also is not backwards compatible with anything but the latest daily builds.)

Alternatively see the live example here.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article