Powerful data validation using the Validate Data block

Modified on Wed, 29 Jul 2020 at 07:19 PM

The Validate Data block in Omniscope Evo lets you perform checks on your data, and halt execution/flow, send email notifications, or display validation results in your dashboards:

We've recently expanded the block's functionality, so you can now use it to:

  • Validate the schema. Check the incoming data matches a particular schema precisely or loosely, by detecting missing fields, wrong field data types, and/or extraneous fields. You can also capture the current input data schema as rules with one click on the Reset to input button (new), such that any future changes to the incoming data are immediately detected.
  • Validate cell values. Check values in the using a list of rules; you might define valid values for a categorical field, or numeric fields having values beyond acceptable ranges, for example.

  • Validate record count (new). Check the total record count is at least a minimum, no greater than a maximum, or within a range. For example you might be expecting a fixed number of records, at least 1 record (to fail on no-data), at most 0 records (to fail on some-data), or simply a range of healthy record counts.

  • Output problem metadata (new). The block's 2nd output Problems contains granular data about all problem occurrences for all configured validation rules (schema, cell value and/or record count), irrespective of whether they are configured as Warning or Ignore (but note you cannot use in conjunction with Error, which aborts execution altogether). It outputs a row per problem occurrence, with a set of machine- and human-readable columns describing exactly what has gone wrong. You can further process this metadata as regular data, such as filtering or aggregating, and acting upon it if needed, e.g. in an Email Output block (new), or by connecting into a Report block and displaying the problems appropriately using views and formulas in your dashboard.

  • Customise outcome: When problems arise, you can choose to fail brutally with an error, and stop data flowing onwards. Or just warn, and let data pass. Or completely ignore a particular failure condition.

  • Send emails: You can also opt to send a notification email when a particular validation event arises. This is a simple email without any data-driven content, used only to flag a problem. To send a more sophisticated validation email with a summary of the problems, you should use the Problems data output, in conjunction with the Email Output block (new).

Example: If you would like to separate 'clean' from 'faulty' records, also see the data validation diagnostics, you could build a model with two Validation blocks: 1 set to create a warning and generate the diagnostics, then feed the data into a Data Table block, where some values could be edited and fixed, a Field organiser where data types could be corrected, and then follow it up by another Validate data block, that could stop the faulty records reaching the report (error outcome on fail).

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article