Omniscope allows you to quickly and systematically compare data from multiple sources, whether you are comparing raw data or data resulting from your transformations, to apply some checks.
The block has two inputs: the “actual,” which is the dataset you wish to check, and the “expected,” which is what the correct dataset should look like.
Users of the Validate Data block will be familiar with the three levels of checks applied in the Compare process:
- Compare Schema Section: This checks the fields/columns, identifying mismatches in field names, counts, and formats.
- Compare Records (Rows): This checks whether the two sources have identical number of records.
- Compare Cells: This compares the values inside the cells and reports any mismatches.
The outcome of each of these checks can be set to “error,” “warning,” or “none.”
In the case of error the data will not flow through the block if it fails one or more checks.
Warning: This setting will result in a warning message explaining which points the dataset failed the validation criteria (e.g., extra fields, fewer records than expected, or mismatching cell values).
In the “Compare Cells” section, you can apply variations:
Decide whether to compare values in all columns or just focus on some.
Use dynamic rules to set the checks to compare only text fields or fields that contain some text e.g. ‘category,’ making it resistant to fluctuations in future data inputs.
Further to that, all fields with a link icon can be parameterised, so that their behaviour can be controlled remotely.
Note: The result of the Compare block is the ‘actual’ input, either if data has passed all checks, or if checks were set to 'warning' or 'none'.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article