Prerequisite: download and install the app
On the Omniscope download page you will find two builds - 'Rock' and 'latest daily build', first one being older and more stable, the newer one coming with all the latest features, but not as tested as the first one.
After the download, install the app (following instructions here if necessary) and start Omniscope by clicking on the blue Omniscope icon on your desktop.
It’s a Web application, so use any device to interact with it
Omniscope will open in a browser and from here you will be able to create new projects: files, folders, or upload the data files that you wish to use as sources (see the three blue buttons in the top right hand corner). Storing the data in the same folder with the project file/reports is optional - Omniscope workflow can connect to remote data sources.
All menus are finger-friendly to allow you create and access reports on different devices: you could create a report in the office, then make edits at home, or on the train. Omniscope will run not only on Windows, Mac and Linux machines, but also on any web-enabled touch screen devices, such as iPads, android tablets and mobile phones.
Get started - create a project
Clicking on the plus button will open a new project interface, where you will be able to assemble the data transformation workflow and interactive visualisations (see image below).
If you are working with locally-stored reports and connecting to files on your machine as data sources, you will be able to do all this even if you have no internet connection.
Small blue icon at the top of the screen on Mac, or on the bottom tray on the Windows machines, will contain shortcuts to options such as Omniscope Classic application (if you wanted to work on legacy IOK files), accessing error reports, log files etc.
Tip - use 'Exit' option from this menu as a reliable way to quit the application before updating your Omniscope build, by downloading the new version. There is no need to delete the existing build - the new one will replace it. By doing this on a regular basis, you will benefit from all the software improvements throughout the year - see the changelog section on the download page.
The first experience with your new project is the Workflow app. The multiple toolboxes on the left contain all sort of operations and inputs for ETL and analytics : Inputs, Connectors, Preparation, Analytics, Text Analysis, Outputs and Reports.
Expanding each of these menus will allow you to drag a block to the workspace to create the data transformation process (here we've expanded the Inputs menu).
All blocks are colour-coded, so you can quickly see that in the project above we are working with four source files, all in yellow.
First building blocks are your data sources, so open and drag a File block, if you wanted to use data from an Excel spreadsheet, or a Database block, to retrieve data from a database.
Inputs menu contents:
- Demo data block comes with multiple useful demo datasets, which are great for testing and training. Some of these datasets are used in our training videos.
- File – used for Excel spreadsheets, .csv, .tsv, .txt, legacy IOK files, XML, JSON, IOD and other data files.
Note that you can use the FIle block to connect to any XML or JSON REST API, setting optional HTTP headers for example to pass an authentication token.
- Database block enables you to connect to local, network or cloud located databases such as Oracle, SQL, Actian Vector, Ingres, MS Access, etc., as well as use any database that comes with a JDBC driver.
- Lookup data might come handy if your data is lacking information needed to display it on the map, as it contains Lat/Long data for countries and towns around the world
- ‘Batch append’ and ‘Append files’ will help you append multiple files from one or several folders, avoiding the need to clutter the screen with too many individual data sources in a situation when, for example, you have files with the same structure covering different periods (week1, week2, week3...).
- ‘File attributes’ is a useful when doing files and reports inventory, showing files, folders, field metadata, data schema, also listing report features used in the legacy iok files.
Connectors menu is utilising publicly available or commercial API connectors for external data sources - social media, advertising, financial data, such as Twitter, Reddit, Google Analytics, DoubleClick, Currency data and others.
This menu is updated periodically to add more sources, or remove the connectors that are being updated/removed by the vendor.
All individual blocks contain the configuration tab and a data preview tab plus, in some cases, extra tabs with instructions and extra information. After configuring the options on the first tab you will need to press the ‘Execute’ button at the top of the block. This will allow the data to flow through it, undergo any defined actions, then appear on the data preview tab, where you can examine the results. This inspection is made easier by clicking on the column header, allowing you to sort the data, or by clicking on the cog button, to create a mini chart, showing the data distribution inside each field. Clicking on the small chart icon in the top left-hand corner will create charts for all the fields. Another useful feature is a pin icon next to the Data tab – this will bring the data preview forward and display it in the bottom half of the config tab. Non-executed data may in some cases appear in the preview, containing a data sample.
Preparation menu - all the tools you need for the for the data management and transformation
Append – bringing together multiple data sources and organising them in a vertical fashion. Typically used to append data that contains same fields, for example transactions in consecutive time periods.
Join – performs merge operation on one or multiple fields, and allows separation of non-merged records. Tip - deduplicate and remove blanks on both sides before merging!
Normalise case is used for textual data - it will eliminate multiple variations of the same text (e.g. London and LONDON)
Bulk field organiser – contains multiple fields editing/deleting functionality
Validate data will do data type checks for all the fields, making sure the contents of the fields follow a correct data schema (e.g. the Date field contains date values)
Record filter will allow the user to set one or multiple field value criteria, also to accept/reject the records that do not satisfy the criteria. Two data preview tabs will display both accepted/rejected records, while both sets can be used as outputs of this block.
Search/replace block will organise data cleaning in one or multiple fields, using any number of search terms
Aggregate block can summarise the dataset using one or multiple criteria, while the user can pick resulting fields and functions, such as sum, mean, max etc.
De-duplicate and Delete empty data are useful steps, especially before the data is merged with another dataset. Beware - two blanks are a match!
Pivot, De-pivot and Transpose are priceless tools when we need to change the data orientation. Pivot and de-pivot can isolate and change organisation of several fields while leaving the others in place, while Transpose is a mechanical operation, changing orientation of all the fields. (e.g. fields Jan, Feb, March can be merged into a single Date field by using de-pivot operation; Pivot will reverse this).
Normalise operation is used to normalise values in one or multiple numeric fields, so these can be later displayed and analysed together, in case where the original values are on different scales and might be difficult to visualise, or distort influence on a target variable in a statistical model.
Scramble data is a simple way to replace sensitive text data with unrecognisable, yet consistent patterns, enabling the user to share the output with others, while the analysis and visualisation of the results are not affected.
Split values will separate fields using a single-character separator or fixed width, while Collapse block will do the opposite, and concatenate contents of multiple fields.
Sort will enforce data ordering according to one or more fields.
Split data – splits records randomly into two sets with given proportions. Used to create input for statistical models, so one subset of the dataset can be used to train the model, and the other one for testing.
Random sample picks records from the datasets either as a percentage of the dataset size, or a fixed number of records.
Geocode and Reverse Geocode rely on Esri (paid for) or Nominatum (free service) to allow users to assign Lat/Long locational data based on the address information, or the other way round. This is a less reliable alternative to using 'data mapping' and merging the dataset with a list of locations.
Making a connection - one-to-many or many-to-many
Connecting blocks is easy - you will see the arrows on one or both sides of the blocks, which you grab and drag to connect with other blocks. Data sources will usually have output only, exception are the Database block and connectors, which may take optional parameters input. Output of one block can be connected to multiple subsequent preparations blocks or reports. In the image of the workflow above a Field organiser block called 'World' is feeding data into two analytics blocks, as well as a blue 'Report 3' block.
Best practice tip 1: whatever your datasource - connect it to a Field organiser block, where you can see a list of all the fields and quickly eliminate fields that are not needed, correct the data type, or rename the fields.
Best practice tip 2: name your blocks by clicking on the name at the top of an open block, or create workflow notes from the + button >Shapes at the top of the workflow
It will make any subsequent reviewing and editing job easier, especially if you are collaboratively working on the file with other users.
Once configured, the workflow will allow you to retrieve new data from your data sources and update the dashboards and reports, just by clicking on the refresh button. This refresh can be performed on demand, by clicking on the ‘play’ button, or be automated by using the Scheduler application, available to the Omniscope Server users. In this scenario you could specify reports production dynamic, so some reports are updated every hour/day/week/month or more frequently.
You can read more about the execution and refresh behaviour in this article.