split data in multiple datasets

Posted over 5 years ago by Magali Colin - Avizua

Post a topic
Magali Colin - Avizua
Magali Colin - Avizua


I am using the « split data » operation block.

Would it be feasible to allow the output of several datasets, according of the content of a column.

For example : if column « country » contains : UK, France, Italy à split the input dataset in 3 datasets, one for each country ?

Best Regards


signature 5-2


0 Votes


Antonio Poggi

Antonio Poggi posted over 5 years ago Admin

Hello Magali,

The "Split data" block is meant to split data into two sets using various methodologies , like Proportional Data Split or Intelligent Cut-off, and works on date or numeric fields. This is not the block for your use case.

What you are describing seems like a workflow to create datasets, one per each value found in a field with comma separated values.

Omniscope does not have a block that solves this but we can use the Python block as if it was an output block capable of splitting the output and writing to multiple csv files, 1 per "partitition" / data split.

I have created a demo for you that does what you described: https://omniscope.me/Forums/Split+data+in+Datasets.iox/ 

The workflow splits the data per country and creates csv files, 1 per country.
First I use a De-tokenise block on the "Countries" field to duplicate rows, then I rename "Countries" to "Country" in Field organiser, and finally have the Python block outputting the data partitioned by Country one per Country / partition.

I also added a Batch append folder block to verify the data is written correctly.

You can download the project and continue locally.

Let me know if this achieve your goal

0 Votes

Login or Sign up to post a comment