split data in multiple datasets

Posted over 5 years ago by Magali Colin - Avizua

Post a topic
Solved
Magali Colin - Avizua
Magali Colin - Avizua

Hi,

I am using the « split data » operation block.

Would it be feasible to allow the output of several datasets, according of the content of a column.

For example : if column « country » contains : UK, France, Italy à split the input dataset in 3 datasets, one for each country ?

Best Regards

 

signature 5-2



 

0 Votes


1 Comments

Antonio Poggi

Antonio Poggi posted over 5 years ago Admin

Hello Magali,


The "Split data" block is meant to split data into two sets using various methodologies , like Proportional Data Split or Intelligent Cut-off, and works on date or numeric fields. This is not the block for your use case.


What you are describing seems like a workflow to create datasets, one per each value found in a field with comma separated values.

Omniscope does not have a block that solves this but we can use the Python block as if it was an output block capable of splitting the output and writing to multiple csv files, 1 per "partitition" / data split.


I have created a demo for you that does what you described: https://omniscope.me/Forums/Split+data+in+Datasets.iox/ 

The workflow splits the data per country and creates csv files, 1 per country.
First I use a De-tokenise block on the "Countries" field to duplicate rows, then I rename "Countries" to "Country" in Field organiser, and finally have the Python block outputting the data partitioned by Country one per Country / partition.


I also added a Batch append folder block to verify the data is written correctly.


You can download the project and continue locally.


Let me know if this achieve your goal



0 Votes

Login or Sign up to post a comment