Start a new topic
Solved

split data in multiple datasets

Hi,

I am using the « split data » operation block.

Would it be feasible to allow the output of several datasets, according of the content of a column.

For example : if column « country » contains : UK, France, Italy à split the input dataset in 3 datasets, one for each country ?

Best Regards

 

signature 5-2



 

1 Comment

Hello Magali,


The "Split data" block is meant to split data into two sets using various methodologies , like Proportional Data Split or Intelligent Cut-off, and works on date or numeric fields. This is not the block for your use case.


What you are describing seems like a workflow to create datasets, one per each value found in a field with comma separated values.

Omniscope does not have a block that solves this but we can use the Python block as if it was an output block capable of splitting the output and writing to multiple csv files, 1 per "partitition" / data split.


I have created a demo for you that does what you described: https://omniscope.me/Forums/Split+data+in+Datasets.iox/ 

The workflow splits the data per country and creates csv files, 1 per country.
First I use a De-tokenise block on the "Countries" field to duplicate rows, then I rename "Countries" to "Country" in Field organiser, and finally have the Python block outputting the data partitioned by Country one per Country / partition.


I also added a Batch append folder block to verify the data is written correctly.


You can download the project and continue locally.


Let me know if this achieve your goal



Login or Signup to post a comment