Introduction
Omniscope supports reading and writing Apache Parquet and Avro files.
Parquet is a popular column-based file format used by Hadoop systems. It is designed to efficiently storage large data sets and has the file extension .parquet.
Avro is a row-oriented file format, developed by Apache.
Reading a Parquet or Avro file
Inside your Omniscope workflow, add a new File input block. Double-click on the block to open the options. Select the location of the parquet file. If the file has the expected .parquet extension Omniscope will automatically pick the Parquet file format. Click the Play button to execute and read the data:
Writing a Parquet or Avro file
In side your Omniscope workflow, add a new File output block. Connect the data that you want to write to your output block:
Double-click on the File output block to open the options. Select the location and name of the file you want to create. Change the Format to Apache Parquet (.parquet file). Click the Play button to write the data:
Limitations
When reading a Parquet file, Omniscope only supports the following logical types: STRING, ENUM, INTEGER, DECIMAL, DATE, TIME, TIMESTAMP, JSON. Other types, such as LIST and MAP are not currently supported. If you need to import data with one or more missing types please get in touch with us, as it may be possible for us to develop support if required.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article