Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
Apache: Big Data Europe 2016
Click here to Register or for more information 
View analytic
Wednesday, November 16 • 13:00 - 13:50
Parquet Format in Practice & Detail - Uwe L. Korn, Blue Yonder

Sign up or log in to save this to your schedule and see who's attending!

Apache Parquet is among the most commonly used column-oriented data formats in the big data processing space. It leverages various techniques to store data in a CPU- and I/O-efficient way. Furthermore, it has the capabilities to push-down analytical queries on the data to the I/O layer to avoid the loading of nonrelevant data chunks. With various Java and a C++ implementation, Parquet is also the perfect choice to exchange data between different technology stacks.

As part of this talk, a general introduction to the format and its techniques will be given. Their benefits and some of the inner workings will be explained to give a better understanding how Parquet achieves its performance. At the end, benchmarks comparing the new C++ & Python implementation with other formats will be shown.

Speakers
avatar for Uwe L. Korn

Uwe L. Korn

Data Scientist, Blue Yonder GmbH
Uwe Korn is a Data Scientist at the German RetailTec company Blue Yonder. His expertise is on building architectures for machine learning services that are scalably usable for multiple customers aiming at high service availability as well as rapid prototyping of solutions to evaluate the feasibility of his design decisions. As part of his work to provide an efficient data interchange he became a core committer to the Apache Parquet project.


Wednesday November 16, 2016 13:00 - 13:50
Nervion/Arenal I

Attendees (30)