Loading…
Apache: Big Data Europe 2016
Click here to Register or for more information 
Monday, November 14 • 12:00 - 12:50
An Overview on Optimization in Apache Hive: Past, Present, Future - Jesús Camacho Rodríguez, Hortonworks

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Apache Hive has been continuously evolving to support a broad range of use cases, bringing it beyond its batch processing roots to its current support for interactive queries with LLAP. However, the development of its execution internals is not sufficient to guarantee efficient performance, since poorly optimized queries can create a bottleneck in the system. Hence, each release of Hive has included new features for its optimizer aimed to generate better plans and deliver improvements to query execution. In this talk, we present the development of the optimizer since its initial release. We describe its current state and how Hive leverages the latest Apache Calcite features to generate the most efficient execution plans. We show numbers demonstrating the improvements brought to Hive performance, and we discuss future directions for the next-generation Hive optimizer.

Speakers
avatar for Jesús Camacho Rodríguez

Jesús Camacho Rodríguez

Member of Technical Staff, Hortonworks
Jesús Camacho Rodríguez is a Member of Technical Staff at Hortonworks, the PMC chair of Apache Calcite, and a PMC member of Apache Hive. His current work focuses on extending and improving query processing and optimization, ensuring that the increasingly complex workloads supported... Read More →



Monday November 14, 2016 12:00 - 12:50 CET
Nervion/Arenal II/III