Loading…
Apache: Big Data Europe 2016
Click here to Register or for more information 
Monday, November 14 • 16:30 - 17:20
Interactive Analytics at Scale in Apache Hive Using Druid - Jesús Camacho Rodríguez, Hortonworks

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Druid is an open-source analytics data store specially designed to execute OLAP queries on event data. Its speed, scalability and efficiency have made it a popular choice to power user-facing analytic applications. However, it does not provide important features requested by many of these applications, such as a SQL interface or support for complex operations such as joins. This talk presents our work on extending Druid indexing and querying capabilities using Apache Hive. In particular, our solution allows to index complex query results in Druid using Hive, query Druid data sources from Hive using SQL, and execute complex Hive queries on top of Druid data sources. We describe how we built an extension that brings benefits to both systems alike, leveraging Apache Calcite to overcome the challenge of transparently generating Druid JSON queries from the input Hive SQL queries.

Speakers
avatar for Jesús Camacho Rodríguez

Jesús Camacho Rodríguez

Member of Technical Staff, Hortonworks
Jesús Camacho Rodríguez is a Member of Technical Staff at Hortonworks, the PMC chair of Apache Calcite, and a PMC member of Apache Hive. His current work focuses on extending and improving query processing and optimization, ensuring that the increasingly complex workloads supported... Read More →



Monday November 14, 2016 16:30 - 17:20 CET
Nervion/Arenal II/III