Name: Massively Parallel Data Warehousing in the Hadoop Stack - Gregory Chase & Roman Shaposhnik, Pivotal
Start: 2016-11-15T13:00:00+0100
End: 2016-11-15T13:50:00+0100

Apache: Big Data Europe 2016
Click here to Register or for more information

Back To Schedule

Massively Parallel Data Warehousing in the Hadoop Stack - Gregory Chase & Roman Shaposhnik, Pivotal

Hadoop has been touted as a replacement for data warehouses. In practice Hadoop has had success offloading ETL/ELT workloads, but still has gaps serving requirements for operational analytics.

Apache Bigtop now includes Greenplum Database in deployment of big data solutions. Greenplum Database is, an open source massively parallel data warehouse based on PostgreSQL, and is an excellent addition to the Hadoop ecosystem.

In this session we'll cover:

Introduction to Greenplum
Bigtop Support for Greenplum
External tables in Hadoop by Greenplum
Parallel reads and writes to Hadoop by Greenplum
Running advanced analytics on structured and unstructured data in both Hadoop and Greenplum via Apache MADlib (incubating)
Geospatial and Machine Learning in Greenplum based on HDFS data
Storing data from a data lake in Greenplum for high throughput analytical queries

Speakers

Gregory Chase

Director Product Marketing, PagerDuty

Greg Chase is Director of Product Marketing for PagerDuty Automation and Rundeck. He's been in marketing and engineering in software companies for too many decades, evangelizing and building automation platforms, developer tools and data engineering frameworks. Before PagerDuty, Greg... Read More →

Roman Shaposhnik

Director of Open Source, Linux Foundation

Apache Software Foundation and Data, oh but also unikernels

Tuesday November 15, 2016 13:00 - 13:50 CET
Nervion/Arenal I

Apache Big Data Europe 2016

Gregory Chase

Roman Shaposhnik

Attendees (25)

Apache Big Data Europe 2016

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Gregory Chase

Roman Shaposhnik

Attendees (25)