
arrow_back View All Dates
07:00 CET
08:30 CET
09:30 CET
09:45 CET
10:10 CET
10:25 CET
11:00 CET
12:00 CET
Data Science with Spark and Case Study with Non-Motorized Travel Social Data for the Public - Yi Fan Zhang, IBM Giralda III/IV
Geospatial Track: Geospatial Big Data: Software Architectures and the Role of APIs in Standardized Environments - Ingo Simonis, Open Geospatial Consortium (OGC) Carmona
Graph Processing with Apache Tinkerpop on Apache S2Graph - Doyung Yoon, Kakao Corp. Giralda VI/VII
An Overview on Optimization in Apache Hive: Past, Present, Future - Jesús Camacho Rodríguez, Hortonworks Nervion/Arenal II/III
Machine Learning in Apache Zeppelin - Alexander Bezzubov, NF Labs Santa Cruz
Managing Deeply Nested Documents in Apache Solr - Anshum Gupta, IBM Watson Giralda V
Building a Scalable Recommendation Engine with Apache Spark, Apache Kafka and Elasticsearch - Nick Pentreath, IBM Giralda I/II
Property-based Testing for Spark Streaming - Adrian Riesco, Universidad Complutense de Madrid Nervion/Arenal I
13:00 CET
Uber - Your Realtime Data Pipeline is Arriving Now! - Ankur Bansal, Uber Giralda III/IV
Geospatial Track: Crowd Learning for Indoor Navigation - Thomas Burgess, indoo.rs GmbH Carmona
Apache S2Graph (incubating) as a User Event Hub - Hyunsung Jo, Daewon Jeong & Hwansung Yu, Kakao Corp. Giralda VI/VII
Hadoop, Hive, Spark and Object Stores - Steve Loughran, Hortonworks Nervion/Arenal II/III
Introducing Apache Apex: Next Gen Big Data Processing on Hadoop - Thomas Weise, DataTorrent Nervion/Arenal I
Distributed In-Database Machine Learning with Apache MADlib (incubating) - Roman Shaposhnik, Pivotal Santa Cruz
Fast & Scalable Email System with Apache Solr - Strategies, Tradeoffs and Optimizations - Arnon Yogev, IBM Research Giralda V
Building Apache Spark Application Pipelines for the Kubernetes Ecosystem - Michael McCune, Red Hat Giralda I/II
13:50 CET
15:30 CET
Fighting Identity Theft: Big Data Analytics to the Rescue - Seshika Fernando, WSO2 Giralda III/IV
Performance Monitoring for the Cloud - Werner Keil, Agile Coach Santa Cruz
Processing Planetary Sized Datasets - Tim Park, Microsoft Carmona
Moven: Machine/Deep Learning Models Distribution Relying on the Maven Infrastructure - Sergio Fernandez, Redlink GmbH Nervion/Arenal II/III
Large Scale SolrCloud Cluster Management via APIs - Anshum Gupta, IBM Watson Giralda V
Open Source Operations: Building on Apache Spark with InsightEdge, TensorFlow, Apache Zeppelin, and/or Apache - Samuel Cozannet, Canonical Giralda VI/VII
Scalable Data Science in R and Apache Spark 2.0 - Felix Cheung, Committer Giralda I/II
Streaming Report: Functional Comparison and Performance Evaluation - Huafeng Wang, Intel Nervion/Arenal I
16:30 CET
How Big Data/IoT Leverage the Power of OpenSource to Solve Healthcare Use Cases - Manidipa Mitra, ValueLabs Giralda III/IV
Interactive Analytics at Scale in Apache Hive Using Druid - Jesús Camacho Rodríguez, Hortonworks Nervion/Arenal II/III
Integrators at Work! Real-Life Applications of Apache Big Data Components - Moderated by Phil Archer, W3C Giralda I/II
SystemML - Declarative Machine Learning - Luciano Resende, IBM Santa Cruz
ETL Pipelines with OODT, Solr and Stuff - Tom Barber, Meteorite Consulting Giralda V
Deep Neural Network Regression at Scale in Spark MLlib - Jeremy Nixon, Spark Technology Center Giralda VI/VII
Myriad, Spark, Cassandra, and Friends - Big Data Powered by Mesos - Jörg Schad, Mesosphere Carmona
Real Time Aggregates in Apache Calcite -- Optimal Use of your Streaming Data - Atri Sharma, Microsoft Nervion/Arenal I
17:30 CET