Apache: Big Data Europe 2016
Click here to Register or for more information 
Back To Schedule
Monday, November 14 • 15:30 - 16:20
Processing Planetary Sized Datasets - Tim Park, Microsoft

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

In my group at Microsoft, we have worked with the United Nations, Guide Dogs for the Blind in the UK, several automotive companies, and Strí_er on a number of projects involving high scale geospatial data.

In this talk, I'll share some of the best practices and patterns that have come out of those experiences: best practices for storing and indexing geospatial data at scale, incremental ingestion and slice processing of the data, and efficiently building and presenting progressive levels of detail.

The audience will walk away with an understanding of how to efficiently summarize data over a geographic area, general methods for doing ingestion with Apache Kafka (or other event ingestion systems), and incremental updates to large scale datasets with Apache Spark, and best practices around visualizing this data on the frontend.

avatar for Tim Park

Tim Park

Software Engineer, Microsoft
Tim is a Principal Software Engineer at Microsoft and works with customers and partners to help them utilize open source platforms on Microsoft’s Azure cloud. He has a particular focus on big data, and, in particular, processing large scale geospatial data. His project experience... Read More →

Monday November 14, 2016 15:30 - 16:20 CET