Apache: Big Data Europe 2016
Click here to Register or for more information 
Back To Schedule
Wednesday, November 16 • 11:00 - 11:50
Why is My Hadoop Cluster Slow? - Steve Loughran, Hortonworks

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Apache Hadoop is used to run jobs that execute tasks over multiple machines with complex dependencies between tasks. And at scale, there can be 10äó»s to 1000äó»s of tasks running over 100's to 1000äó»s of machines which increases the challenge of making sense of their performance. Pipelines of such jobs that logically run a business workflow add another level of complexity. No wonder that the question of why Hadoop jobs run slower than expected remains a perennial source of grief for developers. In this talk, we will draw on our experience in debugging and analyzing Hadoop jobs to describe some methodical approaches to this and present current and new tracing and tooling ideas that can help semi-automate parts of this difficult problem.

avatar for Steve Loughran

Steve Loughran

Member of Technical Staff, Hortonworks
Steve Loughran is a developer at Hortonworks, where he works on leading-edge Hadoop applications, most recently on Apache Slider and on Apache Spark's integration with Hadoop and YARN, and Hadoop's S3A connector to Amazon S3. He's the author of Ant in Action, a member of the Apache... Read More →

Wednesday November 16, 2016 11:00 - 11:50 CET