Apache: Big Data Europe 2016
Click here to Register or for more information 
Back To Schedule
Tuesday, November 15 • 15:30 - 16:20
AMIDST Toolbox: A Java Toolbox for Scalable Probabilistic Machine Learning - Andres Masegosa, NTNU

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

We would like to present our open source AMIDST toolbox for analysis of large-scale data sets using probabilistic machine learning models. AMIDST runs algorithms in a distributed fashion for learning a wide range of latent variable models such as Gaussian mixtures, (probabilistic) principal component analysis, Hidden Markov Models, Kalman Filter, Latent Dirichlet Allocation, etc. This toolbox is able to learn any user-defined probabilistic (graphical) model with billions of nodes using novel message passing algorithms.

We plan to give an overview of the AMIDST toolbox, some details about the API and the integration with Flink, Spark (and other open source tools) and an analysis of the scalability of our learning algorithms. All this in the context of a real use case scenario in the financial domain (BCC group), where millions of customers profiles are analyzed.

avatar for Andres Masegosa

Andres Masegosa

I am a research fellow at NTNU (Norway) with broad interests in data mining and machine learning using probabilistic graphical models. Lately, my research has focused on scalable machine learning methods for solving real use cases in the financial (BCC group) and automotive industry... Read More →

Tuesday November 15, 2016 15:30 - 16:20 CET