Loading…
Apache: Big Data Europe 2016
Click here to Register or for more information 
Back To Schedule
Tuesday, November 15 • 15:30 - 16:20
Classifying Unstructured Text - Deterministic and Machine Learning Approaches - Christian Winkler & Stephanie Fischer, mgm Technology Partners GmbH

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Text is one of the most used forms of communication and ubiquitous in the Internet. Social networks like Facebook and Twitter mainly contain unstructured text; the same is true for content-driven websites.



For humans it is easy to grasp the meaning of text - much more difficult for computers. Used correctly, computers can help humans tremendously in structuring and classifying huge amounts of text. This "symbiosis" can help humans work more efficiently, reduce repetitve work and use the uncovered structure.



Our talk starts with visualizations giving us ideas how to automatically classify texts. Then we will demonstrate that manual intervention is sometimes necessary and how this can be used as a basis for machine learning. This helps significantly in classifying more complicated cases.



As software tools we use R, Apache Solr, D3.js, and several NLP and ML tools from the ASF.

Speakers
avatar for Stephanie Fischer

Stephanie Fischer

Big Data, Agile and Change Management, mgm consulting partners
I concentrate on user-centricity of Big Data technologies. My focus is finding the questions really worth solving. I think Big Data has the potential to advance humanity into a desirable direction. I have a background in organizational development, agility and business analytics... Read More →
avatar for Christian Winkler

Christian Winkler

Enterprise architect, mgm technology partners GmbH
Christian has worked for 20 years with Internet technologies. Recently, he has focused on working with large amounts of data or many users. As big data applications become more and more popular, lots of applications evolve. Many aggregates have to be calculated to describe charcteristics... Read More →


Tuesday November 15, 2016 15:30 - 16:20 CET
Giralda V