Novel machine learning models for text mining and human mobility analytics


  • Data Science Colloquium


  • 2017/2018


Stan Matwin, Institute for Big Data Analytics, Dalhousie University


Wednesday, February 28, 2018 - 13:00 to 14:30


Sala Gerace, Dipart. Informatica, University of Pisa, Edificio C, Polo Fibonacci, Largo Pontecorvo 3, Pisa, Italy

For a number of years there was a general consensus in Natural Language Processing and in Machine Vision that the existing representations were not adequate for more semantically challenging tasks. These classical representations from the 1970s by and large ignored the word context. In NLP the standard Bag of Words representation seemed to have reached its limits in a many tasks, such as emotional valuation, word sense disambiguation, or word similarity. The 2013 paper by Mikolov introducing the learned, embedding-based w2v representation was therefore a breakthrough that significantly advanced the entire area of text analytics. In this presentation I will discuss an algebraic (as opposed to neural network) approach to learning embeddings. I will present our recent work on the novel way of using negative examples to obtain embeddings for text data in such algebraic framework. I will also discuss our new approach to obtaining and using embeddings that contextualize human mobility data. I will round up with some work in progress in Deep Learning from vessel mobility data, and some general thoughts on Deep Learning.

Stan Matwin is the Director of the Institute for Big Data Analytics at Dalhousie University, Canada, where he is a Professor and Canada Research Chair. He is also a Distinguished Professor at the University of Ottawa, a State Professor at the Polish Academy of Sciences, and a member of the Board of the Data Science PhD. His main research results are in big data with a focus on ocean data, in text mining, in machine learning, and in data privacy. Stan Matwin is a member of the Editorial Boards of IEEE TKDE an JIIS. He was the General Chair of KDD 2017 in Halifax, Canada, and is the General Chair of Document Engineering 2018.

Zircon - This is a contributing Drupal Theme
Design by WeebPal.