Analytics, Bernie Najlis

Named Entity Recognition from Online News

Posted on May 27, 2018 by bnajlis

This is a project from the Natural Language Processing course in my Masters in Data Science program. The project aimed to create a series of models for the extraction of Named Entities (People, Locations, Organizations, Dates) from news headlines obtained online. We created two models: a traditional Natural Processing Language Model using Maximum Entropy , and a Deep Neural Network Model using pre-trained word embeddings. Accuracy results of both models show similar performance, but the requirements and limitations of both models are different and can help determine what type of model is best suited for each specific use case.

The final conclusion is that, as the Deep Learning Model is less dependent on specific language grammar rules, it is more generalizable (given embeddings and some labeled corpora is provided in any language) whereas the Maximum Entropy model will perform poorly on an language where there is no Domain Knowledge to create the required features.

This is our deck for the final presentation:

This is our final report / paper with our results and conclusion:

All source code for this project can be found in this GitHub repository: https://github.com/bnajlis/named_entity_recognition

Introduction to KNIME

Posted on January 23, 2017 by bnajlis

KNIME is one of the many open source data analytics and blending tools available for free online.

This is a very basic presentation about KNIME I did at one of the labs as part of a Data Mining course in the Masters in Data Science and Analytics program at Ryerson University. The tool is really great and I ended up using it as the main analytics tool to deliver the final project for the same course.