#FluxFlow: A visual tool to analyze “anomalous information” in social media

Thanks to the Social Media Analytics course I’m taking as part of my Masters in Data Science program, I found a very interesting paper about #FluxFlow that I had to summarize and present. #FluxFlow is an analytics data visualization tool that helps identifying and understanding how ‘anomalous’ information spreads in social media. In the context […]

Social Media Analytics: Bell Let’s Talk 2017

Two weeks ago I started the second semester of the Masters in Data Science program and as part of it I am taking a course in Social Media Analytics. The first lab assignment for this course was on January 25 and the objective is to analyze Bell Let’s Talk social media campaign. Using a proposed […]

Implementing a TF-IDF (term frequency-inverse document frequency) index with Python in Spark

Introduction As part of the final exam assignment for my Masters in Data Science course “DS8003 – Management of Big Data Tools”, I created a Big Data TF-IDF index builder and query tool. The tool consists a script with functions to create a TF-IDF (term frequency-inverse document frequency) index and it is then used it […]

Analyzing Reddit Public Comments on Azure Data Lake and Azure Data Analytics (Part 1.5)

In the previous article on this series, I skipped the part where I downloaded data. At first I used my laptop and a downloader to get the files locally, which I ended up uploading to the Azure Data Lake Store folders. Another alternative that I wanted to give a try and will show you in […]

Analyzing Reddit Public Comments on Azure Data Lake and Azure Data Analytics (Part 1)

With some free time in my hands in between Coursera courses and classes not starting for the next couple of weeks, I wanted to use some of the new Azure Data Lake services and build a Big Data analytics proof of concept based on a large public dataset. So I decided to create these series […]

Free Azure Machine Learning? Yes, Please!

With Azure Machine Learning being released to General Availability this week (Feb 18th, 2015), more interesting news come to life. There is a couple of (somewhat confusing) options to try and use AzureML. Better to be informed before you jump in and register your account with Azure… AzureML Free Tier With GA, Microsoft decided to release […]

Azure Machine Learning: Data Mining 2.0

Azure Machine Learning (aka AzureML) is one of the new products/services in this new bold world of ‘cloud first, mobile first’ that Microsoft is endeavouring. It helps you create predictive analytics from your data in a very quick and simple way, and easily integrate this with allyour applications. And you can do that armed just […]

mongoDB – What’s great (and not so great) about it

mongoDB is a relatively new database management system, one of the prime examples of the No-SQL database movement (if such a thing exists). In No-SQL databases, that can also be referred to as ‘non-relational databases’, you don’t represent data tables that store rows and their relations. Each No-SQL database has its own particular way of […]