Using NLP to discover hot topics on social network


Case 1 - problem
The data on social platform are growing everyday.

How can we use such data to discover “What is happening” on Facebook or Twitter everyday or even every hour?

We can use the topic modeling for this problem:

– INPUT: A dataset of Facebook’s posts of the most popular fanpages in Vietnam at a specific time.

– OUTPUT: A set of top hottest topics that are the most popular on Vietnamese Facebook at that time and their keywords.


Solution social network

For this problem, we use the LDA Topic Modeling Technique to solve the challenge.

LDA is a probability model that find the pattern topic distribution in a corpus (a set of documents).


In October 2019, we discovered these are the hottest topics that Vietnamese Facebook-er were talking about the most:

case 1 - Result1

We also can present the fanpages by vectorizing them, and compare them to look for the similarities:

case 1 - result2

Here is an comparison between pages base on their content in Oct-2019. The more blue a square is, the more similar two fanpages are:

case 1 - result3


With a proper crawler, an improved Topic Modeling technique, we can track the hot topics on Facebook or any social network in every hour or even every second.