Twitter Analysis to Uncover Communities Neo4j database graph to implement Louvain algorithm

Pradhyumna Reddy Madhulapally
3 min readApr 25, 2023

--

With millions of users submitting information daily, Twitter is one of the most widely used social media sites in the world. The platform has developed into an effective resource for researching social networks and examining user activity. In this article, we will investigate how to apply the Louvain algorithm for Twitter community discovery utilising a Neo4j database graph.

A common technique for identifying communities in expansive networks is the Louvain algorithm. It operates by repeatedly improving the network’s modularity score, which gauges the robustness of the community structure. The method divides the network into smaller communities until the modularity score cannot be increased any more.

We first need to establish a graph database in Neo4j before applying the Louvain algorithm to the data from Twitter. We can achieve this by importing the 1,768,149 edges and 81,306 nodes in our Twitter dataset. any Twitter user is represented as a node, and any link between users (such as a follower-following relationship) is represented by an edge.

The Neo4j Graph Algorithms package, which offers a built-in implementation of the method, can be used to implement the Louvain algorithm in Neo4j. We can run the algorithm on our Twitter graph and locate the communities there using the Cypher query language. Once the communities have been identified, we may visualise them with the help of programmes like Gephi or the Neo4j Browser to learn more about their organisation and connections.

We may use the Louvain algorithm to find communities in the network after importing our data into Neo4j. To accomplish this, we need to create a Cypher query that runs our graph via the Louvain algorithm. The inquiry could take the form of this:

CALL algo.louvain(‘User’, ‘FOLLOWS’, {
write:true,
writeProperty:’community’
})
YIELD nodes, communities, iterations, loadMillis, computeMillis, writeMillis;

This query uses the ‘User’ label for nodes and the ‘FOLLOWS’ relationship type for edges to apply the Louvain algorithm to our graph. Additionally, it updates each node’s ‘community’ property with the community assignments.
After applying the Louvain algorithm, we may investigate the communities that our network has identified. Cypher queries can be used to determine the size of each community, the members who have the most sway within each community, and the relationships between communities.
For instance, the following query could be used to determine each community’s size:

MATCH (u:User)
RETURN u.community, COUNT(*)
ORDER BY COUNT(*) DESC;

This query yields a list of communities and their sizes, sorted by size from largest to smallest. With the help of this data, we can discover which Twitter communities are the busiest and learn more about them.

In conclusion, applying the Louvain algorithm to a Neo4j database graph provides an effective way to analyse Twitter data and identify communities inside the network. By using this technique, we can find the most active Twitter communities, investigate their traits, and learn more about how users behave there. These insights can be extremely helpful for businesses, scholars, and politicians given the growing significance of social media in our daily lives.

--

--

No responses yet