Editor’s note: Tamilla is a speaker for ODSC East this May. Be sure to check out her talk, “Graph Viz: Exploring, Analyzing and Visualizing Graphs and Networks with Gephi and ChatGPT,” there!
Analyzing the collaboration between authors of scientific papers can provide valuable insights into research communities’ structure and individual paper impact. In this post, I will describe the process of analyzing a global AI research community network using Gephi and ChatGPT.
Graph theory defines a graph as a collection of objects where pairs of objects are connected and represented as mathematical entities known as nodes and edges. Graph visualization finds applications in various fields, such as computer science, social network analysis, biology, and business. Here is an example of graphs with nodes representing people or objects and edges indicating connections between them.
Over the past few years, there has been a surge of interest in AI research worldwide, driven by the explosion of data, the availability of powerful computing resources, and the rapid advancements in machine learning algorithms. In this post, we will explore the present trends of global AI research collaboration to identify the countries with the highest number of AI-related research publications and the regions with the most frequent collaborations.
Step 1: Obtain Data
OECD.AI publishes real-time data on the location, methods, rate, and sectors in which AI is being developed and used. The dataset “Domestic and international collaboration in AI publications“ contains data on the international collaboration in AI scientific publications. Each publication is assigned to one or more countries based on the affiliations of its authors. For this post, I selected AI collaboration data for 2022.
Step 2: Visualize the Graph
To visualize the graph, I used the open-source platform Gephi. Gephi offers a variety of robust tools and layouts for generating intricate, visually appealing, and customizable network visualizations. After uploading data to Gephi, the initial graph visualization appeared like this:
Step 3: Apply a Layout
To enhance the visualization, I applied the ForceAtlas2 layout. This algorithm pulls nodes that have strong connections closer and pushes those with weaker connections apart. The resulting graph reveals that there is substantial activity between countries located in the center of the visual, while those on the edges have comparatively less activity.
Step 4: Add Size and Color to the Nodes
To provide additional information to the graph, I applied node sizing based on degree and node coloring based on modularity. In graph theory, degree refers to the number of edges connected to a given node in a network. In social networks, higher degree indicates a greater number of connections, which in our analysis, implies more collaborations between authors. Nodes with larger sizes indicate more collaborations compared to smaller nodes.
Modularity is a measure used to identify closely-knit communities of nodes within the network. In this analysis, there are three distinct communities of authors based on this metric.
Step 5: Use ChatGPT to Generate Data
To accurately plot the countries on the map, we need precise coordinates such as latitude and longitude. I generated these coordinates using ChatGPT. The image below displays a sample of the generated data.
Prompt: Generate a table with latitude, longitude, continent, and subregion for countries in alphabetical order
Step 6: Display Results on the Map
With the geolocation information at hand, I applied the GeoLayout in Gephi to plot the results on a world map. This visualization offers a comprehensive view of how AI researchers from various countries collaborate in their research efforts. Node sizes indicate the degree of collaboration, while node colors represent clusters of authors based on their collaborative patterns.
In this post, I provided a brief overview of using Gephi and ChatGPT to create visual graphs. I will go into more detail on graph metrics, layouts, communities, and influencers in my upcoming talk at ODSC East 2023, titled Graph Viz: Exploring, Analyzing and Visualizing Graphs and Networks with Gephi and ChatGPT.
Source of data: OECD.AI (2023), Visualisations powered by JSI using data from OpenAlex, accessed on 22/3/2023, www.oecd.ai
Software: The post was prepared using Microsoft Office, ChatGPT, and Gephi.
About the author/ODSC East 2023 speaker:
Tamilla Triantoro is an Associate Professor of Computer Information Systems at Quinnipiac University and a leader of the Masters’s Program in Business Analytics. She was previously an Academic Director of Data Analytics at the University of Connecticut. Dr. Triantoro is an author, speaker, researcher, and educator in the fields of artificial intelligence, data analytics, user experience with technology, and the future of work. She received her Ph.D. from the City University of New York where she researched online user behavior. Dr. Triantoro presents her research around the world, attempting to demystify the complexity of today’s digital world and to make it understandable and relevant to business professionals and the general audience.