Lesson 9: Visualising and Analysing Network Data

Dr. Kam Tin Seong
Assoc. Professor of Information Systems (Practice)

School of Computing and Information Systems,
Singapore Management University

20 May 2025

Content

  • Introduction to Graph Visual Analytics

  • Graph Visualisation in Actions

  • Basic Principles of Graph

  • Network data sets

    • Graph data format
  • Network Visualisation and Analysis

    • Network visualisation and analysis process model
    • Graph layouts and visual attributes
    • Network metrics

Introduction to Graph Analytics

  • What is Graph Analytics?

  • Basic concepts of graph

  • Network in Real World

  • Graph Visualisation in Actions

What is Graph Analytics?

  • The study and analysis of data that can be transformed into a graph representation consisting of nodes and links.

  • Analytic tools used to determine strength and direction of relationships between objects in a graph.

  • The focus of graph analytics is on pairwise relationship between two objects at a time and structural characteristics of the graph as a whole.

What is Graph Analytics?

  • For example, in a graph representing relationships (such as “liking” or “friending” another individual’s profile or site) between individuals, graph analytics can help answer questions like the following:
    • How many other individuals does the average individual “friend” with?
    • What is the maximum number of “friends” any one individual has?
    • How interconnected are groups of users with one another?
    • How many “friend” relationships does it take to get from one user to another user?
    • Are there isolated groups of individuals who are connected to each other but not to individuals not in their group?
  • Applications of Graph Analytics include clustering, partitioning, search, shortest path solution, widest path solution, finding connected components, and page rank.

Graph Analytics in History: Classical Graph Theory

The Seven Bridges of Königsberg is a historically notable problem in mathematics. Its negative resolution by Leonhard Euler in 1735 laid the foundations of graph theory and prefigured the idea of topology.

Source: The Seven Bridges of Königsberg

Graph Analytics in History: Sociogram

A sociogram is a tool for charting the relationships within a group. It’s a visual representation of the social links and preferences that each person has – valuable data for leaders.

Source: Valdis Krebs (2010) “Your Choice Reveal Who You Are: Mining and Visualizing Social Patterns” in Beautiful Visualization.

Where are Graphs used

  • Graphs are sometimes used in surprising ways. There are many problems which may not initially appear to take the form of graphs but can be solved more quickly if they are transformed into a graph:

    • Partitioning large physical volumes into smaller physical volumes as part of high performance simulations on supercomputers.
    • Parsing speech to determine what is the most likely sequence of words that matches a given set of sounds.
    • Analyzing the way different parts of a complex software program interact in order to proactively find and remove bugs.

Basic Principles of Graph

  • Basic Graph

  • Direct and Undirected Graphs

  • Weighted Graph

  • Ego-centric Graph

  • Bipartite Graph

  • Multimodel Graph

A Complete Graph

A complete graph is a simple undirected graph in which every pair of distinct vertices (also known as nodes) are connected by an unique edge (also known as link).

A Directed Graph

Have a clear origin and destination. Also known as asymmetric edges. Suitable for representing network with non reciprocal relationships such as Twitter.

Graph representation

On the left is a normal graph, in the centre is a graph in which each edge is given a numerical value, and to the right is a directed graph.

A weighted graph

  • A weighted edge includes values associated with each edge that indicate the strength or frequency of tie. For example, numbers of calls between two staffs.

A weighted graph

  • Edges with different thickness are used to represent the monthly calls by staffs.

An ego-centric graph

  • Network consisting of an individual and their immediate peers (Heer & Boyd, 2005).

Bipartite Graph

  • A graph whose vertices can be divided into two disjoint sets U and V such that every edge connects a vertex in U to one in V; that is, U and V are independent sets.

  • Equivalently, a bipartite graph is a graph that does not contain any odd-length cycles.

Affiliation Networks - Bipartite Graph

Source: Valdis Krebs (2010) “Your Choice Reveal Who You Are: Mining and Visualizing Social Patterns” in Beautiful Visualization.

A Multimodel Graph

Social network connecting different types of vertices. For example, a network may connect peers to discussion forums and blog posts they have commented on.

Network in real World

  • Physical
    • Transportation (i.e. road, port, rail, etc)
    • Utility (electricity, water, gas, network cable, etc)
    • Natural (river, etc)
  • Abstract
    • Social media (i.e. e-mail, Facebook, Twitter, Wikipedia, etc)
    • Organisation (i.e. NGO, politics, customer-company, staff-to-staff, criminal, terrorist, disease, etc)

Real world network - Land transport

Real world network - Maritime transport

Real world network - Air transport

Real world network - Life line

Real world network - Social network

Graph Visualisation in action 1

Using graph visualisation to understand business networks.

Source: Exxon Secrets

Graph Visualisation in action 2

Graph visualisation is used to reveal voting patterns among United States senators.

Source: Social Action

Graph Visualisation in action 3

Graph visualisation is used to understand online social network.

Graph Visualisation in action 4

Graph visualisation is used to show how the news are all connected by degrees of separation.

Source: Link

Graph Visualisation in action 5

Application of network analysis in project management.

Source: Pryke, S.D.”Analysing construction project coalitions: exploring the application of social network analysis”, Construction Management and Economics, (2004), 22. pp. 787-797.

Graph Visualisation in action 6

SecViz: Application of network graph in security.

  1. Similarity graph of log entries and (b) Similarity graph of network scans

Source: Graph Drawing for Security Visualization.

Graph Visualisation in action 7

Alumni network.

  • Networks of the below universities are expanded in a breadth first manner up to the depth of 2, (showing university, alumni and companies they are associated with through employment, investment or other activities)
  • Size of the node reflects degree of the node (scaled logarithmically).

Source: Article

Graph Visualisation in action 8

Public Transport Network Analysis.

  • Degree centrality indexes for nodes in the existing (2006) and proposed (2020) public transport networks in Melbourne’s north-east

Graph Visualisation in action 9

  • Maritime Port Network Analysis:
    • Maritime degree, centrality and vulnerability: Port hierarchies and emerging areas in containerized transport (2008–2010)

Graph Visualisation in action 10

  • Firm Network Analysis
    • S&T cooperation network diagram of cities in China

Graph Visualisation in action 11- UN Voting in Europe

To learn more, go to Visual Complexity

]

Source: Transportation Network

Graph Data

  • What is graph data?

  • Storing graph data

    • file-based
    • database management system
    • R object

Potential Graph Data Sources

  • Flight Stats.

  • The record indicates the city pair (that is, a link), such as ORD–LGA or LAX–ATL. Note that this particular data has directed links. ORD–LGA is a flight that starts in Chicago’s O’Hare Airport and ends at LaGuardia Airport in New York City and is different from LGA–ORD, which is a flight going in the other direction. Both links are valid.

Potential Graph Data Sources

  • Sometimes only links are identified in a data set. One example is network log files. Although log files may seem arcane, they contain a wealth of interesting information—for example, from where people are connecting into a corporate network, when and where big files are transferred out, patterns of regular activity (such as network backup), and patterns of irregular activity (such as hackers attempting to break in).

Potential Graph Data Sources

Transaction Records

  • By looking at the items that co-exist in a transaction, you can construct a graph. Nodes are the items, and links are the co-occurrence of items within any transaction. Examples of this type of graph include a wide variety of social networking (including e‑mail data, as well as multiple authors of documents such as books, news stories, or reports.

Potential Graph Data Sources

Sequence Data

  • Sequence data is very similar to transaction data with explicit time stamps on each record.

Potential Graph Data Sources

Unstructured Data (for Example, Tweets)

  • Unstructured data can also be processed to extract nodes and links.
  • A means to identify nodes and identify links is required. For example, tweets are short, 140-character messages publicly broadcast on Twitter. Tweets are a rich data source from which you can mine different kinds of nodes and links by looking for co-occurrence of hash tags (that is, user-defined topics), usernames, or stock symbols within tweets, and you can extract these to form graphs.

Potential Graph Data Sources

Matrix (for Example, Trade, Migration)

  • Sometimes a matrix of data contains the same entries in both the first column and first row. For example, global trade flows between countries can be represented as a table of numbers (http://stats.oecd.org).

Potential Graph Data Sources

Statistical Correlation (for Example, Stocks, News Stories)

  • Graphs can also be created statistically.

Potential Graph Data Sources

Two Data Types (for Example, Board Memberships)

  • A bipartite graph has two different types of nodes, with linkages between the different types. For example, a graph analysis of executives and their board memberships reveals the connections between companies via board members. The two different data types in this example are people and companies. These are the nodes. The board memberships are the links that connect a person to a company.

Potential Graph Data Sources

  • People can be connected through many kinds of commonalities, for example, LinkedIn builds connections via companies, friendships, educational institutions, group memberships, and so on.

Graph database

Neo4j (Network Exploration and Optimization 4 Java) is a graph database management system developed by Neo4j, Inc.

Source: neo4j Get Started

R Graph Objects

Introducing tidygraph

  • A tidy API for graph/network manipulation in R

  • It provides a way to switch between node and edge tables.

  • It provides dplyr verbs for manipulating node and edge tables.

  • It provides access to a lot of graph algorithms with return values that facilitate their use in a tidy workflow.

  • The full reference guide is available at this link.

Network Graph Visualisation and Analysis

  • Layouts

  • Visual Attributes

  • Network Geometrics

Graph Layouts

Graph layouts are algorithms that return coordinates for each node in a network graph.

  • Showing node-edge relationship.

  • Very challenging for large graph.

One common method for drawing graphs is to draw nodes as markers and edges as lines connecting them (also referred to as links)

Source: flare.prefuse.org

Force-Directed Layout

  • Force-directed graph drawing algorithms are a class of algorithms for drawing graphs in an aesthetically-pleasing way.
  • Their purpose is to position the nodes of a graph in two-dimensional or three-dimensional space so that all the edges are of more or less equal length and there are as few crossing edges as possible, by assigning forces among the set of edges and the set of nodes, based on their relative positions, and then using these forces either to simulate the motion of the edges and nodes or to minimize their energy.

Source: Observable

BiPartite Layout

Source: BiPartite Layout

Node-Only Layout

Source: 2013 Budget Proposal Graphic

Time-Oriented Layout

Radial Hierarchical Layout

Source: Cluster

Tree Hierarchical Layout

Source: Tree

Geographic Layout

Source: ShniyNet

Chord Diagrams

Source: Migration Analyticss

Sankey Diagrams

Source: Sankey Diagram

Hive Plot

Source: Hiveplot

Hive Plot of Network of Individuals at Risk of HIV

Source: hiveplot

Hive Plot in ggraph

Soure: ggraph layout

Adjacency Matrix

Source: Adjacency Matrix

Basic Visual Attributes

Additional Visual Attributes

Combining Visual Attributes

Network Visualisation and Analysis Process Model

Source: Hansen, D. L. et. al. 2009

Network Metrics: Measures of Power and Influence

  • A collection of statistical measures to report:
    • the connectivity of a node within a network,
    • the complexity of a network,
    • the clusters or sub-groups within a network.

Network Metrics: Degree

  • Degree, the number of direct connections a node has.

  • Degree is often interpreted in terms of the immediate risk of node for catching whatever is flowing through the network (such as a virus, or some information).

Network Metrics: In-degree & Out-degree

  • If the network is directed (meaning that ties have direction), then we usually define two separate measures of degree centrality, namely indegree and outdegree.

  • Indegree is a count of the number of ties directed to the node, and outdegree is the number of ties that the node directs to others.

  • For positive relations such as friendship or advice, we normally interpret indegree as a form of popularity, and outdegree as gregariousness.

Network Metrics: Betweenness centrality

  • Betweenness is a centrality measure of a vertex within a graph (there is also edge betweenness, which is not discussed here).

  • Vertices that occur on many shortest paths between other vertices have higher betweenness than those that do not.

Network Metrics: Closeness Centrality

  • In graph theory closeness is a centrality measure of a vertex within a graph. Vertices that are ‘shallow’ to other vertices (that is, those that tend to have short geodesic distances to other vertices with in the graph) have higher closeness.

  • Closeness is preferred in network analysis to mean shortest-path length, as it gives higher values to more central vertices, and so is usually positively associated with other measures such as degree.

]

Network Metrics: Eigenvector Centrality

  • A measure of the importance of a node in a network.

  • It assigns relative scores to all nodes in the network based on the principle that connections to high-scoring nodes contribute more to the score of the node in question than equal connections to low-scoring nodes.

Network Metrics: Clustering Coefficient

  • A measure on how connected a vertex’s neighbours are to one another. More specifically, it is the number of edges connecting a vertex’s neighbours divided by the total number of possible edges between the vertex’s neighbour.

Network Analytics Methods

  • Mapping relationships

  • Identifying hierarchies

  • Detecting communities

  • Analysing flow

Analysing Spatial Networks Relationship

Source: Link

Analysing Hierarchy of Spatial Network

Source: Link

Detecting communities

Source: Link

Spatial Networks for Flow Analysis

Arteries of the City.

Source: Arteries of the City

Swiss Knife for Graph Visualisation and Analysis I

NodeXL, an open-source template for Microsoft® Excel® 2007 and 2010 that makes it easy to explore network graphs.

Swiss Knife for Graph Visualisation and Analysis II

Gephi, an open source network graph visualisation and analysis toolkit.

R Package for Network Visualisation and Analysis

Web enabled Graph Visualisation Libraries

References

  • Richard Brath and David Jonker (2015) Graph Analysis and Visualization: Discovering Business Opportunity in Linked Data, John Wiley & Sons. This book is available online at smu digital library.

  • Luke, Douglas A. (2015) A user’s guide to network analysis in R, Springer. This book is available online at smu digital library.

Additional readings

  • Ian McCulloh, Helen Armstrong, and Anthony Johnson (2013) Social Network Analysis with Applications, Chapter 1-3. This book is available online at smu digital library.

  • Scott, John (2017) Social network analysis (4th Edition). This book is available online at smu digital library.)

Geospatial Network Visualisation

This site provides an outstanding overview and survey of various geospatial network visualisations.