Mathematicians and computer scientists often look for patterns and connections to help them make sense of the world. Network theory finds connections between people, places, or things, and maps them into a graphical object.

These mathematical models of connections help visualise anything from the spread of disease through a population to fraud detection in banking. In biology, a common example is a food web, showing the overlapping feeding relationships between species and their environment.

Dr Sandipan Roy, a senior lecturer in the Department of Mathematical Sciences, uses statistical modelling to explore social, biological and financial networks.

Deciding on the data to use

Networks are made from individual data points, or ‘nodes’, and the links or ‘edges’ that connect them. Community detection is a common technique in network analysis that identifies groups or communities within networks, clustering tightly linked nodes together.

In social networks, close friends are grouped into a community. These people will have strong links with each other. They may also have links with other groups, but these are likely to be weaker connections.

Research into the maths behind these connections has been ongoing for many years. However, Sandipan and his team want to explore beyond individual network connectivity. He’s looking instead at other types of data beyond these links.

Some people may be in the same location, may have attended the same college, or may even have similar hobbies. This extra data, known as metadata, could influence connectivity between individuals. Sandipan and his team want to understand how we can use this extra information to learn more about network structures.

Past studies that have used metadata have done so either in an ad hoc manner or without making a judgement on whether it is useful information or not. There is currently no concrete method available to help make informed decisions about which information to use. This is a problem Sandipan and his team are working to solve. Their research is driven by the mathematical and statistical methods behind these techniques, and how they apply in practice.

From medicine to market predictions

Sandipan’s work uses real-world data, with both practical and theoretical applications.

A current project sees him collaborating with clinicians. He is using machine learning-based methods to predict multi-morbidities in older patients. This is where two or more diseases are present in the same patient. In modelling patient data, he’s looking for features or connections that could act as an early warning system. With these flags, clinicians can better predict who is most at risk of developing two or more diseases. As a result, they can provide better care to vulnerable patients.

Another social network study focusses on predicting how certain networks can evolve. This could help understand large and complex political networks, like the US Senate or European Parliament. In these networks, there are often strong, party-based connections between people. However, other factors can determine how connections develop. These are often cross-party issues, such as health or education.

Extra information, beyond understanding party membership, could give insights into underlying communities. A better understanding of the network structure could help when it comes to making voting predictions. This applies not only to which parties are likely to come into power but what key voting issues could be.

Network analysis can also prove useful for looking at financial or Stock Exchange data. Much like in social networks, stock indices have links with one another. There is often an underlying network structure present. The better we understand the network structure and the relationships between different stock indices, the better we can predict what will happen in the financial market. This could help avoid or mitigate catastrophic events.

While the types of models used vary slightly, the methodology behind them uses fundamental theoretical concepts, such as graphical models and stochastic networks.

Future directions

Looking towards the future of network science, Sandipan can see his research having even wider applications.

Large Language Models (LLMs) used in AI tools such as ChatGPT and Google Gemini collect training samples from different areas. These feed into large machine learning algorithms, but they don’t behave in isolation. If we know the underlying network of how these examples connect, we can better understand automatic language translation systems. As a result, we will be able to improve their design in the future.

This learning could even apply to global challenges, such as climate change. Climate change isn’t the result of one singular action. Instead, it's caused by many different, interconnected actions around the world. By understanding the structural connectivity between these actions, we can make effective changes and take pre-emptive actions to prevent or mitigate climate-related disasters.