# Our research

Students of SAMBa will be at the forefront of a future generation of statistical applied mathematicians, with careers in both universities and industry.

Our research interests are broad and multidisciplinary. Researchers span a continuum from modelling of data with leading statistical methods, to investigating fundamental movements of particles, to applying mathematics to real world statistical phenomena.

Training in SAMBa provides students with exceptional skills in developing the formulation of statistical applied mathematics problems and with the tools to solve those problems. Students will have confidence in talking to people from a wide range of backgrounds and bringing new perspectives to challenges faced across industry and academia.

## Approach

In order to address the modern challenges of analysing huge data sets and mapping them to real-time predictions, we believe it is essential that future generations of researchers are trained across the continuum of statistical applied mathematics, with confidence in computation, stochastics and a wide range of cross-disciplinary approaches.

There is also a need to work closely with industry and researchers from other disciplines in order to ensure that the benefits gained from this approach are widely shared and implemented.

## Impact

The range of applications, and subsequent socio-economic impact is very broad: insurance risk, medical genetics, energy management, communication networks, pharmaceutical development, safety management of physical systems, ecological and population monitoring and retail analytics to name but a few.

## Student PhD projects

### Averaging for fast-slow systems,** **Matthias Klar

Supervisors: Johannes Zimmer and Karsten Matthies

Matthias is studying systems with multiple time scales, so-called fast-slow systems. One aim is to derive effective large-scale descriptions of such systems, by `averaging out' the fast scale. Thermodynamic systems are prototypical examples of systems with such a separation of time scales, and the aim of this project is to advance averaging methods for thermodynamic models.

### Fast iterative regularisation methods,** **Malena Sabate Landman

Supervisor: Silvia Gazzola

Malena’s project is based on the study of fast iterative regularisation methods, with a particular focus on Krylov subspace methods and novel ways for determining regularisation operators and regularisation parameters. These tools are widely used in solving inverse problems, which are challenging as they can be large scale and severely ill-posed. As an example, Malena is exploring different imaging applications, such as tomography or deblurring and denoising of images.

### Hybrid models in biology , Cameron Smith

Supervisor: Kit Yates

Spatial hybrid models are emerging methods used to simulate biological, chemical and physical phenomena on multiple scale levels. These methods take different models of the same system and at varying spatial resolutions, and employ them concurrently in different regions of the spatial domain. The main purpose of such hybrid models is to utilise the efficiency of coarser methods, whilst maintaining accuracy by using the finer methods where necessary. Cameron is developing various spatial hybrid models for biological processes in order to gain insight into how the underlying systems behave. Focusing initially on reaction-diffusion systems, which can be used to model many biological systems, from cell migration to the intracellular calcium dynamics, he is incorporating biological realism into such methods.

### Detection of underwater acoustic events in a large dataset with machine learning, Amélie Klein

Supervisors: Philippe Blondel and Kari Heine

Acoustic remote sensing listens to ambient noise underwater and uses it to recognise the sources of the sounds (e.g. marine life, human activities, weather). Passive sensors acquire data at very high rates (up to a million samples/second) for long periods (up to several years). In this project, Amélie is working on automating the processing and exploration of the large dataset using machine learning techniques and high-performance computing system. The project aims to detect long-term trends, like the increase in shipping or seasonal variations in marine life, and transient events, loud sounds associated to seismic prospection, vocalisations by animals (e.g. whales or dolphins), or small-scale weather observations. The key research questions are in the processing and analysing the vast amounts of continuous data and in deciding the best time scale to look at specific processes.

### Measure-valued martingales and applications, Dan NG

Supervisors: Alex Cox and Johannes Zimmer

Measure-valued martingales are stochastic processes in the space of probability measures which have certain nice martingale properties. They have applications in mathematical finance such as the model-independent pricing and hedging of options. There are natural links to optimal transport and construction of gradient flows for measure-valued processes. They also set up a framework to interpret classical inequalities such as the Log Sobolev Inequality. The aim of Daniel's Ph.D. project is to establish some basic properties of such processes, and to consider variational methods for their construction.

### Large scale differential geometric MCMC, Tom Pennington

Supervisors: Karim Anaya-Izquierdo and Rob Scheichl

Uncertainty Quantification (UQ) concerns both propagation of uncertainty through a physical model, known as the forward problem, and the inverse problem of inferring uncertain model parameters from noisy measurements. Markov Chain Monte Carlo (MCMC) methods are the most widely used tools for computing expectations in UQ and large statistical models in general. Conventional approaches to MCMC are often inefficient and must compute many samples for a high accuracy. Geometric ideas can be used to improve the methods' statistical performance; two prominent algorithms in this line of thinking are Riemann Manifold Hamiltonian Monte Carlo (RMHMC) and Riemann Manifold Metropolis Adjusted Langevin Algorithm (RMMALA). Tom is interested in extending these ideas to exploit more general ideas from differential geometry, with a focus on developing methods that are suited to problems from UQ.

### Bayesian inference for point processes, Nadeen Khaleel

Supervisor: Theresa Smith

Point patterns, speciﬁcally spatial and spatio-temporal point patterns, occur frequently in the environment sciences and epidemiology. These phenomena are possible to model using point processes from which it is possible to learn about any spatial relationships that cause the point pattern observed as well as stochastic dependence between points in the pattern. In particular, Cox processes (or “doubly stochastic” processes) are practical models when the point pattern is clustering due to environmental heterogeneity that is stochastic. Nadeen is working on computational methods for a particular type of Cox process, log-Gaussian Cox processes where she is exploring the development of efficient MCMC techniques for fitting large scale spatio-temporal point patterns and comparing the effects of predictors in different regions.

### Optimising First in Human trials, Lizzi Pitt

Supervisor: Chris Jennison

Lizzi's project involves developing the statistical methodology used to design and make decisions in Phase I/First in Human clinical trials and is in collaboration with Roche. This is the first stage of testing a potential new treatment in humans, after extensive laboratory testing. The primary aim is to establish the associated safety and tolerability in order to define the range of doses to be tested in phase II. Clinical trials are expensive and time consuming, thus research into optimising this process aims to reduce the number of people required, the duration and the cost. Lizzi is looking to develop existing model-based Bayesian dose finding methodology such as the Continual Reassessment Method with this in mind. She is investigating properties of trial designs through simulation to ensure a design is both statistically robust and fit for practical use, thus appealing to clinicians. Traditionally, at this stage there is no evaluation of whether or not the treatment works. Lizzi's research is therefore incorporating analysing an early signal of efficacy into the trial design. Furthermore, the majority of existing research in this area focuses on oncology, thus Lizzi's is centring on a different therapeutic area.

### Discordant voting on evolving scale-free networks , John Fernley

Supervisors: Marcel Ortgiese and Peter Mörters

Similarly to the Contact Process, voting models describe competing spread of two ‘opinions’ on a graph of interacting ‘voters’. Cooper et al. in their 2016 paper, “discordant voting processes on finite graphs”, explored the expected consensus time for a variety of voting models on extremal graphs. These discordant voting models could be seen as a bridge between the classical voter model and the Graph Fission evolving voter model of Durrett. John is interested in finding a universal description of the model's lifetime on scale-free heterogeneous networks, in particular with Chung-Lu type edge models. These models can then be made to evolve in time by vertex updating, and his next objective would be to show that this speeds consensus.

### Spatial confounding, Emiko Dupont

Supervisor: Nicole Augustin

Spatial confounding is a problem that often occurs in environmental, ecological and epidemiological applications of spatial statistics. Models for spatial data usually include a fixed effect for the explanatory variable of interest as well as a random effect capturing spatial correlation in the data. Although the inclusion of a spatial random effect generally improves the goodness of fit of the model, it can also introduce bias in the estimated fixed effect due to co-linearity of the fixed and random effects, which could lead to incorrect statistical inference. This is called spatial confounding and is a general problem that is not restricted to any specific type of statistical model. Emiko’s project is about gaining a better understanding of spatial confounding, using both real and simulated data to investigate when the problem occurs and what can be done to avoid it. She is considering both parametric and non-parametric spatial models.

### Methods for preferentially sampled spatial data, Elizabeth Gray

Supervisor: Evangelos Evangelou

In general, geostatistical methods deal with data under the assumption that the quantity being measured is independent of the locations at which measurements are being taken. However, this is often not the case. Preferential sampling refers to the situation in which there is some stochastic dependence between the quantity being measured and the process used to select the sampling locations, involving an investigator’s ‘design utility’. Ignoring such a dependence can lead to biased and inaccurate estimates. Elizabeth’s PhD involves investigating and developing methods for modelling such data.

### Spatial branching processes, Tsogzolmaa Saizmaa

Supervisor: Andreas Kyprianou

Tsoogii’s project belongs to the field of spatial branching processes focusing on the exit measure induced by the limit of branching mechanisms of isotropic stable Lévy-processes. Specifically, the spatial arrangement of mass of a d-dimensional isotropic super-stable process as it first exits an increasing sequence of balls is being studied. The location of mass in the exit measure is being explored via the overshoot of an embedded isotropic stable branching process and its radii-dependent branching mechanism will be characterised. Convergence of this space-time stochastic process is explored as time goes to infinity.

### Interacting particle models and the geometry of their macroscopic description,** **Marcus Kaiser

Supervisors: Johannes Zimmer and Rob Jack

Marcus is studying the geometric properties of interacting particle systems and their hydrodynamic scaling limits described by non-linear partial differential equations, such as drift-diffusive systems. He is looking at processes that can serve as prototypes for non-equilibrium behaviour, having underlying descriptions as irreversible Markov chains. A better understanding of the geometric behaviour and the links between the microscopic and macroscopic models yields new insights, such as the way processes converge to equilibrium. See http://people.bath.ac.uk/mk806/ for more details.

### Modelling air pollution using data assimilation, Matt Thomas

Supervisors: Gavin Shaddick and Melina Freitag

In order to assess the burden of disease which may be attributable to air pollution, accurate estimates of exposure are required globally. There is a need for comprehensive integration of information from remote sensing, atmospheric models and surface monitoring to facilitate estimation of concentrations in areas throughout the world. Data assimilation is a method of combining model forecast data with observational data in order to more accurately understand the state of a system. Methods vary greatly in complexity and Matt is exploring different methods from both a statistical and numerical analysis standpoint. Elements of a suitable method include flexibility, modularity, the ability to incorporate multiple levels of uncertainty and techniques that allow relationships between surface monitoring, remote sensing and atmospheric models that vary spatially and allow information to be `borrowed' where monitoring data may be sparse. Throughout the project, the efficacy of different methods in this setting is being examined by applying them to data from the Global Burden of Disease project. Of particular interest is their scaleability with regards to use with high-dimensional data.

### Faraday wave-droplet dynamics: a hydrodynamic quantum analogue, Matt Durey

Supervisor: Paul Milewski

It has been observed on a microscopic scale that when a small fluid droplet is dropped onto a vertically vibrating fluid surface, it will `walk' across the surface of the bath. The droplet-Faraday pilot wave pair's behaviour is now reminiscent of quantum physics; there is a particle-wave duality where the fluid droplet can undergo similar processes to a particle in the quantum world. On an unbounded domain, pairs of droplets can interact, deflect or capture each other, depending on various parameters. The quantum single-particle double-slit experiment can be reproduced for fluid droplets, with the interactions between wave field and slits causing a diffraction probability distribution for droplet positions to be produced. This phenomenon is the basis for two lines of research that is being explored by Matt: (i) The fluid dynamics of droplet-Faraday pilot wave reflection properties at planar boundaries. (ii) The long time stationary behaviour of models for droplet-Faraday pilot wave dynamics in confined domains.

### SDEs for embedded successful genealogies, Dorka Fekete

Supervisor: Andreas Kyprianou

Dorka is using the mathematical medium of stochastic differential equations (SDEs) to describe the fitness of certain sub-populations in an asexual high-density stochastic population model known as a continuous-state branching process. In particular, she is looking at ways to describe genealogies that propagate prolific traits in surviving populations, where ‘survival’ can be interpreted in different ways. For example, it can mean survival beyond a certain time-horizon, but it can also mean survival according to some spatial criteria.

### Uncertainty Quantification for neutron transport problems, Matt Parkinson

Supervisors: Ivan Graham, Rob Scheichl and Paul Smith

Working in collaboration with Amec Foster Wheeler, Matt's PhD is developing computation of uncertainty in flux and fundamental eigenvalue of a simplified 1D monoenergetic neutron transport problem with cross sections modelled by lognormal fields using KL sampling and Monte Carlo method. The methods start with situations where the transport equation can be solved analytically and go on to consider numerical solutions by discrete ordinates and then by analogue MC simulation. He is analysing how the MC error and KL truncation affect the results and associated numerical experiments and apply MLMC methods to the problem while assessing the possibility of applying multilevel techniques to the analogue MC solver for the simplified neutron transport problem.

### Analysis of transition rates for the Dean-Kawasaki model, Federico Cornalba

Supervisors: Johannes Zimmer and Tony Shardlow

Nucleation is a physical process, important in fields as diverse as physics, chemistry and biology. Nucleation is, broadly speaking, the process with which a material undergoes the formation of new thermodynamic phases via self-assembly. The mathematical description of this process is comprised of several different relevant features. In his Ph.D., Federico is focusing his research on some aspects of the Dean-Kawasaki stochastic model, arising from the fluctuating hydrodynamics theory. Of this model, Federico is primarily investigating the underlying mathematical geometry, the transition rates analysis in the context of metastability, and will seek a description of the nucleation pathways.

### Numerics and analysis of waves in random media, Owen Pembery

Supervisors: Euan Spence and Ivan Graham

Wave propagation problems arise in applications such as seismic imaging, radar and ultrasound scanning. The Helmholtz equation is the simplest model of acoustic wave propagation - solutions of the Helmholtz equation correspond to acoustic waves with a single frequency. Researchers have been studying the Helmholtz equation, and developing numerical methods to solve it, for many years. However, most of the research effort until now has been concerned with sound waves propagating through a homogeneous medium where the speed of sound is constant. Owen is studying the Helmholtz equation where the medium is heterogeneous or random. He is developing numerical methods for uncertainty quantification for it and proving rigorous mathematical results about solutions. These results will allow him to study the convergence behaviour of these numerical methods, and may suggest new numerical methods as well.

### Higher-order DG methods for atmospheric modelling, Jack Betteridge

Supervisor: Eike Müller and Ivan Graham

One technique for solving partial differential equations numerically is by using the Discontinuous Galerkin (DG) method. This method has high spatial locality, which improves the parallel scalability and can take greater advantage of modern (many core) high performance computing architectures. A hybrid multigrid approach has already been successfully used for elliptic PDEs arising from subsurface flow. Similar methods can also be applied to atmospheric modelling problems, for instance solving the Navier-Stokes equations in a thin spherical shell. Over the course of the project, Jack is looking at the computational and algorithmic aspects of implementing a solver for these atmospheric models and the various different pre-conditioners to speed up the solution.

### Modelling and optimised control of macro-parasitic diseases, Beth Boulton

Supervisor: Jane White

Macro-parasites cause a variety of diseases throughout the world, including many neglected tropical diseases. When considering mathematical models of macro-parasitic diseases, the SIS models so often used when modelling the spread of bacterial or viral diseases do not capture some of the crucial ways in which macro-parasitic diseases differ. By considering a combination of ODE models, probabilistic and, hybrid models, Beth will attempt to formulate mathematical models which capture the dynamics of host-parasite relationships and macro-parasitic infections and then make use of these to research how best to optimise the treatment of macro-parasitic infections in both people and animals.

### Automatic diagnosis of psoriasis arthritis (xAPAD), Adwaye Rambojun

Supervisors: Neill Campbell, Tony Shardlow, Gavin Shaddick and Will Tillett

Patients with Psoriasis Arthritis are graded according to the extent of damage by scoring X-rays. Currently, this is a painstaking and time consuming process that has to be performed manually. In collaboration with the Bath Royal National Hospital of Rheumatic Diseases, Adwaye is working on automating this scoring process by exploring machine learning techniques from the computer vision community. He is working towards building a statistical model of a healthy hand that can be compared to diseased hand enabling the scoring process to be automated. This would enable scoring to be performed on a large scale basis that will ultimately increase the understanding of how the disease progresses within patients.

### Condensation in reinforced branching processes with fitness, Anna Senkevich

Supervisors: Peter Mörters and Cécile Mailler

Anna is studying a stochastic model for evolution of a structured population of particles equipped with fitness values. Each particle reproduces independently, with rate given by its fitness, and its offspring either inherits the fitness with some probability, or gets a new fitness value drawn from some probability distribution, independent of everything else. The particles of the same fitness are referred to as families. This is a stochastic version of Kingman’s model for population undergoing selection and mutation. However this framework also covers a dynamic random graph model, preferential attachment tree with fitness of Bianconi and Barabási, which is suitable for describing growth characteristics of real-life networks, such as social networks. There are two growth scenarios of the system: growth driven by bulk behaviour and growth driven by extremal behaviour (condensation case). Furthermore, there are two types of condensation: non-extensive, when no individual family makes an asymptotically positive contribution to the population, and macroscopic, when proportion of individuals in the largest family is asymptotically positive. Behaviour of the system is largely determined by properties of the chosen probability distribution. So far a broad class of bounded fitness distributions with polynomial behaviour at the tail was analysed. In this project, Anna is focusing on asymptotic behaviour of maximal families for bounded fitness distributions with a faster decay at the maximal fitness value. She is going to establish which of the above scenarios prevails by drawing links with extreme value theory.

### Accelerating Bayesian sampling, Gianluca Detommaso

Supervisors: Rob Scheichl

Gianluca's research aims to bring together techniques from statistics, numerical analysis and applied mathematics to accelerate Bayesian sampling. In particular, he deals with computationally expensive high-dimensional problems, trying to beat down the cost per iteration and performing algorithms that scale well in high-dimension. Gianluca is interested in developing interactions among different research fields, bringing together knowledge and experimenting with new ideas. He also tries out new potential sampling accelerations, or applies his machinery to other topics. His current research involves multilevel methods, MCMC algorithms, transport maps and Bayesian inverse problems.

### Seamless and overarching approaches for optimising over the phases of drug development, Robbie Peck

Supervisors: Chris Jennison and Alun Bedding

This project in collaboration with Roche concerns the optimisation of the drug development process at a program level. This involves considering multiple phases of treatment refinement and dose selection together. While individual phases of drug development have been studied in depth, there has been relatively little work that looks at two or more phases jointly. Robbie’s project uses numerical computations and simulations to model different designs which may involve computational challenges including trial designs which use a form of gain function, or “net present value”, in order to optimise decision making throughout phases, use of Seamless Phase II/III designs that may use data from Phase II in the final analysis, possibly through use of a combination test, and the realistic incorporation of beliefs about drug safety and tolerability into the program level decision making process.

### Modelling the surge phenomenon within turbomachinery, Kate Powers

Supervisors: Chris Budd, Chris Brace, Colin Copeland and Paul Milewski

Turbochargers are used in internal combustion engines in order to get a better power output for smaller engines and to get better fuel efficiency. Turbochargers work by compressing air. In order to get the most out of a turbocharger the air before and after the compressor needs a high pressure ratio for a relatively low massflow. If the massflow is too low, the air flow can reverse direction and cause surge. Surge is a difficult phenomenon to model because it exhibits chaotic behaviour. Kate is working jointly with the Mechanical Engineering department with the aim of finding a model that can (i) give a better prediction of the onset of surge and (ii) describe what happens to the air flow during surge. This will involve analysis of experimental data as well as a combination of theory from compressible fluid dynamics, rotating flows, dynamical systems and bifurcations.

### Topics in optimal stopping and optimal transport, Ben Robinson

Supervisor: Alex Cox

Ben is studying various stochastic optimisation problems and the connections between them. Recent work on optimal stopping problems has investigated imposing a constraint on the expected value of the stopping time in these problems to obtain so-called constrained optimal stopping problems. Ben plans to build on this work, making use of a connection to stochastic optimal control problems. This approach requires developing an understanding of the modern theory of stochastic optimal control, including the theory of weak solutions to partial differential equations in the viscosity sense. Certain problems of this type can be represented in terms of Monge-Ampère equations, a highly non-linear class of PDEs, which arise in the classical Monge-Kantorovich optimal transport problem. Ben is interested in this problem, as well as the recent variation, martingale optimal transport, in which additional constraints are imposed. Methods of martingale optimal transport have also been used in the Skorokhod embedding problem, a classical problem in probability theory. Each of these classes of problems has a financial motivation. Ben is particularly interested in how these problems are related.

### Attribution of large scale drivers for environmental change, Aoibheann Brady

Supervisors: Ilaria Prosdocimi and Julian Faraway

Several large flood events have hit the UK in the last years, and there is a growing concern among the public opinion and policy makers on whether the current level of protection of cities and infrastructure is appropriate. In particular, there is a concern that climate change and its impacts might result in increased flood risks: climate change projections seem to indicate that flooding risk might increase, but this is not fully validated by the observed river flow data, for which there is no strong evidence of increasing trends. Further, due to the short period of river flow record, the testing methods routinely used to assess whether change can be detected in observed data are typically not very powerful (in a statistical sense) and can not fully differentiate between possible confounders. Aoibheann is aiming to develop methods to detect and attribute changes in flooding and other environmental variables. This will result in methods for the detection of spatially coherent trends in environmental data. The project is also investigating methods to make an assessment on the main drivers of higher river flows and flooding at a regional or national scale.

### Mixing times and general behaviour of random walks on changing environments, Andrea Lelli

Supervisor: Alexandre Stauffer

Random walks in random environments have become a classical model for random motion in random media, and this model has been the source of many mathematical investigations over the years. More recently, people started to look at random walks in an environment which changes at the same time that the particle is moving. It is believed that when the environment is ‘well behaved’ (e.g. uniformly elliptic) and changes quickly enough, the random walk will behave in a way that is similar to a random walk on the underlying (non-changing) graph. This has been quantiﬁed, especially in the case of the d-dimensional infinite lattice, by the derivation of a law of large numbers and central limit theorems under some conditions related to the mixing time of the environment. Andrea is interested in understanding the effect of a slowly changing environment on the behaviour of simple random walks, e.g. the impact of the environment on the recurrence/transience property of the random walk and the mixing time of the random walk inside a ﬁnite, but changing graph.

### Two-species contact processes, Sam Moore

Supervisors: Tim Rogers and Peter Mörters

Recent work in the physics literature has explored the ‘two-species contact process’ as a model of staged infections. The work has a biological interpretation in terms of host-parasite invasions, for example, when a growing colony of bacteria is under threat from a developing bacteriophage infection. Past studies have focused mainly on simulations on Z^{2}. Sam is interested in exploring the possibility of obtaining mathematically rigorous results for models of this type but evolving on random graphs. He aims to further make use of existing branching methods as a novel approach to the problem.

### Inverse problems for brain imaging, Shaerdan Shataer

Supervisor: Chris Budd

Imaging is a fast growing area driven by its importance in real life application as well as its mathematical challenge. In the field of brain research, imaging brain activity serves as part of the ambition to understand some fundamental questions about cognition and perception. Mathematically, the problem could be perceived as two levels of the inverse problem: first to solve the source intensity image from the scalp measurement, second to infer the cause of source activity from source intensity image solved from the first part. Shaerdan is aiming to locate the active sources of brainwaves, given measurements of EEG on the surface of the scalp.