Joshua A. Bull, Philip S. Macklin, Tom Quaiser, Franziska Braun, Sarah L. Waters, Chris W. Pugh, Helen M. Byrne

Read the paperFortunately, pathology is becoming increasingly digitised. As slide scanning becomes more widespread, this creates an opportunity to use computer vision algorithms to analyse extremely high-resolution images. Many algorithms have been designed to find positively stained cells from immunohistochemistry (IHC) images. While this is not a trivial task, image analysis is becoming more accessible to non-coders (including via open source software such as QuPath) and the performance of cell identification algorithms is constantly improving.

So, if we assume that we can identify the locations of individual immune cells – say, macrophages – from an IHC image, then the question becomes: now what? What can we do with this information? Using (x,y)-coordinates of macrophages we can certainly calculate statistics like cell counts or densities more accurately than via manual assessment, but a lot more information is not captured by these descriptions. More detailed analyses might involve spatial statistics which can provide information about relationships between pairs of points, or techniques such as topological data analysis (TDA) which can describe the structure of a dataset.

While there are an array of mathematical and statistical techniques which could be used to describe this type of point cloud data, it is unclear which should be prioritised. The current state-of-the-art description of immune cell infiltration is manual evaluation by a pathologist; no mathematical descriptor is quite able to capture this yet. Our aim in this paper was to explore ways in which different spatial statistics could be combined to approximate pathologists' evaluations.

We decided to focus on macrophage localisation within tumour or stroma, using only the (x,y) coordinates of the macrophages (i.e., no labelling to distinguish tumour from stroma, as this is generally unavailable). In an ideal world, we'd calculate different spatial statistics from a huge dataset of manually labelled point patterns, and use that to infer the labelling. Unfortunately, such datasets are difficult to come by. So, being mathematical modellers at heart, we decided to make one.

We generated synthetic point patterns that imitated point patterns from real IHC images: each pattern was based on an underlying "tumour/stroma" map, which could be varied programmatically to produce regions with small and highly mixed tumour and stroma areas, or regions with large, distinct areas. We placed points within these regions to mimic the locations of macrophages in the real images. A key parameter, ρ, describes the ratio of macrophages in the simulated tumour to the stroma. Varying ρ produces images which appear to have different degrees of macrophage infiltration into the tumour regions. This suggests that identifying the value of ρ used to generate a point pattern by observing a range of spatial statistics could be a stepping stone to describing infiltration in IHC images.

**Figure 1**. (a) regions taken from head and neck cancer IHC slides showing macrophage locations, evaluated as having differing levels of macrophage infiltration into tumour nests.
Features of spatial statistics like the pair-correlation function (b) and J-function (c) correlate with pathologist’s evaluations, and so can be used to identify a metric which is predictive of these descriptions.

While the combination of three statistics that we considered isn't powerful enough to define a new metric for describing immune cell infiltration, our approach is a proof-of-concept which can be used to combine a much wider array of statistics. Our next steps will be to identify combinations of statistics which can more accurately reproduce pathologist's classifications, and to explore links between the classifications and patient outcomes.

- Bull, J.A., Macklin, P.S., Quaiser, T. et al. Combining multiple spatial statistics enhances the description of immune cell localisation within tumours. Sci Rep 10, 18624 (2020). https://doi.org/10.1038/s41598-020-75180-9