IRIS-5D

ML and Virtual Cells

The IRIS-5D hardware platform can produce massive, high-dimensional datasets: large 5D volumes (x, y, z, time, channels) of living cells under many conditions. Analysing this data at scale requires advanced machine learning (ML) and AI models that go beyond simple quantification, enabling us to extract hidden patterns, predict behaviours, and ultimately generate virtual representations of cells (“Virtual Cells”).

An ideal computational pipeline should:

  • Automate experiment setup and imaging decisions.
  • Denoise and enhance image quality for gentle, long-term live-cell imaging.
  • Segment and quantify subcellular structures across thousands of cells.
  • Build generative models that learn the rules of cell behaviour.
  • Create “Virtual Cells” that allow prediction, visualisation, and simulation of cellular processes.

We have designed such a pipeline:

ML in the Imaging Pipeline

1. Experiment Automation

  • Cell detection and localisation: Neural networks for detection and localisation operate directly on raw volumetric OPM data to find cells of interest.
  • Automated cell selection: Using user-provided labels (e.g. healthy/unhealthy, cell-cycle stage, receptor localisation), ML decision boundaries are fine-tuned to choose the most valuable cells to image in real time.
  • Adaptive parameter optimisation: Models predict the best laser power, exposure, and acquisition strategy for each cell, maximising image quality and throughput.

2. Image Quality and Denoising

  • Generative models of image formation: Deep generative networks trained on photon-counting data denoise extremely low-light images, allowing gentle imaging that preserves cell viability.

3. Segmentation and Quantification

  • Adaptation: Advanced network models are tuned to novel microscopy data using efficient adaptation techniques.
  • Multi-structure segmentation: Plasma membrane, endosomes, Golgi, ER, nucleus, mitochondria, and more are segmented in full 5D volumes.
  • Population statistics: Quantitative descriptors (shape, size, localisation, dynamics) are extracted from populations of thousands of cells and distilled into population-level metrics.

Virtual Cells

1. Latent Representations of Cell States

  • Each cell state is represented as a vector in a learned latent space, with structure induced by the distributions of subcellular structures in the cell population.
  • Generative models capture the co-dependence between structures, allowing us to predict unlabelled organelles from labelled ones.
  • Representative cells can be reconstructed and visualised, enabling human-interpretable summarisation of population-level information.

2. Hierarchical Generative Models

  • Variational hierarchical generative models of subcellular structures and whole cells are trained to represent individual cell states and their distributions across populations.
  • This allows probing of novel cell states sampled from the learned distribution, providing insights at the scale of both cell populations and individual cells.

3. Dynamics and Time Evolution

  • Generative temporal models learn the temporal evolution of cells through trajectories in the latent state space.
  • Models capture both short-term transitions (Markovian) and long-range dependencies (non-Markovian).
  • These models enable predictive simulations of cell behaviour, including responses to stimuli.

Outputs and Sharing

  • Software: ML pipelines, APIs, and GUIs for automated imaging, segmentation, and generative modelling.
  • Data: Large-scale 5D datasets with full metadata, available via open repositories.
  • Models: Pre-trained generative models of cell states and dynamics for reuse by the bioimaging and AI communities.