The SPA+ project, funded by the German Federal Ministry of Education and Research under grant 05M20MOA, ran from 1 April 2020 to 31 March 2023 and was led by Prof. Dr. Jörg Lücke at the Carl von Ossietzky University of Oldenburg. The consortium included the Universities of Bremen and Siegen, with Oldenburg taking the lead on Part 3, “Model‑based data and label generation with probabilistic generative networks.” The project’s aim was to improve deep‑learning methods for histopathological tumour diagnosis by stabilising training on small, sparsely labelled data sets and by developing mathematically grounded data‑augmentation and label‑generation techniques.
Part 3 focused on probabilistic generative models that learn data representations from limited annotations. The team extended regularised deep mixture models (DMMs) with position‑ and intensity‑invariance, building on earlier work (FS18). Experiments showed that standard DMMs, when scaled, outperformed the invariant‑augmented variants on both medical and non‑medical data sets. In the second half of the project, scalable standard models were combined with label‑propagation techniques, yielding improved runtimes and higher classification accuracy on benchmark data sets. Sublinear learning algorithms for Gaussian mixture models, described in a paper submitted to IEEE Trans. on Pattern Analysis and Machine Intelligence, were integrated into the generative framework, further reducing training time and enabling the use of more complex probabilistic models as the base of the label‑propagation scheme.
A supplemental application, approved before project start, directed the same generative framework toward denoising and detail extraction of electron‑microscopy (EM) images of SARS‑CoV‑2. The denoising algorithm produced two reconstruction modes: one replacing each pixel by the mean of model estimates, the other by the variance, providing a novel visualisation of infection scenes. The results were documented in a preprint titled “Visualization of SARS‑CoV‑2 Infection Scenes by ‘Zero‑Shot’ Enhancements of Electron Microscopy Images” (bioRxiv). Benchmarking used established fluorescence‑microscopy and TEM data sets of cilia, and the denoising method was further refined in a major revision of a journal submission (SD23). The project also released open‑source software for denoising, label/data generation, and efficient optimisation, making the tools available to the wider scientific and commercial community.
Overall, the SPA+ effort achieved its dual objectives: it demonstrated that probabilistic generative models can generate high‑quality labels from few annotations, and it produced a scalable, sublinear denoising pipeline that enhances EM images of viral particles. The collaboration among the three universities, the leadership of Oldenburg, and the support of the Federal Ministry of Education and Research were essential to these outcomes, which are now disseminated through peer‑reviewed publications and open‑source releases.
