The diversity of backgrounds and research interests in our group is also reflected in the wide range of technologies that we use and develop. The vignettes below provide an overview of tools that were developed by us. Particular areas of expertise are network-based methods, Virtual Reality technology, multi-omics data integration and image analysis. More information on specific projects, links to code and webapps can be found under resources.
Our team is pioneering the application of Virtual Reality (VR) technology for exploring large and diverse biological data. On the surface, VR simply drastically increases the amount of information that can be displayed. On a deeper level, and perhaps more importantly, the immersive 3D space also represents the natural environment in which evolution has shaped human cognition. We perceive and interact with the world in 3D, and basic neurological processes of pattern recognition, learning, even social behavior are intimately linked to this fundamental experience. Exploring data in VR thus offers unique opportunities for integrating powerful machine learning with innately human capabilities, e.g. intuition and generalization via experience, incomplete or noisy information.
NATURE COMMUNICATIONS (2021)
We built a multi-media workshop as a central infrastructure for developing and experimenting with new technologies for exploring data. The room is equipped with a range of state-of-the-art 3D technologies. Several Virtual Reality stations enable several users to simultaneously and collaboratively explore a complex dataset in a shared virtual environment. The green screen and custom-built video and audio recording equipment allow us to create mixed reality videos and explore new formats for science communication and remote teaching.
Example videos produced in our multi-media kitchen.
In addition to our multi-media kitchen, we also set up a creative workshop for producing physical objects. The workshop is equipped with a 3D printer, laser cutter, electronic workbench, and other tools. We use the workshop to develop experimental user interfaces for virtual, augmented and mixed reality (VR/AR/XR) applications, for example data gloves or AR panels, but also to produce a wide variety of physical artifacts, for example 3D protein models, and art pieces.
Networks offer an intuitive visual representation of complex systems. Important network characteristics can often be recognized by eye and, in turn, patterns that stand out visually often have a meaningful interpretation. In conventional network layout algorithms, however, the precise determinants of a node’s position within a layout are difficult to decipher and to control. Here we propose an approach for directly encoding arbitrary structural or functional network characteristics into node positions. We introduce a series of two- and three-dimensional layouts, benchmark their efficiency for model networks, and demonstrate their power for elucidating structure-to-function relationships in large-scale biological networks.
NATURE COMPUTATIONAL SCIENCE (2022)
NATURE RESEARCH BRIEFING (2022)
High-content imaging screens provide a cost-effective and scalable way to assess cell states across diverse experimental conditions. The analysis of the acquired microscopy images involves assembling and curating raw cellular measurements into morphological profiles suitable for testing biological hypotheses. Despite being a critical step, general-purpose and adaptable tools for morphological profiling are lacking and no solution is available for the high-performance Julia programming language. Here, we introduce BioProfiling.jl, an efficient end-to-end solution for compiling and filtering informative morphological profiles in Julia. The package contains all the necessary data structures to curate morphological measurements and helper functions to transform, normalize and visualize profiles. Robust statistical distances and permutation tests enable quantification of the significance of the observed changes despite the high fraction of outliers inherent to high-content screens. This package also simplifies visual artifact diagnostics, thus streamlining a bottleneck of morphological analyses.
Rare genetic diseases are typically caused by a single gene defect. Despite this clear causal relationship between genotype and phenotype, identifying the pathobiological mechanisms at various levels of biological organization remains a practical and conceptual challenge. Here, we introduce a network approach for evaluating the impact of rare gene defects across biological scales. We construct a multiplex network consisting of over 20 million gene relationships that are organized into 46 network layers spanning six major biological scales between genotype and phenotype. A comprehensive analysis of 3,771 rare diseases reveals distinct phenotypic modules within individual layers. These modules can be exploited to mechanistically dissect the impact of gene defects and accurately predict rare disease gene candidates. Our results show that the disease module formalism can be applied to rare diseases and generalized beyond physical interaction networks. These findings open up new venues to apply network-based tools for cross-scale data integration.
NATURE COMMUNICATIONS (2021)
Drug combinations provide effective treatments for diverse diseases, but also represent a major cause of adverse reactions. Currently there is no systematic understanding of how the complex cellular perturbations induced by different drugs influence each other. Here, we introduce a mathematical framework for classifying any interaction between perturbations with high-dimensional effects into 12 interaction types. We apply our framework to a large- scale imaging screen of cell morphology changes induced by diverse drugs and their com- bination, resulting in a perturbome network of 242 drugs and 1832 interactions. Our analysis of the chemical and biological features of the drugs reveals distinct molecular fingerprints for each interaction type. We find a direct link between drug similarities on the cell morphology level and the distance of their respective protein targets within the cellular interactome of molecular interactions. The interactome distance is also predictive for different types of drug interactions.
NATURE COMMUNICATIONS (2019)
Gene expression data are routinely used to identify genes that on average exhibit different expression levels between a case and a control group. Yet, very few of such differentially expressed genes are detectably perturbed in individual patients. Here, we develop a framework to construct personalized perturbation profiles for individual subjects, identifying the set of genes that are significantly perturbed in each individual. This allows us to characterize the heterogeneity of the molecular manifestations of complex diseases by quantifying the expression-level similarities and differences among patients with the same phenotype. We show that despite the high heterogeneity of the individual perturbation profiles, patients with asthma, Parkinson and Huntington’s disease share a broadpool of sporadically disease-associated genes, and that individuals with statistically significant overlap with this pool have a 80–100% chance of being diagnosed with the disease. The developed framework opens up the possibility to apply gene expression data in the context of precision medicine, with important implications for biomarker identification, drug development, diagnosis and treatment.
npj Systems Biology and Applications (2017)
The observation that disease associated proteins often interact with each other has fueled the development of network-based approaches to elucidate the molecular mechanisms of human disease. Such approaches build on the assumption that protein interaction networks can be viewed as maps in which diseases can be identified with localized perturbation within a certain neighborhood. The identification of these neighborhoods, or disease modules, is therefore a prerequisite of a detailed investigation of a particular pathophenotype. While numerous heuristic methods exist that successfully pinpoint disease associated modules, the basic underlying connectivity patterns remain largely unexplored. In this work we aim to fill this gap by analyzing the network properties of a comprehensive corpus of 70 complex diseases. We find that disease associated proteins do not reside within locally dense communities and instead identify connectivity significance as the most predictive quantity. This quantity inspires the design of a novel Disease Module Detection (DIAMOnD) algorithm to identify the full disease module around a set of known disease proteins. We study the performance of the algorithm using well-controlled synthetic data and systematically validate the identified neighborhoods for a large corpus of diseases.
PLOS COMPUTATIONAL BIOLOGY (2017)