Explanatory Data Analysis group

Research projects

Below is a (non-exhaustive) list of concrete research projects in which our group is currently involved; also see the overview of research themes and valorisation examples.


Data Science for State-of-the-Art Blood Banking
This project is part of the university-wide Data Science Research Programme.
Project members
Marieke Vinkenoog (Sanquin, LIACS), Matthijs van Leeuwen (LIACS), Aske Plaat (LIACS), Mart Janssen (Sanquin, LIACS; PI)
External partner
Sanquin (Amsterdam)
Period
2018 – 2022
Description
The mission of Sanquin is to be a knowledge-driven organization that provides lifesaving products while committing itself to careful, responsible and efficient processing of the free and voluntary donor gift. At present, however, still around 10% of donors are being deferred when tested at the donation site. Apart from a substantial loss of effectiveness of productivity, this leads to a substantial loss of donors. Recent advances in data science have the potential to substantially improve the understanding and control of blood bank processes by uncovering and utilizing known and unknown patterns in historical donation data. We will use (recurrent) neural networks to capture complex association structures within historical data and apply these to predict, for instance, Hb levels and/or no-show probabilities for future donations. A key challenge will be to correctly interpret and explain the obtained models and predictions.
Dementia back in the heart of the community
This project is part of the university-wide Data Science Research Programme.
Project members
Daniela Gawehns (LIACS), Matthijs van Leeuwen (LIACS), Martine Huygens (NIVEL), Sandra van Beek (NIVEL), Peter Groenewegen (NIVEL), Joost Kok (UT), Janke de Groot (NIVEL; PI)
External partners
NIVEL (Utrecht), Stichting Maasduinen (Kaatsheuvel)
Period
2018 – 2022
Description
Park Vossenberg is a long-term care organization that is currently rebuilding its facilities and redesigning the surrounding park area to make it possible that residents with dementia freely use the outdoor park area. There will be no gates and people from the surrounding residential area will also use the park. This project accompanies this change in care for people with dementia, by monitoring activities and changes in persons with dementia, family members, nursing staff, volunteers, and people from the local community. We will do this through a process evaluation and a before-after design with comparison to other locations where these changes have not been implemented yet. Specifically, we will analyse the activity patterns of persons with dementia measured by sensors and observations. The results will provide a deeper understanding of interaction patterns between persons with dementia and their environment and how these interactions are related to health and quality of life of people with dementia and social cohesion in the local community.
The international tax system as a complex system
This project is part of the university-wide Data Science Research Programme.
Project members
Manon Wintgens (Leiden Law School), Matthijs van Leeuwen (LIACS), Irma Mosquera Valderrama (Leiden Law School), Rex Arendsen (Leiden Law School; PI)
Academic partner
Leiden Law School (Leiden)
Period
2017 – 2021
Description
The international tax system is composed of multiple layers, i.e., law and regulations, jurisdictions, and businesses. Previously, these inherently different layers were often analysed from a fiscal perspective. In contrast, this data-driven research project aims to study the international tax system in its entirety from a complex systems perspective. The main goal will be to investigate if and how the international tax system can be defined and modelled as a complex system. Approaching the international tax system from this perspective aims to gain new insights on all layers, e.g., 1) on the effect of new and modified tax treaties; 2) on the interaction between jurisdictions; and 3) on the behaviour of business strategies over time. In addition, the project will address questions related to the existence of, e.g., tax gaps, legislative patterns, and tax havens and how these are reflected in observational data. By applying, e.g., network modelling and pattern discovery, the researchers aim to understand the behaviour of the international tax system as a complex system.
Meta-modelling for privacy-preserving mining of medical data
This project is part of the university-wide Data Science Research Programme.
Project members
Shannon Kroes (Sanquin, LIACS), Matthijs van Leeuwen (LIACS), Rolf Groenewold (LUMC), Rutger Middelburg (LUMC), Mart Janssen (Sanquin, LIACS; PI)
External partner
Sanquin (Leiden)
Period
2017 – 2020
Description
In many domains and in the medical domain in particular, it is important to protect the privacy of individuals. This implies that medical data often cannot be shared, even when scientific progress could strongly benefit from this. The goal of this ambitious project is to develop methods that construct meta-models of the data that 1) allow to perform data analysis tasks, such as building predictive models, while 2) not containing any sensitive information—thus guaranteeing privacy. By publishing these meta-models instead of the data, the scientific community can exploit the data without breaching the privacy of the individuals represented in the data.
SAPPAO – A Systems Approach towards Data Mining and Prediction in Airlines Operations
Joint project with the Natural Computing Group.
Project members
Hugo Manuel Proença (LIACS), Sarang Kapoor (IIT Roorkee), Matthijs van Leeuwen (LIACS), Dhish Saxena (IIT Roorkee), Michael Emmerich (LIACS), Divyam Aggarwal (IIT Roorkee), Thomas Bäck (LIACS; PI)
Industrial partner
GE Aviation (Bangalore, India)
Period
2016 – 2020
Description
By analysing historical flight data and data on the associated disruptive events on the flight network, the NWO-DeitY SAPPAO project aims to optimise the accuracy and reliability of predicting scheduled flight times, thereby potentially saving millions of Euro’s on better utilisation of airplanes, decreased fuel consumption, decreased CO2‐emissions, decrease of ambient noise and better use of time for passengers and airports. At LIACS we will focus on feature construction for improved flight predictability and reduced airline operating cost. The challenge in this prediction is that it is not clear which features should be used to obtain the best estimates. There is a wide range of available data, including network data, time series data, and so on, which is not straightforwardly used in existing attribute‐value based machine learning and statistical techniques. This project will deal with these challenges.
> More information

DAMIOSO – Data Mining on High Volume Simulation Output
Joint project with the Natural Computing Group.
Project members
Sander van Rijn (LIACS), Matthijs van Leeuwen (LIACS), Stefan Manegold (LIACS, CWI), Michael Lew (LIACS), Thodoris Georgiou (LIACS), Pedro Holanda (CWI), Thomas Bäck (LIACS; PI)
Other partners
Honda Research Institute Europe (Offenbach, Germany)
Period
2016 – 2020
Description
The DAMIOSO project, funded by NWO and Honda Research Europe, focuses on developing algorithms and tools for data management, data mining and knowledge extraction from massive volumes of data, as generated by modern simulation tools, which are being used in a wide range of industries (aerospace, automotive, shipping, and others), in order to deliver advanced design and process optimisation to support engineers in their design processes.
> More information