The added value and simplicity of data management: the how, the why and what sound data management can bring to your lab

Bianca-Olivia Nita

Bianca-Olivia Nita

Making it transparent, reproducible, re-usable – the benefits of good data management are manifold. In Monique den Boer’s lab at the Princess Máxima Center for pediatric oncology, the structured practices of data management are now part of the lab’s culture. This has brought not only a wealth of data that everyone can use, but also various international collaborations and publications. We spoke to Judith Boer, long-time data steward within the lab, about the easy but efficient structure she devised and implemented, and the challenges and long-term benefits of storing data well.

The change started more than five years ago, looking for a structured way to store the ever-growing amount of data the lab was producing. “We wanted to make sure that when people left the lab, others could still find and understand their data. In the beginning we only requested them to make folders for published papers, so that others can see how they went from raw data to processed data and the actual figures or tables in the paper” says Judith Boer.

But that was just the beginning. “We further figured that we have many data collections that are continuously being used, and a lot of primary patient materials. We had a collection of over 1.000 patients with different kinds of molecular data – microarray, sequencing, proteome data. And we understood that for example a PhD student generally contributes by running a few hundred samples, so we always make sure we can combine them in one cohort by using the same methods and reference samples. When new people come to our lab they can use data that has been collected. At the same time, when they leave, their own data is also collected” she explains.

It all works like a snowball and for this, it is important to have consistent techniques of data storage implemented, for everyone to know where they can find it, what version it is and how it is being pre-processed. “This is when we started with the idea to link the experiment you write in your lab notebook - or more recently type in your e-journal – with the generated data stored on a network drive. The notebook experiment name corresponds with the name of the data folder, and in addition, a ‘data storage box’ in the experiment has a link to the data folder. It is very simple, no technology, no science, but it works. Now everyone does it. You can just point to the folder location and everyone reading the experiment can go back to the data, so there is a link between how the data was made and the data itself. This is the most important thing. Everyone can go look at other people’s folders and find and copy the data and use it for themselves” explains Boer.

According to her, achieving structure requires that the Principal Investigator sees the value of it. “We have this in Monique, our group leader” she says. “She really stands for high quality data that you know you can trust. And the only way that she can guarantee that you can trust the data is by implementing such a system. If the PI doesn’t want to put time in it, no one in the lab will be motivated to do it. In our group doing this is already part of the culture. And it takes a couple of years to come to this” she adds.

The system she devised and implemented within their lab is simple and efficient but she rather not take credit for it. “We had a technician who had worked with a similar system in another lab. It’s the whole idea of linking your data to your experiment. Otherwise, people call their projects names that are hard to guess, they have places on their network folders that are difficult to find and so on. Ours is just a structured way to store data and it makes life easier. Also archiving when one leaves is much easier. I developed it within our lab, but I cannot say we totally developed it ourselves. We mainly saw the value of it and implemented it” she says. “What is really important after implementation is teaching new group members and monitoring to keep up the good practice” she adds.

During Oncode’s Annual Meeting in June, Boer presented this approach to data management by drawing a parallel with the happiness hypothesis defined by Sonja Lyubomirsky, a professor of positive psychology at the University of California. “I found it interesting that my presentation then appeared on LinkedIn with the idea that research data management makes you happy. Well, that is not exactly what I said” she laughs. “But perhaps it can be so”.

It all started with a pep talk for the team in the lab, but what came out is a full analogy with our ability to influence happiness, or in this case good data management within a lab. The happiness hypothesis states that 50% of the variability in happiness between people is genetic, 10% is due to circumstances and 40% depends on what we think and what we do. This 40% is something we can influence. When it comes to archiving, “probably 50% of your tendency/ability to archive is genetic” says Boer half-jokingly. “10% is circumstances (e-lab notebook, IT) and 40% is what we think and what we do, therefore it is something we can change. Guidelines, monitoring, and understanding the benefits of such practices make the lab culture. And the culture of our lab is to try to come to good archiving” she adds.

Judith Boer at her desk at the Princess Máxima Center for pediatric oncology

Slide from Judith Boer's presentation during the Oncode CGC Annual Meeting in June, 2021

At the Princess Máxima Center, some of the practices have become centralized in the last years. “Now we have a good data management structure at the Máxima, and that really helps. A lot of things I used to tell people in their introduction, now is centrally explained. But the specifics are still up to each lab. What is centrally explained are the network shares type, who has access, how to set those up. Where to put your raw data and shared data. Where you put data to keep private. No patient identifiers, so the data integrity part”.

But the rest depends on each lab. “Building a culture of good data management starts with understanding that you have people with different levels of being organized, and you need guidelines to help bring all of them to an agreed level of archiving, while also monitoring them over time to see if they keep using those. And the benefits are manifold: transparent, reproducible, re-usable, collaboration. The result is a large data collection that can be used either in the country or internationally” she explains.

The reticence of some towards implementing a structured data management mostly links to fear of time spent on it but the benefits clearly outweigh the effort. “If you ask people in our lab what they think of this, well, it is a little bit more work, but it is part of the job”.

By now, everyone in the group can instruct a new lab member on how the lab is working and storing the data. They also have monthly lab discussions directly from the lab notebook. “You don’t do a presentation; you just open an experiment. And then people switch and look at someone else’s experiment and see if they can follow it or have any tips to improve it. This keeps the system alive” says Boer. They also practice what they call an internal audit: asking a group member to reproduce a figure from a paper about to be published.

In a nutshell, the secret of good data management in a lab is simple and easily achievable, and the benefits are clear. The PI sets the tone creating a group culture. A bottom-up practical approach means group effort, and so everyone is involved. The data steward ensures the structure. “It’s an experience that in practice this works and helps. Rare disease in an international setting is our type of work. For example, childhood ALL (acute lymphoblastic leukaemia) is quite a rare diagnosis and there are all sorts of subtypes that make an even smaller group. It is very helpful to share our data with different countries to be able to say something about a subtype of the disease that is present in only 1% of the cases. It has brought us many international collaborations and papers by contributing our sound data. And when this practice becomes culture, the data get wings” adds Boer.

If you want to know more about data management or need advice getting started, you can contact Inga Tharun, programme manager Open Science and FAIR data.

Inga Tharun

Inga Tharun

Programme Manager

Inga has a background in biochemistry and acceleration of health innovation. She obtained her PhD from the TU Eindhoven after studying the effects of estrogen-receptor post-translational modifications on protein-protein interactions. Since then she gained many years of experience in accelerating health innovation by managing public private partnerships, advising the Dutch minister of Health on innovation policy and as program manager for the Personal Health Train. Since 2020, she is responsible for the Oncode Open Science program. Inga: “I am honored to lead the Oncode Open Science program since: Open science is the act of taking responsibility of scientists to return insights into society. This strongly aligns with my personal motivation and is right at the heart of the Oncode ambition: Outsmarting cancer.”


Programme Manager

Other Stories

HC2
Oncode shows it can be done in a different way: an interview with Hans Clevers
How can we as a society benefit even more from the knowledge that is being developed at our Dutch universities? How do you transfer this knowledge into innovations that boost our economy and offer solutions to our major societal challenges? These are some of the questions at the core of Oncode Institute. Oncode Investigator Hans Clevers, co-founder of Oncode Institute and professor Molecular Genetics at Utrecht University and UMC Utrecht answers them in this interview conducted by Teachleap.nl in collaboration with Innovation Origins.
Bianca-OliviaNita

<span>Bianca-Olivia</span><span>Nita</span>

Sylie Noordermeer
Oncode Investigator Sylvie Noordermeer: ‘Setting up your own research group is both exciting ánd terrifying’
Creating your own research group is exciting, but what are some of the typical challenges that can emerge? Oncode Investigator Sylvie Noordermeer (LUMC) discusses her experiences as founder of the Noordermeer research group, whilst also reflecting on the contribution that Oncode’s Mentoring Programme can make in solving dilemmas.
Marloesvan Amerom
Shobi
Beyond the p values: guide to building your possibly first biotech
Oncode Oncology Bridge Fund manager Shobhit Dhawan on the art and challenges of linking researchers, investors and industry, and turning exciting research projects into successful companies.
Bianca-OliviaNita

<span>Bianca-Olivia</span><span>Nita</span>