Skip to Main Content

Research Data Management: Case Studies

This guide provides a comprehensive overview of Research Data Management (RDM) best practices, resources, and services available at the University of Waterloo.

Case Studies decorative banner

I submitted the wrong paper?!

overflowing filing cabinet

Rahul is a new professor at his university. He’s been busy working on a paper for a journal while balancing an intense teaching semester. 
 
His paper has gone through several drafts, and he’s been working on it across various devices. He’s almost ready to submit a draft of the paper to a journal but he can’t remember which device has the latest version, causing him to rewrite some sections. 
 
Having done his best to bridge the different drafts, he attaches “articlefinalversion” and sends it off to the journal, before running off to another meeting. 
 
A few weeks go by, and Rahul receives an email from the journal he submitted to. The editors explain that the article they received did not appear to be fully finished and as a result they will not be reviewing it further. 
 
Upon seeing this message, Rahul checks the file that he attached and notices that it was in fact a wrong file, and that the latest version he had been working on was similarly titled “articlefinal”. All the time spent working on the paper is now jeopardized by a simple mistake, which could have been avoided if Rahul had begun the writing process with a more clearly defined system for naming his files. If he had used short descriptive titles with version numbers, like “ACME_Paper_v03” he would have been better able to identify the correct file.

Relevant data lifecycle stages: Plan, Share

Related RDM Concepts: File Naming, Data Management Plan

My work speaks for itself...

warning sign

Jim is a well-established researcher. He is recognized as a world leading expert in his field, but recently there have been questions circulating about the validity of some of his work. 
 
There is one article from early in his career that is drawing scrutiny due to an issue with the research methodology, which has been complicated by the fact that the dataset used for the paper is nowhere to be found. A lot of Jim's current work has been built on earlier findings from this paper, which is causing serious concern at the institution. 
 
Jim has been finding it hard to reject the accusations related to his research because he keeps saying that the research methodology is evident when reading the paper, and he has not facilitated further access to the original dataset outside of what is presented in the paper. 
 
His department, and the university at large, is under pressure to clarify the situation or risk damage to its reputation. If only Jim had followed better data management practices, like data documentation and data deposit, this controversy may have never begun.

Relevant data lifecycle stages: Collect, Process, Analyze

Related RDM Concepts: Data Deposit, Data Management Plan

There's no such thing as too much data...

floppy disk with measuring tool

Omaima is a history student working with archival records. 
 
When she started her research, she was worried that she wouldn't have many records to go through, but she has recently gained access to a large archive of state records. She has arranged to make a trip to visit the archive to look at the documents for the first time. 
 
At the archive, Omaima finds lots of interesting and useful materials for her research. There's only one problem... 
 
None of the collection has been digitized and the location is not easy to access for a return visit. If she had been better prepared, she might have been able to bring additional tools to help her digitize some of the documents while she was visiting.

Relevant data lifecycle stages: Plan, Collect, Preserve

Related RDM Concepts: Data Size, File Types, Data Management Plan

Thanks for Sharing!

sharing icon

Maya is a grad student working on a study related to social determinants of health in big cities and she has been collecting a sizable amount of data through surveys. She is particularly interested in comparing her data with similar studies in other locations, but she is having trouble making this comparison with only the information found (and accessible) in published papers. 
 
Frustrated with these results, Maya books an appointment with the library. With the guidance of a librarian, she gets redirected to several databases where she can access the raw data collected by some of the researchers whose work she was interested in. 
 
Maya finds most of the data that she was looking for, though there are institutions and researchers that she would have loved to include but couldn’t access their data. Pressed on time to finish her study, and feeling like she wouldn’t get a response if she emailed directly, she works with the data that was most easily available. 
 
Her paper is a big success in the discipline and the researchers whose data she used are happy that their work was reusable and included in another study. Going forward, Maya makes sure to make her own data as accessible as possible, ensuring that there’s no disappointment when another researcher steps into a similar situation.  

Relevant data lifecycle stages: Collect, Share, Reuse

Related RDM Concepts: Data Deposit, Repository Selection

Unknown Acronyms: Who (or what) is "MATT"?

question marks

Lynn is a postdoctoral fellow, working in a biology lab. 
 
While settling into the new lab environment, Lynn has started going through some work done by a recent master’s student in the department. From what she can tell, the work they did was really good and could contribute to the research that Lynn was hoping to do. She only has one problem: the previous student did not do a good job of documenting their files and Lynn is having a hard time deciphering what certain words mean. 
 
In several documents, there are notes referring to "MATT" and Lynn has been trying to figure out whether this is an acronym, the name of another researcher in the department, or something else entirely. She has gone to every "Matt" in the department to try and figure this out, but no one has any information. 
 
Luckily, the previous student left their contact information, but they haven't been replying to emails. Lynn may be forced to work with only a portion of the data that had been collected by the previous student and is documented.

Relevant data lifecycle stages: Plan, Process, Analyze, Reuse

Related RDM Concepts: Data Documentation
, README File

Our project's going to last forever!

bag of money

Sara has been leading a large-scale project at the university, with collaborators from various universities across Canada and the United States. 
 
The project is expected to take around 10 years to complete and is currently in year three. According to the stipulations of the grant they've received, the project has to place an emphasis on knowledge mobilization and making the results of the research available and accessible to the public. 
 
Sara's team has been working on a website to host the first phase of research findings. While the construction of the website has been going smoothly, Sara has come to the realization that the project did not take into consideration the costs of hosting the website beyond a couple of years. She is trying to figure out a way to ensure that the research won't simply disappear from the public after the grant funding runs out. If she can't find a solution, her work, and that of her research team, might go to waste. 

Relevant data lifecycle stages: Plan, Preserve

Related RDM Concepts: Data Management Plan, Repository Selection