Step 4 - Documentation and writing it down
Last updated on 2024-11-26 | Edit this page
What is documentation
Testimonial
“Documentation is a love letter to your future self” - Damian Conway
We’ve talked about writing down or noting a lot of information …. but where do we do this? How? What exactly should we write down?
Some researchers may think about a lab handbook, a group instruction guide, a team wiki or a knowledge repository. In the IT and computation space, the word “Documentation” is used as a place to store all information about that project.
For our purposes today, we will use “Documentation” as a place to write down notes about your work, link to important documents or share information across a research group.
Examples include:
These are living documents - processes may improve or lessons are learnt, and this should be reflected in your updates to your manual.
Think about this scenario:
Challenge
If a key member of your team because unavailable suddenly, what do you need to know?
Do you know where their work is stored?
Their data?
Do you know where their procedures and protocols are stored?
How about their research contract? Their data custodianship details?
How to start documenting
Starting can be as simple as opening a word document and starting to type.
Your institute may have eLabNotebooks available, a cloud platform or similar available. You can even just have a text file or word document that you store in the same place as your data.
You may want to share it with your supervisor or trusted colleague, or just keep it to yourself.
Consider your audience
Are you writing for:
Yourself?
Your supervisors?
Auditing purposes?
Reviewers and external researchers wanting to replicate your work?
Industry partners whom you may have a contract with?
Your research group?
Staff onboarding and offboarding
Onboarding
Got a new team member?
If you want to document workings across members of a research group, the perfect person to do it is a new student or staff member. Not only does this make them familiar with your procedures, but they don’t have any background knowledge and are viewing everything with fresh eyes.
This is a good time to set expectations in your research group, such as :
Expected contact and work hours?
Best methods of communication across the team?
What is the expected research culture and values?
How to contact IT/Library/Researcher support/campus security/safety officer?
Where do you expect research data is stored?
How do they find out about restrictions and rules around custodianship and sharing of data
How is authorship and author order decided when submitting papers?
Does the research group post papers as preprints?
Do you expect data to be published as open or FAIR? Is there commercial interest around the data? What licences are expected for research data/code?
How does someone learn about data sensitivity and what they need to be aware of?
Examples of previous ethics, governance, grant applications
What technologies are usually used in the group?
What analysis tools or methodologies are usually used in the group?
How does hardware hire/booking work?
Offboarding
As researchers or research students leave for their next adventure, the knowledge that they’ve collated is also leaving. It’s helpful to have an offboarding checklist to ensure your group is across this information.
Checklist items could include:
Is a copy of their data being stored at your institute? Where?
Is there any colleagues in the research group or supervisors that currently also have access to the data?
Is this raw data, processed data and final data? Or just one of these? Which is which?
What is the details on ownership, custodianship and reuse of this data?
Who were/are the collaborators?
What is the retention date of this data?
Is a copy of the data also going to another institute (whether it be a collaborator or the staff member’s future institute)?
Have you got grant ids, publication links and any other public information associated with this data? Is the dataset published or stored in an external repository
Where is any ethics or governance approvals for this data?
What information is there available on analysis, methodology and protocols?
What tools were used for analysis? What software and software versions? What hardware and hardware models? (if applicable)
Challenge
Let’s document how to make a pizza!
Get a piece of paper and write down how you would do it. You’ll need what ingredients you’ve used, and to write how you’ve made it.
Give yourself 3 minutes.
Okay, so firstly, let’s talk about our Base.
Assumedly, you included a base. But did you describe it? Was it thin or thick pan? Extra wide New York style?
What was the base made out of?
Now, the default answer here is dough. BUT here’s the thing - not everyone may have the same idea about what ‘could’ constitute a pizza base.
Story Time
By @amandamiotto
So one day during a work lunch, a colleague made a passing comment about having a cauliflower pizza. This erupted a spicy debate on ‘Is this a pizza?’.
Further questions revealed they weren’t using cauliflower as a topping - Their version of a pizza was a roasted slice of cauliflower as a base, topped with tomato sauce, pizza toppings and cheese.
There was much debate - Surely this wasn’t a pizza, there was no base! But can a base be a slice of cauliflower? Should a base be a dough? Was this just a vegetable stack?
This lead to the conversation of what a base must consist of.
Does a base need to contain flour? If so, does a mix of egg, parmesan cheese and almond flour meet the criteria? Almond flour is a flour after all.
Does a base need to be a yeast based dough? If so, take the scenario that someone used a naan as a base - if they made a yeast based naan, it would be pizza, but a yogurt based naan wouldn’t?
Details are important! This shows that while we have assumptions on what is ‘normal’, this is skewed by our life experience, and others may not have the same ‘normal’.
The Curse of Knowledge
May also be referenced as Expert Bias or Curse of Expertise
When explaining a concept, we assume that others have the same background knowledge and life experiences that we do.
However, this is rarely true.
Challenge
Let’s stand on the other side:
Now assume you are making a pizza someone else has ordered. They’ve asked for egg on their pizza. How do you do it?
Challenge (continued)
Do they mean just crack a whole egg on top as a ‘sunny side up’ option?
Do you scramble the egg and swirl it around, like a sauce swirl
Do you cook the egg first and chop it up, then sprinkle it like a topping?
You would have to talk to the client to clarify what exactly they are after.
Future you
Another added benefit to documentation is for yourself. How often have you gone to redo something, or reference something you did a month ago, but you can’t remember the details?
Equally, did you do something yesterday and not write down the details because ‘it was easy/made sense’ and you believe you will just know how to do it later? You don’t have the time, after all.
It may feel like it chews into a lot of time to be so detailed, but you save that time back when you need to retrieve those details.
When you are specific and detailed in your notes, Future You will really appreciate how thoughtful Past You was.
What do I include in my documentation?
To start with, we’ve already talked about the following:
Data Management Plans - As discussed in lesson 1.
What and where you completed PreRegistration of your research project - As discussed in lesson 3.
What and where you published your protocols - As discussed in lesson 3
How you handle missing data - As discussed in lesson 2
Primarily, this documentation is here for you. It is to:
Help you track what you are doing
Have a point of reference for future you
To alleviate some of the mental load of remembering everything
If you realise you’ve taken a wrong turn, this can also act as a ‘History’ of changes.
From here, we can include anything you feel is important. You can just write what you are working on every day, like a diary.
Documentation can also include:
Name and contact of all collaborators
Ethics and Governance details
Grant Details
Contracts and agreements around data ownership, custodianship and permitted uses
Data Transfer Agreements between institutes, partners and consultants
Where has the data come from?
How were participants recruited for?
Participant information sheets & consent forms
Survey questions
What settings was your equipment in? What brand and model was that machine that produced output?
Safety precautions needed
A copy of your Literature Review
Reasoning for experimental design
Reasoning for chosen sample sizes
How results were calculated and reported, including statistical analysis
Any rules for predefining and documenting excluded data to avoid bias
Detailed procedures
What didn’t work in your project?
Some of this, we would have discussed in folder structures in Step 1. You may also have some of this information in your Data Management Plans.
Standard Operating Procedure
A standard operating procedure (SOP) is a set of step-by-step instructions to help carry out routine operations. SOPs aim to achieve efficiency, quality output, and uniformity of performance, while reducing miscommunication and failure to comply with industry regulations.
Ten simple rules on how to write a standard operating procedure is a great start to writing a SoP, and comes with this useful template.
Workflow mapping / Analysis Pipelines
How did you clean and analyse your data?
If you’re using an analysis program like SPSS, Stata or even just Excel, you can download function list from SPSS/NVIVO to the code used to create it. Even just writing down and taking screenshots of the analysis screens is somewhere to start.
Challenge
Can you determine which file(s) was/were used as input in a process that produced a derived file?
A common way is to name your files with a number at the front, so 1_rawdata.csv, 2_cleandata.csv, 3_analysisdata.csv etc.
We’ll talk about automation and testing in the following lessons.
To fully automate your pipeline, you can start to work in a data science language such as R or Python. The Carpentries data science lessons are a great place to start, made for researchers who have never coded before. Workshops are held across the world.
Things to consider
Media work
If you are working with newspapers, artwork or other GLAM (Gallery, Library, Art and Museum) media, have you noted what articles were around it? What context did location add?
Passwords
You may want to create a team password manager, or even just a personal one, to track accounts and passwords. This should prevent losing account logins where data or analysis may have occured.
Thankfully, we can now do this! We can refer to our notes with our detailed methods, analysis pipelines and have links to where our data is backed up (from Lesson 1)
The work you have done here can also help you with any audits.
A great example of this is “Sorting Out the FACS: A Devil in the Details” . Two laboratories in USA had contradicting results when replicating fluorescence-activated cell sorting (FACS) profiles of primary breast cells. They worked together for over a year, swapping machines, samples and even working side by side to find the difference. In the end, the methods for stirring a liquid were different, which caused different results.
William C. Hines, Ying Su, Irene Kuhn, Kornelia Polyak, Mina J. Bissell, (2014) Sorting Out the FACS: A Devil in the Details, Cell Reports, Volume 6, Issue 5, Pages 779-781, https://doi.org/10.1016/j.celrep.2014.02.021.
What is your next step?
Useful Resources
Wikipedia contributors. (2024, March 11). List of electronic laboratory notebook software packages. In Wikipedia, The Free Encyclopedia. Retrieved 00:58, April 15, 2024, from https://en.wikipedia.org/w/index.php?title=List_of_electronic_laboratory_notebook_software_packages&oldid=1213192113
Wikipedia contributors. (2024, February 20). Standard operating procedure. In Wikipedia, The Free Encyclopedia. Retrieved 01:20, April 15, 2024, from https://en.wikipedia.org/w/index.php?title=Standard_operating_procedure&oldid=1209098321
Andrey Andreev, Valerie Komatsu, Paula Almiron, Kasey Rose, Alexandria Hughes, Maurice Y Lee (2022) Research Culture: Welcome to the lab eLife 11:e79627. https://doi.org/10.7554/eLife.79627
CESSDA Training Team (2017 - 2022). CESSDA Data Management Expert Guide. Bergen, Norway: CESSDA ERIC. Retrieved from https://dmeg.cessda.eu/ CESSDA Training Team (2017 - 2022). CESSDA Data Management Expert Guide, licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
Hollmann S, Frohme M, Endrullat C, Kremer A, D’Elia D, Regierer B, et al. (2020) Ten simple rules on how to write a standard operating procedure. PLoS Comput Biol 16(9): e1008095. https://doi.org/10.1371/journal.pcbi.1008095 published with a CC-BY licence.
Image:
Sonja Bezjak, April Clyburne-Sherin, Philipp Conzett, Pedro Fernandes, Edit Görögh, Kerstin Helbig, Bianca Kramer, Ignasi Labastida, Kyle Niemeyer, Fotis Psomopoulos, Tony Ross-Hellauer, René Schneider, Jon Tennant, Ellen Verbakel, Helene Brinken, & Lambert Heller. (2018). Open Science Training Handbook (1.0) - File:Open Science Training Handbook 03.png[Computer software]. Zenodo. https://doi.org/10.5281/zenodo.1212496 licenced as CC0 Public Domain.
A shout out to the people in the Cauliflower Pizza story - Thanks for letting me share it! @amandamiotto
In this lesson, we have learnt:
What documentation is and how it can help us
How to consider the audience you are writing for
What to document for staff onboarding and offboarding
The curse of knowledge and how that skews our perception
What to include for documentation
We build trust in our knowledge by:
When asked, we can refer to our notes for further information on what we did, how we did it and why