Spreadsheets are widely used as a digital tool to capture, organize, process and archive information. However, to organize the information provided properly and logically can be a complex task and doesn’t always come naturally. Therefore, a recent article approached this topic systematically by presenting 12 practical recommendations that should be considered when working with spreadsheets (in bold, with comments by the PAASP team):
- Be consistent with data entry across a research unit, since this will ease the understanding and analysis, e.g., agree on how to refer to cell lines or use abbreviations of plasmids and stick to it.
- Choose good names for things to be able to understand file names and data entries quickly and allow for error-free processing.
- Write dates like YYYY-MM-DD since this is according to the global “ISO 8601” standard and facilitates sorting of documents.
- Do not leave any cells of your data set empty to be able to “distinguish between truly missing values and unintentionally missing values”, this means that placeholders should be introduced, like “NA” or “-“.
- Put just one thing in a cell to avoid any confusion and allow for an automated processing of the data.
- Organize the data as a single rectangle for an easy overview and (referring to point 4) fill any empty cells.
- Create a data dictionary to make it possible for others to understand the data. Such a document for the entire research unit could already cover a lot of information and would require only little modification for the individual experiment.
- Do not include calculations in the raw data files since it is considered best practice to have all raw data in a “read only” data file to avoid (un)intentional deleting of information.
- Do not use font color or highlighting as data since that cannot be read and used by any automated process and is prone to errors.
- Make backups to ensure that all your data are secured.
- Use data validation to avoid data entry errors, e.g. with the Excel functionality “Data Validation”,
- Save the data in plain text files to allow for legibility in the future which means that the data sets are independent of any special formatting or software version, e.g. in a csv-file.
These are very helpful and practical recommendations which should be considered by every bench scientist. We, at PAASP, would even suggest to have these recommendations agreed on at the level of the research unit since they will help to reduce errors and allow for easier understanding of spreadsheets by colleagues – ultimately supporting the reproducibility of research results and outcomes.