Tools for Reproducible Science |  astrobites

Tools for Reproducible Science | astrobites

In some of our daily summaries you may see “open access‘ written in brackets next to the journal name. In some other bites it says: “closed access” there. That’s because some journals like Nature or Science don’t have unpaid access to their articles. They require users to pay if they want to read the latest science news. We’re happy to go along with the issues with Overcoming closed access because of arxiv.org, a site where you can find articles for free, but only if the authors have made their articles available there Fortunately, most astronomers actually do this and maximize the accessibility of their works.

Science can only be improved if it is open and the results of the work are transparent. Besides the easily accessible written articles, the ideal open science world would also allow you to easily reproduce the results of any paper. This would be easier to do if the authors of the papers included the codes they used for the analysis or created charts on an accessible platform. Some scientists reject this on principle, as it often means a lot of extra work – of which most scientists have enough. The 2016 Nature Scientific Reports article asked 1,500 scientists about the reproducibility of research results. A third of the respondents have never even considered developing techniques to check data for reproducibility, and only 40% reported using such techniques regularly. Also, up to 70% of researchers have encountered irreproducible experiments and results not only from other groups of scientists, but also from the authors/co-authors of the published scientific papers! The current focus on quantity rather than quality encourages the production of articles with high-profile headlines and no less inflated research results, which could lead to further bias.

Let’s break down some key points of the importance of reproducible and open science:

  • Leads to cooperation and thus to improvements: Science is made in cooperation! That’s why we hold conferences – we want to hear what others are working on, exchange thoughts and ultimately collaborate with relevant people to publish more exciting results. When people can reproduce, build on, and maintain your work, collaboration becomes more effective!
  • Builds trust in the community: In astronomy we build various models and tools to analyze data and create theories. It’s now difficult to keep track of all the tools, which is exciting! But it can be difficult to know which tool to choose – transparency in the field builds trust between colleagues!
  • Creates interesting debates/discussions: Other astronomers might come to different conclusions with different models or tools – this is where the interpretation of physics becomes very important. Controversial results are actually exciting too! We can definitely have a lot more confidence in a result when two groups arrive at the same result using different methods. However, if the results are different, there is more work to be done. (e.g. one group might have made a mistake, or the methods are biased, or there may have been more than one answer).
  • Helps you work more efficiently: I recently came across a blog post on Professor Lorena Barba’s group’s website. They share an anecdote about how it took them so much less time to reproduce their own results for new work whose results were even reproducible.

In this bit I want to talk about some great tools that can help you make your own science reproducible!

GitHub – a very popular collaborative programming site. There you can create both public and private repositories (usually people make the repositories public with ready-made code). The site’s other powerful tool is the version control system – it allows you to see the changes in your code and collaborate seamlessly without touching the original code. An example of a GitHub profile is this profile of a professor Michael Zingale who is a strong advocate for open science (permission granted).

Zenodo – Another popular tool for storing your papers, data files, research software and other research related artifacts. It’s free to upload and freely accessible, and this universal repository makes your work citable and shareable.

show your work! – one of Dr. Workflow created by Rodrigo Luger. It uses another great workflow for reproducible data analysis, Snakemake. The philosophy behind the workflow is that “anyone should be able to generate the article PDF from scratch at the click of a button.” show your work! is integrated with GitHub, Zenodo and Overleaf and can save you a ton of time answering questions about how you got your results because you can simply share the GitHub repository with your work that the workflow creates for you!

Reproducible workflow in a public cloud for Computational Fluid Dynamics – a workflow developed by Professor Lorena Barba’s group. It can store your computer studies in a public cloud called “Microsoft Azure”. The main benefit of the workflow is its speed: “Public cloud resources today are capable of similar performance to a university-managed cluster and can therefore be considered a suitable solution for research computing.” (quoted from the paper)

Professor Lorena Barba’s research group is very concerned about reproducibility and writes about it on their blog. I recommend trying it!

Making science reproducible might seem like a lot of work – and that’s true, but only at the beginning! It actually saves a lot of time in the long run as mentioned above. Luckily, the community of people interested in open software is happy to help. For example, the Flatiron Institute hosted an Astronomical Software Development Workshop where people shared their thoughts on open science and how to continue building and nurturing the community. More workshops will follow in the future (stay tuned!).

Fortunately, if you finally look at the following numbers from the above-mentioned Nature paper, you will see that depending on how the scientific community evolves, most of the factors that contribute to irreproducible research can be easily eliminated (e.g. code availability /Paper). !

Author’s note: I was not involved in the development of the above workflows. Individuals mentioned in the bite are one of the gems in the open science community and are credited solely for their work on reproducible science.

Astrobite edited by Jana Steuer

Credit for selected images: Stanford medicine

About Sabina Sagynbayeva

I am a graduate student at Stony Brook University and my main area of ​​research is planet formation. I am currently working on planetary migrations using hydrodynamic simulations. I’m also interested in protoplanetary disks, but almost any topic related to planets fascinates me! Besides research, I’m also a singer-songwriter. I LOVE writing songs and you can find them on any streaming platform.

Leave a Comment

Your email address will not be published.