Open notebook science thoughts inspired by the biomedical research symposium

First let me say that my presentation went amazingly. I spoke for ten minutes with a five minute Q&A session after. The audience was comprised of IMSD students, their faculty advisors, and their friends and family. It was also easily the largest audience I’ve spoken in front of. I estimate about 70-100 people.

The Q&A was quite exciting. I received questions from every audience group: students, faculty, and nontechnical attendees. This is very important to me because it tells me that my presentation was engaging and impactful. The questions used the entire allotted time and it spilled over into break period.

During the break I spoke with students interested in open notebook science, participating in it, and reaching out to others who could be interested in it. I also spoke with nontechnical audience members (the friends and family) about the core values of ONS. This was very exciting to me because it shows me that taxpayers and the general public care much more about science and information more than we give them credit for. I always say ONS is more than just data provenance, but is also scientific outreach, and the conversations I had today demonstrate that.

I also received some interesting comments from faculty. I always receive comments from this crowd regarding two things: 1) information signal to noise and 2) data protection. Here are my thoughts regarding these two issues:

1) Scientists already publish a ton, and there is already a decent amount of bad science. Open notebook science does not add to this perceived signal to noise ratio. In this case signal to noise refers to the amount of relevant scientific information (for your interests) in the sea of all publications. By providing open access to your data and keeping a complete account of your research you are actually making it easier to find the information you want and need.

If you are reading a journal article, you have to sift through the document and spend time trying to understand what the author is proposing with their conclusions. With open notebooks that information is laid out for you in plain english with minimal effort required. You need a protocol? Here are my steps. You want the raw data? Here you go.

While it is true that open notebooks will increase the amount of published scientific information in the world, in this case that overload enhances the discovery process and minimizes the time required to follow up on prior experiments. I can’t count the number of times I’ve saved oodles of time by finding information in my notebook and the informally published documentation of others.

2)  When it comes to data protection, I will admit that I don’t know everything there is to know about this. I also have never been a victim of data theft (scooping) myself. But I don’t see how open notebooks can increase the frequency of data thievery.

Firstly, bad scientists are trained to be that way. There aren’t many people that want to be a bad or ethically incompetent scientist. Those that are trained to be that way. In graduate school you are prepared to lead the next generation of science and you learn from those around you. If your PI follows a negative code of ethics you wont have all the necessary tools for success upon graduation. Luckily even those who are raised in this environment, like my advisor Steve, have the choice to follow those guides or choose a different path. In the case of Steve, he was so uncomfortable in his environment he chose to be an open scientist. This decision ultimately led me to becoming an open notebook science evangelist, a career path that has led me on a wonderful adventure.

Aside: I really hope I can convince Steve to air his grievances publicly because his experiences are something that all scientists can learn from.

Secondly, data thievery already occurs in a closed environment. I’m sure someone one day will be a victim of these circumstances in an open environment (if it hasn’t already happened), but being open won’t open the flood gates on scooping and unethical science. In fact, I believe that open science can help minimize it.

By being open you are essentially prepublishing and staling claim on your research domain. In the event of catastrophe, you can point to the fact that you have been working on this project and that your information is valuable and full of integrity (integrous?). Peers may also be able to back you up and essentially police the situation accordingly. Being open makes your research transparent and could help prevent tragedy. Why would you choose to steal information that others already know exist?

And to that point, why would you choose to steal information that is encouraging reuse? By participating in open science and open notebook science and publishing your data with open access you are encouraging data sharing and reuse. You can’t steal information that is being given away.

I think the issue is that scientists who work in the closed environment think their research is theirs and the protect it like it is an extension of themselves. Open scientists, on the other hand, view their data as that to be shared with the world. Their data is not theirs but something that they produced and should be consumed and developed by others in ways they can’t imagine. Traditional scientists who think about open science continue to view their data as theirs in the open environment. The truth is the two systems are fundamentally different and require a mental reprogramming in order to go from closed to open science.

As I always say, education is the required mental reprogramming and when this happens we will see a much faster shift from one system to the other. I envision a world that is based on the quality of research you produce, not the quantity or the perceived impact of that research.

Personally speaking, I feel that younger scientists can embrace open science much better than the older more established generation. The older generation was scientifically raised in a system that embraced a different set of core values. New scientists haven’t been influenced as much by the current system and are more open to change because of this lack of influence.

With regard to data theft, I want to add that I wouldn’t mind being scooped. This may sound strange to invite unethical conduct, but how can anyone understand the negative aspects of the system without experiencing it themselves. Unfortunately all my data is public domain so data theft is essentially impossible. Perhaps I don’t get a credit or a citation if someone reuses my data, and that would basically be the most unethical thing that could happen to me. And if that is in fact the worst case scenario then I would say the system works pretty well.