The entire story of my scientific career

This article is actually the introduction to my dissertation and I thought I’d share it with the world officially rather than let it die in an electronic archive somewhere. I’ve shared this story in some form or another several times already, but I’ve never provided the entire account like this. And so, it is with great pleasure that I share with you, the story of how I became the scientist that I am today…

I joined the KochLab in the Spring of 2007. It was a brand new lab that, at the time, was comprised of Dr. Koch, myself, and my best friend Larry Herskowitz (who is now Dr. Herskowitz). In our first lab meeting, Dr. Koch discussed his scientific endeavors up to that point (some of which are continued in this dissertation) and introduced the concept of open science.

Open science was, and still is, an emerging paradigm, and is not to be confused with a particular field of science. The core concept of open science is providing access information and it is through the opening of scientific research that many new endeavors have become possible. Many of these endeavors have changed the way scientists approach research and acquire data. Citizen science, for instance, has brought a mass scale of human analysis to previously unsolvable problems. Even sharing data has led to new forms of collaboration. Data repositories have allowed scientists to share data with the world in hopes of finding new uses for the shared data. Tools like DataOne have emerged to provide some organization to the new data. Meanwhile, open notebook science has emerged to open the entire scientific process and practitioners make every stage of research accessible including protocols, raw data, data analysis, and much more open to scrutiny.
My Dissertation

It’s time to get this party started! So here is my dissertation edited in real-time. Feel free to refresh periodically over the next several months to witness the evolution, or just come back in May when it’s all done!

In Google Docs

And feel free to leave a comment in the document with a correction or question or anything!

Open Notebook Science – SACNAS Poster

I finished my poster for SACNAS on Friday and it is getting printed today. You can check it out via Slideshare and/or figshare depending on your preference. I’ve made it downloadable via both links, but the figshare one won’t alter the original file so I would download from there. Anyways here it is:

The Open IGERT: Internships @figshare @benchfly @plosone and more!

As I’ve said before, education is an important part of the IGERT training program. But there isn’t just a course component to the IGERT. The Open IGERT is working with the BEST open science companies on the planet to provide the best possible educational environment that I can possibly create. The internship and collaborational component of the Open IGERT is to me the most exciting aspect of this or any other IGERT and I’m very happy about all the people that are willing to collaborate with us to teach future scientists the importance and merits of being open.

Here are the opportunities that we’re working on:

  • figshare – What better way to educate IGERT trainees on open science and data management by getting an opportunity to work with one of the best open data repositories on the planet! With figshare, students will get to work with data of all different types, sizes, and colors and learn how to manage that data for the future. Not only that but they’ll learn how scientists interact and use the different data types and will get to develop user friendly and usable software. Because figshare is located in the UK we’ll be able to fund a few students to work with Mark in person, with lodging, and any other interested parties can work remotely from ABQ.
  • BenchFly – To me video documentation of scientific experiments is going to be the future of publication. Why present a paper when you can show exactly how you do your experiments, how you acquired the data, and how your results affect the world? Being able to develop a killer video protocol will be important to getting your science out there and doing it effectively. But BenchFly also is a data repository but for videos and they need to store, share, and analyze their data just like any other scientific endeavor. Like the experience from figshare, learning how to handle data of all different varieties and size will be crucial to their scientific career and developing an understanding of long term data management is exactly the kind of education the Open IGERT is hoping to provide.
  • PLOS One – I can’t see a better training opportunity for open scientists than working with the leading open access publisher. Open scientists will need to understand how open science can integrate with current scientific practices and publication. PLOS One is looking to innovate scientific publishing and the students I hope to train will be active innovators. To me there is no better pairing than that! And in the spirit of data management, students will get to see exactly how publishers safeguard the data that is entrusted to them via scientists from around the globe. Students will also get to see important aspects of publishing that many don’t get a chance to witness, and having an understanding of current publishing models can help push open science into the forefront of science.
  • Universidad Tecnica Particular de Loja – We are partnering with the Computer Science Department at UTPL in Ecuador to host a student swap program. Open IGERT students may get to choose between a 5 week immersion program or a 10 week full study program. The idea behind this is some students may be hesitant to go to a foreign language county for an extended period of time and giving students the option to go for 5 weeks could be a more favorable amount of time. Aspects of the projects at UTPL are unknown at this point, but they will be of an open data perspective.

All of these internship opportunities are of course contingent of the Open IGERT proposal funding, but I’m very excited for the opportunities I was able to secure working with some of the best open science tools and enterprises in existence today!

The Open IGERT: Outreach

Outreach is an important aspect of any IGERT program. As an educational training initiative, the IGERT is responsible for educating not just students in the program but also members of the local community, other students, and anyone who can benefit from general science education. The Open IGERT will look to provide educational opportunities in the form of live scientific demonstrations, hands-on mentoring, and online educational material.

The most important feature of the Open IGERT is that ALL course material from the IGERT courses will be made available online so that educators from anywhere on the planet could build their own open research focused courses without having to start from scratch. We will also make ourselves available to explain aspects of the courses that work well and those that don’t so that others don’t have to struggle with the growing pains of developing a course from scratch.

The IGERT trainees will also play a major role in the outreach arm of the program. Students will help spread the word of open science and train other students how to be active, efficient, and good open scientists. The IGERT will host workshops to provide students (and faculty) around the University an educational forum for open science initiatives like open notebook science, science blogging, citizen science, crowdfunding for science, etc. IGERT students will lead the workshops and provide details and experiences from the courses, labs, and personal accounts.

In addition to training students, the Open IGERT fellows will work with students in high school, junior high, and elementary school. The fellows will be active participants in science fair projects, student mentoring, and at higher levels may host students in their home lab to develop interest in a scientific career at a young age. The students will also provide scientific demonstrations both in classrooms and at local scientific venues such as Explora!, the New Mexico Museum of Natural History and Science, and the National Museum of Nuclear Science and History.

I also have a couple unique ideas that weren’t included in the proposal, but could be worked out later that would be tremendously beneficial and in line with IGERT core outreach values. The first idea would be to develop educational demos/labs in partnership with Vernier. A friend of mine works for Vernier and from what she has told me about the company and their ideals, they seem like a perfect fit to collaborate with the Open IGERT program. The second idea is an extension of another project I worked with, developing educational projects/demos with arduino. In the Junior Lab course I taught, students developed arduino projects completely open and I think this could be replicated much better through the IGERT program and even incorporated into some of the classes (the capstone course or even the weekly seminar).

The final component of the Open IGERT outreach effort will be the Open IGERT Blog. I’ve been blogging for 5 years and science blogging for 1 year (if you include prior open notebooks than 3 years) and blogging can be a powerful form of communication in general and provides a way for science to be accessible to the general public. The Open IGERT blog will provide updates about the IGERT program, new IGERT initiatives, access to education material, details about outreach events, and much more. Mostly the blog will be maintained by myself and perhaps Rob Olendorf. The fellows will also be expected to post information about their projects and whatever tickles their fancy in an effort to keep the public informed about what research is going on at the University and in the IGERT program.

And since the University has hosted several successful IGERT programs in the past, we have access to outreach initiatives that were successful and can build on those. It is my hope that the Open IGERT  benefits more than just the very local community. Because it is an open educational training opportunity, the Open IGERT has a chance to benefit the country and potentially impact the world.

The Open IGERT: The Open Repository

The IGERT program is a cooperative program. Faculty receive support in the form of students provided with stipend and tuition, students get to brag about how much more they are paid than the other grad students, and the IGERT gets recognition. The Open IGERT requires one thing from students and faculty: open research.

As part of the deal between receiving funding, we are going to require that students participate in open research. The minimum will be to have openly accessible data with as short of lag time (between production and publication) as possible. I will encourage students to be fully open, essentially open notebook scientists but I understand that some faculty may be hesitant and resistive to this.

In every encounter with people I’ve had regarding open science and open notebook science students are very willing to share their research, data, protocols, etc but their advisers are the ones preventing this share of information. By requiring open research, we hope to attract professors that are more in line with open research values. Also we may hope to acquire some professors who may be desperate for funding and willing to change their attitude in favor of financial support.

There are many places online that support open research, but for this IGERT to impact the University and the globe, we propose to build an open repository. The framework for the repository will be open access and shared (probably via GitHub) so that other universities can build on our model. We hope to include measures that allow for:

  • Archival – After my invite to the Library of Congress, I’ve been made aware of the effort and difficulty in archiving digital and online science. The Open Respository will be kept up to date with archival standards to ensure the research stored will not be lost.
  • Open Access – What kind of open repository won’t allow for open access to the research that it contains? The information in the repository will be sharable, citable, licensed, and downloadable.
  • Tagging – Metadata! Metadata will be crucial for the repository. Without the ability to add/provide metadata to data sets, information won’t be searchable. The main reason for providing an open repository is so that research groups around the University and the nation/world will be able to search for data that is useful. Metadata provides the backbone for data accessibility.

The IGERT will encourage use of the repository for any purpose imaginable. Students can host data, notebooks, or any scientific information they wish. They will also be able to store posters, publications, presentations, educational material, etc. And we’ll work with the students to ensure that the repository is useful and useable by them and others. Hopefully the students will be part of the development of the repository, and this will provide them a tangible output from the IGERT program.