Category Archives: The Open IGERT

The Open IGERT: Review of the Reviews – Grant Declined

Sorry for the weird title, I wanted to describe as much about this post as possible in the title without making it super huge. Long story short, the IGERT Grant that my peers and I submitted in August has been reviewed and the reviews are in.

The Open IGERT has been declined.

I’m not at all surprised. I knew we were a long shot. And I knew that we weren’t focusing on issues that, while not required or mentioned by the NSF, would be things they would want us to focus on. But, I thought we put together a powerful program. Unfortunately the NSF and their army of anonymous peer reviewers thought otherwise.

So in an effort to improve upon the program I am going to share the reviews with you all, and comment on the reviews. The reviews and the review summary can be found in my Google Drive Folder. Feel free to poke around. For reference, here is the original NSF call for proposals. And so you can skip the link, here is the main objectives of the IGERT call:

  • …NSF recognizes the need to educate and support a next generation of researchers able to address fundamental challenges in
    1. core techniques and technologies for advancing big data science and engineering;
    2. analyzing and dealing with challenging computational and data enabled science and engineering (CDS&E) problems, and
    3. researching, providing, and using the cyberinfrastructure that makes cutting-edge CDS&E research possible in any and all disciplines.

On to the reviews:
Continue reading The Open IGERT: Review of the Reviews – Grant Declined

The Open IGERT: Internships @figshare @benchfly @plosone and more!

As I’ve said before, education is an important part of the IGERT training program. But there isn’t just a course component to the IGERT. The Open IGERT is working with the BEST open science companies on the planet to provide the best possible educational environment that I can possibly create. The internship and collaborational component of the Open IGERT is to me the most exciting aspect of this or any other IGERT and I’m very happy about all the people that are willing to collaborate with us to teach future scientists the importance and merits of being open.

Here are the opportunities that we’re working on:

  • figshare – What better way to educate IGERT trainees on open science and data management by getting an opportunity to work with one of the best open data repositories on the planet! With figshare, students will get to work with data of all different types, sizes, and colors and learn how to manage that data for the future. Not only that but they’ll learn how scientists interact and use the different data types and will get to develop user friendly and usable software. Because figshare is located in the UK we’ll be able to fund a few students to work with Mark in person, with lodging, and any other interested parties can work remotely from ABQ.
  • BenchFly – To me video documentation of scientific experiments is going to be the future of publication. Why present a paper when you can show exactly how you do your experiments, how you acquired the data, and how your results affect the world? Being able to develop a killer video protocol will be important to getting your science out there and doing it effectively. But BenchFly also is a data repository but for videos and they need to store, share, and analyze their data just like any other scientific endeavor. Like the experience from figshare, learning how to handle data of all different varieties and size will be crucial to their scientific career and developing an understanding of long term data management is exactly the kind of education the Open IGERT is hoping to provide.
  • PLOS One – I can’t see a better training opportunity for open scientists than working with the leading open access publisher. Open scientists will need to understand how open science can integrate with current scientific practices and publication. PLOS One is looking to innovate scientific publishing and the students I hope to train will be active innovators. To me there is no better pairing than that! And in the spirit of data management, students will get to see exactly how publishers safeguard the data that is entrusted to them via scientists from around the globe. Students will also get to see important aspects of publishing that many don’t get a chance to witness, and having an understanding of current publishing models can help push open science into the forefront of science.
  • Universidad Tecnica Particular de Loja – We are partnering with the Computer Science Department at UTPL in Ecuador to host a student swap program. Open IGERT students may get to choose between a 5 week immersion program or a 10 week full study program. The idea behind this is some students may be hesitant to go to a foreign language county for an extended period of time and giving students the option to go for 5 weeks could be a more favorable amount of time. Aspects of the projects at UTPL are unknown at this point, but they will be of an open data perspective.

All of these internship opportunities are of course contingent of the Open IGERT proposal funding, but I’m very excited for the opportunities I was able to secure working with some of the best open science tools and enterprises in existence today!

The Open IGERT: Outreach

Outreach is an important aspect of any IGERT program. As an educational training initiative, the IGERT is responsible for educating not just students in the program but also members of the local community, other students, and anyone who can benefit from general science education. The Open IGERT will look to provide educational opportunities in the form of live scientific demonstrations, hands-on mentoring, and online educational material.

The most important feature of the Open IGERT is that ALL course material from the IGERT courses will be made available online so that educators from anywhere on the planet could build their own open research focused courses without having to start from scratch. We will also make ourselves available to explain aspects of the courses that work well and those that don’t so that others don’t have to struggle with the growing pains of developing a course from scratch.

The IGERT trainees will also play a major role in the outreach arm of the program. Students will help spread the word of open science and train other students how to be active, efficient, and good open scientists. The IGERT will host workshops to provide students (and faculty) around the University an educational forum for open science initiatives like open notebook science, science blogging, citizen science, crowdfunding for science, etc. IGERT students will lead the workshops and provide details and experiences from the courses, labs, and personal accounts.

In addition to training students, the Open IGERT fellows will work with students in high school, junior high, and elementary school. The fellows will be active participants in science fair projects, student mentoring, and at higher levels may host students in their home lab to develop interest in a scientific career at a young age. The students will also provide scientific demonstrations both in classrooms and at local scientific venues such as Explora!, the New Mexico Museum of Natural History and Science, and the National Museum of Nuclear Science and History.

I also have a couple unique ideas that weren’t included in the proposal, but could be worked out later that would be tremendously beneficial and in line with IGERT core outreach values. The first idea would be to develop educational demos/labs in partnership with Vernier. A friend of mine works for Vernier and from what she has told me about the company and their ideals, they seem like a perfect fit to collaborate with the Open IGERT program. The second idea is an extension of another project I worked with, developing educational projects/demos with arduino. In the Junior Lab course I taught, students developed arduino projects completely open and I think this could be replicated much better through the IGERT program and even incorporated into some of the classes (the capstone course or even the weekly seminar).

The final component of the Open IGERT outreach effort will be the Open IGERT Blog. I’ve been blogging for 5 years and science blogging for 1 year (if you include prior open notebooks than 3 years) and blogging can be a powerful form of communication in general and provides a way for science to be accessible to the general public. The Open IGERT blog will provide updates about the IGERT program, new IGERT initiatives, access to education material, details about outreach events, and much more. Mostly the blog will be maintained by myself and perhaps Rob Olendorf. The fellows will also be expected to post information about their projects and whatever tickles their fancy in an effort to keep the public informed about what research is going on at the University and in the IGERT program.

And since the University has hosted several successful IGERT programs in the past, we have access to outreach initiatives that were successful and can build on those. It is my hope that the Open IGERT  benefits more than just the very local community. Because it is an open educational training opportunity, the Open IGERT has a chance to benefit the country and potentially impact the world.

The Open IGERT: The Open Repository

The IGERT program is a cooperative program. Faculty receive support in the form of students provided with stipend and tuition, students get to brag about how much more they are paid than the other grad students, and the IGERT gets recognition. The Open IGERT requires one thing from students and faculty: open research.

As part of the deal between receiving funding, we are going to require that students participate in open research. The minimum will be to have openly accessible data with as short of lag time (between production and publication) as possible. I will encourage students to be fully open, essentially open notebook scientists but I understand that some faculty may be hesitant and resistive to this.

In every encounter with people I’ve had regarding open science and open notebook science students are very willing to share their research, data, protocols, etc but their advisers are the ones preventing this share of information. By requiring open research, we hope to attract professors that are more in line with open research values. Also we may hope to acquire some professors who may be desperate for funding and willing to change their attitude in favor of financial support.

There are many places online that support open research, but for this IGERT to impact the University and the globe, we propose to build an open repository. The framework for the repository will be open access and shared (probably via GitHub) so that other universities can build on our model. We hope to include measures that allow for:

  • Archival – After my invite to the Library of Congress, I’ve been made aware of the effort and difficulty in archiving digital and online science. The Open Respository will be kept up to date with archival standards to ensure the research stored will not be lost.
  • Open Access – What kind of open repository won’t allow for open access to the research that it contains? The information in the repository will be sharable, citable, licensed, and downloadable.
  • Tagging – Metadata! Metadata will be crucial for the repository. Without the ability to add/provide metadata to data sets, information won’t be searchable. The main reason for providing an open repository is so that research groups around the University and the nation/world will be able to search for data that is useful. Metadata provides the backbone for data accessibility.

The IGERT will encourage use of the repository for any purpose imaginable. Students can host data, notebooks, or any scientific information they wish. They will also be able to store posters, publications, presentations, educational material, etc. And we’ll work with the students to ensure that the repository is useful and useable by them and others. Hopefully the students will be part of the development of the repository, and this will provide them a tangible output from the IGERT program.

The Open IGERT: Courses

IGERT Programs are designed to be innovative training programs for graduate students. This training involves lab experience, workshops, career development, and coursework. The Open IGERT (which is officially titled “Creating the New Scientist – Training Graduate Students in Open Science and Informatics”) is built around several core training initiatives that focus on each of those areas and include some new innovative ones. In this post I’m going to discuss the center of the educational component: The Open Courses.

The Open IGERT will feature 3 core courses designed to teach students the principles of open science and data management. Each courses material will be presented in a way that is applicable to multiple disciplines. The courses are built around a couple of core concepts that we feel apply broadly to data management:

  • Data comes in a variety of forms; software, files, numbers, images, documents, etc. And it is not enough to just simply backup your data, scientists must be able to access the data at any point in time, secure it, and provide access to those that may come later.
  • Data is useful to others even after it is useful to yourself. Collaboration will be key and online collaboration is essentially the flood gate control of scientific discovery.

The first course is the capstone course titled Collaborative and Open Research in Practice which will teach students (and really whoever wants to sign up for the course) principles and tools for open science and collaborative science. The information attained in this class will be used to develop a proposal to be submitted for the Open Research Challenge (to be explained in another post). Also the training from this course will be directly applicable to students for use in their home labs as it will be geared toward providing access to data and marketing that data to ensure it reaches scientists who would be interested in the findings.

Course two will focus on data management and is titled Data Management and Curation. In this course students will learn the fundamentals of the data life cycle which we outlined as

The data life cycle

Acquisition, Processing, Analysis, and Dissemination. Metadata will play a major role in the course because in order to find relevant data online, that data needs to be tagged and metadata is the way to do it. Also metadata provides supplemental information that could be crucial to experimental repeatability. The NSF is beginning to require grant applications to have a data management plan attached, and current scientists are ill equipped to deal with this. My collaborator Rob Olendorf currently writes data management plans for many faculty at the University of New Mexico, and it’s time future scientists are taught the important features of a successful data management plan. Not only is this useful for grant applications but also data management plans will be crucial as labs become more digital. Having information and protocols for data protection, archival, and security (not just from potential theft but also from hardware failures) will be very beneficial for science in the long run.

The third core course will focus on data visualization and presentation and is titled Data Analysis and Visualization. I’m a firm believer that scientific data should be easy to understand and be visually appealing. Far too many scientists put little to no effort into making their data readable and publish complicated plots that require a lot of time to consume, interpret, and understand. This class will teach students various analysis techniques and outline effective methods for data presentation and present case studies of both good and bad examples of visualized data. Either as part of the course or as a supplement to it, I will get to teach the students the way of graphic design to enhance their data presentation prowess.

In addition to the core courses, we will feature a seminar series every semester that will range in topics that won’t be covered in the courses or will only be glossed over. The seminar will also feature career development education: oral and poster presentation design and speaking, scientific writing for publications and less formal (open notebooks and blogs), grant writing tips, preparing a CV, etc. This course will be very flexible to provide the most required educational components at the time.

In addition to the weekly seminar, we will also teach an ethics course titled Ethical Issues of Online Collaborative Research once every other year. By offering this course once every 4 semesters, the course will be available to all students in the IGERT program at some point in their funding period. The course will discuss relevant topics in open science such as: data use, reuse, licensing; social media use; conversations in public online; trolling, how to avoid it, and how to deal with it; protecting yourself and your data online; scientific communication – the responsibility of maintaining facts while providing accessibility; and daily lab ethics – working with others, using common areas and tools, publication authorship, etc.

The final educational component is an optional elective credit. We have compiled a list of courses that are taught in various disciplines that are applicable to our curriculum. Students may choose to supplement their education with one of these courses designed to provide a new wrinkle to their research experience.

The core courses of the IGERT along with the ethics course will be the core of a new Informatics program at the University. IGERT students will typically be from other disciplines and their reward for completing the courses (aside from receiving the IGERT stipend) will be a minor in Informatics or Data Management (still undecided).

The benefit of this curriculum is that students would be learning about tools, techniques, and practices that are applicable in just about every discipline. In every IGERT I’ve interacted with the educational component is very specific and unless you are directly involved in the research focus, the courses may not be relevant. My hope is to change this and by teaching some of the courses myself (definitely the capstone course, part of the data visualization course, and shared responsibilities with the ethics course and seminar series) I want to prevent students from replicating my IGERT educational experience.

The Open IGERT

Ladies and Gentlemen, our IGERT proposal that is centered around data management with an open science approach is finally completed and submitted! I’m linking the 99.9% completed proposal (as it was in Google Docs) below. The final version was amended in Word to meet NSF guidelines and modified for spelling/grammar errors.

I would like to thank Alan Marnett of BenchFly, Mark Hahnel of figshare, Kristen Ratan of PLOS, the Computer Science department at Universidad Tecnica Particular de Loja, Heather Armstrong and Abhaya Datye of the NSMS IGERT, Linda Bugge and Marek Osinski of the INCBN IGERT, Johannes Van Reenen, and especially Monica Fishel for all your help and support in putting this proposal together. Together we put together a program that I would be extremely excited to be a part of were this program around when I  was an IGERT fellow.

I would also like to thank my Co-PI’s Robert Olendorf, Lori Townsend, Kathleen Keating, Julie Coonrod, and Tim Lowrey.  Without them, I would not have had this opportunity to make such an impact on science. I would just be a lowly grad student wishing I could do something, but this team has made the dream a hopeful reality.

In celebration of the submission, I will highlight the most exciting aspects of this proposal and show you all exactly what I have in mind in terms of educating future scientists to be open scientists. There are so many innovations included and I am 110% confident I can make all of them a reality, I just hope our NSF reviewers can see that.

Without further adieu, here is the full proposal to hold you over until I get to break it down piece by piece:


Creating the new scientist – Training graduate students in open science and informatics