ONS and Intellectual Property

This post is an excerpt from my dissertation which can be found here via figshare.

Note: The contained information pertains strictly to the US legal system, and is based on information I (Anthony Salvagno) alone researched. I am in no way a lawyer and offer no legal advice, but thought it would be foolish to not share basic copyright and patent law policy for scientific consideration.

 One of the biggest arguments I hear against open research is the fear about not being able to protect your intellectual property, also known as the fear of being scooped. The biggest oversight in that argument is that IP violations occur in traditional scientific culture both accidentally and maliciously. In an open environment, however, there is a greater risk of attracting this behavior if only because scientific research is made publicly available. With that said, there is nothing about being open that is any more inviting of harmful activity than in the traditional system. In fact, because of the current US legal system, being open may be more beneficial to protecting scientific information.

With regards to the US legal system, there are two primary protections available to scientists: (1) copyright law would protect recorded scientific information, for example data and ideas, while (2) patent law would protect scientific processes, production, procedures, etc.

Despite what is commonly believed, in no way does open notebook science prevent either protection from applying to scientific intellectual property. Open notebook science can actually stake your claim on IP and provide immediate protection. For patent law, patent protection is granted for one year once a work is publicly disclosed. If a patent is not filed, the IP becomes public domain and a patent can never be filed. In the case of copyright law, copyright applies from the moment of fixation (the moment scientific information is documented). In both cases, open notebook science can be used either as a defensive tactic to protect IP, or as an offensive tactic to prevent others from profiting from scientific IP.

Copyright Law

Copyright law is essentially very simple, and has been made increasingly simple since it was originally expanded upon in the US Constitution. The most recent addendum to this statute came about in the 1976 Copyright Act, which defined rights to copyright holders (exclusive rights), how copyright is achieved, and even what does/does not constitute infringement (fair use).

 While the law is simple in principle, copyright infringement is not necessarily black and white. In some instances it is questionable as to what is even copyrightable. In others, the matter of fair use is debatable. Even when there is infringement, it can be tough to prove because there are varying degrees of copying or “borrowing.”

The bare-essential rules of copyright law can be seen in Table 1:

Copyright is applied immediately from the moment any work is tangibly recorded, both publicly and privately.

To be protected a work needs to be original (not novel) and there needs to be a minimum element of creativity (known as expression).

The exclusive rights provided to copyright holders are reproduction, distribution, derivation, performance, and display.

Copyright infringement is a federal offense!

Even though copyright is applied immediately, in order to file suit for infringement a copyright needs to be registered with the US Copyright Office.

A copyright is not violated if it has been determined that the infringer has a fair use of the material. Fair use is a broad definition and is only created as a defense in infringement suits.

Table 1: Bare-essentials of copyright law.

Rule 2 from Table 1 may reveal that copyright law doesn’t apply to most of science intellectual property, because it is fact based and process driven. Patent law was developed for this very reason. While there are no statutes against having dual protection in the form of patents and copyrights, it is not likely to receive copyright protection if there is patent protection since the copyright lasts much longer than the patent. But that’s not to say none of science is copyrightable.

In fact, journal articles are in fact copyrighted. It can be interpreted that there is creative expression in organizing scientific discoveries (which are fact based) and that would make them copyrightable. Journals hold the copyrights for publications and have exclusive right to copy and distribute the articles any any material contained within. And there are cases where they’ve tried to enforce it.

In that link, the author tries to distribute (via publishing in her blog) figures from a publication and receives a cease and desist letter. Unfortunately it will never be known if there was a violation because the infringement never went to trial. She made an argument for fair use, which probably has some grounds, but skirted around the issue by recreating the figures using the original data (which is NOT copyrightable), thus making her own original figures which are therefore copyrightable. There is a chance that she has no fair use argument since her reuse (even through attribution) is a clear violation of distribution rights and can be viewed as falling within the same scope of the original publication.

 In the case of publications, scientists waive their copyright upon submission and acceptance for publication and dissemination, and grant that copyright to the journal. Not all scientific output is formatted for publication, or released at all. In that case, it would greatly benefit scientists to publish their figures via an open notebook to provide copyright protection for their research (if that is in fact the goal).

With regards to the traditional science system, scientists are offered protection from the moment they record their data and create figures based on that data. They are even protected at conferences where they present their research (either via an oral or poster format). This is specifically useful in the case of scientific scooping, which isn’t as rampant as we make it out to be but is still a major fear in the community. If there is a case of potential copyright infringement, you have the right to file suit (once you apply for copyright). If you can prove there was access to your research findings and there is substantial copying you may even win your case.

If you are an open scientist, in that you publish your research findings online before peer reviewed publication, you may be in an even better position. You are granted the same rights as a traditional scientist. In the open case, however, the proof of access is much easier to demonstrate since a simple Google search can turn up your findings. The burden is then that you prove there is evidence of copying, which is hard enough as it is.

Because of all the possible interpretations of copyright application to science, I highly advocate the use of the Creative Commons licenses. The CC0 (public domain), CC-BY (use with attribution), and CC-BY-SA (use with attribution and share alike) afford the copyright owner the ability to share their research findings with the community and in turn allow the community to share, use, and reuse those findings without fear of retaliation. It is incredibly important to note that using the CC licenses (with the exception of the CC0) does NOT waive all exclusive rights as a copyright holder. They allow you to waive your rights as long as the reuser of the original work attributes, shares, etc (per terms of the license) in turn. If those stipulations are infringed, you are free to take action. In fact, there is legal precedence of such action.

The licenses provide a means for others to use information and data without worrying about moral ambiguities, legal issues, and in turn promote a culture of sharing and attribution. With the CC licenses there will be more societal pressure to do the right thing. When credibility is involved social pressure can work wonders.

For more information, please refer to the US Copyright Office website.

Patent Law

The America Invents Act was initiated in 2011 and institutes some new changes to patent law. The newest inclusion to the law is that now patents are given based on a first-to-file system, whereas previously they were given through a first-to-invent system. This change was implemented on March 16, 2013 as a way to conform to international policy, but also to decrease the burden of the US Patent Office in identifying first-inventor which can be extremely complicated and arduous.

 In a first-to-file system, a patent will be granted to the first person to file a patent for a given invention. While the system is as simple as it sounds, it tends to give advantages to larger entities with the resources and efficiency to file patents for every invention conceived. It is outside the scope of this writing to argue the merits of a first-to-file or first-to-invent system, but this is mentioned because there are a couple of workarounds to the first-to-file mandate. The first is through the filing of a provisional application, and the second is through public disclosure. In both cases, there is a one-year grace period under which a patent must be filed lest it become public domain.

The provisional application is a low cost option that grants an inventor protection from competitive patent filings. The fee is $125 for small entity inventors, such as individuals, and $250 for large entities like corporations. The intellectual property remains a secret during the provisional period until patent. Public disclosure is a free alternative to the provisional patent, in the sense that there is nothing to file with the patent office. With this method, the details of an invention become public information, but no competitor may file a patent.

Scientifically speaking, patentable items include processes, designs, and technology of all sort (although computer programs are hard to patent or copyright). It is usually advantageous to maintain secrecy when dealing with intellectual property, and this culture is especially prevalent in science. As such many universities and institutions have legal services that aid scientists in patent filings. In an effort to maintain confidentiality, it is highly suggested by these services to file provisional applications for all inventions.

Much like copyright, the ultimate goal of a patent is to prevent competitors from stealing and reproducing a work without the inventor benefitting. It is little known fact that patents become public information after filing, generally 18 months after the earliest filing date. It is entirely possible for competitors to analyze a patent and create a “non-obvious”derivation of the work that can then be patented. In this scenario the benefit of the patent application is essentially lost.

Open notebook science can be a major benefit to the new patent process. Since it does cost money to file a provisional application, ONS (or other web disclosure) would provide a free alternative to the provisional application. The only difference between the two routes is that through ONS, the patent is immediately public information, while the provisional application maintains invention secrecy. Because the patent will eventually be public domain, the incentive to innovate is delayed a bit through the provisional process.

While ONS publicly discloses a scientific creation and encourages potential modification, it does not promote/encourage stealing the idea. Scientists are still protected from patent infringement. Now, if a competitor sees the notebook entries and makes non-obvious changes to the idea, then they can be granted a new patent, if filed. That is no different from how the patent process currently operates, it simply speeds up the process.

Filing a provision for every idea ever produced and paying $125 every time is a waste of money and resources. It is highly unlikely that every idea/invention will come to fruition. It also gives the US patent office a lot of unnecessary paperwork, and could actually stifle innovation and creativity. ONS would in turn allow a researcher to disseminate their ideas and protect the best ones for the original creator. Resources could be better used to fight for the best ideas and allow others to develop the ideas that won’t necessarily get the same level of attention or ever be produced.

In this way ONS could be used as a defensive tactic to protect a scientist from losing his/her best ideas. It is also possible for open notebook science to be used as an offensive tactic. In this maneuver, the documentation of ideas born from discussions or other endeavors creates prior art (which is essentially the same as public disclosure). An invention disclosed in prior art is exempt from patent protection. So in the case of public disclosure via ONS inventions would be blocked from filing for patent. Hypothetically, a researcher could publish any and all ideas, techniques, or technologies and prevent all competitors (and peers) from filing for patent.

In the interest of sharing research information, open notebook science may be the best protection against impediments in the scientific process. 

Notes on Intellectual Property: Copyright Law

In the quest to discover how a scientist may protect their intellectual property with regards to open access to that IP, I’ve decided to do some research. The notes contained here come from:

Intellectual Property: Patents, Trademarks, and Copyright (in a nut shell, 4th edition) by Arthur Miller and Michael Davis

In the interest of time and sanity, I’m going to focus on copyright law. Generally when providing open access and CC licensing, only copyright applies since nothing contained is trademarked or patented (except in the case where patents are filed). Hopefully the information I document here is useful to those who want to follow the model I have used, and maybe it’ll be useful to scientists who pursue other avenues of scientific discovery.

Foundations of Copyright Protection

  • first it should be said that copyrights pertain to “written” works which has come to expand to other works of art and computer programs, and in our case scientific data/research.
  • originally copyright law’s jurisdiction was from the moment of publication, but amended to the moment of fixation – that is the moment a work becomes transcribed into a tangible form. In our case that means once data/methods is acquired and stored.
  • typically, registration of a copyrighted work is important, but “the basic doctrine of this country’s copyright law is to protect authors without requiring it.” That is especially important for science because information and conclusions are being produced all the time and it would be nearly impossible to register all of that scientific work constantly.
  • The Copyright Clause of the US Constitution: “To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.”
    • Basically Congress has the ability to power to create legislation dealing with copyrights, and has chosen to do so since 1790 and has amended the law several times since then.
    • A 1976 revision to the law was created as the Copyright Act of 1976, which applied copyright to moment of fixation, like I stated before.
  • Prior to the 1976 Act copyright fell under two distinctions (not sure if that’s the right term): (1) there was common law copyright and (2) statutory copyright
    • common law gave authors the ability to protect their work from being copied forever as long as the work was unpublished.
    • once the work was published then statutory copyright law took over. this copyright was limited (unlike common law which was perpetual). The benefit was that authors could publish their work and claim a monopoly over their work and receive compensation while being protected by the law.
    • the problem with this system was that there was a gray period when common law copyright would end and statutory copyright would begin. To complicate matters new methods of communication made it hard to classify the concept of “publication.”
  • the 1976 Act essentially eliminates the concept of common law copyright and protects the author from the moment a work is recorded in some concrete way. For research I assume that would be from the moment notes are taken, but I can see a case to say that this moment is actually when a grant for research is written. Some articles in the act:
    • Section 102 is pretty important in that it defines the moment of copyright and what a work of authorship is. Interestingly section b of the law states: “In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work.” Despite the fact that copyright was specifically created to aid science the wording of that section seems contradictory. More information will be needed.
    • Section 106 gives the author exclusive rights to produce copies of the work and any person who makes copies without the authors consent is subject to an infringement suit and can be arrested (Section 506). Yikes! Derivative works are also protected.
    • The author is protected when displaying/performing the work publicly. This seems to be applicable to open science. Allowing scientists to publish their research without fear of data misuse/thievery
    • It seems copyright applies to the publication of science (data, journal articles, etc) but patents provide protection of the actual process of discovery. So the application of the law to open science would be a mixture of the two law regimes.
    • The basis of copyright protection lies in expression and originality. Since facts and ideas aren’t copyrightable the way an idea is expressed becomes important. So for science, data probably isn’t very protectable, but they way you display that data (interpretation) probably is copyrightable. Originality here becomes important. A work doesn’t need to be new or novel, it just needs to be proven that it wasn’t copied or derived from someone else.

The Subject Matter of Copyrights

  •  The key aspect of copyright is originality. According to the author “an author can claim copyright … as long as he created it himself, even if a thousand people created it before him.” 
    • This is especially interesting in the open publication world, and to me, makes Creative Commons licensing all the more important. With access to works (via the web) copyright violations can become more of an issue. The CC license essentially allows you to keep your copyright, but provide would-be authors the chance to adapt a work without fear of infringement (and likewise, authors won’t have to fear plagiarism).
    • Because of the simple concept of originality, there has been some interpretation as to what exactly can be copyrighted:
    • Burrow-Giles Lithographic Co v. Sarony (1884) established that artistic consideration and creative effort is enough for photographs to be copyrightable.
    • But in 1903 Bleistein v. Donaldson Lithographing Co declared that a work had originality if it was “one man’s alone.” At that point artistic merit was not to be considered by the court.
    • Artistic reproductions became copyrightable after Alfred Bell & Co. v. Catalda Fine Arts, Inc. (1951) because the reproduction can be considered an original work. Essentially the reproducer is protected from someone making copies of his reproduction. (This probably only applies to reproductions of works that are in the public domain, since only the copyright holder can allow reproductions of a work.) Also it must be demonstrated that the reproducer has contributed something more than trivial to the reproduction.
    • The “sweat of the brow” doctrine gave originality to works that were not artistic in nature. For instance, aggregations of public domain information were protected if the author demonstrated some investment of original work.
    • Feist Publications v. Rural Telephone Service (1991) rejected the “sweat of the brow” doctrine on the premise that there should be “some minimal degree of creativity.”
      • Basically simple information aggregation, or fact compiling, isn’t enough for copyright. But this shouldn’t exclude scientific data from being copyrightable since the collection of the data is a creative process and the data analysis is highly nontrivial.
      • Interestingly computer databases may fall into the category of non-copyrightable works and as such sui generis protections are required. This is interesting because of the involvement of data and may become an umbrella for scientific research.
      • As a result of this trial, there remains a lot of controversy as to how much creativity is required for copyright protection.
  • To determine what categories of works can be included for copyright protection see 17 USCA 102 (linked above). But the wording of that section suggests that copyrightable material need not fall under those categories specifically. Those are provided as a guide.
    • Works of utility (functional objects) are generally not granted copyright protection because that is what patents are for. But there are exceptions in the case of works that are non-functional, or for portions of functional objects that are non-functional (ie designs). For example Mazer v. Stein (1954) allowed the copyright of lamp bases.
    • When the idea and it’s expression are inseparable, copyright is generally denied. This affects things like forms, systems, software, and potentially scientific data. Blueprints on the other hand are copyrightable, and until recently the buildings themselves were not. Now buildings are copyright protected, but not functional components like doors and windows. Fashion designs fall into both realms, patterns are copyrightable but the design of clothes themselves are tough to copyright.
    • The availability of patent protection makes it hard to attain copyright, even though nothing is explicitly written to prevent this. In fact there has been a case to determine that patents and copyright can both exist in the same work (In re Yardley (1974)).
  • intangible expression is not protected under copyright since there is no fixation of the expression. Choreography is an example of this. Speeches are another, but presentations with powerpoint should be copyrighted because the presentation has been “scribed.” Likewise, audio recordings of a speech are copyrighted.
  • the term “writings” (as said in the Constitution) and the more narrow “works of authorship” (as written in the 1976 act) are incredibly hard to limit in scope. The authors note that it is “difficult to identify those works that would constitute writings but that would not be original works of authorship.”
  • computer programs are copyrightable, but may be denied copyright if they “lack minimal originality… or constitute the only way of accomplishing a particular result.” The second part is essentially phrased so that the program is itself an idea and no longer the expression of an idea that can be expressed in other ways.
    • when dealing with programs it seems there are two components literal and nonliteral:
      • literal components refer to the programming code and has been copyright protected
      • nonliteral components refer to the organization and the user-interface (among others) and is harder to attain copyright. This is especially true when the interface is dependent on user-interaction.
  • The Berne Convention has complicated the legality of copyright. Through signature, the US recognizes the copyright of all other countries that have also signed.
    • “the copyright formalities…have lost almost all of their legal significance”
    • “notice of copyright… has virtually no legal significance.”
    • “similarly, registration has almost no legal significance” –> “the only remaining procedureal effect of registration is that US authors must register before bringing suit.”

Exclusive Rights

  • see section 106 of the 1976 Act for the exclusive rights of authors. Most of these rights are upheld only publicly, but 2 (reproduction and derivative work) are subject to infringement both publicly and privately. Note that public is defined as “a performance or display to a ‘substantial number of persons’ outside of family and friends.”
  • reproduction allows the copyright owner to exclude all others from reproduction of the work
    • a copy is defined as “any material object from which, either with the naked eye or other senses, or with the aid of a machine or other device, the work can be perceived, reproduced, or communicated.”
    • phonorecords are not specifically excluded from the definition of copies, so they have been specifically added to the description of reproduction
  • derivative works (works based on the original work) are also under protection for a copyright owner
    • this is defined as “translations, arrangements, dramatizations, fictionalizations, films, recordings, abridgements, condensations, ‘or any other form in which a work may be recast, transformed, or adapted.'”
  • the right to distribute to the public “by sale or other transfer of ownership, or by rental, lease, or lending…”
    • called the first-sale doctrine
    • copyright owner has the right to prohibit others from distribution of work, until the ownership is sold/transferred. At this point, the new owner has this exclusive right.
    • designed to prevent restraints on alienation, “attempts to make an actual sale resemble something less than that… will be unsuccessful.”
    • it is possible a third-party to be held liable if there was no first sale
  • the right to perform work publicly is also provided to copyright owners, but excludes purely graphical works and I feel scientific data falls into this category.
  • the right to display a copyrighted work is also exclusive to a copyright holder.
    • owners of a copy of work are permitted to display one image of the copy and this includes digital transmission (internet, network, etc)


  • occurs when any of the exclusive rights of the copyright owner are violated – makes sense
    • doesn’t need to be intentional
    • it can even be unconscious – an author produces work that he conceives is original but is actually unintentionally borrowed from another author
    • indirect infringement – “one who actively and knowingly encourages another to infringe”
    • contributory infringement – producing a work/device that can be used to infringe on copyrights (see A&M Records v. Napster, 2001), but note that if there are substantial non-infringing uses then contributory infringement is not applied
    • vicarious/related infringement – seems similar to indirect inf. “a person who profits from an infringing performance, AND who somehow supervises or has the right to control or supervise the performance”
  • “to prove infringement, a party must establish ownership of the copyright and impermissible copying”
    • usually determined via circumstantial evidence
      • substantial similarity – remarkable resemblance to original work
      • proof of access – opportunity for contact with original work prior to creating work
    • literal copies allow for the proof of access requirement to be less
    • similarity and access are not required proofs, but merely an evidentiary method

Fair Use

  • “a balancing process by which a complex of variables determine whether other interests should override the rights of creators” – there are 4 interests:
    1. purpose and character of the use, including commercial uses
    2. the nature of the copyrighted work
    3. the proportion of the work that was used
    4. the economic impact of the use
  • seems like a very sticky thing to prove in cases of infringement and all cases involving fair use are ruled based on the interests listed above. Seems like cases where indirect infringement occurs has most likey use of fair use defense.
  • Purpose and Character:
    • commercial vs noncommerical
    • public vs private – private nature of use can be favorable in fair use defenses
    • educational and nonprofit (especially together) are favored for fair use, but not always grounds against infringement
  • nature of the work plays a role in determining fair use
    • ex: educational works may not fall into fair use if the original work is educational itself, because of the economic impact of the use (the works are in the same area of economic potential)
    • consent issue – would the author give consent for uncompensated use if the author can use the work for their own benefit?
    • unpublished nature of work may be within fair use, but prior cases have precedent for barring the defense
  • amount of the work used (proportion) is important in determining fair use
    • proportionality is to be measured with respect to the original (copyrighted) work, not the potentially infringing work
    • quantitative, qualitative, and reverse proportionality can all be used to determine fair use, but only the first two are specifically mentioned in law
  • economic impact is particularly important when determining fair use – this should be obvious since copyright is designed to provide an author protection to profit from their work


  • it is important to realize the physical work and the creative property are two separate entities. A transfer of the physical work does not constitute copyright transfer. This is important when considering communications between two parties: an email or letter for instance. The information in the communique is copyrighted and protected but the actual paper/message is nothing and particularly meaningless.
  • copyright must be transferred in writing
  • multiple authorship makes copyright ownership complicated and occurs when:
    1. work consists of material made by more than one person (joint works)
    2. work is made by one and published by another (work for hire)
    3. work can be neither joint nor work for hire and is classified as collective works
    4. work based on prior author is derivative
  • in cases of coauthors, each owner has the right to use the work for their own purposes, but neither can prevent the other from doing the same.
    • neither author is allowed to destroy the value of the work


  • copyright protection is automatic – as soon as a work is fixated (written, drawn, etc) copyright is applied
  • for clarification: copyright is designed to prevent copying, as an author you don’t need to find works that are similar to one you wish to create if you are creating something independently.
  • but registration of a copyright is required if legal action is to be taken – ie if you want to sue for infringement
    • you can register a copyright after finding an infringement but before filing suit
  • notice is optional (for works authored after 1989), but when it is applicable there are 3 rules, notice of copyright must be affixed with :
    1. copyright symbol (letter, symbol, word, or abbreviation
    2. the date of first publication
    3. the name of the copyright owner


Repeating Crumley Publication Prep

  1. I need a title for the paper. I’ve always called it Repeating Crumley, and maybe it makes sense to continue that trend, but is there a more fitting/descriptive name? Does it even matter?
  2. I think it makes sense to create .gifs from all the plant germination images for each sample of each experiment.
    1. From RC1-4 I had slideshows, which allowed you to click through each sample at your own pace. Then after I had started making .gifs (especially since that was around the time of memes on the web).
    2. I still think it makes sense to have all the data as pictures as well. If they aren’t already there, I will upload all the images to figshare, and have a separate dataset as gifs.
  3. Should the gifs be stored via my notebook (and thus the Winnower), or figshare?
    1. Both?
    2. Since the Winnower can actively display the .gifs, this has my preference, but I’m not sure. Maybe both… just because.
  4. Making a citation list for every notebook entry may be tiring, but it must be done.
  5. I’ll have to go through my figshare profile to see what data is currently up there.
  6. I worry that I don’t remember some of the data analysis methods. I think the only one I have absolutely no recollection of is the root length vs time graph. I remember it happening but I don’t remember going from Point A to B. I think of this like getting in the car and driving to work. You remember getting in the car, but you have no recollection of the in-between time because you were lost in thought. This is what happens when your brain is in Dissertation/Defense mode.
  7. The primary focus on this paper is going to be about the replication of the Crumley experiment through my methods and the difference in our results. I will include some of the cooler data, but won’t be able to write a follow-up (yet) since there is insufficient data on some of the cooler experiments. But I can show preliminary stuff!

I think that’s all I got now. I’ll keep adding notes like this when I get more ideas, come across roadblocks, or something else.

The Repeating Crumley-ONS Project: Next Steps

Slightly over a month ago, I came across the Winnower and began a project in open notebook science. The concept was to upload notes from my notebook to the Winnower, archive the notes, and get DOI’s for each post. Then I would write 2 papers: one to summarize the experiment and the other to theorize a complete publication system that would incentive open documentation of real-time research (open notebook science). I chose the Repeating Crumley experiment for this experiment in ONS, and you can read about the reasoning here.

Well I’m happy to say that I’ve completed Steps 1, 2, and 3! I’ve posted every notebook entry in the RC series (there’s a physics pun there somewhere) to the Winnower and received DOI’s for almost every post. A few posts didn’t translate, at all, on the platform. They are uploaded, but I didn’t bother with the DOI. Regardless, you can go on any of my Winnower posts and get a DOI (or click through to my notebook),  or look through the RC entries and click the DOI to get to the Winnower archive of that post.

One cool side effect of this project was that a Twitter friend noticed a post that had embedded .gifs and I think I am now credited with being the first to publish a scientific paper with embedded .gif’s.

Now it’s time to write the paper based on all this research. I got the process started a couple years ago with a Google Doc about the project. I think I never followed through, because I didn’t value the traditional publication process. I think open science and peer review publication are on a course to merge and the incentives for ONS will shift, but this is a topic for another time.

Anyway, here is the previous write-up which I’ll work on, merge with some info from my dissertation, and to which add some new thoughts.

This part may take some time…

Small-ish issue with digital object identifiers

I’m no expert in this space, but I came across an issue with digital object identifiers because of my annoyingly persistent use (overuse? hahaha) of figshare. What happens if the archive tool you use for your data switches from one permanent link system to another?

Back in the early days of figshare, they used the handle system to provide a permanent link for data stored in their system. At some point they switched to using the DOI system. I have no idea when it happened and I don’t even think I noticed the change. The only thing I know now is that my older figshare datasets are full of dead links.

The point of using a permanent link, ie a handle or a DOI, is to maintain a connection to the source if the URL or data at that source changes. Any changes will result in a change to the metadata which will allow the permanent link to point to the correct location. This allows you to change the URL for a dataset on figshare, for instance, and the DOI link will point you to the updated location.

In my case old projects that were linked via the handle system are all updated with DOIs. Since the two systems are different, I have the unique situation of having broken permanent links! Obviously, this defeats the purpose of a permanent link. So it seems I have some work to do to find all the outdated figshare sets and update them, which presents a very tedious set of challenges.

Has anyone ever experienced anything like this? I’m not familiar with the internal workings of permanent link systems, but is there a way to easily move from one system to another? Does this present an issue for the future of web science where DOIs or handles are obsolete? I imagine in that world there would need to be a system wide effort to ensure everything is upgraded properly (like switching from paper to electronic records).

100% Real-time publication: an experiment in #opennotebookscience

I’ve long been an advocate of open notebook science. In my advocacy, I am always looking for new ways to encourage fellow researchers to pursue this methodology for their own research. The latest of which pertains to archival and citability.

The ability to receive credit for your research, has been a requirement of science culture for quite some time, and is presently essential to an academic career. The altmetrics movement has been a valuable way to track and receive academic credit for new and nontraditional publication methods. Online tools like Impactstory help to track these activities, while tools like Figshare help propagate data and track your online impact as well.

This has always been missing from open notebooks.

I’ve always advocated against the need for a singular open notebook platform for the reason that ONS needs to have the flexibility to meet the needs of the scientists who use it. I’ve also never actively pursued a tool that can provide that formal citation credit since there are APA, MLA, etc rules for citing websites and other online resources. But the success of Figshare and other software has made me rethink this approach.

If open notebooks could have an automatic way to apply either a handle or a DOI, and could be archived, I think people would pay attention. If there was a publishing platform that could freely contain all the information of an open notebook, give the notebook a DOI (for instance) for each entry, and then host the final publication for peer review, there would be an even bigger incentive for ONS. And obviously there would be more transparency in the research process.

Where am I going with this?

Well a few days ago, I did a search for “DOI for WordPress” and came up with this, a plugin for a website called The Winnower. I had never heard of this organization so I went to the website and found a world of opportunity.

The Winnower, in case you are unfamiliar, is self-labeled as a DIY science publication platform that features a post-publication peer-review process to expedite and lower the entry barrier for publication. Once you submit your manuscript you can request a DOI for your article, which will undergo changes as you receive feedback for the publication.

The aforementioned plugin allows you to post blog entries (self-hosted WordPress blogs only for now) to the Winnower and receive DOIs, and with it the easy ability to be cited, for those entries. Integration between an open notebook and the Winnower (or a platform like it) could be a huge step forward for the ONS movement.

Imagine being able to see the entire scientific record for a study contained in the same system. Even better, imagine being able to witness the development of the study in real-time, providing feedback to the experiment, and being active in its development. When it comes time for peer-review, the process should theoretically be quick, because the work should have been vetted. If it hasn’t already, then it is relatively easy to review the prior work summarized in the publication, because it is all self-contained on the publishing platform (or the open notebook where the publication is).

In the interest of open science, I will perform an experiment. I will re-publish a series of notebook entries pertaining to one experiment and will write a paper based on that experiment. All of that will be published on the Winnower, since the mechanism is in place to cross-post from this notebook to that site.

The experiment I have in mind is the Repeating Crumley experiment that was the basis for my work on deuterium depleted water. It is the perfect experiment for this trial in ONS publication because the work turned out to reveal a mistake in the original study from the 1960’s, and I also propose a correction to the methods.

The key to this ONS experiment would be to understand what would be required of an open notebook or publication system to be able to provide a complete, organized, and user-friendly documentation system, or at least what is required for proper interaction between an open notebook and a publication platform. Additionally I hope to demonstrate another benefit to open notebook science in an effort to encourage others to participate in ONS.

In the spirit of open notebook science, I will document my interactions here and possibly also on the Winnower, and then write another publication on the Winnower about ONS and the peer-review system.

You can follow the documentation process through my Winnower profile.

UNM’s panel discussion about the use of Open Data #oaweek2014

I’m a bit slow to catch up on these things, but UNM has been holding a series of conversations about open access. For instance, today Mark Hanel discussed the growth of figshare and how much has changed since the organization began. Here are the collected tweets from a panel discussion regarding open access data sharing. Definitely worth the read.

Storify and tweets by Steve Koch


Do you?

%d bloggers like this: