Tag Archives: impact factor

Impact: Analytics for October

I was originally thinking of just posting the analytics for last week, but I’ve been receiving WordPress stats since this notebook started and I implemented Google Analytics towards the end of September so I might as well show the results of the whole month. Also I was going to post on Friday, but realized that stats taken for that day weren’t finished and decided to wait until this week to post concluded analytics.

Since I don’t know how to get data out of WordPress stats and only just learned that GA data can be exported, I took screen shots of the data. Unfortunately that means the data isn’t interactive (but it isn’t that way in a paper anyway), but hopefully I’ll learn how to deal with it in the coming weeks.

Finally I just want to say that from here on out data will only be of the previous week, and maybe a quick monthly report. In weekly data sets I’ll be looking at page views, visitors, and navigation. In monthly reports I’ll be looking at stat summaries, but will include popular pages, world maps, and whatever else is fun at the time. Today I’ve got a bunch of stuff so let’s jump right in!

[portfolio_slideshow]

On top of what I’ve written in the captions of the pictures above, I looked at some other details of the stats available to me and jot a few things down:

  • The average visitor time on site is around 2 minutes. This means people are staying and reading content. If visitors weren’t interested and left immediately this number would be lower. What’s more is that the average number of pages visited is 2, which means there is some site navigation. Since most information is posted to the main page there doesn’t need to be much navigation to determine usefulness. Having any in the context of a blog format (to me) is a pretty big deal.
  • With that said I noted the average time spent on 2 popular posts. For the ONS vs traditional scientific infrastructure post visitors spent an average of 2:48 (min:sec) on the page. This post is rather long which means visitors skimmed, left immediately, or read until they found relevant information and then left the site/page. On the first Impact Factor post (posted Friday) visitors spent an average of over 9min on the page (9:23)! I can’t explain that at all.
  • It should be noted that my visits are supposed to be blocked so they don’t get tracked in the data. I say supposed to be because there are 13 visits in GA from CHTM. At least 90% of those are me (accounting for friends and Koch). I usually mess in the back end of the site and almost never see the actual content (except in preview modes) so there is very minimal effect on my part anyway. It also should be noted that 13 views from 480 views is 2.7% so my impact is very negligible.
  • I am a firm believer in the uselessness of Twitter. Well they are showing me, because Twitter is my biggest referral. In fact, because of this whenever I post something new that I want to ensure it’s success I ask Koch to retweet my post. It’s impact snowballs from there and within 24 hours I will have a considerable number of hits (compared to my daily average). Social networking works!
  • The average page load time is 30 seconds. That is staggeringly high, but I think I can attribute it to one main cause: The water evaporation experiment. The slideshow script takes a while to load and I had 8 on the main page for about 2 weeks. This would definitely explain the high avg load time.

Now for the moment we’ve all been waiting for (well at least I was). I did a comparison of the number of page views for various dates from both GA and WordPress. I’ve been complaining about inconsistencies, but it appears that they are actually pretty similar with a couple of exceptions:

If you have any suggestions for tracking, making the data open, things you’d like to see specific to the site stats, or whatever else let me know in the comments.

Web Analytic Tools for Impact Factor

I’m going to publish my web analytics data weekly on this site. WordPress has it’s own limited site stats. On top of that I linked my Google Analytics account to this notebook, and I’m looking to check out one more analytics software: either Piwik (which is a free open sourced software that you install on your server) or CrazyEgg (which was suggested to me by Alan Marnett of BenchFly fame). Ideally I’d like to do both, but the downfall of installing all these analytic software is that they may bog down the site and cause lengthy loading times which would drive away traffic. Maybe scientists are different and will wait to see content, but I don’t want to take that chance.

Why so many different analytics?

Well from what I can tell, the data that is generated isn’t consistent. I experienced this on my own personal blog which is hosted by Google (Blogger). There are site stats that are incorporated with the Blogger software itself and I initially setup Google Analytics to analyze this blog (before Blogger added analytics). Both sources reveal different information (hits, links, navigation, etc) and the strange thing is that both are Google Analytics. I can’t tell if they are different versions of Analytics or the same version, but the fact is that the same piece of software somehow generates two different results.

So far in my own experience here the same rule applies: Google Analytics tells me one thing and WordPress tells me another. So by adding at least one more piece of software I’ll hopefully be able to get a better account of how my notebook is being used by the public. After all having more data is better than not having enough!

I will publish various pieces of information from all sources like hits for the week, top visited page, relevant references, or whatever else I determine is useful to the cause. Since most of this data is presented to me as charts and graphs, I will upload those graphs here for you to enjoy as well!

This weeks should be particularly enjoyable since I published that ONS vs science infrastructure article. I bet you can’t wait to see the data!

 

Publishing Openly on the Web: Impact Factor

It occurred to me, sadly not too long ago, that the web is way different than a book. Traditionally in science the only way to gauge your presence besides having a network of colleagues is through peer review citations. By this I mean having your publications cited in other publications.

Back before computers I have no idea how people measured how many times they were cited. Did they even care? Were the publication places so few that it was easy? I wish I could ask Gilbert Lewis!

But now, this is rather easy thanks to all the search algorithms used around the internet.

Typically this method of measuring scientific worth is known as impact factor. While it has its uses (it is easy to gauge how prominent someone is if the number is high), it has several flaws and the worst of which is used for prejudging papers submitted for peer review.

Several publications (Science and Nature being the proclaimed best) accept papers for print based on perceived impact factor, meaning reviewers and editors judge papers based on whether or not they think a paper will have a big impact in the field. While this has its own merit, the biggest problem with that is there are just as many papers in those publications that have low citations as there are in any other journal.

While open access journals have begun a fight against this thinking, and journals such as PLoS One review potential articles without perceiving impact, I feel that there is a whole lot more that can be done to truly weigh impact factor. What’s more is that I want to apply this to open notebook science to (hopefully) show that open notebook articles can carry just as much weight, if not more, as a peer reviewed article in any journal.

And just how do I intend to do all that?

With website analytics!

Because of search engines, you no longer need to go to the library to find articles relevant to your field of study. You can just type something vague into a search engine bar, like tobacco seeds d2o – which has the top site as this notebook – and you can find all matter of information. Google Scholar allows you to narrow the search to just scholarly articles, and hopefully one day lab notebooks can be included in that search algorithm.

It is because of the ability to search, that people decided they needed some way to measure visitors to their sites. Someone created/developed/invented web analytics to track all sorts of fancy information about a visitor’s use of a website. Just about everyone on the web does this, and I thought “Why not scientists?”

Not all website analytics software is the same but generally speaking they all have the same basic functions:

  • track hits – number of visitors to the site
  • record movement – where in your site visitors navigate, and how long they stay there
  • referrers – did a visitor click a link from another website to get to yours? Was it a search engine, RSS reader, colleague, etc?
  • search results – which search engine a visitor used, and what they searched to find your page

While most of this is used for marketing purposes, it is still relevant for science. Being able to judge how many people read your open notebook consistently, seems to be something that scientists would like to know. Seeing that people are able to access your data through search results is useful and can help scientists optimize their results to target specific people. And finding out how people are coming across your information from sites that aren’t search engines, seems like it could be the new impact factor.

Twitter and Facebook have revolutionized how data is shared. I have a network of followers and friends and if I post something useful in my blog I share it to Twitter/Facebook. My network finds and reads it and then shares that with their network. That to me, is just like traditional citations because the people I’m targeting with my notebook entries are my peers, they read it, offer feedback and comments, and then share it with their peers. The act of sharing is the modern citation.

Not only that, but my commenting plugin (Disqus) actually counts the number of social media interactions and links to my notebook as “Reactions.” This gives me a solid number to say my notebook has been referenced “so many” times. Sounds like a citation to me.

Of course, people may not weigh this data the same, but I’m hoping that by acquiring actual numbers to associate to my notebook to go along with real world success stories, people will begin to weigh these citations similarly. I mean if all the major businesses in the world can count hits and use that as their metric for popularity and impact, why can’t science?