Tag Archives: impact factor

Quick analytics of the most popular pages in my notebook

According to Wordpress stats the greatest post I’ve ever written has received almost 1000 visits (973 as of right now). That is astounding when compared to the homepage which has been visited 1667 times and has existed for longer than 3 weeks (5 months to be more accurate). The next most visited page after those two is my Experiments page (click link above) which has been visited 254 times. That number alone is pretty successful but it’s peanuts compared to the amount of people who care/complain about the peer-review article I wrote (which itself is peanuts compared to much more popular scientific writings).

Just saying…

Monday Analytics on Tuesday: 2011 in Review

There is a new feature in the WordPress stats that allows me to see a summary of the best stats for 2011 (for this site). I’ve decided to make it public (because that’s what I do) and share it with you.

You can see the most visited posts and more by clicking this link:

Monday Analytics Reviewed

By the way, I’ll be bringing back the site stats after ScienceOnline 2012.

Analytics Monday: Week of 12/5/11

The big news of this week is the addition of FigShare to my open notebook repertoire. I uploaded the Crumley data there for all to access. While technically the data was already open and accessible through this notebook, having it in more than one location is better!

I think I’m losing the goal of analytics. While I love looking at the hits and it is a very short term reward, I think the important information is not how many people are visiting the site, but where the people are coming from. By knowing where the audience is visiting from, you can better gauge the level of impact you may have on the scientific community. As an example, a lot of my traffic comes from Google searches – I still get a ton of hits for “Open PCR” (and will probably get more because of that mention there) – which is great, but probably not measurable right now in terms of traditional impact. But every now and again I get a visitor who is referred from a site that links my blog. While right now this is kinda small potatoes, eventually (hopefully) someone will link a protocol or a data set, which to me is just as good as a paper citation. That to me says “this person has a pretty good written protocol that you can trustfully follow” or “here is some interesting data based on a similar set of experiments.”

When that happens (and it will) and it happens to others (and it will as well) then ONS will become a viable outlet for more than just a handful of scientists. So today let’s look at some referrers:

Figure of visits by source (from Google Analytics).
  • A good number of visitors came from search results and referrals. The referrals are listed as: Facebook (predominantly), Google Plus, LinkedIn, Andy’s Notebook, and Wikipedia. I’ve noticed that I’ve been getting some hits from the ONS Wikipedia page which warms my heart.
  • As you can see, I’ve been hitting the social media outlets pretty hard. I actually don’t use twitter that frequently, but I have my posts auto-populate twitter and they go viral from there. I’m pretty amazed because I’ve always said Twitter is useless. It works pretty well for about a few hours and then goes dead, that’s how fast information is nowadays. As for the rest of the social media, I only use it because how are other scientists supposed to come across things that may interest them if I don’t do some form of promotion?
  • A surprise to me is that most of my hits are getting tracked as “campaign” and I don’t know what that means! I know one component of campaigns has to do with visitors from RSS feeds and another source of campaign traffic are hits from Twitter. I would have assumed twitter would go under referrals, since other social media is sourced as that. I’ll have to investigate further since I don’t understand the associations, but it is interesting that I could even have a campaign association for an open notebook.

UPDATE: I removed the section that links to the FigShare Crumley data. I thought the embed box on FigShare linked to the data, but it instead linked to the site itself. Oh well.

Analytics Monday: Week of 11/28/11

This was a big week for my notebook. For some reason on Friday I had 100 page views! WP stats won’t give me enough detail to track what people were looking at, but GA gave me a little more insight. WP told me that there were over 50 views to the main page, but Analytics actually says there were less visits to that page and that there was more surfing than usual.

Interestingly there were only 18 visitors, which means that several people (or one person) spent a considerable time browsing. And I know that it isn’t anyone from the lab because there are no views from anywhere in NM! It’s also good to know that the visitors were able to navigate easily enough, especially because the day before I posted an article about the potential importance of user experience in an open notebook.

With that said here are the pageviews for the week:

I’m looking to expand this study a bit so if anyone has any ideas I’d love to hear them.

Analytics Monday on Tuesday: 11/14 – 11/27

Well it looks like my notebook took a hit while I was away because there was no new content. What intrigues me is that I usually have steady traffic from Google searches relating to open pcr, open notebook science, tobacco seeds, etc but for some reason no one searched these past week and a half. I guess I answered everyone’s questions, ha!

As usual here are the analytics reports from my two tools, and it looks like the correlation is pretty remarkable. Check it out:

 

Analytics Monday: Week of 11/7/11

This week was a very strange week indeed. It started off normally and I received generic traffic from people following my notebook. Then I created Google+ and Facebook pages to promote my notebook through other media. Then I blogged about the motivations for doing so and the failures of each. Then for some reason I got a ton of traffic, and none of it makes sense. To make matters worse, Google Analytics and WordPress Stats don’t correlate at all during these record breaking numbers, and my social media tracking tools don’t work either at all or because twitter doesn’t work. Let’s talk about this:

  • I published the article about Google+ and Facebook Pages on Thursday. Then on Friday I had a huge spike in numbers. The traffic on this day doesn’t demonstrate anything relevant to that topic, but yet I still had the most page views ever in the short history of this notebook. This day also marks the end of the correlation between the two analytics services. WP says 85 views and GA says 81. The most viewed page was the main page, with the Macrophotography post coming in second.
  • Apparently on Friday there was a lot of browsing. As GA reports 8 pages per visit meaning people surfed the site. Great!
  • Saturday was real weird because WP says there was almost no traffic, meanwhile GA says there was a lot. Meanwhile Sunday got a ton of traffic on WP but about half as much on GA. I’m like 99% sure this discrepancy is due to a time error in WP. I’ve noticed that some posts will post the next day if I publish around 5pm (not sure what the exact time is). And according to GA the first big spike in traffic on Saturday was at 7pm MT. This would be after WP switches dates. Which would explain the large number of visitors on Sunday for WP and the split amount on Sat and Sun for GA.
  • A new referred has joined the crowd, Networked Blogs. I had to sign up for that so I could import my notebook posts into the new IheartAnthony’s Research Facebook Page. They of course redirect you through their site to my site, which I don’t think I’m too keen on right now, but it works so I won’t complain… yet.

So now that I figured out the discrepancy between the WP and GA stats for the weekend (time difference issues) the numbers between the two continue to correlate well, not perfectly, but well. That is a positive.

Unfortunately all this new traffic has not resulted in any positive measurable effects outside of web traffic stats. I’ve received a few comments from outside the lab (thanks Bill Hooker!). For me, the purpose of open notebook science isn’t just about data transparency and archiving, but it is also about engaging the community. And until I’ve got some success stories to share, measuring analytics won’t convince anyone that open notebooks are going to be the future of science.

I need a community project…

Impact: Analytics for week of 10/31/11

I’m still trying to figure out how to direct this experiment. I think right now I’m just going to compare the analytics tools to ensure they are similar in results and are reliable. If something ever prominent happens (like the research here gets cited or something) I’ll try and show proof through the analytics. Right now this is just raw data that is noteworthy, but not for any particular reason.

Here are the Site Stats Comparison (in Page Views):

Notes:

  • There was a noticeable amount of traffic last week from searches that involved OpenPCR. I wrote a series of posts about building the device and troubleshooting. The guys over at OpenPCR loved it, and I’ve been getting hits sporadically relating to those posts, but this past week almost every day I had a hit relating to those posts. Hopefully they are useful to people.
  • Someone also searched “eee transformer useful” and came to this site. I most certainly didn’t say in my post on the tablet (Asus Eee Pad Transformer) that the device was useful so I’m assuming they got that message. I still don’t like tablets and feel they have a long way to go. In related news I learned that the chromebooks have been on sale for a little while. I would much rather have one of those. Maybe I should try it out.
  • Someone also searched “tobacco seeds microscope.” That warms my heart because that size comparison study was pretty well done (with rulers and everything) and I hope the person who searched that got what they were looking for. I wish people would leave comments though so I had some sort of feedback.

There isn’t much else to report this week. I got a lot of hits from twitter again and some from Steve’s blog. The main page was the most looked at page, but a close second was the setup for RC4 and surprisingly Alex was not the majority of those hits. Wednesday was a huge spike in traffic and I can’t explain why and Google Analytics and WordPress stats don’t correlate very well on that day either (46 vs 71 views).

Impact: Analytics for October

I was originally thinking of just posting the analytics for last week, but I’ve been receiving WordPress stats since this notebook started and I implemented Google Analytics towards the end of September so I might as well show the results of the whole month. Also I was going to post on Friday, but realized that stats taken for that day weren’t finished and decided to wait until this week to post concluded analytics.

Since I don’t know how to get data out of WordPress stats and only just learned that GA data can be exported, I took screen shots of the data. Unfortunately that means the data isn’t interactive (but it isn’t that way in a paper anyway), but hopefully I’ll learn how to deal with it in the coming weeks.

Finally I just want to say that from here on out data will only be of the previous week, and maybe a quick monthly report. In weekly data sets I’ll be looking at page views, visitors, and navigation. In monthly reports I’ll be looking at stat summaries, but will include popular pages, world maps, and whatever else is fun at the time. Today I’ve got a bunch of stuff so let’s jump right in!

[portfolio_slideshow]

On top of what I’ve written in the captions of the pictures above, I looked at some other details of the stats available to me and jot a few things down:

  • The average visitor time on site is around 2 minutes. This means people are staying and reading content. If visitors weren’t interested and left immediately this number would be lower. What’s more is that the average number of pages visited is 2, which means there is some site navigation. Since most information is posted to the main page there doesn’t need to be much navigation to determine usefulness. Having any in the context of a blog format (to me) is a pretty big deal.
  • With that said I noted the average time spent on 2 popular posts. For the ONS vs traditional scientific infrastructure post visitors spent an average of 2:48 (min:sec) on the page. This post is rather long which means visitors skimmed, left immediately, or read until they found relevant information and then left the site/page. On the first Impact Factor post (posted Friday) visitors spent an average of over 9min on the page (9:23)! I can’t explain that at all.
  • It should be noted that my visits are supposed to be blocked so they don’t get tracked in the data. I say supposed to be because there are 13 visits in GA from CHTM. At least 90% of those are me (accounting for friends and Koch). I usually mess in the back end of the site and almost never see the actual content (except in preview modes) so there is very minimal effect on my part anyway. It also should be noted that 13 views from 480 views is 2.7% so my impact is very negligible.
  • I am a firm believer in the uselessness of Twitter. Well they are showing me, because Twitter is my biggest referral. In fact, because of this whenever I post something new that I want to ensure it’s success I ask Koch to retweet my post. It’s impact snowballs from there and within 24 hours I will have a considerable number of hits (compared to my daily average). Social networking works!
  • The average page load time is 30 seconds. That is staggeringly high, but I think I can attribute it to one main cause: The water evaporation experiment. The slideshow script takes a while to load and I had 8 on the main page for about 2 weeks. This would definitely explain the high avg load time.

Now for the moment we’ve all been waiting for (well at least I was). I did a comparison of the number of page views for various dates from both GA and WordPress. I’ve been complaining about inconsistencies, but it appears that they are actually pretty similar with a couple of exceptions:

If you have any suggestions for tracking, making the data open, things you’d like to see specific to the site stats, or whatever else let me know in the comments.

Web Analytic Tools for Impact Factor

I’m going to publish my web analytics data weekly on this site. WordPress has it’s own limited site stats. On top of that I linked my Google Analytics account to this notebook, and I’m looking to check out one more analytics software: either Piwik (which is a free open sourced software that you install on your server) or CrazyEgg (which was suggested to me by Alan Marnett of BenchFly fame). Ideally I’d like to do both, but the downfall of installing all these analytic software is that they may bog down the site and cause lengthy loading times which would drive away traffic. Maybe scientists are different and will wait to see content, but I don’t want to take that chance.

Why so many different analytics?

Well from what I can tell, the data that is generated isn’t consistent. I experienced this on my own personal blog which is hosted by Google (Blogger). There are site stats that are incorporated with the Blogger software itself and I initially setup Google Analytics to analyze this blog (before Blogger added analytics). Both sources reveal different information (hits, links, navigation, etc) and the strange thing is that both are Google Analytics. I can’t tell if they are different versions of Analytics or the same version, but the fact is that the same piece of software somehow generates two different results.

So far in my own experience here the same rule applies: Google Analytics tells me one thing and WordPress tells me another. So by adding at least one more piece of software I’ll hopefully be able to get a better account of how my notebook is being used by the public. After all having more data is better than not having enough!

I will publish various pieces of information from all sources like hits for the week, top visited page, relevant references, or whatever else I determine is useful to the cause. Since most of this data is presented to me as charts and graphs, I will upload those graphs here for you to enjoy as well!

This weeks should be particularly enjoyable since I published that ONS vs science infrastructure article. I bet you can’t wait to see the data!

 

Publishing Openly on the Web: Impact Factor

It occurred to me, sadly not too long ago, that the web is way different than a book. Traditionally in science the only way to gauge your presence besides having a network of colleagues is through peer review citations. By this I mean having your publications cited in other publications.

Back before computers I have no idea how people measured how many times they were cited. Did they even care? Were the publication places so few that it was easy? I wish I could ask Gilbert Lewis!

But now, this is rather easy thanks to all the search algorithms used around the internet.

Typically this method of measuring scientific worth is known as impact factor. While it has its uses (it is easy to gauge how prominent someone is if the number is high), it has several flaws and the worst of which is used for prejudging papers submitted for peer review.

Several publications (Science and Nature being the proclaimed best) accept papers for print based on perceived impact factor, meaning reviewers and editors judge papers based on whether or not they think a paper will have a big impact in the field. While this has its own merit, the biggest problem with that is there are just as many papers in those publications that have low citations as there are in any other journal.

While open access journals have begun a fight against this thinking, and journals such as PLoS One review potential articles without perceiving impact, I feel that there is a whole lot more that can be done to truly weigh impact factor. What’s more is that I want to apply this to open notebook science to (hopefully) show that open notebook articles can carry just as much weight, if not more, as a peer reviewed article in any journal.

And just how do I intend to do all that?

With website analytics!

Because of search engines, you no longer need to go to the library to find articles relevant to your field of study. You can just type something vague into a search engine bar, like tobacco seeds d2o – which has the top site as this notebook – and you can find all matter of information. Google Scholar allows you to narrow the search to just scholarly articles, and hopefully one day lab notebooks can be included in that search algorithm.

It is because of the ability to search, that people decided they needed some way to measure visitors to their sites. Someone created/developed/invented web analytics to track all sorts of fancy information about a visitor’s use of a website. Just about everyone on the web does this, and I thought “Why not scientists?”

Not all website analytics software is the same but generally speaking they all have the same basic functions:

  • track hits – number of visitors to the site
  • record movement – where in your site visitors navigate, and how long they stay there
  • referrers – did a visitor click a link from another website to get to yours? Was it a search engine, RSS reader, colleague, etc?
  • search results – which search engine a visitor used, and what they searched to find your page

While most of this is used for marketing purposes, it is still relevant for science. Being able to judge how many people read your open notebook consistently, seems to be something that scientists would like to know. Seeing that people are able to access your data through search results is useful and can help scientists optimize their results to target specific people. And finding out how people are coming across your information from sites that aren’t search engines, seems like it could be the new impact factor.

Twitter and Facebook have revolutionized how data is shared. I have a network of followers and friends and if I post something useful in my blog I share it to Twitter/Facebook. My network finds and reads it and then shares that with their network. That to me, is just like traditional citations because the people I’m targeting with my notebook entries are my peers, they read it, offer feedback and comments, and then share it with their peers. The act of sharing is the modern citation.

Not only that, but my commenting plugin (Disqus) actually counts the number of social media interactions and links to my notebook as “Reactions.” This gives me a solid number to say my notebook has been referenced “so many” times. Sounds like a citation to me.

Of course, people may not weigh this data the same, but I’m hoping that by acquiring actual numbers to associate to my notebook to go along with real world success stories, people will begin to weigh these citations similarly. I mean if all the major businesses in the world can count hits and use that as their metric for popularity and impact, why can’t science?