total-impact: about

what is total-Impact?

Total-Impact is a website that makes it quick and easy to view the impact of a wide range of research output. It goes beyond traditional measurements of research output -- citations to papers -- to embrace a much broader evidence of use across a wide range of scholarly output types. The system aggregates impact data from many sources and displays it in a single report, which is given a permaurl for dissemination and can be updated any time.

who is it for?

researchers who want to know how many times their work has been downloaded, bookmarked, and blogged
research groups who want to look at the broad impact of their work and see what has demonstrated interest
funders who want to see what sort of impact they may be missing when only considering citations to papers
repositories who want to report on how their research artifacts are being discussed
all of us who believe that people should be rewarded when their work (no matter what the format) makes a positive impact (no matter what the venue). Aggregating evidence of impact will facilitate appropriate rewards, thereby encouraging additional openness of useful forms of research output.

how should it be used?

Total-Impact data can be:

highlighted as indications of the minimum impact a research artifact has made on the community
explored more deeply to see who is citing, bookmarking, and otherwise using your work
run to collect usage information for mention in biosketches
included as a link in CVs
analyzed by downloading detailed metric information

how shouldn’t it be used?

Some of these issues relate to the early-development phase of total-Impact, some reflect our early-understanding of altmetrics, and some are just common sense. Total-Impact reports shouldn't be used:

as indication of comprehensive impact
Total-Impact is in early development. See limitations and take it all with a grain of salt.
for serious comparison
Total-Impact is currently better at collecting comprehensive metrics for some artifacts than others, in ways that are not clear in the report. Extreme care should be taken in comparisons. Numbers should be considered minimums. Even more care should be taken in comparing collections of artifacts, since some total-Impact is currently better at identifying artifacts identified in some ways than others. Finally, some of these metrics can be easily gamed. This is one reason we believe having many metrics is valuable.
as if we knew exactly what it all means
The meaning of these metrics are not yet well understood; see section below.
as a substitute for personal judgement of quality
Metrics are only one part of the story. Look at the research artifact for yourself and talk about it with informed colleagues.

what do these number actually mean?

The short answer is: probably something useful, but we’re not sure what. We believe that dismissing the metrics as “buzz” is short-sited: surely people bookmark and download things for a reason. The long answer, as well as a lot more speculation on the long-term significance of tools like total-Impact, can be found in the nascent scholarly literature on “altmetrics.”

The Altmetrics Manifesto is a good, easily-readable introduction to this literature, while the proceedings of the recent altmetrics11 workshop goes into more detail. You can check out the shared altmetrics library on Mendeley for more even relevant research. Finally, the poster Uncovering impacts: CitedIn and total-Impact, two new tools for gathering altmetrics, recently submitted to the 2012 iConference, describes a case study using total-Impact to evaluate a set of research papers funded by NESCent; it has some brief statistical analysis and some visualisations of the results.

what kind of research artifacts can be tracked?

Total-Impact currently tracks a wide range of research artifacts, including papers, datasets, software, preprints, and slides.

Because the software is in early development it has limited robustness for input variations: please pay close attention to the expected format and follow it exactly. For example, inadvertently including a "doi:" prefix, or omitting "http" from a url may render the IDs unrecognizable by the system. Add each ID on a separate line in the input box.

artifact type	host	supported ID format	example
a published paper	any journal that issues DOIs	DOI (simply the DOI alone)	10.1371/journal.pcbi.1000361
a published paper	PubMed	PubMed ID (no prefix)	17808382
a published paper	Mendeley	Mendeley UUID	ef35f440-957f-11df-96dc-0024e8453de8
dataset	Genbank	accession number	AF313620
dataset	PDB	accession number	2BAK
dataset	Gene Expression Omnibus	accession number	GSE2109
dataset	ArrayExpress	accession number	E-MEXP-88
dataset	Dryad	DOI	10.5061/dryad.1295
software	GitHub	URL (starting with http)	https://github.com/mhahnel/total-Impact
software	SourceForge	URL	http://sourceforge.net/projects/aresgalaxy
slides	SlideShare	URL	http://www.slideshare.net/phylogenomics/eisenall-hands
generic url	A conference paper, website resource, etc.	URL	http://opensciencesummit.com/program/

Identifiers are automatically exploded to include synonyms when possible (PubMed IDs to DOIs, DOIs to URLs, etc).

Stay tuned, we expect to support more artifact sources soon! Want to see something included that isn't here? See the How can I help section below.

which metrics are measured?

Metrics are computed based on the following data sources:

CrossRef

An official Digital Object Identifier (DOI) Registration Agency of the International DOI Foundation.

authors: the authors of the publication
journal: the journal where the paper was published
year: the year of the publication
title: the title of the publication

Mendeley

A research management tool for desktop and web.

groups: the number of groups of the article
readers: the number of readers of the article

Slideshare

The best way to share presentations, documents and professional videos.

downloads: the number of downloads of the presentation
comments: the number of comments on the presentation
favorites: the number of times a presentation has been favorited
views: the number of views of the presentation

Dryad

An international repository of data underlying peer-reviewed articles in the basic and applied biology.

most downloaded file: number of downloads of the most commonly downloaded data package component
total downloads: combined number of downloads of the data package and data files
package views: number of views of the main package page
file views: combined number of views of the data package and data files

PLoSALM

PLoS article level metrics.

blogs: The number of blogs of this article recorded in Research Blogging (API now defunct)
abstract views: the number of times the abstract has been viewed at PubMed Central (confirm)
scanned page views: the number of times the scanned pages have been viewed at PubMed Central, if applicable (confirm)
xml views: the number of downloads of the PLoS XML article
supp data views: the number of times the supplementary material has been viewed at PubMed Central (confirm)
blogs: The number of times this article was mentioned on Postgeonomic blogs. This service was discontinued by Nature Publishing Group in 2009.
bookmarks: The number of times a user has bookmarked an article in CiteULike.
citations: The citation data reported for an article from CrossRef.
citations: the number of times the article has been cited by other articles in PubMed Central (confirm)
pdf views: the number of times the PDF has been viewed at PubMed Central (confirm)
scanned summary views: the number of times the scanned summary has been viewed at PubMed Central, if applicable (confirm)
citations: The citation data reported for an article from PubMed Central
citations: The citation data reported for an article from Scopus.
blogs: The number of blog articles in Nature Blogs that have mentioned an article.
citations: The citation data reported for an article from Web of Science.
cited by: The citation data reported for an article from PubMed Central
blogs: The number of times this article was mentioned on Bloglines blogs. (API now defunct)
html views: the number of downloads of the PLoS HTML article
bookmarks: the number of bookmarks to an article in Connotea API (API almost defunct)
figure views: the number of times the figures have been viewed at PubMed Central, if applicable (confirm)
unique ip views: the number of unique IP addresses that have viewed the artifact at PubMed Central (confirm)
pdf views: the number of downloads of the PLoS PDF article
html views: the number of times the full text has been viewed at PubMed Central (confirm)

PLoSsearch

PLoS full text search.

mentions: the number of mentions in PLoS article full text

Facebook

A social networking service.

shares: the number of users who shared a post about the object
likes: the number of users who liked a post about the object
comments: the number of users who commented on a post about the object
clicks: the number of users who clicked who commented on a post about the object

CiteULike

CiteULike is a free service to help you to store, organise and share the scholarly papers you are reading.

bookmarks: The number of times a user has bookmarked this artifact

Wikipedia

Wikipedia is the free encyclopedia that anyone can edit.

mentions: The number of articles that mention this artifact

Delicious

The tastiest bookmarks on the web.

bookmarks: The number of bookmarks to this artifact (maximum=100)

PubMed

PubMed comprises more than 21 million citations for biomedical literature from MEDLINE, life science journals, and online books.

citations: The number of times this DOI has been cited in papers in PubMed Central

Topsy

Real-time search for the social web,

tweets: the number of tweets of the artifact
influential tweets: the number of tweets of the artifact by influential tweeters

Research Blogging

allows readers to easily find blog posts about serious peer-reviewed research

blogs: the number of blogs about this research article indexed by Research Blogging

GitHub

Social Coding.

watchers: The number of people who are watching the GitHub repository

SourceForge

Find, Create, and Publish Open Source software for free

recommenders: The number of times a user has recommended this software package

where is the journal impact factor?

We do not include the Journal Impact Factor (or any similar proxy) on purpose. As has been repeatedly shown, the Impact Factor is not appropriate for judging the quality of individual research artifacts. Individual article citations reflect much more about how useful papers actually were. Better yet are article-level metrics, as initiated by PLoS, in which we examine traces of impact beyond citation. Total-Impact broadens this approach to reflect artifact-level metrics, by inclusion of preprints, datasets, presentation slides, and other research output formats.

where is my other favourite metric?

We only include open metrics here, and so far only a selection of those. We welcome contributions of plugins. Your plugin need not reside on our server: you can host it if we can call it with our REST interface. Write your own and tell us about it.

You can also check out these similar tools:

what are the current limitations of the system?

Total-Impact is in early development and has many limitations. Some of the ones we know about:

Gathering IDs and quick reports sometimes miss artifacts

misses papers in Mendeley profiles that aren't returned in a title/author/year search
Mendeley groups detail page only shows public groups
seeds only first 100 artifacts from Mendeley groups
doesn’t handle dois for books properly

Artifacts are sometimes missing metrics

doesn’t display metrics with a zero value, though this information is included in raw data for download
sometimes the artifacts were received without sufficient information to use all metrics. For example, the system sometimes can't figure out the DOI from a Mendeley UUID or URL.

Metrics sometimes have values that are too low

some sources have multiple records for a given artifact. Total-Impact only identifies one copy and so only reports the impact metrics for that record. It makes no current attempt to aggregate across duplications within a source.

Other

max of 250 artifacts in a report; artifact list that are too long are truncated and a note is displayed on the report.

Tell us about bugs! @totalImpactdev (or via email to total-Impact@googlegroups.com)

is this data Open?

We’d like to make all of the data displayed by total-Impact available under CC0. Unfortunately, the terms-of-use of most of the data sources don’t allow that. We're trying to figure out how to handle this.

An option to restrict the displayed reports to Fully Open metrics — those suitable for commercial use — is on the To Do list.

The total-Impact software itself is fully open source under an MIT license. GitHub

does total-Impact have an api?

yes! We have a full roadmap of an api spec and have implemented the main piece. Please don’t use it heavily or in production yet; we haven't implemented good caching. It is still early days: we welcome your feedback on how to make it useful and easy.

Initial implementation includes:

GET /items/ID1,ID2,ID3 or GET /items/ID1,ID2,ID3.html

returns html for those IDs, as it would appear on the total-impact website.

GET /items/ID1,ID2,ID3.json

all metrics info in json format

GET /items/ID1,ID2,ID3.xml

all metrics info in xml format

GET /items/ID1,ID2,ID3.json?fields=biblio,aliases,metrics,debug

allows subsetting the metrics info returned

who developed total-Impact?

Concept originally hacked at the Beyond Impact Workshop. Contributors. Continued development effort on this skunkworks project was done on personal time, plus some discretionary time while funded through DataONE (Heather Piwowar) and a UNC Royster Fellowship (Jason Priem).

what have you learned?

the multitude of IDs for a given artifact is a bigger problem than we guessed. Even articles that have DOIs often also have urls, PubMed IDs, PubMed Central IDs, Mendeley IDs, etc. There is no one place to find all synonyms, yet the various APIs often only work with a specific one or two ID types. This makes comprehensive impact-gathering time consuming and error-prone.
some data is harder to get than we thought (wordpress stats without requesting consumer key information)
some data is easier to get than we thought (vendors willing to work out special agreements, permit web scraping for particular purposes, etc)
lack of an author-identifier makes us reliant on user-populated systems like Mendeley for tracking author-based work (we need ORCID and we need it now!)
API limits like those on PubMed Central (3 request per second) make their data difficult to incorporate in this sort of application

how can I help?

can you write code? Dive in! github url: https://github.com/mhahnel/total-Impact.
do you have data? If it is already available in some public format, let us know so we can add it. If it isn’t, either please open it up or contact us to work out some mutually beneficial way we can work together.
do you have money? We need money :) We need to fund future development of the system and are actively looking for appropriate opportunities.
do you have ideas? Maybe enhancements to total-Impact would fit in with a grant you are writing, or maybe you want to make it work extra-well for your institution’s research outputs. We’re interested: please get in touch (see bottom).
do you have energy? We need better “see what it does” documentation, better lists of collections, etc. Make some and tell us, please!
do you have anger that your favourite data source is missing? After you confirm that its data isn't available for open purposes like this, write to them and ask them to open it up... it might work. If the data is open but isn't included here, let us know to help us prioritize.
can you email, blog, post, tweet, or walk down the hall to tell a friend? See the this is so cool section for your vital role....

this is so cool.

Thanks! We agree :)

You can help us. We are currently trying to a) win the PLoS/Mendeley Binary Battle because that sounds fun, b) raise funding for future total-Impact development, and c) justify spending more time on this ourselves.

Buzz and testimonials will help. Tweet your reports. Sign up for Mendeley, add public publications to your profile, and make some public groups. Tweet, blog, send email, and show off total-Impact at your next group meeting to help spread the word.

Tell us how cool it is at @totalImpactdev (or via email to total-Impact@googlegroups.com) so we can consolidate the feedback.

I have a suggestion!

We want to hear it. Send it to us at @totalImpactdev (or via email to total-Impact@googlegroups.com). Total-Impact development will slow for a bit while we get back to our research-paper-writing day jobs, so we aren’t sure when we’ll have another spurt of time for implementation.... but we want to hear your idea now so we can work on it as soon as we can.

total-impact