Enabling time travel for the scholarly web

-A A +A

An international team of information scientists has begun a two-year study to investigate how web links in scientific and other academic articles fail to lead to the resources being referenced.
This is the focus of the Hiberlink project in which the team from Los Alamos National Laboratory and the University of Edinburgh will assess the extent of “reference rot” using a vast corpus of online scholarly work. It is funded by a grant of $500,000 from the US-based Andrew W. Mellon Foundation, coordinated by EDINA, the designated online services center at the University of Edinburg, which serves the needs of universities and colleges across the UK.
“Increasingly, scientific papers contain links to web pages containing, for example, project descriptions, demonstrations, and software. But, as we all know, web pages change or disappear,” said Herbert Van de Sompel, the Los Alamos principal investigator on the project. “Currently, there is no archival infrastructure to safeguard such pages and hence revisiting them some time after they were linked from a paper is many times impossible. The result is a broken scholarly record.”
Increasingly, web-based scholarship includes links that point to resources needed or created in research activity, including software, datasets, websites, presentations, blogs, videos etc. as well as scientific workflows and ontologies. These referenced resources often evolve over time, unlike traditional scholarly articles. The reference-rot problem occurs whenever the original version of a linked resource is not available anymore.
The problem has two aspects. First, the http:// link that references a resource may no longer function. Second, the content at the end of the link may have evolved and may even have become dramatically different from when originally referenced. So when eventually a researcher revisits an online scholarly work and double-checks referenced resources to confirm evidence or establish context, the original online information may have changed or even ceased to exist.
The Hiberlink project builds directly upon a pilot study from Los Alamos, powered by their Memento “Time Travel for the Web” technology that confirmed that as much as 30 percent of the http:// links in a selection of 400,000 arXiv.org papers did not function and that 65 percent of the remaining links referred to a resource that was not archived, and hence in danger of disappearing without a trace.