Overview
The Citation Metadata Repository (CMR) is an open database for others to build upon, and it is separate from the archive maintained by each government agency. The CMR would be run by a third party, be in an open format, and contain the following metadata:
- the full text of the citation
- the URL of the text in the original context on the agency's archive
- the URLs of where the citations occur on the downstream blog or report
- the number of times the citation is clicked on the downstream blog
- the number of times the citation is displayed on the downstream blog
Workflow: Inserting a Citation to a Blog
- A blogger selects text on a government Web site and clicks on the Citability logo when it appears next to the selected text. (This requires that the Web site contains Citability.org JavaScript with added functionality). This has the advantage of educating many users about Citability.org by simply selecting text
- The blogger can then copy and paste into their blog the following two items: the selected text and the corresponding Citability URL. They have an option to choose among two Citability URLs, one which will create a full record in the CMR and one that will create a record that skips data-copying into the CMR. By default there will be nothing which connects the citation to the blogger, and is thus citier-agnostic. However, the expectation is that most bloggers will choose to be connected with the citation to increase interest in their blog
- Each time the link is clicked (or possibly even displayed) the CMR will record that information (and perhaps associated metadata such as ________), and the user who clicks is taken to the archived version of the text hosted on the [government archive server]. The CMR will not record any information if the blog's author did not provide informed consent to make this readership data available
Workflow: Clicking on Citation
- A reader can click on a citation when they see it on a blog and they will be taken to the citation in context in the agency's archive
Advantages
- Researchers who are interested in a phrase such as "stem cells" can easily receive notification anytime someone cites a portion of a government document containing that term
- The CMR will help publishers (bloggers, researchers, etc.) reach a broader audience through notifications
- The CMR will enable semantic analysis of citations, especially across documents
- Everyone will have access to the CMR which will provide analytic data that would normally only be available to search engines or URL shortening services
- The CMR will be an open platform upon which other applications can be built
Issues and Solutions
- Privacy Concerns
- Solution: Citability will collect identifiable information (including IP address) of bloggers who insert the citations or the readers who click-through on the citations contained in the CMR
- Agencies are prevented by law from collecting/storing certain types of data about use of their Web sites
- Solution: The data that would be collected is about the blogs that refer to the original government web source rather than data about traffic to government web sites
- Solution: Citability.org (or a third party) would host the data, rather than the government