Wikidata + Citoid = <3

Talk by Marielle Volz

[[User:Mvolz_(WMF)]]
[[User:Mvolz]]
@mariellevolz
mvolz@wikimedia.org
marielle.volz@gmail.com
CC-by-4.0

What is citoid?

  • Restful API/webscraper that returns metadata given an identifier (typically URL, can do QID, full citation, PMID etc).
  • Publically available at https://en.wikipedia.org/api/rest_v1/#!/Citation/getCitation
  • Optimised towards traditional citation styles (MLA, APA, Chicago).
  • Currently used to generate citation templates on some Wikipedias.
  • Returns JSON: all values are strings

Sample request

https://en.wikipedia.org/api/rest_v1/data/citation/mediawiki/https%3A%2F%2Flink.springer.com%2Fchapter%2F10.1007%2F978-3-7908-2628-9_7

[
  {
    "url": "https://link.springer.com/chapter/10.1007/978-3-7908-2628-9_7",
    "itemType": "bookSection",
    "place": "Heidelberg",
    "publisher": "Physica-Verlag HD", 
    "pages": "97–111",
    "title": "On Implementation of the Markov Chain Monte Carlo Stochastic Approximation Algorithm",
    "bookTitle": "Advances in Directional and Linear Statistics",
    "date": "2010-09-27",
    "ISBN": [ "9783790826272", "9783790826289"],
    "language": "en",
    "abstractNote": "The Markov Chain Monte Carlo Stochastic Approximation Algorithm (MCMCSAA) was developed to compute estimates of parameters with incomplete data. In theory this algorithm guarantees convergence to the...",
    "accessDate": "2018-10-09",
    "author": [
      ["Yihua", "Jiang"],
      ["Peter", "Karcher"],
      ["Yuedong", "Wang"]
    ],
    "DOI": "10.1007/978-3-7908-2628-9_7",
    "source": [ "Crossref", "citoid"]
  }
]
					

Knowledge Integrity: Citing sources on wikidata

https://meta.wikimedia.org/wiki/Knowledge_Integrity

Challenges

  • How nested the reference is depends on item type.
  • De-duplication depends on item type.
  • When is the source the website itself rather than object it's about?

Traditional reference

Jiang, Y., Karcher, P., & Wang, Y. (2010). "On Implementation of the Markov Chain Monte Carlo Stochastic Approximation Algorithm." Advances in Directional and Linear Statistics edited by Martin T. Wells. Physica-Verlag HD, pp. 97–111.

Structured references

Flat reference

Two layers

Reference

Two layers

Book

Three layers

Reference

Three layers

Chapter

Three layers

Book

Four layers?

Passage and page level citations

Gadgets

Katie Filbert's "autofill" tool

Original tool: https://github.com/filbertkm/wikidata-citetool/

Fork with more features: https://github.com/mvolz/wikidata-citoid-gadget

To enable, add this to your common.js:

	mw.loader.using(['wikibase'], function() {
		$.getScript( 'https://www.wikidata.org/w/index.php?title=User:Mvolz_(WMF)/CiteTool.js&action=raw&ctype=text/javascript', function() {
			var citeTool = new wb.CiteTool( 'https://www.wikidata.org/w/index.php?title=User:Mvolz_(WMF)/CiteProperties.json&action=raw&ctype=application/json' );
			citeTool.init();
		});
	});				
	
  • Flat / two layer references
  • Does not create new items.

DEMO

wikidata.org

https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002947

Please visit this RFC!

Referencing websites: How do we store the website title?

TODO:

Design wikibase / wikidata format for citoid

phab: T208213

Create ability for users to create new items as needed for references easily.