NifWrapper is a Python library that makes it practical to work with the NLP Interchange Format (NIF) — an RDF-based vocabulary for representing text annotations such as named entities, mentions and links to knowledge bases like DBpedia or Wikidata. NIF is the de-facto interchange format in the Entity Linking community, but its RDF nature means that even simple operations (load a document, list its mentions, filter by surface form) require boilerplate code using a triple store API.
The package wraps that complexity behind a small, Pythonic API and is available on PyPI.
rdflib, NifWrapper
reads NIF documents in any RDF serialisation supported by rdflib (Turtle, N-Triples,
RDF/XML, JSON-LD) and exposes them as plain Python objects.
NifContext
instances containing ordered NifMention objects with character
offsets, surface form, optional taIdentRef link target and any
custom annotations.
setuptools distribution,
published to PyPI; tests run with pytest against the included
sample corpora.
Reading and writing NIF correctly is surprisingly error-prone — offset semantics,
URI minting, the difference between a context and a mention, and the multiple
prefixes (nif:, itsrdf:) all conspire against quick
iteration. NifWrapper consolidates the conventions that emerged from
Fine-Grained Entity Linking, VoxEL and the NIFify tool
into a single dependency so other researchers can focus on their algorithms instead
of on RDF plumbing.