Data Production

LJ Data Cycle

Musician URIs

For every musician discovered in a transcript, we generate a URI for that person which takes the form of our namespace plus the person’s name without spaces:

If the same individual already has a URI in DBPedia, we create a sameAs triple to connect that URI with our Linked Jazz URI.

Relationship Data

When fed with an interview transcript, the Analyzer breaks the text into question-and-answer blocks, and leverages this structure to automatically identify a connection between two individuals based on the assumption that if the interviewee mentions a person, she or he must at least know of the person cited. An RDF triple is then generated that states this basic relationship:

<> <> <> .

In addition to automatic processing, we also employ human processing to generate data. Through our crowdsourcing tool 52nd Street, volunteers are presented with snippets of interview text and asked to assign more specific terms to describe the relationship between the interviewee and the person mentioned. When the volunteer chooses a relationship from a list of options, this action generates a triple that will be stored in our dataset.

Read about how to Access/Query the data >>