⬇️ Download profile metadata (JSON-LD) | 📁 View whole crate on GitHub

Language Data Commons RO-Crate Profile

This document is a DRAFT RO-Crate profile for Language Data resources. The profile specifies the contents of RO-Crate Metadata Documents for language resources and gives guidance on how to structure language data collections both at the RO-Crate package level and in a repository containing multiple packages.

This profile assumes that the principles and standards set out in the PILARS protocols, or similar compatible approaches, are being used.

The core metadata vocabularies for this profile are:

RO-Crate recommendations for data packaging and basic discoverability metadata, which is mostly Schema.org terms with a handful of additions. Following RO-Crate practice, basic metadata terms such as "who, what, where" and bibliographic-style descriptions are chosen from Schema.org (in preference to other vocabularies such as Dublin Core or FOAF) where possible, with domain-specific vocabularies used for things which are not common across domains (such as types of language).
An updated version of the Open Language Archives Community (OLAC) vocabularies; originally expressed as XML schemas. The new vocabulary is under development here: https://w3id.org/ldac/terms

Audience

This document is primarily for use by tool developers, data scientists and metadata specialists developing scripts or systems for user communities. It is not intended for use by non-specialists.

Just as we would not expect repository users to type Dublin Core metadata in XML format by hand, we do not expect our users to have to deal directly with the JSON-LD presented here. This document is for tool developers to build systems that crosswalk data from existing systems, or allow for user-friendly data entry.

About this Profile

This profile covers various kinds of crate metadata:

Structural RO-Crate metadata: how the root dataset links to files, and the abstract structure of nested collections (e.g. collections/corpora or other curated datasets) and objects of study; linguistic Items, Sessions or Texts). This profile assumes that a repository (for example, an OCFL storage root, with an API for accessing it) exists and that it can at a minimum support

(a) listing all items of the repository and returning their RO-Crate metadata, and

(b) retrieving an item given its ID.
Types of language data: is this resource a dialogue? A written text? A transcript or other annotation? Which file has which kind of data in it? What is inside CSV and other structured files? The vocabulary used for language-specific data is the Language Data Commons vocabulary which is being developed alongside this profile.
Contextual metadata: how to link people who had speaking, authoring, collection roles, places, subjects.

Structural Metadata

The structural elements of a Language Data Commons RO-Crate are:

A Collection / Object hierarchy to allow language data to be grouped. For example, a corpus with sub-corpora, or collections of items (objects) from a particular region.
Dataset and File entities (as per RO-Crate). Files may be referenced locally or via URI, for example, from an API. If an RO-Crate contains files, they MUST be linked to the root dataset as per the RO-Crate specification using either:
- hasPart relationships on the object(s), or
- isPartOf relationships on the file(s).

NOTE: The terms Collection and Object are encoded in RO-Crate metadata using RepositoryCollection and RepositoryObject types respectively. These in turn are re-named versions of the Portland Common Data Model types, pcdm:Collection and pcdm:Object.

A conformant RO-Crate:

Class: Dataset #class_Dataset

A body of structured information describing some topic(s) of interest.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://schema.org/Dataset
accountablePerson ⓘ #prop_accountablePerson_Dataset	Yes	The person or organisation who is the data steward for this resource.	Person, Organization
author ⓘ #prop_author_Dataset	Yes	The person or organisation responsible for creating this collection of data. Authors should be identified using URIs such as ORCiD or ROR.	Person, Organization
dct:rightsHolder ⓘ #prop_dct:rightsHolder_Dataset	Yes	The person or organisation owning or managing rights over the resource.	http://schema.org/Text, Person, Organization
publisher ⓘ #prop_publisher_Dataset	Yes	The organisation responsible for releasing this dataset.	Organization
citation ⓘ #prop_citation_Dataset	No	Associated publications.	CreativeWork
creditText ⓘ #prop_creditText_Dataset	No	A free text bibliographic citation for this material, e.g. 'Cite as: Musgrave (2023). Title of work. DOI'.	http://schema.org/Text
funder ⓘ #prop_funder_Dataset	No	The organisation(s) responsible for funding the creation or collection of this dataset.	Organization
hasPart ⓘ #prop_hasPart_Dataset	No	An item or CreativeWork that is part of this item, or CreativeWork (in some sense).	CreativeWork, File, Dataset
isAccessibleForFree ⓘ #prop_isAccessibleForFree_Dataset	No	This is available under an Open Access license.	http://schema.org/Boolean
isBasedOn ⓘ #prop_isBasedOn_Dataset	No	Link to or description of an original resource.	http://schema.org/Text, http://schema.org/URL, CreativeWork, Dataset, File
isPartOf ⓘ #prop_isPartOf_Dataset	No	An item or CreativeWork that this item, or CreativeWork (in some sense), is part of.	http://schema.org/URL, CreativeWork
ldac:annotationOf ⓘ #prop_ldac:annotationOf_Dataset	No	This resource contains some kind of description that adds information to the resource it references.	PrimaryMaterial
ldac:annotator ⓘ #prop_ldac:annotator_Dataset	No	The participant produced an annotation of this or a related resource.	Person, Organization
ldac:compiler ⓘ #prop_ldac:compiler_Dataset	No	The participant is responsible for collecting the sub-parts of the resource together.	Person, Organization
ldac:consultant ⓘ #prop_ldac:consultant_Dataset	No	The participant contributes expertise to the creation of a work, for example by contributing knowledge of their native language.	Person, Organization
ldac:dataInputter ⓘ #prop_ldac:dataInputter_Dataset	No	The participant responsible for entering, re-typing, and/or structuring the data contained in the resource.	Person, Organization
ldac:depositor ⓘ #prop_ldac:depositor_Dataset	No	The participant responsible for depositing the resource in an archive.	Person, Organization
ldac:developer ⓘ #prop_ldac:developer_Dataset	No	The participant developed the methodology or tools (including software) that constitute the resource, or that were used to create the resource.	Person, Organization
ldac:doi ⓘ #prop_ldac:doi_Dataset	No	A Digital Object Identifier, e.g. https://doi.org/10.1000/182.	http://schema.org/Text
ldac:editor ⓘ #prop_ldac:editor_Dataset	No	The participant reviewed, corrected, and/or tested the resource.	Person, Organization
ldac:hasCollectionProtocol ⓘ #prop_ldac:hasCollectionProtocol_Dataset	No	A link to a CollectionProtocol object with (at least) a summary of how resources were selected or elicited for this collection/sub-collection.	ldac:CollectionProtocol
ldac:illustrator ⓘ #prop_ldac:illustrator_Dataset	No	The participant contributed drawings or other illustrations to the resource.	Person, Organization
ldac:interpreter ⓘ #prop_ldac:interpreter_Dataset	No	The contributor renders the discourse recorded in the resource into another language in real time, or the contributor explains the discourse recorded in the resource.	Person, Organization
ldac:interviewee ⓘ #prop_ldac:interviewee_Dataset	No	The participant was a respondent in an interview.	Person, Organization
ldac:interviewer ⓘ #prop_ldac:interviewer_Dataset	No	The participant conducted an interview that forms part of the resource.	Person, Organization
ldac:participant ⓘ #prop_ldac:participant_Dataset	No	The participant was present during the creation of the resource, but did not contribute substantially to its content.	Person, Organization
ldac:performer ⓘ #prop_ldac:performer_Dataset	No	The participant performed some portion of a recorded, filmed, or transcribed resource. It is recommended that this term be used only for creative participants whose role is not better indicated by a more specific term, such as 'speaker', 'signer', or 'singer'.	Person, Organization
ldac:photographer ⓘ #prop_ldac:photographer_Dataset	No	The participant took the photograph, or shot the film, that appears in or constitutes the resource.	Person, Organization
ldac:recorder ⓘ #prop_ldac:recorder_Dataset	No	The participant operated the recording machinery used to create the resource.	Person, Organization
ldac:researcher ⓘ #prop_ldac:researcher_Dataset	No	The resource was created as part of the participant's research, or the research presents interim or final results from the participant's research.	Person, Organization
ldac:researchParticipant ⓘ #prop_ldac:researchParticipant_Dataset	No	The participant acted as a research subject or responded to a questionnaire, the results of which study form the basis of the resource.	Person, Organization
ldac:responder ⓘ #prop_ldac:responder_Dataset	No	The participant was an interlocutor in some sort of discourse event, but only reacted to the contributions of others.	Person, Organization
ldac:signer ⓘ #prop_ldac:signer_Dataset	No	The contributor was a principal signer in a resource that consists of a recording, a film, or a transcription of a recorded resource. Signers are those whose gestures predominate in a recorded or filmed resource. (The resource may be a transcription of that recording).	Person, Organization
ldac:singer ⓘ #prop_ldac:singer_Dataset	No	The participant sang, either individually or as part of a group, in a resource that consists of a recording, a film, or a transcription of a recorded resource.	Person, Organization
ldac:speaker ⓘ #prop_ldac:speaker_Dataset	No	The contributor was a principal speaker in a resource that consists of a recording, a film, or a transcription of a recorded resource. Speakers are those whose voices predominate in a recorded or filmed resource. (The resource may be a transcription of that recording).	Person, Organization
ldac:sponsor ⓘ #prop_ldac:sponsor_Dataset	No	The participant contributed financial support to the creation of the resource.	Person, Organization
ldac:transcriber ⓘ #prop_ldac:transcriber_Dataset	No	The participant produced a transcription of this or a related resource.	Person, Organization
ldac:translator ⓘ #prop_ldac:translator_Dataset	No	The participant produced a translation of this or a related resource.	Person, Organization
pcdm:hasMember ⓘ #prop_pcdm:hasMember_Dataset	No	The sub-collections, if any, associated with this collection.	RepositoryCollection, RepositoryObject
pcdm:memberOf ⓘ #prop_pcdm:memberOf_Dataset	No	Links from a Repository Object or Collection to a containing Repository Object or Collection.	RepositoryCollection
spatialCoverage ⓘ #prop_spatialCoverage_Dataset	No	The place(s) that are the focus of the content. It is a sub-property of contentLocation intended primarily for more technical and detailed materials. For example, with a dataset, it indicates areas that the dataset describes: a dataset Cape York languages would have spatialCoverage which was the place: the outline of the Cape.	Place
temporalCoverage ⓘ #prop_temporalCoverage_Dataset	No	The range of years of creation for items in this dataset using a slash, e.g. 1900/1945. If there are sub-collections with different coverages put this on the sub-collections not the top-level.	http://schema.org/DateTime, http://schema.org/Text
usageInfo ⓘ #prop_usageInfo_Dataset	No	Additional information on licensing options for using the data, e.g. 'Contact the Data Steward to discuss license terms'.	http://schema.org/Text

Structure of collections that conform to the Language Data Commons Profile

A collection such as a corpus may be stored in a repository or transmitted either as:

A distributed collection: a set of individual RO-Crates which reference separate collection records with one Object and one Collection per crate.
A bundled single crate: contains all the Collection and Object data.

Distributed collections may reference member collections or Objects in pcdm:hasMember property but should not include descriptions of Objects that are stored elsewhere in the repository.

Classes

In linked data, a class is a resource that represents a concept or entity. Classes specific to the Language Data Commons Schema include:

Class	Description
CollectionEvent	A description of an event at which one or more PrimaryMaterials were captured, e.g. as video or audio.
CollectionProtocol	A description of how this Object or Collection was obtained, such as the strategy used for selecting written source texts, or the prompts given to participants.
DataDepositLicense	A license document setting out terms for deposit into a repository.
DataLicense	A license document for data licensing. This is a superclass of DataReuseLicense and DataDepositLicense.
DataReuseLicense	A license document, setting out terms for reuse of data.

Bidirectional Relationships

The relational hierachy between Collections, Objects and Files are represented bidirectionally in an RO-Crate by the terms hasPart/isPartOf and pcdm:hasMember/pcdm:memberOf.

Superset Term	Inverse Of	Subset Term
`pcdm:hasMember`	⟷	`pcdm:memberOf`
`hasPart`	⟷	`isPartOf`

Objects are placed in a Collection using the pcdm:memberOf property, which is required. The inverse will be encoded automatically using the pcdm:hasMember property on a Collection. Similarly, if using pcdm:hasMember, pcdm:memberOf will also be automatically encoded.

The same relationship applies for hasPart and isPartOf at the Object and File levels.

Superset Level		Relationship		Subset Level
Collection	→	`pcdm:hasMember`	→	Object
Collection	←	`pcdm:memberOf`	←	Object
Object	→	`hasPart`	→	File
Object	←	`isPartOf`	←	File

Depending on the data, using one term over another may be preferable when creating the hierarchical relationship. For example, if you are describing multiple files in a spreadsheet, it is easier to use isPartOf at the File level referencing the Object it belongs to, rather than listing all the hasPart entries at the Object level.

The following diagram shows how these relationships are encoded in a single "bundled" RO-Crate.

Self-contained collection crate with all resources

The next diagram shows how distributed crates (with one RO-Crate per Object and Collection) are linked.

Distributed crate with links to object crates

Which linking strategy is used is an implementation choice for repository developers.

When to choose collection-as-crate ("bundled") vs collection-in-multiple-crates ("distributed")

Use a single bundled crate for a collection when all of these conditions are true:
- The collection is final and is expected to be stable, i.e. there is negligible chance of having to withdraw any of its contents or files.
- The collection and all its files can easily be transferred in a single transaction - say 20 GB total.
- All the material in the corpus shares the same license for reuse.
Split a collection into distributed RepositoryCollection and RepositoryObject crates, with one crate per repository object, when any of these conditions are true:
- The collection is not yet stable:
  - New items are being added or changed.
  - There is a chance that some data may have to be taken down or withdrawn at the request of participants.
- The total size of the collection will present challenges for data transfer.
- There is more than one data reuse license applicable.

Collection

A collection is a group of related Objects. Examples of collections include corpora, and sub-corpora, as well as aggregations of cultural objects such as PARADISEC collections which bring together items collected in a region or on a session with informants. This follows the Alveo usage:

Items [Objects in this model] are grouped into collections which might correspond to curated corpora such as ACE or informal collections such as a sample of documents from the AustLit archive.

When an RO-Crate is used to package a collection that is part of another Collection it has a pcdm:memberOf property which references a resolvable ID (within the context of a repository or service) of the parent Collection. The Collection may also list its members in a pcdm:hasMember property, but this is not required.

The root dataset must have at least these @type values: ["Dataset", "RepositoryCollection"]

A RepositoryCollection:

Class: RepositoryCollection #class_RepositoryCollection

A Collection is a group of resources. Collections have descriptive metadata, access metadata, and may links to works and/or collections. By default, member works and collections are an unordered set, but can be ordered using the ORE Proxy class.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://pcdm.org/models#Collection
inLanguage ⓘ #prop_inLanguage_RepositoryCollection	Yes	The language in which the resource is written.	Language
conformsTo ⓘ #prop_conformsTo_RepositoryCollection	No	A link to the language data commons RO-Crate profile for collections.	Values for conformsTo
contentLocation ⓘ #prop_contentLocation_RepositoryCollection	No	The location depicted or described in the content. For example, the location in a photograph or painting.	Place
dateCreated ⓘ #prop_dateCreated_RepositoryCollection	No	The (earliest) date the data in this dataset were created.	http://schema.org/Date
holdingArchive ⓘ #prop_holdingArchive_RepositoryCollection	No	Organisation where the original of this work or collection is housed.	Organization, http://schema.org/Text
ldac:dateFreeText ⓘ #prop_ldac:dateFreeText_RepositoryCollection	No	Date information which cannot be put in one of the standard date formats, e.g. 'mid-1970s', or it is not clear, for example, if it is a creation or publication date.	http://schema.org/Text
ldac:itemLocation ⓘ #prop_ldac:itemLocation_RepositoryCollection	No	Current location of the item, e.g. where a set of audio tapes are stored.	Place, Organization
ldac:subjectLanguage ⓘ #prop_ldac:subjectLanguage_RepositoryCollection	No	The languages that the materials in the collection are about (not the language that it is in).	Language

Object

An Object is a single unit linked to tightly related files, for example, a dialogue or session in a speech study, or a work (document) in a written corpus. This is based on the use of the term Item in Alveo:

The data model that we have developed for the storage of language resources is built around the concept of an item which corresponds (loosely) to a record of a single communication event. An item is often associated with a single text, audio or video resource but could include a number of resources, for example, the different channels of audio recording, or an audio recording and associated textual transcript. Items are grouped into collections which might correspond to curated corpora such as ACE or informal collections such as a sample of documents from the AustLit archive. https://www.researchonline.mq.edu.au/vital/access/services/Download/mq:37347/DS01

The definition of an object is necessarily loose and needs to reflect what data owners have chosen to do with their collections in the past.

If an RO-Crate contains a single Object, the Root Dataset would have a @type property of ["Dataset", "RepositoryObject"] with a conformsTo property pointing to the Language Data Commons Object profile https://w3id.org/ldac/profile#Object (this document).

If an RO-Crate contains an entire collection, each Object has a @type property of ["Dataset", "RepositoryObject"] and a conformsTo property referencing this document. For example:

Objects SHOULD have files (which may be included in an RO-Crate for the object, or as part of a collection crate).

In this example, the Object in question is an interview from a speech corpus with three files. The diagram shows the relationships between the object and its files (and the contextual metadata of a Person who takes the role of the speaker/informant (discussed in more detail below).

Structure of an Object crate

There are a number of terms that can be used to characterise resources - these use the Schema.org mechanism of DefinedTerm and DefinedTermSet.

A RepositoryObject:

Class: RepositoryObject #class_RepositoryObject

An Object is an intellectual entity, sometimes called a "work", "digital object", etc. Objects have descriptive metadata, access metadata, may contain files and other Objects as member "components". Each level of a work is therefore represented by an Object instance, and is capable of standing on its own, being linked to from Collections and other Objects. Member Objects can be ordered using the ORE Proxy class.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://pcdm.org/models#Object
conformsTo ⓘ #prop_conformsTo_RepositoryObject	No	A link to the language data commons RO-Crate profile for collections.	http://schema.org/Text
creator ⓘ #prop_creator_RepositoryObject	No	The creator/author of this CreativeWork. This is the same as the Author property for CreativeWork.	Person
dateCreated ⓘ #prop_dateCreated_RepositoryObject	No	The date on which the CreativeWork was created or the item was added to a DataFeed.	http://schema.org/Text
description ⓘ #prop_description_RepositoryObject	No	A description of the item.	http://schema.org/Text
identifier ⓘ #prop_identifier_RepositoryObject	No	The identifier property represents any kind of identifier for any kind of [[Thing]], such as ISBNs, GTIN codes, UUIDs etc. Schema.org provides dedicated properties for representing many of these, either as textual strings or as URL (URI) links. See background notes for more details.	http://schema.org/PropertyValue, http://schema.org/Text, http://schema.org/URL
ldac:hasAnnotation ⓘ #prop_ldac:hasAnnotation_RepositoryObject	No	This resource is referenced by another resource that adds information to it such as a translation, transcription or other analysis.	Annotation
license ⓘ #prop_license_RepositoryObject	No	A license document that applies to this content, typically indicated by URL.	DataReuseLicense
temporalCoverage ⓘ #prop_temporalCoverage_RepositoryObject	No	The temporalCoverage of a CreativeWork indicates the period that the content applies to, i.e. that it describes, either as a DateTime or as a textual string indicating a time period in ISO 8601 time interval format. In the case of a Dataset it will typically indicate the relevant time period in a precise notation (e.g. for a 2011 census dataset, the year 2011 would be written "2011/2012"). Other forms of content, e.g. ScholarlyArticle, Book, TVSeries or TVEpisode, may indicate their temporalCoverage in broader terms - textually or via well-known URL. Written works such as books may sometimes have precise temporal coverage too, e.g. a work set in 1939 - 1945 can be indicated in ISO 8601 interval format format via "1939/1945". Open-ended date ranges can be written with ".." in place of the end date. For example, "2015-11/.." indicates a range beginning in November 2015 and with no specified final date. This is tentative and might be updated in future when ISO 8601 is officially updated.	http://schema.org/Text

Files

There are three important types of files (or references to other works) that may be included: ldac:PrimaryMaterial which is a recording or original text, or a citation of or proxy for it, ldac:DerivedMaterial which has been generated or sampled from primary material by a process such as format conversion or digitization, and ldac:Annotation, which contains one or more types of analysis of the ldac:PrimaryMaterial or ldac:DerivedMaterial.

A File:

Class: File #class_File

A media object, such as an image, video, audio, or text object embedded in a web page or a downloadable dataset i.e. DataDownload.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://schema.org/MediaObject
contentSize ⓘ #prop_contentSize_File	No	File size in (mega/kilo)bytes.	http://schema.org/Text
encodingFormat ⓘ #prop_encodingFormat_File	No	The media type typically expressed using a MIME format.	http://schema.org/Text, http://schema.org/WebPage, http://schema.org/CreativeWork
hasPart ⓘ #prop_hasPart_File	No	An item or CreativeWork that is part of this item, or CreativeWork (in some sense).	CreativeWork, File
ldac:derivationOf ⓘ #prop_ldac:derivationOf_File	No	This property references another resource from which the current resource is derived, e.g. downsampling audio or video files, or extracting text from a PDF.	Annotation, PrimaryMaterial
ldac:hasDerivation ⓘ #prop_ldac:hasDerivation_File	No	This property references another resource that is derived from it, such as a downsampled audio or video file, or text extracted from a PDF.	DerivedMaterial
ldac:materialType ⓘ #prop_ldac:materialType_File	No	Indicates whether the material in a file is the original (primary) source or is derived from it or describes it via annotation.	MaterialTypes

ldac:PrimaryMaterial

ldac:PrimaryMaterial may be a video or audio file if it is available, or may be a ContextualEntity referencing a primary text such as a book.

ldac:DerivedMaterial

ldac:DerivedMaterial is a non-analytical derivation from ldac:PrimaryMaterial, for example, downsampled video or excerpted text.

ldac:Annotation

ldac:Annotation is a description or analysis of other material. More than one type of annotation may be present in a file.

Describing the columns in CSV or other tabular data

CSV or similar tabular files are often used to represent transcribed speech or sign language data, sometimes also with time codes. To enable automated location of which column is which, use a frictionless Table Schema described by a File entity in the crate.

For example: ${exampleEntities('art', ['art_schema.json'])}

Places

The place in which data was collected may be indicated using the contentLocation property.

${exampleEntities('paradisec-item-NT1-001', ['./', 'https://www.ethnologue.com/country/VU', '#Vanuatu'])}

Identifiers

Identifiers for Objects and Collections MUST be URIs.

Internally, identifiers for all entities that do not have their own URIs may use the Archive and Packaging identifier scheme (ARCP), which allows for a DNS-like namespacing of identifiers. For example, the Sydney Speaks corpus top-level collection would have the ID:

arcp://name,http://www.dynamicsoflanguage.edu.au/sydney-speaks/corpus/

A sub-corpus (collection) would have an ID like:

arcp://name,http://www.dynamicsoflanguage.edu.au/sydney-speaks/corpus/collection/SSP

An object:

arcp://name,http://www.dynamicsoflanguage.edu.au/sydney-speaks/corpus/object/331

A person:

arcp://name,http://www.dynamicsoflanguage.edu.au/sydney-speaks/corpus/person/54

How to record people's contributions

Some corpora express ages and other demographics of participants - this presents a data modelling challenge, as age and some other variables change over time, so if the same person appears over time then we need to have a base Person with date of birth etc. as well as time-based instances of the person with an age, social status, gender etc. at that time.

There are three levels at which contributions to an object can be modelled:

Include one or more Person items as context in a crate and reference them with properties such as creator or the Language Data Commons Vocabulary properties such as ldac:compiler or ldac:depositor. The @id of the person MUST be a URI and SHOULD be re-used where the same person appears in multiple objects in a collection or repository.
For longitudinal studies where it is important to record changing demographic information for a Person, or where precision is required in listing contributions to a work use ldac:PersonSnapshot.
If it is important to record lots of contributions to a work (e.g. in analysis of a joint work) use Action. If more precision is required in describing the provenance of items, e.g. this work on The Declaration of Rights of Man and of the Citizen (Lorber-Kasunic & Sweetapple).

NOTE: If this approach is used, special care will have to be taken in developing user interfaces and/or training communities to use this way of modelling metadata; the user need not see the underlying structure. This profile does not give advice about how to do this as we have not seen a use case that requires it.

Collection events such as "Sessions"

Where data is collected from participants in a speech study with elicitation tasks such as "sessions" (see this IMDI document) or field interviews, this can be recorded in metadata via the CollectionEvent class.

The indirection in this conforms-to relationship is to allow multiple objects to have a conformsTo property which indicates that they conform to the same schema while having a local copy of the schema, as per RO-Crate best practice of having all local context to use a data packages in the package where possible.

References

Himmelmann, Nikolaus P. 2012. Linguistic data types and the interface between language documentation and description. Language documentation & conservation. University of Hawai'i Press 6. 187--207.

Paterson, Hugh Joseph. 2021. Language Archive Records: Interoperability of Referencing Practices and Metadata Models. United States -- North Dakota: The University of North Dakota M.A. https://www.proquest.com/docview/2550236802/abstract/22686A0E508D4E5CPQ/1 (3 May 2022).

Examples

https://www.mpi.nl/ISLE/documents/docs_frame.html

Defined Term Sets

Defined Term Set: AccessTypes

ID: ldac:AccessTypes

Set of defined terms for ldac:access

Term	Description
AuthorizedAccess	Indicates that a DataReuseLicense requires some kind of authorization step, from SelfAuthorization (click-through) to processes that require a data steward to grant permission.
OpenAccess	Data covered by this license may be accessed as long as the license is served alongside it, and does not require any specific authorization step.

Defined Term Set: AnnotationTypeTerms

ID: ldac:AnnotationTypeTerms

Set of defined terms for ldac:annotationType

Term	Description
Gestural	The resource describes the gestural content of the resource it annotates.
Orthographic	The resource contains annotations using orthography (a writing system) as opposed to a coded representation such as a phonetic transcription.
PartOfSpeech	An annotation that assigns lexical elements of language to classes on the basis of their distributional properties (for sign languages, the term 'sign class' is appropriate).
Phonemic	An annotation that represents speech in terms of the sound contrasts made in a language.
Phonetic	A representation of speech in terms of the sounds produced, typically using the International Phonetic Alphabet.
Phonological	An annotation that includes information about the sound system of a language, such as the contrasts between sounds which make up the sound system and the locally conditioned realisations of sounds which characterise speech in the language.
Prosodic	An annotation that provides a symbolic record of intonation, stress, tone or other suprasegmental features, which is expressed independently of regular phonetic transcription.
Semantic	The resource includes annotation or analysis concerning the encoding of meaning.
Syntactic	The resource contains annotation or analysis describing the combinatorial patterns of words in another resource.
Transcription	The resource contains a transcription, which is a written representation (orthographic or coded) of an audio or visual signal.
Translation	This is a translation of a resource in another language.

Defined Term Set: AuthorizationWorkflows

ID: ldac:AuthorizationWorkflows

Set of defined terms for ldac:authorizationWorkflow

Term	Description
AccessControlList	License grants access to data based on a list of approved users, specified using the property accessControlList.
AgreeToTerms	A user is expected to explicitly agree to a set of license terms, this may be combined with AccessControlList - to note that even if a user has been pre-approved for a license they must agree to license terms.
AuthorizationByApplication	Users may apply for a license via some workflow, such as a form, with the decision being made by a DataSteward or their delegate about whether to grant the license.
AuthorizationByInvitation	A data steward or administrator is expected to use an access control system to invite users, for example, participants, collaborators or students.
SelfAuthorization	A user can be authorised to access data by clicking that they agree to a license, or filling out a form to check their understanding, which can be validated by a machine and does not require human intervention.

Defined Term Set: CollectionEventTypeTerms

ID: ldac:CollectionEventTypeTerms

Set of defined terms for ldac:collectionEventType

Term	Description
Session	A collection event that is a recording or elicitation session with participants.

Defined Term Set: CollectionProtocolTypeTerms

ID: ldac:CollectionProtocolTypeTerms

Set of defined terms for ldac:collectionProtocolType

Term	Description
ElicitationTask	The collection protocol includes a task-based prompt to participants.
MaterialSelectionCriteria	A description of the criteria used to select texts in a collection.

Defined Term Set: CommunicationModeTerms

ID: ldac:CommunicationModeTerms

Set of defined terms for ldac:communicationMode

Term	Description
Coded	The resource contains an analysis or annotations represented by a code (such as the International Phonetic Alphabet).
Gesture	The resource contains non-linguistic gestural communication (i.e. not sign language).
SignedLanguage	The resource contains data for which the medium of interaction was signing.
Song	The resource contains data for which the medium of interaction was song.
SpokenLanguage	The resource contains data for which the medium of interaction was speech.
WhistledLanguage	The resource contains data for which the medium of interaction was whistling.
WrittenLanguage	The resource contains data for which the medium of interaction was writing.

Defined Term Set: IndexTypes

ID: ldac:IndexTypes

Set of defined terms for ldac:openAccessIndex

Term	Description
FullText	A text index that makes the full text of a data resource findable via a search interface.

Defined Term Set: LinguisticGenreTerms

ID: ldac:LinguisticGenreTerms

Set of defined terms for ldac:linguisticGenre

Term	Description
Dialogue	An interactive discourse with two or more participants. Examples of dialogues include conversations, interviews, correspondence, consultations, greetings and leave-takings.
Drama	A planned, creative rendition of discourse with two or more participants intended for presentation to an audience.
Formulaic	The resource is a ritually or conventionally structured discourse.
Informational	Discourse whose primary purpose is to inform the audience about the natural or social world.
Interview	The resource is a conversation where one or more speakers are directing the conversation.
Lexicon	The resource includes a systematic listing of lexical items.
Ludic	Language whose primary function is to be part of play, or a style of speech that involves a creative manipulation of the structures of the language. Examples of ludic discourse are play languages, jokes, secret languages, and speech disguises.
Narrative	A discourse, monologic or co-constructed, which represents temporally organised events. Types of narratives include historical, traditional, and personal narratives, myths, folktales, fables, and humorous stories.
Oratory	The art of public speaking, or of speaking eloquently according to rules or conventions. Examples of oratory include sermons, lectures, political speeches, and invocations.
Procedural	An explanation or description of a method, process, or situation having ordered steps.
Report	A factual account of some event or circumstance.
Thesaurus	The resource contains a list or data structure consisting of words or concepts arranged according to sense.

Defined Term Set: MaterialTypes

ID: ldac:MaterialTypes

Set of defined terms for ldac:materialType

Term	Description
Annotation	The resource includes material that adds information to some other linguistic record.
DerivedMaterial	This is derived from another source, such as a Primary Material, via some process, e.g. a downsampled video or a sample or an abstract of a resource that is not an annotation (an analysis or description).
PrimaryMaterial	The object of study, such as a literary work, film, or recording of natural discourse.

Defined Term Set: WrittenLanguageTypeTerms

ID: ldac:WrittenLanguageTypeTerms

Set of defined terms for ldac:writtenLanguageFormat

Term	Description
Handwritten	The resource was written using a writing implement such as a pen, pencil, brush or computer stylus (except where the digital handwriting is converted to standard text).
Typeset	The resource has been formatted for printing or display.
Typewritten	The resource contains text produced on a typewriter.

30 classes · 99 properties

Classes

CollectionEvent — A description of an event at which one or more PrimaryMaterials were captured, e #class_CollectionEvent
CreativeWork — The most generic kind of creative work, including books, movies, photographs, software programs, etc #class_CreativeWork
DataDepositLicense — A license document setting out terms for deposit into a repository #class_DataDepositLicense
DataLicense — A license document for data licensing #class_DataLicense
DataReuseLicense — A license document, setting out terms for reuse of data #class_DataReuseLicense
Dataset — A body of structured information describing some topic(s) of interest #class_Dataset
dct:Collection — An aggregation of resources #class_dct:Collection
dct:Dataset — Data encoded in a defined structure #class_dct:Dataset
dct:Event — A non-persistent, time-based occurrence #class_dct:Event
dct:Image — A visual representation other than text #class_dct:Image
dct:InteractiveResource — A resource requiring interaction from the user to be understood, executed, or experienced #class_dct:InteractiveResource
dct:MovingImage — A series of visual representations imparting an impression of motion when shown in succession #class_dct:MovingImage
dct:PhysicalObject — An inanimate, three-dimensional object or substance #class_dct:PhysicalObject
dct:Service — A system that provides one or more functions #class_dct:Service
dct:Software — A computer program in source or compiled form #class_dct:Software
dct:Sound — A resource primarily intended to be heard #class_dct:Sound
dct:StillImage — A static visual representation #class_dct:StillImage
dct:Text — A resource consisting primarily of words for reading #class_dct:Text
File — A media object, such as an image, video, audio, or text object embedded in a web page or a downloadable dataset i #class_File
Geometry — A coherent set of direct positions in space #class_Geometry
Language — Natural languages such as Spanish, Tamil, Hindi, English, etc #class_Language
ldac:CollectionProtocol — A description of how this Object or Collection was obtained, such as the strategy used for selecting written source texts, or the prompts given to participants #class_ldac:CollectionProtocol
Organization — An organization such as a school, NGO, corporation, club, etc #class_Organization
Person — A person (alive, dead, undead, or fictional) #class_Person
Place — Entities that have a somewhat fixed, physical extension #class_Place
README Entity — An Data Package MUST contain a README file that describes the contents of the package #README_Entity
RepositoryCollection — A Collection is a group of resources #class_RepositoryCollection
RepositoryObject — An Object is an intellectual entity, sometimes called a "work", "digital object", etc #class_RepositoryObject
RO-Crate Metadata Descriptor — An RO-Crate @graph must contain an entity of Type @CreativeWork which is known as the RO-Crate Metadata descriptor #RO-Crate_Metadata_Descriptor
Root Data Entity — The Root Data Entity for an RO-Crate #Root_Data_Entity

Properties

@id — README file #README.id
@id — The RO-Crate Metadata file identifier #RO-Crate_Metadata_Descriptor.id
about — This property on the RO-Crate Metadata Descriptor references the Root Data Entity #RO-Crate_Metadata_Descriptor.about
accountablePerson — The person or organisation who is the data steward for this resource #prop_accountablePerson_Dataset
address — The physical address of the place #prop_address_Place
affiliation — The organisation that this person is affiliated with #prop_affiliation_Person
author — The person or organisation responsible for creating this work #prop_author_CreativeWork
author — The person or organisation responsible for creating this collection of data #prop_author_Dataset
citation — Associated publications #prop_citation_Dataset
conformsTo — A link to the language data commons RO-Crate profile for collections #prop_conformsTo_RepositoryCollection
conformsTo — A link to the language data commons RO-Crate profile for collections #prop_conformsTo_RepositoryObject
contentLocation — The location depicted or described in the content #prop_contentLocation_RepositoryCollection
contentSize — File size in (mega/kilo)bytes #prop_contentSize_File
creator — The creator/author of this CreativeWork #prop_creator_RepositoryObject
creditText — A free text bibliographic citation for this material, e #prop_creditText_Dataset
dateCreated — The (earliest) date the data in this dataset were created #prop_dateCreated_RepositoryCollection
dateCreated — The date on which the CreativeWork was created or the item was added to a DataFeed #prop_dateCreated_RepositoryObject
datePublished — A date that this collection was published #prop_datePublished_Dataset
dct:rightsHolder — The person or organisation owning or managing rights over the resource #prop_dct:rightsHolder_Dataset
description — An abstract of the collection #prop_description_Dataset
description — A description of the item #prop_description_RepositoryObject
encodingFormat — The media type typically expressed using a MIME format #prop_encodingFormat_File
funder — The organisation(s) responsible for funding the creation or collection of this dataset #prop_funder_Dataset
geo — The geographic coordinates of the place #prop_geo_Place
geosparql:asWKT — The WKT serialisation of the geometry #prop_geosparql:asWKT_Geometry
hasPart — An item or CreativeWork that is part of this item, or CreativeWork (in some sense) #prop_hasPart_Dataset
hasPart — An item or CreativeWork that is part of this item, or CreativeWork (in some sense) #prop_hasPart_File
holdingArchive — Organisation where the original of this work or collection is housed #prop_holdingArchive_RepositoryCollection
identifier — The identifier property represents any kind of identifier for any kind of [[Thing]], such as ISBNs, GTIN codes, UUIDs etc #prop_identifier_RepositoryObject
inLanguage — The language in which the resource is written #prop_inLanguage_RepositoryCollection
isAccessibleForFree — This is available under an Open Access license #prop_isAccessibleForFree_Dataset
isBasedOn — Link to or description of an original resource #prop_isBasedOn_Dataset
isbn — The ISBN for this work, if applicable #prop_isbn_CreativeWork
isPartOf — An item or CreativeWork that this item, or CreativeWork (in some sense), is part of #prop_isPartOf_Dataset
issn — The ISSN for this publication #prop_issn_CreativeWork
ldac:access — Whether this is an open or restricted access license #prop_ldac:access_DataReuseLicense
ldac:accessControlList — When a license has an authorizationWorkflow property with a value of the DefinedTerm AccessControlList this property has a URI value that points to a list of userIDs #prop_ldac:accessControlList_DataReuseLicense
ldac:age — The age of a person #prop_ldac:age_Person
ldac:annotationOf — This resource contains some kind of description that adds information to the resource it references #prop_ldac:annotationOf_Dataset
ldac:annotationType — The type of an Annotation resource #prop_ldac:annotationType_CreativeWork
ldac:annotator — The participant produced an annotation of this or a related resource #prop_ldac:annotator_Dataset
ldac:authorizationWorkflow — By what process a user is granted authorization to a license #prop_ldac:authorizationWorkflow_DataReuseLicense
ldac:channels — The number of audio channels this resource contains (e #prop_ldac:channels_CreativeWork
ldac:collectionEventType — A kind of CollectionEvent characterised by some specific procedures, e #prop_ldac:collectionEventType_CollectionEvent
ldac:collectionProtocolType — A description of the process used to collect or collate data, such as prompts given to participants, or how texts are selected for inclusion in a collection #prop_ldac:collectionProtocolType_ldac:CollectionProtocol
ldac:communicationMode — The mode (spoken, written, signed etc #prop_ldac:communicationMode_CreativeWork
ldac:compiler — The participant is responsible for collecting the sub-parts of the resource together #prop_ldac:compiler_Dataset
ldac:consultant — The participant contributes expertise to the creation of a work, for example by contributing knowledge of their native language #prop_ldac:consultant_Dataset
ldac:dataInputter — The participant responsible for entering, re-typing, and/or structuring the data contained in the resource #prop_ldac:dataInputter_Dataset
ldac:dateFreeText — Date information which cannot be put in one of the standard date formats, e #prop_ldac:dateFreeText_RepositoryCollection
ldac:depositor — The participant responsible for depositing the resource in an archive #prop_ldac:depositor_Dataset
ldac:derivationOf — This property references another resource from which the current resource is derived, e #prop_ldac:derivationOf_File
ldac:developer — The participant developed the methodology or tools (including software) that constitute the resource, or that were used to create the resource #prop_ldac:developer_Dataset
ldac:doi — A Digital Object Identifier, e #prop_ldac:doi_Dataset
ldac:editor — The participant reviewed, corrected, and/or tested the resource #prop_ldac:editor_Dataset
ldac:hasAnnotation — This resource is referenced by another resource that adds information to it such as a translation, transcription or other analysis #prop_ldac:hasAnnotation_RepositoryObject
ldac:hasCollectionProtocol — A link to a CollectionProtocol object with (at least) a summary of how resources were selected or elicited for this collection/sub-collection #prop_ldac:hasCollectionProtocol_Dataset
ldac:hasDerivation — This property references another resource that is derived from it, such as a downsampled audio or video file, or text extracted from a PDF #prop_ldac:hasDerivation_File
ldac:illustrator — The participant contributed drawings or other illustrations to the resource #prop_ldac:illustrator_Dataset
ldac:indexableText — One or more target File(s) that together contain the full text of an item – each file should indicate its language #prop_ldac:indexableText_CreativeWork
ldac:interpreter — The contributor renders the discourse recorded in the resource into another language in real time, or the contributor explains the discourse recorded in the resource #prop_ldac:interpreter_Dataset
ldac:interviewee — The participant was a respondent in an interview #prop_ldac:interviewee_Dataset
ldac:interviewer — The participant conducted an interview that forms part of the resource #prop_ldac:interviewer_Dataset
ldac:isDeIdentified — The data in this item has had potentially identifying information removed, which may include replacing names with pseudonyms #prop_ldac:isDeIdentified_CreativeWork
ldac:itemLocation — Current location of the item, e #prop_ldac:itemLocation_RepositoryCollection
ldac:linguisticGenre — A linguistic classification of the genre of this resource #prop_ldac:linguisticGenre_CreativeWork
ldac:material — Description of the original media, e #prop_ldac:material_CreativeWork
ldac:materialType — Indicates whether the material in a file is the original (primary) source or is derived from it or describes it via annotation #prop_ldac:materialType_File
ldac:openAccessIndex — One or more public index types allowed by a license, e #prop_ldac:openAccessIndex_CreativeWork
ldac:participant — The participant was present during the creation of the resource, but did not contribute substantially to its content #prop_ldac:participant_Dataset
ldac:performer — The participant performed some portion of a recorded, filmed, or transcribed resource #prop_ldac:performer_Dataset
ldac:photographer — The participant took the photograph, or shot the film, that appears in or constitutes the resource #prop_ldac:photographer_Dataset
ldac:recorder — The participant operated the recording machinery used to create the resource #prop_ldac:recorder_Dataset
ldac:register — The type of register (any of the varieties of a language that a speaker uses in a particular social context [Merriam-Webster]) of the contents of a language resource #prop_ldac:register_CreativeWork
ldac:researcher — The resource was created as part of the participant's research, or the research presents interim or final results from the participant's research #prop_ldac:researcher_Dataset
ldac:researchParticipant — The participant acted as a research subject or responded to a questionnaire, the results of which study form the basis of the resource #prop_ldac:researchParticipant_Dataset
ldac:responder — The participant was an interlocutor in some sort of discourse event, but only reacted to the contributions of others #prop_ldac:responder_Dataset
ldac:reviewDate — The date that this license should be reviewed #prop_ldac:reviewDate_DataLicense
ldac:signer — The contributor was a principal signer in a resource that consists of a recording, a film, or a transcription of a recorded resource #prop_ldac:signer_Dataset
ldac:singer — The participant sang, either individually or as part of a group, in a resource that consists of a recording, a film, or a transcription of a recorded resource #prop_ldac:singer_Dataset
ldac:speaker — The contributor was a principal speaker in a resource that consists of a recording, a film, or a transcription of a recorded resource #prop_ldac:speaker_Dataset
ldac:sponsor — The participant contributed financial support to the creation of the resource #prop_ldac:sponsor_Dataset
ldac:subjectLanguage — The languages that the materials in the collection are about (not the language that it is in) #prop_ldac:subjectLanguage_RepositoryCollection
ldac:transcriber — The participant produced a transcription of this or a related resource #prop_ldac:transcriber_Dataset
ldac:translator — The participant produced a translation of this or a related resource #prop_ldac:translator_Dataset
ldac:writtenLanguageFormat — The format of the resource resulting from the way the text was produced (handwritten, typeset, typewritten) #prop_ldac:writtenLanguageFormat_CreativeWork
license — A license document that applies to this content, typically indicated by URL #prop_license_Dataset
license — A license document that applies to this content, typically indicated by URL #prop_license_RepositoryObject
location — A location for the organisation, e #prop_location_Organization
name — The name of this data collection #prop_name_Dataset
pcdm:hasMember — The sub-collections, if any, associated with this collection #prop_pcdm:hasMember_Dataset
pcdm:memberOf — Links from a Repository Object or Collection to a containing Repository Object or Collection #prop_pcdm:memberOf_Dataset
publisher — The organisation that published this work #prop_publisher_CreativeWork
publisher — The organisation responsible for releasing this dataset #prop_publisher_Dataset
recipient — The person or organisation responsible for creating this work #prop_recipient_CreativeWork
spatialCoverage — The place(s) that are the focus of the content #prop_spatialCoverage_Dataset
temporalCoverage — The range of years of creation for items in this dataset using a slash, e #prop_temporalCoverage_Dataset
temporalCoverage — The temporalCoverage of a CreativeWork indicates the period that the content applies to, i #prop_temporalCoverage_RepositoryObject
usageInfo — Additional information on licensing options for using the data, e #prop_usageInfo_Dataset

Types of entities (specializations of Classes) and expected Properties

Class: CollectionEvent #class_CollectionEvent

A description of an event at which one or more PrimaryMaterials were captured, e.g. as video or audio.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			https://w3id.org/ldac/terms#CollectionEvent
ldac:collectionEventType ⓘ #prop_ldac:collectionEventType_CollectionEvent	No	A kind of CollectionEvent characterised by some specific procedures, e.g. a psycholinguistic experiment.	CollectionEventTypeTerms

Class: CreativeWork #class_CreativeWork

The most generic kind of creative work, including books, movies, photographs, software programs, etc.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://schema.org/CreativeWork
author ⓘ #prop_author_CreativeWork	No	The person or organisation responsible for creating this work. Authors should be identified using URIs such as ORCiD or ROR.	http://schema.org/Text, Person, Organization
isbn ⓘ #prop_isbn_CreativeWork	No	The ISBN for this work, if applicable.	http://schema.org/Text
issn ⓘ #prop_issn_CreativeWork	No	The ISSN for this publication.	http://schema.org/Text
ldac:annotationType ⓘ #prop_ldac:annotationType_CreativeWork	No	The type of an Annotation resource.	AnnotationTypeTerms
ldac:channels ⓘ #prop_ldac:channels_CreativeWork	No	The number of audio channels this resource contains (e.g. 1, 2, 5.1).	http://schema.org/Text
ldac:communicationMode ⓘ #prop_ldac:communicationMode_CreativeWork	No	The mode (spoken, written, signed etc.) of this resource. There may be more than one value for this property.	CommunicationModeTerms
ldac:indexableText ⓘ #prop_ldac:indexableText_CreativeWork	No	One or more target File(s) that together contain the full text of an item – each file should indicate its language.	http://schema.org/MediaObject
ldac:isDeIdentified ⓘ #prop_ldac:isDeIdentified_CreativeWork	No	The data in this item has had potentially identifying information removed, which may include replacing names with pseudonyms.	http://schema.org/Boolean
ldac:linguisticGenre ⓘ #prop_ldac:linguisticGenre_CreativeWork	No	A linguistic classification of the genre of this resource.	LinguisticGenreTerms
ldac:material ⓘ #prop_ldac:material_CreativeWork	No	Description of the original media, e.g. audio cassette tapes, participant questionnaires, field notes.	http://schema.org/Text
ldac:openAccessIndex ⓘ #prop_ldac:openAccessIndex_CreativeWork	No	One or more public index types allowed by a license, e.g. FullText indexing may be allowed for discovery even when an item is not.	IndexTypes
ldac:register ⓘ #prop_ldac:register_CreativeWork	No	The type of register (any of the varieties of a language that a speaker uses in a particular social context [Merriam-Webster]) of the contents of a language resource.	http://schema.org/Text
ldac:writtenLanguageFormat ⓘ #prop_ldac:writtenLanguageFormat_CreativeWork	No	The format of the resource resulting from the way the text was produced (handwritten, typeset, typewritten).	WrittenLanguageTypeTerms
publisher ⓘ #prop_publisher_CreativeWork	No	The organisation that published this work.	http://schema.org/Text, Organization
recipient ⓘ #prop_recipient_CreativeWork	No	The person or organisation responsible for creating this work. Authors should be identified using URIs such as ORCiD or ROR.	http://schema.org/Text, Person, Organization

Class: DataDepositLicense #class_DataDepositLicense

A license document setting out terms for deposit into a repository.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			https://w3id.org/ldac/terms#DataDepositLicense
No properties defined for this class

Class: DataLicense #class_DataLicense

A license document for data licensing. This is a superclass of DataReuseLicense and DataDepositLicense.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			https://w3id.org/ldac/terms#DataLicense
ldac:reviewDate ⓘ #prop_ldac:reviewDate_DataLicense	No	The date that this license should be reviewed.	http://schema.org/Text

Class: DataReuseLicense #class_DataReuseLicense

A license document, setting out terms for reuse of data.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			https://w3id.org/ldac/terms#DataReuseLicense
ldac:access ⓘ #prop_ldac:access_DataReuseLicense	No	Whether this is an open or restricted access license.	AccessTypes
ldac:accessControlList ⓘ #prop_ldac:accessControlList_DataReuseLicense	No	When a license has an authorizationWorkflow property with a value of the DefinedTerm AccessControlList this property has a URI value that points to a list of userIDs.	http://schema.org/URL
ldac:authorizationWorkflow ⓘ #prop_ldac:authorizationWorkflow_DataReuseLicense	No	By what process a user is granted authorization to a license.	AuthorizationWorkflows

Class: Dataset #class_Dataset

A body of structured information describing some topic(s) of interest.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://schema.org/Dataset
accountablePerson ⓘ #prop_accountablePerson_Dataset	Yes	The person or organisation who is the data steward for this resource.	Person, Organization
author ⓘ #prop_author_Dataset	Yes	The person or organisation responsible for creating this collection of data. Authors should be identified using URIs such as ORCiD or ROR.	Person, Organization
dct:rightsHolder ⓘ #prop_dct:rightsHolder_Dataset	Yes	The person or organisation owning or managing rights over the resource.	http://schema.org/Text, Person, Organization
publisher ⓘ #prop_publisher_Dataset	Yes	The organisation responsible for releasing this dataset.	Organization
citation ⓘ #prop_citation_Dataset	No	Associated publications.	CreativeWork
creditText ⓘ #prop_creditText_Dataset	No	A free text bibliographic citation for this material, e.g. 'Cite as: Musgrave (2023). Title of work. DOI'.	http://schema.org/Text
funder ⓘ #prop_funder_Dataset	No	The organisation(s) responsible for funding the creation or collection of this dataset.	Organization
hasPart ⓘ #prop_hasPart_Dataset	No	An item or CreativeWork that is part of this item, or CreativeWork (in some sense).	CreativeWork, File, Dataset
isAccessibleForFree ⓘ #prop_isAccessibleForFree_Dataset	No	This is available under an Open Access license.	http://schema.org/Boolean
isBasedOn ⓘ #prop_isBasedOn_Dataset	No	Link to or description of an original resource.	http://schema.org/Text, http://schema.org/URL, CreativeWork, Dataset, File
isPartOf ⓘ #prop_isPartOf_Dataset	No	An item or CreativeWork that this item, or CreativeWork (in some sense), is part of.	http://schema.org/URL, CreativeWork
ldac:annotationOf ⓘ #prop_ldac:annotationOf_Dataset	No	This resource contains some kind of description that adds information to the resource it references.	PrimaryMaterial
ldac:annotator ⓘ #prop_ldac:annotator_Dataset	No	The participant produced an annotation of this or a related resource.	Person, Organization
ldac:compiler ⓘ #prop_ldac:compiler_Dataset	No	The participant is responsible for collecting the sub-parts of the resource together.	Person, Organization
ldac:consultant ⓘ #prop_ldac:consultant_Dataset	No	The participant contributes expertise to the creation of a work, for example by contributing knowledge of their native language.	Person, Organization
ldac:dataInputter ⓘ #prop_ldac:dataInputter_Dataset	No	The participant responsible for entering, re-typing, and/or structuring the data contained in the resource.	Person, Organization
ldac:depositor ⓘ #prop_ldac:depositor_Dataset	No	The participant responsible for depositing the resource in an archive.	Person, Organization
ldac:developer ⓘ #prop_ldac:developer_Dataset	No	The participant developed the methodology or tools (including software) that constitute the resource, or that were used to create the resource.	Person, Organization
ldac:doi ⓘ #prop_ldac:doi_Dataset	No	A Digital Object Identifier, e.g. https://doi.org/10.1000/182.	http://schema.org/Text
ldac:editor ⓘ #prop_ldac:editor_Dataset	No	The participant reviewed, corrected, and/or tested the resource.	Person, Organization
ldac:hasCollectionProtocol ⓘ #prop_ldac:hasCollectionProtocol_Dataset	No	A link to a CollectionProtocol object with (at least) a summary of how resources were selected or elicited for this collection/sub-collection.	ldac:CollectionProtocol
ldac:illustrator ⓘ #prop_ldac:illustrator_Dataset	No	The participant contributed drawings or other illustrations to the resource.	Person, Organization
ldac:interpreter ⓘ #prop_ldac:interpreter_Dataset	No	The contributor renders the discourse recorded in the resource into another language in real time, or the contributor explains the discourse recorded in the resource.	Person, Organization
ldac:interviewee ⓘ #prop_ldac:interviewee_Dataset	No	The participant was a respondent in an interview.	Person, Organization
ldac:interviewer ⓘ #prop_ldac:interviewer_Dataset	No	The participant conducted an interview that forms part of the resource.	Person, Organization
ldac:participant ⓘ #prop_ldac:participant_Dataset	No	The participant was present during the creation of the resource, but did not contribute substantially to its content.	Person, Organization
ldac:performer ⓘ #prop_ldac:performer_Dataset	No	The participant performed some portion of a recorded, filmed, or transcribed resource. It is recommended that this term be used only for creative participants whose role is not better indicated by a more specific term, such as 'speaker', 'signer', or 'singer'.	Person, Organization
ldac:photographer ⓘ #prop_ldac:photographer_Dataset	No	The participant took the photograph, or shot the film, that appears in or constitutes the resource.	Person, Organization
ldac:recorder ⓘ #prop_ldac:recorder_Dataset	No	The participant operated the recording machinery used to create the resource.	Person, Organization
ldac:researcher ⓘ #prop_ldac:researcher_Dataset	No	The resource was created as part of the participant's research, or the research presents interim or final results from the participant's research.	Person, Organization
ldac:researchParticipant ⓘ #prop_ldac:researchParticipant_Dataset	No	The participant acted as a research subject or responded to a questionnaire, the results of which study form the basis of the resource.	Person, Organization
ldac:responder ⓘ #prop_ldac:responder_Dataset	No	The participant was an interlocutor in some sort of discourse event, but only reacted to the contributions of others.	Person, Organization
ldac:signer ⓘ #prop_ldac:signer_Dataset	No	The contributor was a principal signer in a resource that consists of a recording, a film, or a transcription of a recorded resource. Signers are those whose gestures predominate in a recorded or filmed resource. (The resource may be a transcription of that recording).	Person, Organization
ldac:singer ⓘ #prop_ldac:singer_Dataset	No	The participant sang, either individually or as part of a group, in a resource that consists of a recording, a film, or a transcription of a recorded resource.	Person, Organization
ldac:speaker ⓘ #prop_ldac:speaker_Dataset	No	The contributor was a principal speaker in a resource that consists of a recording, a film, or a transcription of a recorded resource. Speakers are those whose voices predominate in a recorded or filmed resource. (The resource may be a transcription of that recording).	Person, Organization
ldac:sponsor ⓘ #prop_ldac:sponsor_Dataset	No	The participant contributed financial support to the creation of the resource.	Person, Organization
ldac:transcriber ⓘ #prop_ldac:transcriber_Dataset	No	The participant produced a transcription of this or a related resource.	Person, Organization
ldac:translator ⓘ #prop_ldac:translator_Dataset	No	The participant produced a translation of this or a related resource.	Person, Organization
pcdm:hasMember ⓘ #prop_pcdm:hasMember_Dataset	No	The sub-collections, if any, associated with this collection.	RepositoryCollection, RepositoryObject
pcdm:memberOf ⓘ #prop_pcdm:memberOf_Dataset	No	Links from a Repository Object or Collection to a containing Repository Object or Collection.	RepositoryCollection
spatialCoverage ⓘ #prop_spatialCoverage_Dataset	No	The place(s) that are the focus of the content. It is a sub-property of contentLocation intended primarily for more technical and detailed materials. For example, with a dataset, it indicates areas that the dataset describes: a dataset Cape York languages would have spatialCoverage which was the place: the outline of the Cape.	Place
temporalCoverage ⓘ #prop_temporalCoverage_Dataset	No	The range of years of creation for items in this dataset using a slash, e.g. 1900/1945. If there are sub-collections with different coverages put this on the sub-collections not the top-level.	http://schema.org/DateTime, http://schema.org/Text
usageInfo ⓘ #prop_usageInfo_Dataset	No	Additional information on licensing options for using the data, e.g. 'Contact the Data Steward to discuss license terms'.	http://schema.org/Text

Class: dct:Collection #class_dct:Collection

An aggregation of resources.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://purl.org/dc/terms/Collection
No properties defined for this class

Class: dct:Dataset #class_dct:Dataset

Data encoded in a defined structure.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://purl.org/dc/terms/Dataset
No properties defined for this class

Class: dct:Event #class_dct:Event

A non-persistent, time-based occurrence.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://purl.org/dc/terms/Event
No properties defined for this class

Class: dct:Image #class_dct:Image

A visual representation other than text.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://purl.org/dc/terms/Image
No properties defined for this class

Class: dct:InteractiveResource #class_dct:InteractiveResource

A resource requiring interaction from the user to be understood, executed, or experienced.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://purl.org/dc/terms/InteractiveResource
No properties defined for this class

Class: dct:MovingImage #class_dct:MovingImage

A series of visual representations imparting an impression of motion when shown in succession.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://purl.org/dc/terms/MovingImage
No properties defined for this class

Class: dct:PhysicalObject #class_dct:PhysicalObject

An inanimate, three-dimensional object or substance.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://purl.org/dc/terms/PhysicalObject
No properties defined for this class

Class: dct:Service #class_dct:Service

A system that provides one or more functions.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://purl.org/dc/terms/Service
No properties defined for this class

Class: dct:Software #class_dct:Software

A computer program in source or compiled form.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://purl.org/dc/terms/Software
No properties defined for this class

Class: dct:Sound #class_dct:Sound

A resource primarily intended to be heard.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://purl.org/dc/terms/Sound
No properties defined for this class

Class: dct:StillImage #class_dct:StillImage

A static visual representation.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://purl.org/dc/terms/StillImage
No properties defined for this class

Class: dct:Text #class_dct:Text

A resource consisting primarily of words for reading.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://purl.org/dc/terms/Text
No properties defined for this class

Class: File #class_File

A media object, such as an image, video, audio, or text object embedded in a web page or a downloadable dataset i.e. DataDownload.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://schema.org/MediaObject
contentSize ⓘ #prop_contentSize_File	No	File size in (mega/kilo)bytes.	http://schema.org/Text
encodingFormat ⓘ #prop_encodingFormat_File	No	The media type typically expressed using a MIME format.	http://schema.org/Text, http://schema.org/WebPage, http://schema.org/CreativeWork
hasPart ⓘ #prop_hasPart_File	No	An item or CreativeWork that is part of this item, or CreativeWork (in some sense).	CreativeWork, File
ldac:derivationOf ⓘ #prop_ldac:derivationOf_File	No	This property references another resource from which the current resource is derived, e.g. downsampling audio or video files, or extracting text from a PDF.	Annotation, PrimaryMaterial
ldac:hasDerivation ⓘ #prop_ldac:hasDerivation_File	No	This property references another resource that is derived from it, such as a downsampled audio or video file, or text extracted from a PDF.	DerivedMaterial
ldac:materialType ⓘ #prop_ldac:materialType_File	No	Indicates whether the material in a file is the original (primary) source or is derived from it or describes it via annotation.	MaterialTypes

Class: Geometry #class_Geometry

A coherent set of direct positions in space. The positions are held within a Spatial Reference System (SRS).

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://www.opengis.net/ont/geosparql#Geometry
geosparql:asWKT ⓘ #prop_geosparql:asWKT_Geometry	No	The WKT serialisation of the geometry.	http://schema.org/Text

Class: Language #class_Language

Natural languages such as Spanish, Tamil, Hindi, English, etc.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://schema.org/Language
No properties defined for this class

Class: ldac:CollectionProtocol #class_ldac:CollectionProtocol

A description of how this Object or Collection was obtained, such as the strategy used for selecting written source texts, or the prompts given to participants.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			https://w3id.org/ldac/terms#CollectionProtocol
ldac:collectionProtocolType ⓘ #prop_ldac:collectionProtocolType_ldac:CollectionProtocol	No	A description of the process used to collect or collate data, such as prompts given to participants, or how texts are selected for inclusion in a collection.	CollectionProtocolTypeTerms

Class: Organization #class_Organization

An organization such as a school, NGO, corporation, club, etc.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://schema.org/Organization
location ⓘ #prop_location_Organization	No	A location for the organisation, e.g. a city for a publisher.	http://schema.org/Text

Class: Person #class_Person

A person (alive, dead, undead, or fictional).

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://schema.org/Person
affiliation ⓘ #prop_affiliation_Person	No	The organisation that this person is affiliated with. For example, a university or school.	Organization
ldac:age ⓘ #prop_ldac:age_Person	No	The age of a person. If an age is specified, a specializationOf pointing to a 'canonical' ageless version of that Person can also be included.	http://schema.org/Text

Class: Place #class_Place

Entities that have a somewhat fixed, physical extension.

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://schema.org/Place
address ⓘ #prop_address_Place	No	The physical address of the place.	http://schema.org/Text
geo ⓘ #prop_geo_Place	No	The geographic coordinates of the place.	Geometry

Class: RepositoryCollection #class_RepositoryCollection

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://pcdm.org/models#Collection
inLanguage ⓘ #prop_inLanguage_RepositoryCollection	Yes	The language in which the resource is written.	Language
conformsTo ⓘ #prop_conformsTo_RepositoryCollection	No	A link to the language data commons RO-Crate profile for collections.	Values for conformsTo
contentLocation ⓘ #prop_contentLocation_RepositoryCollection	No	The location depicted or described in the content. For example, the location in a photograph or painting.	Place
dateCreated ⓘ #prop_dateCreated_RepositoryCollection	No	The (earliest) date the data in this dataset were created.	http://schema.org/Date
holdingArchive ⓘ #prop_holdingArchive_RepositoryCollection	No	Organisation where the original of this work or collection is housed.	Organization, http://schema.org/Text
ldac:dateFreeText ⓘ #prop_ldac:dateFreeText_RepositoryCollection	No	Date information which cannot be put in one of the standard date formats, e.g. 'mid-1970s', or it is not clear, for example, if it is a creation or publication date.	http://schema.org/Text
ldac:itemLocation ⓘ #prop_ldac:itemLocation_RepositoryCollection	No	Current location of the item, e.g. where a set of audio tapes are stored.	Place, Organization
ldac:subjectLanguage ⓘ #prop_ldac:subjectLanguage_RepositoryCollection	No	The languages that the materials in the collection are about (not the language that it is in).	Language

Class: RepositoryObject #class_RepositoryObject

Instances of this type MAY be present in the crate.

Min Count	Max Count
N/A	N/A

Property	Required	Description	Range	Value
@type	Yes			http://pcdm.org/models#Object
conformsTo ⓘ #prop_conformsTo_RepositoryObject	No	A link to the language data commons RO-Crate profile for collections.	http://schema.org/Text
creator ⓘ #prop_creator_RepositoryObject	No	The creator/author of this CreativeWork. This is the same as the Author property for CreativeWork.	Person
dateCreated ⓘ #prop_dateCreated_RepositoryObject	No	The date on which the CreativeWork was created or the item was added to a DataFeed.	http://schema.org/Text
description ⓘ #prop_description_RepositoryObject	No	A description of the item.	http://schema.org/Text
identifier ⓘ #prop_identifier_RepositoryObject	No	The identifier property represents any kind of identifier for any kind of [[Thing]], such as ISBNs, GTIN codes, UUIDs etc. Schema.org provides dedicated properties for representing many of these, either as textual strings or as URL (URI) links. See background notes for more details.	http://schema.org/PropertyValue, http://schema.org/Text, http://schema.org/URL
ldac:hasAnnotation ⓘ #prop_ldac:hasAnnotation_RepositoryObject	No	This resource is referenced by another resource that adds information to it such as a translation, transcription or other analysis.	Annotation
license ⓘ #prop_license_RepositoryObject	No	A license document that applies to this content, typically indicated by URL.	DataReuseLicense
temporalCoverage ⓘ #prop_temporalCoverage_RepositoryObject	No	The temporalCoverage of a CreativeWork indicates the period that the content applies to, i.e. that it describes, either as a DateTime or as a textual string indicating a time period in ISO 8601 time interval format. In the case of a Dataset it will typically indicate the relevant time period in a precise notation (e.g. for a 2011 census dataset, the year 2011 would be written "2011/2012"). Other forms of content, e.g. ScholarlyArticle, Book, TVSeries or TVEpisode, may indicate their temporalCoverage in broader terms - textually or via well-known URL. Written works such as books may sometimes have precise temporal coverage too, e.g. a work set in 1939 - 1945 can be indicated in ISO 8601 interval format format via "1939/1945". Open-ended date ranges can be written with ".." in place of the end date. For example, "2015-11/.." indicates a range beginning in November 2015 and with no specified final date. This is tentative and might be updated in future when ISO 8601 is officially updated.	http://schema.org/Text

Class: RO-Crate Metadata Descriptor #RO-Crate_Metadata_Descriptor

An RO-Crate @graph must contain an entity of Type @CreativeWork which is known as the RO-Crate Metadata descriptor.

At least 1 instances of this type MUST be present in the crate.

A maximum of 1 instances of this type MAY be present in the crate.

Min Count	Max Count
1	1

Property	Required	Description	Range	Value
@type	Yes			http://schema.org/CreativeWork
@id #RO-Crate_Metadata_Descriptor.id	Yes	The RO-Crate Metadata file identifier	Root Data Entity	ro-crate-metadata.json
about ⓘ #RO-Crate_Metadata_Descriptor.about	Yes	This property on the RO-Crate Metadata Descriptor references the Root Data Entity. In a SoSS+ profile there may be Schemas present for more than one 'flavour' of Root Data Entity with different @type arrays or `@conformsTo` references (or other specializations).	Root Data Entity

Class: README Entity #README_Entity

An Data Package MUST contain a README file that describes the contents of the package. This entity represents the README.html file.

At least 1 instances of this type MUST be present in the crate.

A maximum of 1 instances of this type MAY be present in the crate.

Min Count	Max Count
1	1

Property	Required	Description	Range	Value
@type	Yes			http://schema.org/MediaObject
@id #README.id	Yes	There must with the path `README.html` in the root of the RO-Crate, and it must be described by an entity of type README_Entity with an @id of `README.html`.	README Entity	README.html

Class: Root Data Entity #Root_Data_Entity

The Root Data Entity for an RO-Crate. This is the main entity of the RO-Crate and is the one that is referenced by the RO-Crate Metadata Descriptor. In this profile, it is a Dataset and RepositoryCollection.

At least 1 instances of this type MUST be present in the crate.

A maximum of 1 instances of this type MAY be present in the crate.

Min Count	Max Count
1	1

Property	Required	Description	Range	Value
@type	Yes			http://schema.org/Dataset, http://pcdm.org/models#Collection
datePublished ⓘ #prop_datePublished_Dataset	Yes	A date that this collection was published. This should be the date that the collection was first made available.	http://schema.org/Date
description ⓘ #prop_description_Dataset	Yes	An abstract of the collection. Include as much detail as possible about the motivation and use of the collection.	http://schema.org/Text
license ⓘ #prop_license_Dataset	Yes	A license document that applies to this content, typically indicated by URL.	DataReuseLicense, http://schema.org/URL, http://schema.org/Text
name ⓘ #prop_name_Dataset	Yes	The name of this data collection.	http://schema.org/Text

All Properties

Property: @id #README.id

Description	Range	Occurs in Domain(s)
There must with the path `README.html` in the root of the RO-Crate, and it must be described by an entity of type README_Entity with an @id of `README.html`.	README Entity	README Entity

Property: @id #RO-Crate_Metadata_Descriptor.id

Description	Range	Occurs in Domain(s)
The RO-Crate Metadata file identifier	Root Data Entity	RO-Crate Metadata Descriptor

Property: about ⓘ #RO-Crate_Metadata_Descriptor.about

Description	Range	Occurs in Domain(s)
This property on the RO-Crate Metadata Descriptor references the Root Data Entity. In a SoSS+ profile there may be Schemas present for more than one 'flavour' of Root Data Entity with different @type arrays or `@conformsTo` references (or other specializations).	Root Data Entity	RO-Crate Metadata Descriptor

Property: accountablePerson ⓘ #prop_accountablePerson_Dataset

Description	Range	Occurs in Domain(s)
The person or organisation who is the data steward for this resource.	Person, Organization	Dataset

Property: address ⓘ #prop_address_Place

Description	Range	Occurs in Domain(s)
The physical address of the place.	http://schema.org/Text	Place

Property: affiliation ⓘ #prop_affiliation_Person

Description	Range	Occurs in Domain(s)
The organisation that this person is affiliated with. For example, a university or school.	Organization	Person

Property: author ⓘ #prop_author_CreativeWork

Description	Range	Occurs in Domain(s)
The person or organisation responsible for creating this work. Authors should be identified using URIs such as ORCiD or ROR.	http://schema.org/Text, Person, Organization	CreativeWork

Property: author ⓘ #prop_author_Dataset

Description	Range	Occurs in Domain(s)
The person or organisation responsible for creating this collection of data. Authors should be identified using URIs such as ORCiD or ROR.	Person, Organization	Dataset

Property: citation ⓘ #prop_citation_Dataset

Description	Range	Occurs in Domain(s)
Associated publications.	CreativeWork	Dataset

Property: conformsTo ⓘ #prop_conformsTo_RepositoryCollection

Description	Range	Occurs in Domain(s)
A link to the language data commons RO-Crate profile for collections.	Values for conformsTo	RepositoryCollection

Property: conformsTo ⓘ #prop_conformsTo_RepositoryObject

Description	Range	Occurs in Domain(s)
A link to the language data commons RO-Crate profile for collections.	http://schema.org/Text	RepositoryObject

Property: contentLocation ⓘ #prop_contentLocation_RepositoryCollection

Description	Range	Occurs in Domain(s)
The location depicted or described in the content. For example, the location in a photograph or painting.	Place	RepositoryCollection

Property: contentSize ⓘ #prop_contentSize_File

Description	Range	Occurs in Domain(s)
File size in (mega/kilo)bytes.	http://schema.org/Text	File

Property: creator ⓘ #prop_creator_RepositoryObject

Description	Range	Occurs in Domain(s)
The creator/author of this CreativeWork. This is the same as the Author property for CreativeWork.	Person	RepositoryObject

Property: creditText ⓘ #prop_creditText_Dataset

Description	Range	Occurs in Domain(s)
A free text bibliographic citation for this material, e.g. 'Cite as: Musgrave (2023). Title of work. DOI'.	http://schema.org/Text	Dataset

Property: dateCreated ⓘ #prop_dateCreated_RepositoryCollection

Description	Range	Occurs in Domain(s)
The (earliest) date the data in this dataset were created.	http://schema.org/Date	RepositoryCollection

Property: dateCreated ⓘ #prop_dateCreated_RepositoryObject

Description	Range	Occurs in Domain(s)
The date on which the CreativeWork was created or the item was added to a DataFeed.	http://schema.org/Text	RepositoryObject

Property: datePublished ⓘ #prop_datePublished_Dataset

Description	Range	Occurs in Domain(s)
A date that this collection was published. This should be the date that the collection was first made available.	http://schema.org/Date	Root Data Entity

Property: dct:rightsHolder ⓘ #prop_dct:rightsHolder_Dataset

Description	Range	Occurs in Domain(s)
The person or organisation owning or managing rights over the resource.	http://schema.org/Text, Person, Organization	Dataset

Property: description ⓘ #prop_description_Dataset

Description	Range	Occurs in Domain(s)
An abstract of the collection. Include as much detail as possible about the motivation and use of the collection.	http://schema.org/Text	Root Data Entity

Property: description ⓘ #prop_description_RepositoryObject

Description	Range	Occurs in Domain(s)
A description of the item.	http://schema.org/Text	RepositoryObject

Property: encodingFormat ⓘ #prop_encodingFormat_File

Description	Range	Occurs in Domain(s)
The media type typically expressed using a MIME format.	http://schema.org/Text, http://schema.org/WebPage, http://schema.org/CreativeWork	File

Property: funder ⓘ #prop_funder_Dataset

Description	Range	Occurs in Domain(s)
The organisation(s) responsible for funding the creation or collection of this dataset.	Organization	Dataset

Property: geo ⓘ #prop_geo_Place

Description	Range	Occurs in Domain(s)
The geographic coordinates of the place.	Geometry	Place

Property: geosparql:asWKT ⓘ #prop_geosparql:asWKT_Geometry

Description	Range	Occurs in Domain(s)
The WKT serialisation of the geometry.	http://schema.org/Text	Geometry

Property: hasPart ⓘ #prop_hasPart_Dataset

Description	Range	Occurs in Domain(s)
An item or CreativeWork that is part of this item, or CreativeWork (in some sense).	CreativeWork, File, Dataset	Dataset

Property: hasPart ⓘ #prop_hasPart_File

Description	Range	Occurs in Domain(s)
An item or CreativeWork that is part of this item, or CreativeWork (in some sense).	CreativeWork, File	File

Property: holdingArchive ⓘ #prop_holdingArchive_RepositoryCollection

Description	Range	Occurs in Domain(s)
Organisation where the original of this work or collection is housed.	Organization, http://schema.org/Text	RepositoryCollection

Property: identifier ⓘ #prop_identifier_RepositoryObject

Description	Range	Occurs in Domain(s)
The identifier property represents any kind of identifier for any kind of [[Thing]], such as ISBNs, GTIN codes, UUIDs etc. Schema.org provides dedicated properties for representing many of these, either as textual strings or as URL (URI) links. See background notes for more details.	http://schema.org/PropertyValue, http://schema.org/Text, http://schema.org/URL	RepositoryObject

Property: inLanguage ⓘ #prop_inLanguage_RepositoryCollection

Description	Range	Occurs in Domain(s)
The language in which the resource is written.	Language	RepositoryCollection

Property: isAccessibleForFree ⓘ #prop_isAccessibleForFree_Dataset

Description	Range	Occurs in Domain(s)
This is available under an Open Access license.	http://schema.org/Boolean	Dataset

Property: isBasedOn ⓘ #prop_isBasedOn_Dataset

Description	Range	Occurs in Domain(s)
Link to or description of an original resource.	http://schema.org/Text, http://schema.org/URL, CreativeWork, Dataset, File	Dataset

Property: isbn ⓘ #prop_isbn_CreativeWork

Description	Range	Occurs in Domain(s)
The ISBN for this work, if applicable.	http://schema.org/Text	CreativeWork

Property: isPartOf ⓘ #prop_isPartOf_Dataset

Description	Range	Occurs in Domain(s)
An item or CreativeWork that this item, or CreativeWork (in some sense), is part of.	http://schema.org/URL, CreativeWork	Dataset

Property: issn ⓘ #prop_issn_CreativeWork

Description	Range	Occurs in Domain(s)
The ISSN for this publication.	http://schema.org/Text	CreativeWork

Property: ldac:access ⓘ #prop_ldac:access_DataReuseLicense

Description	Range	Occurs in Domain(s)
Whether this is an open or restricted access license.	AccessTypes	DataReuseLicense

Property: ldac:accessControlList ⓘ #prop_ldac:accessControlList_DataReuseLicense

Description	Range	Occurs in Domain(s)
When a license has an authorizationWorkflow property with a value of the DefinedTerm AccessControlList this property has a URI value that points to a list of userIDs.	http://schema.org/URL	DataReuseLicense

Property: ldac:age ⓘ #prop_ldac:age_Person

Description	Range	Occurs in Domain(s)
The age of a person. If an age is specified, a specializationOf pointing to a 'canonical' ageless version of that Person can also be included.	http://schema.org/Text	Person

Property: ldac:annotationOf ⓘ #prop_ldac:annotationOf_Dataset

Description	Range	Occurs in Domain(s)
This resource contains some kind of description that adds information to the resource it references.	PrimaryMaterial	Dataset

Property: ldac:annotationType ⓘ #prop_ldac:annotationType_CreativeWork

Description	Range	Occurs in Domain(s)
The type of an Annotation resource.	AnnotationTypeTerms	CreativeWork

Property: ldac:annotator ⓘ #prop_ldac:annotator_Dataset

Description	Range	Occurs in Domain(s)
The participant produced an annotation of this or a related resource.	Person, Organization	Dataset

Property: ldac:authorizationWorkflow ⓘ #prop_ldac:authorizationWorkflow_DataReuseLicense

Description	Range	Occurs in Domain(s)
By what process a user is granted authorization to a license.	AuthorizationWorkflows	DataReuseLicense

Property: ldac:channels ⓘ #prop_ldac:channels_CreativeWork

Description	Range	Occurs in Domain(s)
The number of audio channels this resource contains (e.g. 1, 2, 5.1).	http://schema.org/Text	CreativeWork

Property: ldac:collectionEventType ⓘ #prop_ldac:collectionEventType_CollectionEvent

Description	Range	Occurs in Domain(s)
A kind of CollectionEvent characterised by some specific procedures, e.g. a psycholinguistic experiment.	CollectionEventTypeTerms	CollectionEvent

Property: ldac:collectionProtocolType ⓘ #prop_ldac:collectionProtocolType_ldac:CollectionProtocol

Description	Range	Occurs in Domain(s)
A description of the process used to collect or collate data, such as prompts given to participants, or how texts are selected for inclusion in a collection.	CollectionProtocolTypeTerms	ldac:CollectionProtocol

Property: ldac:communicationMode ⓘ #prop_ldac:communicationMode_CreativeWork

Description	Range	Occurs in Domain(s)
The mode (spoken, written, signed etc.) of this resource. There may be more than one value for this property.	CommunicationModeTerms	CreativeWork

Property: ldac:compiler ⓘ #prop_ldac:compiler_Dataset

Description	Range	Occurs in Domain(s)
The participant is responsible for collecting the sub-parts of the resource together.	Person, Organization	Dataset

Property: ldac:consultant ⓘ #prop_ldac:consultant_Dataset

Description	Range	Occurs in Domain(s)
The participant contributes expertise to the creation of a work, for example by contributing knowledge of their native language.	Person, Organization	Dataset

Property: ldac:dataInputter ⓘ #prop_ldac:dataInputter_Dataset

Description	Range	Occurs in Domain(s)
The participant responsible for entering, re-typing, and/or structuring the data contained in the resource.	Person, Organization	Dataset

Property: ldac:dateFreeText ⓘ #prop_ldac:dateFreeText_RepositoryCollection

Description	Range	Occurs in Domain(s)
Date information which cannot be put in one of the standard date formats, e.g. 'mid-1970s', or it is not clear, for example, if it is a creation or publication date.	http://schema.org/Text	RepositoryCollection

Property: ldac:depositor ⓘ #prop_ldac:depositor_Dataset

Description	Range	Occurs in Domain(s)
The participant responsible for depositing the resource in an archive.	Person, Organization	Dataset

Property: ldac:derivationOf ⓘ #prop_ldac:derivationOf_File

Description	Range	Occurs in Domain(s)
This property references another resource from which the current resource is derived, e.g. downsampling audio or video files, or extracting text from a PDF.	Annotation, PrimaryMaterial	File

Property: ldac:developer ⓘ #prop_ldac:developer_Dataset

Description	Range	Occurs in Domain(s)
The participant developed the methodology or tools (including software) that constitute the resource, or that were used to create the resource.	Person, Organization	Dataset

Property: ldac:doi ⓘ #prop_ldac:doi_Dataset

Description	Range	Occurs in Domain(s)
A Digital Object Identifier, e.g. https://doi.org/10.1000/182.	http://schema.org/Text	Dataset

Property: ldac:editor ⓘ #prop_ldac:editor_Dataset

Description	Range	Occurs in Domain(s)
The participant reviewed, corrected, and/or tested the resource.	Person, Organization	Dataset

Property: ldac:hasAnnotation ⓘ #prop_ldac:hasAnnotation_RepositoryObject

Description	Range	Occurs in Domain(s)
This resource is referenced by another resource that adds information to it such as a translation, transcription or other analysis.	Annotation	RepositoryObject

Property: ldac:hasCollectionProtocol ⓘ #prop_ldac:hasCollectionProtocol_Dataset

Description	Range	Occurs in Domain(s)
A link to a CollectionProtocol object with (at least) a summary of how resources were selected or elicited for this collection/sub-collection.	ldac:CollectionProtocol	Dataset

Property: ldac:hasDerivation ⓘ #prop_ldac:hasDerivation_File

Description	Range	Occurs in Domain(s)
This property references another resource that is derived from it, such as a downsampled audio or video file, or text extracted from a PDF.	DerivedMaterial	File

Property: ldac:illustrator ⓘ #prop_ldac:illustrator_Dataset

Description	Range	Occurs in Domain(s)
The participant contributed drawings or other illustrations to the resource.	Person, Organization	Dataset

Property: ldac:indexableText ⓘ #prop_ldac:indexableText_CreativeWork

Description	Range	Occurs in Domain(s)
One or more target File(s) that together contain the full text of an item – each file should indicate its language.	http://schema.org/MediaObject	CreativeWork

Property: ldac:interpreter ⓘ #prop_ldac:interpreter_Dataset

Description	Range	Occurs in Domain(s)
The contributor renders the discourse recorded in the resource into another language in real time, or the contributor explains the discourse recorded in the resource.	Person, Organization	Dataset

Property: ldac:interviewee ⓘ #prop_ldac:interviewee_Dataset

Description	Range	Occurs in Domain(s)
The participant was a respondent in an interview.	Person, Organization	Dataset

Property: ldac:interviewer ⓘ #prop_ldac:interviewer_Dataset

Description	Range	Occurs in Domain(s)
The participant conducted an interview that forms part of the resource.	Person, Organization	Dataset

Property: ldac:isDeIdentified ⓘ #prop_ldac:isDeIdentified_CreativeWork

Description	Range	Occurs in Domain(s)
The data in this item has had potentially identifying information removed, which may include replacing names with pseudonyms.	http://schema.org/Boolean	CreativeWork

Property: ldac:itemLocation ⓘ #prop_ldac:itemLocation_RepositoryCollection

Description	Range	Occurs in Domain(s)
Current location of the item, e.g. where a set of audio tapes are stored.	Place, Organization	RepositoryCollection

Property: ldac:linguisticGenre ⓘ #prop_ldac:linguisticGenre_CreativeWork

Description	Range	Occurs in Domain(s)
A linguistic classification of the genre of this resource.	LinguisticGenreTerms	CreativeWork

Property: ldac:material ⓘ #prop_ldac:material_CreativeWork

Description	Range	Occurs in Domain(s)
Description of the original media, e.g. audio cassette tapes, participant questionnaires, field notes.	http://schema.org/Text	CreativeWork

Property: ldac:materialType ⓘ #prop_ldac:materialType_File

Description	Range	Occurs in Domain(s)
Indicates whether the material in a file is the original (primary) source or is derived from it or describes it via annotation.	MaterialTypes	File

Property: ldac:openAccessIndex ⓘ #prop_ldac:openAccessIndex_CreativeWork

Description	Range	Occurs in Domain(s)
One or more public index types allowed by a license, e.g. FullText indexing may be allowed for discovery even when an item is not.	IndexTypes	CreativeWork

Property: ldac:participant ⓘ #prop_ldac:participant_Dataset

Description	Range	Occurs in Domain(s)
The participant was present during the creation of the resource, but did not contribute substantially to its content.	Person, Organization	Dataset

Property: ldac:performer ⓘ #prop_ldac:performer_Dataset

Description	Range	Occurs in Domain(s)
The participant performed some portion of a recorded, filmed, or transcribed resource. It is recommended that this term be used only for creative participants whose role is not better indicated by a more specific term, such as 'speaker', 'signer', or 'singer'.	Person, Organization	Dataset

Property: ldac:photographer ⓘ #prop_ldac:photographer_Dataset

Description	Range	Occurs in Domain(s)
The participant took the photograph, or shot the film, that appears in or constitutes the resource.	Person, Organization	Dataset

Property: ldac:recorder ⓘ #prop_ldac:recorder_Dataset

Description	Range	Occurs in Domain(s)
The participant operated the recording machinery used to create the resource.	Person, Organization	Dataset

Property: ldac:register ⓘ #prop_ldac:register_CreativeWork

Description	Range	Occurs in Domain(s)
The type of register (any of the varieties of a language that a speaker uses in a particular social context [Merriam-Webster]) of the contents of a language resource.	http://schema.org/Text	CreativeWork

Property: ldac:researcher ⓘ #prop_ldac:researcher_Dataset

Description	Range	Occurs in Domain(s)
The resource was created as part of the participant's research, or the research presents interim or final results from the participant's research.	Person, Organization	Dataset

Property: ldac:researchParticipant ⓘ #prop_ldac:researchParticipant_Dataset

Description	Range	Occurs in Domain(s)
The participant acted as a research subject or responded to a questionnaire, the results of which study form the basis of the resource.	Person, Organization	Dataset

Property: ldac:responder ⓘ #prop_ldac:responder_Dataset

Description	Range	Occurs in Domain(s)
The participant was an interlocutor in some sort of discourse event, but only reacted to the contributions of others.	Person, Organization	Dataset

Property: ldac:reviewDate ⓘ #prop_ldac:reviewDate_DataLicense

Description	Range	Occurs in Domain(s)
The date that this license should be reviewed.	http://schema.org/Text	DataLicense

Property: ldac:signer ⓘ #prop_ldac:signer_Dataset

Description	Range	Occurs in Domain(s)
The contributor was a principal signer in a resource that consists of a recording, a film, or a transcription of a recorded resource. Signers are those whose gestures predominate in a recorded or filmed resource. (The resource may be a transcription of that recording).	Person, Organization	Dataset

Property: ldac:singer ⓘ #prop_ldac:singer_Dataset

Description	Range	Occurs in Domain(s)
The participant sang, either individually or as part of a group, in a resource that consists of a recording, a film, or a transcription of a recorded resource.	Person, Organization	Dataset

Property: ldac:speaker ⓘ #prop_ldac:speaker_Dataset

Description	Range	Occurs in Domain(s)
The contributor was a principal speaker in a resource that consists of a recording, a film, or a transcription of a recorded resource. Speakers are those whose voices predominate in a recorded or filmed resource. (The resource may be a transcription of that recording).	Person, Organization	Dataset

Property: ldac:sponsor ⓘ #prop_ldac:sponsor_Dataset

Description	Range	Occurs in Domain(s)
The participant contributed financial support to the creation of the resource.	Person, Organization	Dataset

Property: ldac:subjectLanguage ⓘ #prop_ldac:subjectLanguage_RepositoryCollection

Description	Range	Occurs in Domain(s)
The languages that the materials in the collection are about (not the language that it is in).	Language	RepositoryCollection

Property: ldac:transcriber ⓘ #prop_ldac:transcriber_Dataset

Description	Range	Occurs in Domain(s)
The participant produced a transcription of this or a related resource.	Person, Organization	Dataset

Property: ldac:translator ⓘ #prop_ldac:translator_Dataset

Description	Range	Occurs in Domain(s)
The participant produced a translation of this or a related resource.	Person, Organization	Dataset

Property: ldac:writtenLanguageFormat ⓘ #prop_ldac:writtenLanguageFormat_CreativeWork

Description	Range	Occurs in Domain(s)
The format of the resource resulting from the way the text was produced (handwritten, typeset, typewritten).	WrittenLanguageTypeTerms	CreativeWork

Property: license ⓘ #prop_license_Dataset

Description	Range	Occurs in Domain(s)
A license document that applies to this content, typically indicated by URL.	DataReuseLicense, http://schema.org/URL, http://schema.org/Text	Root Data Entity

Property: license ⓘ #prop_license_RepositoryObject

Description	Range	Occurs in Domain(s)
A license document that applies to this content, typically indicated by URL.	DataReuseLicense	RepositoryObject

Property: location ⓘ #prop_location_Organization

Description	Range	Occurs in Domain(s)
A location for the organisation, e.g. a city for a publisher.	http://schema.org/Text	Organization

Property: name ⓘ #prop_name_Dataset

Description	Range	Occurs in Domain(s)
The name of this data collection.	http://schema.org/Text	Root Data Entity

Property: pcdm:hasMember ⓘ #prop_pcdm:hasMember_Dataset

Description	Range	Occurs in Domain(s)
The sub-collections, if any, associated with this collection.	RepositoryCollection, RepositoryObject	Dataset

Property: pcdm:memberOf ⓘ #prop_pcdm:memberOf_Dataset

Description	Range	Occurs in Domain(s)
Links from a Repository Object or Collection to a containing Repository Object or Collection.	RepositoryCollection	Dataset

Property: publisher ⓘ #prop_publisher_CreativeWork

Description	Range	Occurs in Domain(s)
The organisation that published this work.	http://schema.org/Text, Organization	CreativeWork

Property: publisher ⓘ #prop_publisher_Dataset

Description	Range	Occurs in Domain(s)
The organisation responsible for releasing this dataset.	Organization	Dataset

Property: recipient ⓘ #prop_recipient_CreativeWork

Description	Range	Occurs in Domain(s)
The person or organisation responsible for creating this work. Authors should be identified using URIs such as ORCiD or ROR.	http://schema.org/Text, Person, Organization	CreativeWork

Property: spatialCoverage ⓘ #prop_spatialCoverage_Dataset

Description	Range	Occurs in Domain(s)
The place(s) that are the focus of the content. It is a sub-property of contentLocation intended primarily for more technical and detailed materials. For example, with a dataset, it indicates areas that the dataset describes: a dataset Cape York languages would have spatialCoverage which was the place: the outline of the Cape.	Place	Dataset

Property: temporalCoverage ⓘ #prop_temporalCoverage_Dataset

Description	Range	Occurs in Domain(s)
The range of years of creation for items in this dataset using a slash, e.g. 1900/1945. If there are sub-collections with different coverages put this on the sub-collections not the top-level.	http://schema.org/DateTime, http://schema.org/Text	Dataset

Property: temporalCoverage ⓘ #prop_temporalCoverage_RepositoryObject

Description	Range	Occurs in Domain(s)
The temporalCoverage of a CreativeWork indicates the period that the content applies to, i.e. that it describes, either as a DateTime or as a textual string indicating a time period in ISO 8601 time interval format. In the case of a Dataset it will typically indicate the relevant time period in a precise notation (e.g. for a 2011 census dataset, the year 2011 would be written "2011/2012"). Other forms of content, e.g. ScholarlyArticle, Book, TVSeries or TVEpisode, may indicate their temporalCoverage in broader terms - textually or via well-known URL. Written works such as books may sometimes have precise temporal coverage too, e.g. a work set in 1939 - 1945 can be indicated in ISO 8601 interval format format via "1939/1945". Open-ended date ranges can be written with ".." in place of the end date. For example, "2015-11/.." indicates a range beginning in November 2015 and with no specified final date. This is tentative and might be updated in future when ISO 8601 is officially updated.	http://schema.org/Text	RepositoryObject

Property: usageInfo ⓘ #prop_usageInfo_Dataset

Description	Range	Occurs in Domain(s)
Additional information on licensing options for using the data, e.g. 'Contact the Data Steward to discuss license terms'.	http://schema.org/Text	Dataset