Skip to contents

Transforms data from a Movebank dataset (formatted as a Frictionless Data Package) to Darwin Core. The resulting CSV and EML files can be uploaded to an IPT for publication to GBIF and/or OBIS. A meta.xml file is not created.

Usage

write_dwc(
  package,
  directory = ".",
  doi = package$id,
  contact = NULL,
  rights_holder = NULL,
  study_id = NULL
)

Arguments

package

A Frictionless Data Package of Movebank data, as read by frictionless::read_package().

directory

Path to local directory to write file(s) to.

doi

DOI of the original dataset, used to get metadata.

contact

Person to be set as resource contact and metadata provider. To be provided as a person().

rights_holder

Acronym of the organization owning or managing the rights over the data.

study_id

Identifier of the Movebank study from which the dataset was derived (e.g. 1605797471 for this study).

Value

CSV (data) and EML (metadata) files written to disk.

Details

See Get started for examples.

Metadata

Metadata are derived from the original dataset by looking up its doi in DataCite (example) and transforming these to EML. Uses datacite_to_eml() under the hood. The following properties are set:

  • title: Original title + [subsampled representation].

  • description: Automatically created first paragraph describing this is a derived dataset, followed by the original dataset description.

  • license: License of the original dataset.

  • creators: Creators of the original dataset.

  • contact: contact or first creator of the original dataset.

  • metadata provider: contact or first creator of the original dataset.

  • keywords: Keywords of the original dataset.

  • alternative identifier: DOI of the original dataset. This way, no new DOI will be created when publishing to GBIF.

  • external link and alternative identifier: URL created from study_id or the first "derived from" related identifier in the original dataset.

To be set manually in the GBIF IPT: type, subtype, update frequency, and publishing organization.

Not set: geographic, taxonomic, temporal coverage, associated parties, project data, sampling methods, and citations. Not applicable: collection data.

Data

package is expected to contain a reference-data and gps resource. Data are transformed into an Occurrence core. This follows recommendations discussed and created by Peter Desmet, Sarah Davidson, John Wieczorek and others. See the SQL file(s) used by this function for details.

Key features of the Darwin Core transformation:

  • Deployments (animal+tag associations) are parent events, with tag attachment (a human observation) and GPS positions (machine observations) as child events. No information about the parent event is provided other than its ID, meaning that data can be expressed in an Occurrence Core with one row per observation and parentEventID shared by all occurrences in a deployment.

  • The tag attachment event often contains metadata about the animal (sex, lifestage, comments) and deployment as a whole.

  • No event/occurrence is created for the deployment end, since the end date is often undefined, unreliable and/or does not represent an animal occurrence.

  • Only visible (nonoutlier) GPS records that fall within a deployment are included.

  • GPS positions are downsampled to the first GPS position per hour, to reduce the size of high-frequency data. It is possible for a deployment to contain no GPS positions, e.g. if the tag malfunctioned right after deployment.