This repository includes a comprehensive solution to manage specimen information. The solution encompasses the following processes:
- Data Collection: Collect information about specimens using an Excel file.
- Data Transformation: Convert the collected information into XML and Turtle (TTL) formats using R functions.
- Data Presentation: Translate the transformed data into HTML landing pages via XSLT.
Implementation Standards
The implementation adheres to several standards and recommendations to ensure compliance and interoperability.
- ESIP: the solution follows the Physical Sample Curation recommendations;
- SOSA Ontology: the main version of the SOSA ontology (19-10-2017) - W3C SOSA Ontology;
- The extension of the SOSA ontology (05-09-2024) - W3C SOSA-SSN Draft and its GitHub repository;
- SESAR (System for Earth Sample Registration) XSD schema solutions [1];
- IGSN (International Generic Sample Number) CSIRO (Commonwealth Scientific and Industrial Research Organisation) schema compliance [2];
- TDWG MIDS (Minimum Information about a Digital Specimen) specification at level 1.
By adhering to the above standards and schemas, the solution ensures compliance with DataCite requirements, facilitating standardized and interoperable specimen data management.
Production Flow The production flow of this implementation is illustrated in the figure below:
The workflow of this application consists of the following steps:
Fill the Spreadsheet: complete the
specimen_template.xlsx
spreadsheet with the relevant specimen information;Generate XML and TTL: use the
specimen_catalogue()
function to generate IGSN CSIRO XML and TTL (Turtle) files based on the SOSA Ontology for each record present in the spreadsheet.