Data Plans: Metadata Standards

Properly describing and documenting data allows users (yourself included) to understand and track important details of the work. Standardization enables interoperability between systems, and greatly increases the ability of others to find and understand your data, and know what the requirements are for access re-use. It enables your data to be connected to relevant resources, such as publications, citations, indexes, measures of impact, and grants.

Metadata can take many different forms, from free text in a readme file to standardized, structured, and machine-readable information that follows a metadata schema.

Which metadata schema should you use?

Most data repositories use only one or perhaps a couple of metadata schema(s), depending on the type and subjects of the data they receive. Check with the repository where you might deposit your data before you begin outlining the metadata information in your data management plan.

Some repositories provide metadata assistance and other data curation support to data depositors, either free or for a fee (e.g., Inter-university Consortium for Political and Social Research – ICPSR).

If a standard has not been defined for your discipline or you’re not sure what repository you will use, contact lib-data@uiowa.edu.

For a general data-centric metadata standard, a good starting place is the DataCite schema.

  • It has mandatory fields (e.g., creator, title of the dataset), and
  • recommended and optional fields that some funders might require (i.e., funding source and grant number), and other elements (e.g., subject, description, geolocation, format, version, rights information).

Templates and Tools:

Some repositories and research domains have developed templates and minimum information standards for metadata so information is uniformly available following a common structure and format.

For example: The Gene Expression Omnibus (GEO) database uses the MIAME (Minimum Information About a Microarray Experiment) and MINSEQE (Minimum Information About a Next-generation Sequencing Experiment) guidelines which outline the minimum information that should be included when describing a microarray or sequencing study. Many journals and funding agencies require microarray data to comply with MIAME and MINSEQE standards.
– from GEO MIAME and MINSEQE guidelines.

In some domains, there are software programs to assist with creating metadata. For example,  Morpho is used for Ecological Markup Language (EML) for ecological data; and Nesstar for metadata for social science data.

If your discipline or repository does not require a specific metadata standard, or you would like assistance, contact us at lib-data@uiowa.edu.