Any digital product is only as good as its documentation. One can develop the best algorithm in the world, but it only becomes useful for others if accompanied by an explanation for how to use it. One can publish an expensive dataset, but others can only benefit from it if its key characteristics are documented. Good documentation is crucial, especially in a technically and ethically complex domain like machine learning. Over the past months we have therefore conducted a case-study on the best standards for technical and ethical documentation. The result of our study are concrete templates which other can use for reporting technical and ethical information on models and datasheets. A summary of our research is published in a paper for the IEEE RO-MAN 2022 conference and more details are available in the full FlexiGroBots deliverable D2.6. We invite other projects to re-use and adapt our templates for their use-cases.
Existing standards for model and dataset documentation
In the past years, several companies have proposed ways for reporting on machine learning models. The most prominent standard are ‘model cards’, which are essentially a list of questions for model developers, proposed by researchers at Google in 2019 (Mitchell et al. 2019). The overall objective of a model card is:
- to provide prospective (external) users of a model with technical, legal and ethical information to help them decide if they should use a specific model; and
- to help (internal) model developers adhere to a set of standards before publication.
Based on our review, the original model card standard provides the most comprehensive list of key questions to answer before publication. We therefore strongly build upon the original standard from Mitchell et al. (2019) in our work. Several organisations have adopted variations of the model card standard, for example model repositories like the TensorFlow Hub, Hugging Face Model Hub, or the PyTorch Hub. In practice, however, the information provided is often quite sparse due to lack of enforcement. In FlexiGroBots, each published model will be accompanied by a comprehensive model card.
Similarly, there are several standards for documenting technical and ethical information on datasets. The FAIR data principles, for instance, provide general guidelines for making data Findable, Accessible, Interoperable and Reusable, or the ‘datasheets’ standard provides a more concrete list of questions for dataset developers (Gebru et al. 2021). The datasheets standard was developed by private sector and academic researchers and provides the most practical basis for dataset reporting based on our literature review.
The purpose of datasheets is similar to Model Cards:
- It provides prospective dataset users with the technical, legal and ethical information necessary for deciding on the usefulness of the dataset for their specific use-case;
- It provides dataset creators with a check list of key questions to support internal quality control and standardisation. There are several dataset repositories or platforms, which provide the technical infrastructure for sharing data, with varying reporting standards (e.g. Zenodo, Harvard Dataverse, TensorFlow Datasets, Kaggle, Hugging Face Datasets, Papers With Code). In FlexiGroBots, we build upon the datasheets standard from Gebru et al. (2021) and all published datasets are accompanied by a datasheet.
Model card and datasheet templates
Standardisation is essential. We have therefore adopted these two standards and adapted them to our use-cases in autonomous robotics. To enable others to build upon this standard, we have created templates for model cards and datasheets, which can be used for non-commercial and educational purposes. We hope that his research inspires others to use similar documentation standards for machine learning models and datasets to help us achieve our common goal: building safe and trustworthy machine learning systems.