Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

To define the key concepts which inform whether a system for collecting, aggregating and processing routine clinical data for research is fit for purpose. Literature review and shared experiential learning from research using routinely collected data. We excluded socio-cultural issues, and privacy and security issues as our focus was to explore linking clinical data. Six key concepts describe data: (1) Data quality: the core Overarching concept - Are these data fit for purpose? (2) Data provenance: defined as how data came to be; incorporating the concepts of lineage and pedigree. Mapping this process requires metadata. New variables derived during data analysis have their own provenance. (3) Data extraction errors and (4) Data processing errors, which are the responsibility of the investigator extracting the data but need quantifying. (5) Traceability: the capability to identify the origins of any data cell within the final analysis table essential for good governance, and almost impossible without a formal system of metadata; and (6) Curation: storing data and look-up tables in a way that allows future researchers to carry out further research or review earlier findings. There are common distinct steps in processing data; the quality of any metadata may be predictive of the quality of the process. Outputs based on routine data should include a review of the process from data origin to curation and publish information about their data provenance and processing method.

Original publication




Journal article


Yearbook of medical informatics

Publication Date





112 - 120