Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Background: We have used routinely collected clinical data in epidemiological and quality improvement research for over 10 years. We extract, pseudonymise and link data from heterogeneous distributed databases; inevitably encountering errors and problems. Objective: To develop a solution-orientated system of error reporting which enables appropriate corrective action. Method: Review of the 94 errors, which occurred in 2008/9. Previously we had described failures in terms of the data missing from our response files; however this provided little information about causation. We therefore developed a taxonomy based on the IT component limiting data extraction. Results: Our final taxonomy categorised errors as: (A) Data extraction Method and Process; (B) Translation Layer and Proxy Specification; (C) Shape and Complexity of the Original Schema; (D) Communication and System (mainly Software-based) Faults; (E) Hardware and Infrastructure; (F) Generic/Uncategorised and/or Human Errors. We found 79 distinct errors among the 94 reported; and the categories were generally predictive of the time needed to develop fixes. Conclusions: A systematic approach to errors and linking them to problem solving has improved project efficiency and enabled us to better predict any associated delays. © 2010 IMIA and SAHIA. All rights reserved.

Original publication




Conference paper

Publication Date





724 - 728