Collaborative Studies Coordinating Center
Department of Biostatistics
 

Data Management

Since its inception more than 30 years ago, the UNC CSCC has been at the leading edge of the development of research data management systems. The first systems we developed, using the traditional paper-form data collection / centralized data processing architecture, had features designed to address the differences between commercial data base management systems and the requirements of health research projects. These systems were the first to incorporate a data quality field associated with each data value, to accommodate out of range (yet correct) values, different varieties of missing data, etc. This feature, among others, has come to be standard in data management software addressing the research data management market. In the mid 1980s, we developed a system for the ARIC project that was the first application of electronic data capture (EDC) in an NIH-sponsored epidemiologic study. In the early 1990s, we developed a system for the Longscan project which uses computer-based self-administration of questionnaires by participants. Because the study includes young children, we incorporated audio administration as an option for those with limited reading skills. Several of our more recent studies have used web-based software, to allow data collection, entry, and validation from any computer with a standard internet connection and browser software. In aggregate, these systems have been used to collect and process over ten million pages of data (or equivalent electronic records), from over 300 clinical centers and central agencies. The web systems have been used for seven large clinical trials or epidemiologic studies over the past 6 years.