MANUSCRIPT: Clinical epidemiology in the era of big data: new opportunities, familiar challenges
Routinely recorded health data have evolved from mere by-products of health care delivery or billing into a powerful research tool for studying and improving patient care through clinical epidemiologic research. Big data in the context of epidemiologic research means large interlinkable data sets within a single country or networks of multinational databases. Several Nordic, European, and other multinational collaborations are now well established. Advantages of big data for clinical epidemiology include improved precision of estimates, which is especially important for reassuring (“null”) findings; ability to conduct meaningful analyses in subgroup of patients; and rapid detection of safety signals. Big data will also provide new possibilities for research by enabling access to linked information from biobanks, electronic medical records, patient-reported outcome measures, automatic and semiautomatic electronic monitoring devices, and social media. The sheer amount of data, however, does not eliminate and may even amplify systematic error. Therefore, methodologies addressing systematic error, clinical knowledge, and underlying hypotheses are more important than ever to ensure that the signal is discernable behind the noise.