A good researcher seeks multiple sources of data to answer a research question. These data may be results from clinical trials, financial transactions, or even survey data. Often, these do not all stem from a single source, and possess small or large differences in research methods. How does one make sense of these disparate pieces of information and bring them together for better research? In an effort to integrate disparate data and provide better analyses, researchers often rely on a process called data harmonization.
What is data harmonization and why is it important?
Data harmonization is the process by which data from heterogeneous sources are combined into a unified, cohesive data product. It is inherently a research activity and success depends on an understanding of the research objectives, the idiosyncrasies of the data, and how these data will be translated into actionable results for specific target audiences.
Data harmonization is a very powerful research tool because it helps provide and maximize analytical capacity. This is important when, for example, attempting to examine historical trends but no consistent tracking surveys exist. Similarly, data harmonization may be highly important when one is researching a long-standing survey whose methods (and hence metric comparability) may have changed. Without these practices, researchers would not be able to (or would certainly be hindered) in scientifically examining certain issues or topics.
Common obstacles and FMG’s approach
A significant obstacle facing successful and sophisticated data harmonization is a lack of consistency in applying decision-making rules to preserve data quality. Good data harmonization requires a consistent protocol due to the variety of complex and ambiguous situations inherent to it. Often times, researchers are simply concerned with an end goal and do not focus on generating a sound foundation before beginning data harmonization. The absence of a protocol might cause researchers to arrive at differing decisions or results on separate occasions or with a different team of analysts. Furthermore, it is essential to also take into account conventions surrounding a particular data source.
FMG Team has designed and applies a rigorous approach to performing data harmonization tasks. This approach, composed of four pillars and termed ARMA, is what guides decision-making at the various data harmonization stages and crossroads. Its consistency ensures reproducible and defensible findings.
The value of the ARMA methodology is that it partners the FMG Team and the client to deal with data harmonization challenges in unison. Most importantly, it provides a blueprint for tests, both analytical and methodological, that can be used to resolve data harmonization challenges systematically. The ARMA methodology is adaptable to each client and helps FMG quickly move through the various stages of data harmonization. All of these stages rest on our ARMA methodology, and are focused on understanding, cataloguing, and using metadata as this is the best source of information in making effective data harmonization decisions.
The best advice the FMG Team can offer with regards to data harmonization is to be: methodic, detailed, and deliberate. First, it is highly important to establish and consistently follow a method for arriving at data harmonization decisions. These decisions should be guided by a detailed focus on understanding the limitations and caveats surrounding the data sources. Finally, these decisions should be subject to sufficient deliberation as they are bound to have significant impacts on potential findings.
Ultimately, data harmonization is often very challenging. Success depends on good planning, attention to detail, and above all an appreciation of how compiling data from different sources may or may not present an opportunity to expand the knowledge base for a particular topic.