September 22, 2013


SDTM: A Standard for Clinical Trial Data

Study Data Tabulation Model (SDTM) is defined by the Clinical Data Interchange Standards Consortium (CDISC) as a standard structure for human clinical trial (study) data tabulations that are to be submitted to a regulatory authority such as the US Food and Drug Administration (FDA).

The SDTM data is the standard format recommended by the FDA. It has become a CDISC regulated content standard that describes how to organize subject information into variables and domains to be used as a standardized submission dataset format. The purpose of this model is to structure and format the tabulation data that are to be submitted to a regulatory authority. SDTM is based on the concept of observations (described by variables) made on the subjects who participate in a clinical study. The collected data is classified into a series of domains. The key idea of this model is that the domains are divided into Findings, Interventions, Events, and Special Purpose classes.

SDTM is a standard that improves process efficiency and a model that provides flexibility. It has the following advantages:

  • Provides a uniform standard for clinical trial data study to ease data exchange
  • Facilitates communication between CRO’s, sponsors, and regulators
  • Improves viewing and analysis by streamlining the flow of data in a clinical trial process and facilitating data interchange between partners and providers
  • Facilitates data management by consolidating the data collected from multiple CRFs
  • Improves the effectiveness of reviewers with less time to prepare, by providing standardized datasets and standard software tools
  • Ensures a more comprehensive, timely and efficient FDA review process, by providing the reviewer with standard tools and checks
  • Facilitates meta-analysis of safety across new drug entities from multiple companies by enabling the FDA to develop a repository of all submitted data and standard review tools and to access, manipulate and view the tabulations using standardized datasets
  • Reduces the number of submission queries to pharmaceutical companies by leveraging the standard structure provided by SDTM
  • Allows companies to add additional domains and variables outside of the CDISC controlled domains and variables.
  • Increases programming consistency and study efficiency by providing the same data structure to studies with different designs
  • Assists in the creation of analysis data sets by developing macros
  • Facilitates the development of commercial reporting and analysis tools
  • Simplifies cross study analysis

While SDTM provides a standard and ample flexibility, it can also become tedious and lengthy. For instance, conversion of clinical database may be difficult due to the large number of CDISC domains and variables. Similarly, there may be errors and delays in data conversion as many ETL (Extract, Transform, Load) programmers lack the CDISC domain expertise. The converted data would also result in multiple lines of code that is difficult to understand and re-use. In addition, every sponsor company implements SDTM with some variation, because the model is subject to interpretation and allows some flexibility. Therefore, additional documentation like the ‘Define’ (metadata definitions) document is required to support the data sets. Moreover, the SDTM standard is an evolving one and new guidance updates may affect submissions and involve restructuring of data. Consequently, the conversion to SDTM format requires extra effort, time, and cost.

The SDTM standard has been endorsed by the FDA and embraced by the pharmaceutical industry. It has improved the FDA data submission and review process. Additionally, the Center for Drug Evaluation and Research (CDER) also encourages its use for ensuring efficient and quality reviews. The standard has improved data management,data integrity checking, data and cross-study analysis, as well as reporting. The widespread acceptance of SDTM will be beneficial for both the industry and the regulators in terms of efficient data conversion process and reduced related cost.