A Guide to SDTM Mapping

Excerpt from the paper: A Guide to SDTM Mapping The Process, Typical Scenarios, & Best Practices.

SDTM Implementation Guide

Have a good understanding of SDTM domains and their structure. The SDTM Implementation Guide (SDTMIG) is there to help with this.

Read SDTMIG.....It will make the SDTM mapping process much smoother.

  • Build EDC from CRF
  • Get Raw Datasets(source data) from EDC
  • Map Raw Datasets(source data) to SDTM Datasets

6 key steps in a typical mapping process:

  • Identify all the datasets you want to map.
  • Identify all the SDTM datasets that correlate with those datasets.
  • Get the dataset metadata. (What it means?)
  • Get the SDTM dataset metadata that corresponds to Step 3.
  • Map the variables in the datasets identified in Step 1 to the SDTM domain variables.
  • Create custom domains for any other datasets that don't have corresponding SDTM datasets.

There's 9 likely scenarios in a typical SDTM mapping process. Get to grips with these, and SDTM mapping becomes much more achievable.

  • The direct carry forward.

    Variable that are already SDTM compliant can be directly carried forward to the SDTM datasets. They don't need to modified. (Nothing needs to do, just directly capture it.)

  • The variable rename

    You need to rename some variables to be able to map to the corresponding SDTM variable. For example, if the original variable is GENDER, it should be renamed SEX to comply with SDTM standards.

  • The variable attribute change

    Variable attributes must be mapped as well as variable names. Attributes like label, type, length and format must comply with the SDTM attributes. (These variable attributes should comply with SDTM attributes)

  • The reformat

    The format that a value is stored in is changed. However the value itself does not change. For example, converting a SAS date to an ISO 8601 format character string. (Does it mean to change the format of value itself?)

  • The combine

    Sometimes multiple variables must be combined to form a single SDTM variable. (It means that some SDTM variables can't carried directly, sometimes transform is need.)

  • The split

    A non-SDTM variable might need to be split into 2 or more SDTM variables to comply with SDTM standards. (It's contrary to combine step)

  • The derivation

    Some SDTM variables are obtained by deriving a result from data in the non-SDTM dataset. For example, instead of manually entering a patients age, using the date of birth and study start date to derive it instead

  • The variable value map and new code list application

    Some variable values need to be recoded or mapped to match with the values of a corresponding SDTM variable. This mapping is recommended for variables with a code list attached that has controlled terminology that can't be extended. You should map all values in controlled terminology, and not just the values present in the dataset. This would cover values that are not in the dataset currently, but may come in during future dataset updates.

  • The horizontal-to-vertical data structure transpose

    There are situations where the structure of the non-CDISC dataset is completely different to its corresponding SDTM dataset. In such cases you need to transform its structure to one that is SDTM-compliant.

    For example, the Vital Signs dataset. When data is collected in wide form, every test and recorded value is stored in separate variables. SDTM requires data to be stored in lean form. Therefore, the dataset must be transposed to have the tests, values and unit under 3 variables. If there are variables that can't be mapped to an SDTM variable, they would go into supplemental qualifiers. (Such like long and wide pivot transform)


There's thing you can do to make SDTM mapping easier.

  • Part of the trouble is that SDTM mapping is typically done at the end of the clinical trial process-once patient data has been collected. Retrospectively trying to make your results data fit the SDTM structure takes a lot of time and effort.
  • For this reason, it's best practice to align raw datasets with CDISC standards before collecting any patient data
  • That means implementing SDTM right from the start-when designing CRFs. Doing it this way means it's much easier to convert your datasets. And it saves time later on in the process when you're pulling your submission deliverables together. You can submit your study much quickly.

Above all are excerpted, if there is any question, please read the original paper.

Please indicate the source: http://www.bioinfo-scrounger.com