Resources
Software
Training
The Supplemental Security Income (SSI) Public-Use Microdata File contains an extract of data fields from the Social Security Administration's (SSA) Supplemental Security Record file. This dataset is used to demonstrate how to document variables and datasets in DDI 3.
The SSI dataset consists of a 5 percent random, representative sample of persons who received a federal SSI benefit in December 2001. This file contains approximately 320,000 records, with 13 data fields on each record. The SSI Public-Use Microdata file is distributed by data.gov and the Social Security Administration.
The following DDI 3.1 file was creating using Colectica Designer.
The following files were automatically generated by Colectica Designer from the above DDI 3.1 file.
![]() |
![]() RTF |
![]() R Command File |
![]() SAS Command File |
![]() Stata Command File |
![]() SPSS Command File |
Select from the following headers to view sample DDI 3 snippets.
Each variable has several pieces of metadata that form its description. The following sample DDI describes the variable named STAT, describing the respondent's home state.
<l:Variable id="d9017f32-bc88-4072-8627-75a6e86dab2e" version="7.0.0" versionDate="2010-07-14T04:35:35" xmlns:l="ddi:logicalproduct:3_1" xmlns:r="ddi:reusable:3_1">
<r:UserID type="11179-IRDI">example.org:d9017f32-bc88-4072-8627-75a6e86dab2e:7</r:UserID>
<l:VariableName xml:lang="en-US">STAT</l:VariableName>
<r:Label xml:lang="en-US">State of residence of recipient</r:Label>
<r:Description xml:lang="en-US">This field indicates the state of residence on record as of December 2001. Recipients from the outlying area of the Northern Mariana Islands are not included on this file. </r:Description>
<l:ResponseUnit />
<r:AnalysisUnit />
<l:Representation>
<l:CodeRepresentation blankIsMissingValue="true">
<r:CodeSchemeReference>
<r:ID>36e8f115-824b-44c6-9ac4-2835fe540850</r:ID>
<r:IdentifyingAgency>example.org</r:IdentifyingAgency>
<r:Version>8</r:Version>
</r:CodeSchemeReference>
</l:CodeRepresentation>
</l:Representation>
</l:Variable>
DDI also allows a standard way of describing codes and categories. The following DDI sample documents the classification that contains the Male, Female, and Unspecified elements.
<l:CodeScheme id="91e0117b-4c91-4b3b-8f08-95fc03cf33c9" version="9.0.0" versionDate="2010-07-14T04:35:35" agency="example.org" xmlns:l="ddi:logicalproduct:3_1" xmlns:r="ddi:reusable:3_1">
<l:CodeSchemeName xml:lang="en-US">Sex Codes</l:CodeSchemeName>
<l:Code>
<l:CategoryReference>
<r:ID>55bd77ed-5f1c-41ae-9c94-b20790ae9e7e</r:ID>
<r:IdentifyingAgency>example.org</r:IdentifyingAgency>
<r:Version>2</r:Version>
</l:CategoryReference>
<l:Value>M</l:Value>
</l:Code>
<l:Code>
<l:CategoryReference>
<r:ID>6e5addb3-a882-4af4-88ac-e418c3c22a77</r:ID>
<r:IdentifyingAgency>example.org</r:IdentifyingAgency>
<r:Version>2</r:Version>
</l:CategoryReference>
<l:Value>F</l:Value>
</l:Code>
<l:Code>
<l:CategoryReference>
<r:ID>5e0c5123-6c00-43af-a1d4-e5851704e488</r:ID>
<r:IdentifyingAgency>example.org</r:IdentifyingAgency>
<r:Version>2</r:Version>
</l:CategoryReference>
<l:Value>U</l:Value>
</l:Code>
</l:CodeScheme>
The categories are described with the following DDI.
<l:CategoryScheme id="2b9c796c-3071-4c36-a113-ab0479b436b7" version="3.0.0" versionDate="2010-07-14T04:26:20" agency="example.org">
<l:CategorySchemeName xml:lang="en-US">Sex</l:CategorySchemeName>
<l:Category id="55bd77ed-5f1c-41ae-9c94-b20790ae9e7e" version="2.0.0" versionDate="2010-07-14T04:26:12" missing="false">
<r:UserID type="11179-IRDI">example.org:55bd77ed-5f1c-41ae-9c94-b20790ae9e7e:2</r:UserID>
<r:Label xml:lang="en-US">Male</r:Label>
</l:Category>
<l:Category id="6e5addb3-a882-4af4-88ac-e418c3c22a77" version="2.0.0" versionDate="2010-07-14T04:26:12" missing="false">
<r:UserID type="11179-IRDI">example.org:6e5addb3-a882-4af4-88ac-e418c3c22a77:2</r:UserID>
<r:Label xml:lang="en-US">Female</r:Label>
</l:Category>
<l:Category id="5e0c5123-6c00-43af-a1d4-e5851704e488" version="2.0.0" versionDate="2010-07-14T04:26:12" missing="false">
<r:UserID type="11179-IRDI">example.org:5e0c5123-6c00-43af-a1d4-e5851704e488:2</r:UserID>
<r:Label xml:lang="en-US">Unspecified</r:Label>
</l:Category>
</l:CategoryScheme>
The first step in documenting a data file in DDI is to describe the relationships of variables that appear in the dataset. In this case there is only one record - that is, the dataset not hierarchical - so the DataRelationship definition is quite simple. It simply includes all the variables in the specified VariableScheme.
<l:DataRelationship id="8cf3212a-0b2b-44fd-82d0-d06d23ff365b" version="1.0.0" versionDate="2010-07-14T04:27:19">
<r:UserID type="11179-IRDI">colectica.com:8cf3212a-0b2b-44fd-82d0-d06d23ff365b:1</r:UserID>
<l:DataRelationshipName xml:lang="en-US">SSIDataRelationship</l:DataRelationshipName>
<l:LogicalRecord id="ac510f0e-ae0f-47eb-a9ca-0c9e3eced5d2" hasLocator="false">
<r:UserID type="11179-IRDI">colectica.com:ac510f0e-ae0f-47eb-a9ca-0c9e3eced5d2:3</r:UserID>
<l:LogicalRecordName xml:lang="en-US">Recipient</l:LogicalRecordName>
<r:Label xml:lang="en-US">SSI Recipient</r:Label>
<l:VariablesInRecord allVariablesInLogicalProduct="false">
<l:VariableSchemeReference>
<r:ID>62e3a2fc-facc-4ce3-ae9b-cee96938c2b3</r:ID>
<r:IdentifyingAgency>example.org</r:IdentifyingAgency>
<r:Version>4</r:Version>
</l:VariableSchemeReference>
</l:VariablesInRecord>
</l:LogicalRecord>
</l:DataRelationship>
To describe the file format in DDI, a PhysicalDataProduct is required. A PhysicalDataProduct has two main parts: the physical structure definition and the record layout definition.
The PhysicalStructure describes a file's format. In this case, the DDI describes a comma-separated ASCII file.
<p:PhysicalStructure id="aa5540eb-41c6-4e7d-b648-f34eb7ae8792" version="1.0.0" versionDate="2010-07-14T04:27:57">
<r:UserID type="11179-IRDI">colectica.com:aa5540eb-41c6-4e7d-b648-f34eb7ae8792:1</r:UserID>
<p:Format>Delimited</p:Format>
<p:DefaultDelimiter>,</p:DefaultDelimiter>
</p:PhysicalStructure>
The RecordLayout describes the order and characteristics of the data which appear in the dataset.
<p:RecordLayout id="48989d91-594a-44d7-a289-fbd4f1ea0e26" version="1.0.0" versionDate="2010-07-14T04:27:57" namesOnFirstRow="false">
<r:UserID type="11179-IRDI">colectica.com:48989d91-594a-44d7-a289-fbd4f1ea0e26:1</r:UserID>
<p:PhysicalStructureReference>
<r:ID>aa5540eb-41c6-4e7d-b648-f34eb7ae8792</r:ID>
<r:IdentifyingAgency>colectica.com</r:IdentifyingAgency>
<r:Version>1</r:Version>
<p:PhysicalRecordSegmentUsed>ShouldNotBeRequiredForSingleSegmentProducts</p:PhysicalRecordSegmentUsed>
</p:PhysicalStructureReference>
<p:CharacterSet>UTF-8</p:CharacterSet>
<p:ArrayBase>0</p:ArrayBase>
<p:DataItem>
<p:VariableReference>
<r:ID>49465b27-86dc-472c-93a0-f74d8ea837fe</r:ID>
<r:IdentifyingAgency>example.org</r:IdentifyingAgency>
<r:Version>0</r:Version>
</p:VariableReference>
<p:PhysicalLocation>
<p:StorageFormat>Integer</p:StorageFormat>
<p:ArrayPosition>0</p:ArrayPosition>
</p:PhysicalLocation>
</p:DataItem>
<p:DataItem>
<p:VariableReference>
<r:ID>d9017f32-bc88-4072-8627-75a6e86dab2e</r:ID>
<r:IdentifyingAgency>example.org</r:IdentifyingAgency>
<r:Version>7</r:Version>
</p:VariableReference>
<p:PhysicalLocation>
<p:StorageFormat>Integer</p:StorageFormat>
<p:ArrayPosition>1</p:ArrayPosition>
</p:PhysicalLocation>
</p:DataItem>
...
<p:DataItem>
<p:VariableReference>
<r:ID>8e237353-a149-47c4-89fb-0ac98ec778b7</r:ID>
<r:IdentifyingAgency>example.org</r:IdentifyingAgency>
<r:Version>7</r:Version>
</p:VariableReference>
<p:PhysicalLocation>
<p:StorageFormat>Integer</p:StorageFormat>
<p:ArrayPosition>13</p:ArrayPosition>
</p:PhysicalLocation>
</p:DataItem>
</p:RecordLayout>
The data file itself is documented in DDI with the PhysicalInstance. This contains a citation for the data file, points to the appropriate record layout definition, and describes the files that hold the data.
<pi:PhysicalInstance id="0d241021-4a5c-4943-a08f-029a8b5e646f" version="4.0.0" versionDate="2010-07-14T04:32:31" agency="colectica.com">
<r:Citation>
<r:Title xml:lang="en-US">SSI-2001</r:Title>
<r:SubTitle xml:lang="en-US">SSI</r:SubTitle>
<r:Creator xml:lang="en-US">Social Security Administration</r:Creator>
<r:Publisher xml:lang="en-US">Social Security Administration</r:Publisher>
<r:Contributor xml:lang="en-US">data.gov</r:Contributor>
<r:PublicationDate>
<r:SimpleDate>2001-12-02T00:00:00</r:SimpleDate>
</r:PublicationDate>
<r:Copyright xml:lang="en-US">United States Government</r:Copyright>
<dc:DCElements>
<dc2:title xml:lang="en-US">SSI-2001</dc2:title>
<dc2:creator xml:lang="en-US">Social Security Administration</dc2:creator>
<dc2:publisher xml:lang="en-US">Social Security Administration</dc2:publisher>
<dc2:contributor xml:lang="en-US">data.gov</dc2:contributor>
<dc2:date>2001-12-02T00:00:00</dc2:date>
<dc2:rights xml:lang="en-US">United States Government</dc2:rights>
</dc:DCElements>
</r:Citation>
<pi:RecordLayoutReference>
<r:ID>48989d91-594a-44d7-a289-fbd4f1ea0e26</r:ID>
<r:IdentifyingAgency>colectica.com</r:IdentifyingAgency>
<r:Version>1</r:Version>
</pi:RecordLayoutReference>
<pi:DataFileIdentification id="e792296e-465d-4a54-a348-c89a003acb2e">
<pi:URI isPublic="false">SSI-2001.Recipient.dat</pi:URI>
</pi:DataFileIdentification>
<pi:Statistics>
...
</pi:Statistics>
</pi:PhysicalInstance>
DDI can also be used to document summary statistics for a dataset. These appear within PhysicalInstance. The following sample DDI shows the frequencies for the SEX variable, which can take the values M (male), F (female), or U (unspecified).
<pi:VariableStatistics>
<pi:VariableReference>
<r:ID>352c8e80-9902-4566-a7ff-ddd84a96e2a2</r:ID>
<r:IdentifyingAgency>example.org</r:IdentifyingAgency>
<r:Version>8</r:Version>
</pi:VariableReference>
<pi:CategoryStatistics>
<pi:CategoryValue>M</pi:CategoryValue>
<pi:CategoryStatistic>
<pi:CategoryStatisticType>Frequency</pi:CategoryStatisticType>
<pi:Weighted>false</pi:Weighted>
<pi:Value>133557</pi:Value>
</pi:CategoryStatistic>
</pi:CategoryStatistics>
<pi:CategoryStatistics>
<pi:CategoryValue>F</pi:CategoryValue>
<pi:CategoryStatistic>
<pi:CategoryStatisticType>Frequency</pi:CategoryStatisticType>
<pi:Weighted>false</pi:Weighted>
<pi:Value>186905</pi:Value>
</pi:CategoryStatistic>
</pi:CategoryStatistics>
<pi:CategoryStatistics>
<pi:CategoryValue>U</pi:CategoryValue>
<pi:CategoryStatistic>
<pi:CategoryStatisticType>Frequency</pi:CategoryStatisticType>
<pi:Weighted>false</pi:Weighted>
<pi:Value>11</pi:Value>
</pi:CategoryStatistic>
</pi:CategoryStatistics>
</pi:VariableStatistics>
DDI 3 captures all aspects of the data lifecycle. To see all the information about the Supplemental Security Income dataset captured using DDI, please download the accompanying DDI 3 file above, or view the documentation that was automatically generated by Colectica using this DDI description.
Learn how to create DDI quickly and easily using the Colectica DDI tools.
Or, contact us about our metadata consulting, metadata preparation, or custom development services.