Resources
Software
Training
The Supplemental Security Income (SSI) Public-Use Microdata File contains an extract of data fields from the Social Security Administration's (SSA) Supplemental Security Record file. This dataset is used to demonstrate how to document variables and datasets in DDI 3.
The SSI dataset consists of a 5 percent random, representative sample of persons who received a federal SSI benefit in December 2001. This file contains approximately 320,000 records, with 13 data fields on each record. The SSI Public-Use Microdata file is distributed by data.gov and the Social Security Administration.
The following DDI 3.1 file was creating using Colectica Designer.
The following files were automatically generated by Colectica Designer from the above DDI 3.1 file.
|
RTF |
R Command File |
SAS Command File |
Stata Command File |
SPSS Command File |
Select from the following headers to view sample DDI 3 snippets.
Each variable has several pieces of metadata that form its description. The following sample DDI describes the variable named STAT, describing the respondent's home state.
example.org:d9017f32-bc88-4072-8627-75a6e86dab2e:7
STAT
State of residence of recipient
This field indicates the state of residence on record as of December 2001. Recipients from the outlying area of the Northern Mariana Islands are not included on this file.
36e8f115-824b-44c6-9ac4-2835fe540850
example.org
8
DDI also allows a standard way of describing codes and categories. The following DDI sample documents the classification that contains the Male, Female, and Unspecified elements.
Sex Codes
55bd77ed-5f1c-41ae-9c94-b20790ae9e7e
example.org
2
M
6e5addb3-a882-4af4-88ac-e418c3c22a77
example.org
2
F
5e0c5123-6c00-43af-a1d4-e5851704e488
example.org
2
U
The categories are described with the following DDI.
Sex
example.org:55bd77ed-5f1c-41ae-9c94-b20790ae9e7e:2
Male
example.org:6e5addb3-a882-4af4-88ac-e418c3c22a77:2
Female
example.org:5e0c5123-6c00-43af-a1d4-e5851704e488:2
Unspecified
The first step in documenting a data file in DDI is to describe the relationships of variables that appear in the dataset. In this case there is only one record - that is, the dataset not hierarchical - so the DataRelationship definition is quite simple. It simply includes all the variables in the specified VariableScheme.
colectica.com:8cf3212a-0b2b-44fd-82d0-d06d23ff365b:1
SSIDataRelationship
colectica.com:ac510f0e-ae0f-47eb-a9ca-0c9e3eced5d2:3
Recipient
SSI Recipient
62e3a2fc-facc-4ce3-ae9b-cee96938c2b3
example.org
4
To describe the file format in DDI, a PhysicalDataProduct is required. A PhysicalDataProduct has two main parts: the physical structure definition and the record layout definition.
The PhysicalStructure describes a file's format. In this case, the DDI describes a comma-separated ASCII file.
colectica.com:aa5540eb-41c6-4e7d-b648-f34eb7ae8792:1
Delimited
,
The RecordLayout describes the order and characteristics of the data which appear in the dataset.
colectica.com:48989d91-594a-44d7-a289-fbd4f1ea0e26:1
aa5540eb-41c6-4e7d-b648-f34eb7ae8792
colectica.com
1
ShouldNotBeRequiredForSingleSegmentProducts
UTF-8
0
49465b27-86dc-472c-93a0-f74d8ea837fe
example.org
0
Integer
0
d9017f32-bc88-4072-8627-75a6e86dab2e
example.org
7
Integer
1
...
8e237353-a149-47c4-89fb-0ac98ec778b7
example.org
7
Integer
13
The data file itself is documented in DDI with the PhysicalInstance. This contains a citation for the data file, points to the appropriate record layout definition, and describes the files that hold the data.
SSI-2001
SSI
Social Security Administration
Social Security Administration
data.gov
2001-12-02T00:00:00
United States Government
SSI-2001
Social Security Administration
Social Security Administration
data.gov
2001-12-02T00:00:00
United States Government
48989d91-594a-44d7-a289-fbd4f1ea0e26
colectica.com
1
SSI-2001.Recipient.dat
...
DDI can also be used to document summary statistics for a dataset. These appear within PhysicalInstance. The following sample DDI shows the frequencies for the SEX variable, which can take the values M (male), F (female), or U (unspecified).
352c8e80-9902-4566-a7ff-ddd84a96e2a2
example.org
8
M
Frequency
false
133557
F
Frequency
false
186905
U
Frequency
false
11
DDI 3 captures all aspects of the data lifecycle. To see all the information about the Supplemental Security Income dataset captured using DDI, please download the accompanying DDI 3 file above, or view the documentation that was automatically generated by Colectica using this DDI description.
Learn how to create DDI quickly and easily using the Colectica DDI tools.
Or, contact us about our metadata consulting, metadata preparation, or custom development services.