The Framework says its key goal is “making data segment ‘quality’ consistently understandable, transparent, and comparable across vendors.” The term “quality” here refers to data collection, matching and modeling, not how well that segment performs in the marketplace.
The four components of the Data Transparency Framework 1.0 are:
- ID-level labeling requirements for data sellers.
- Standard audience taxonomy for a data segment naming convention.
- Open-source API to structure and communicate machine-readable label data between supply chain participants.
- A compliance program to acknowledge transparent data sellers.
The first two sections are now available for public comment through July 16, while the latter two will come out later this year. Along with the Data Transparency Framework, the IAB is also releasing Audience Taxonomy 1.0 for comment. The Taxonomy creates standard audience segment names, so that data sellers can compare apples to apples.
Lotame VP and IAB Tech Lab Data Transparency Standards Working Group co-chair David Justus told me the new Framework will add labels about how and when data on an individual anonymized user is collected, such as whether it was collected online or offline, if it was collected at a website and the name of the site, if it came from a filled-out form, if it was inferred as the result of a visit to a web page and so on. That combined metadata for each individual in an audience segment then defines that segment, which Justus described as a kind of “ingredients label” on the provenance, freshness and kind of data inside.
IAB Tech Lab notes that these labels will answer such questions as:
- Data Provenance: where was the data attribute sourced?
- Data Age: how long ago was the data collected, compiled, and then made available for online activation?
- Data Modeling: to what extent was the data manipulated or modeled?
- Data Segmentation Criteria: what are the qualifying business rules for an ID to be included in a segment?
- Data Taxonomy: when can one data segment be evaluated against another like segment?
The Framework notes the rationale:
Without a consistent and flexible approach to data organization and labeling throughout the supply chain, the possibility of attribute misclassification increases, producing more intrusive consumer ad experiences, less efficient advertising investment, and diminished monetization opportunities for publishers.