Module 3: Answer

1. Defining key takeaways

You settled on a set of questions that you want to include in your Data Card—questions that you feel are important for your readers. However, it's not as simple as answering these questions and calling it a Data Card. It takes thoughtful consideration to ensure that your final Data Card is optimized for your reader experience.

When people read Data Cards, they want to make very specific decisions, such as the following:

  • Is this dataset suitable for my use case?
  • Can I let others use this dataset?
  • How can I safely use this dataset without adding risk to my models?

If readers can access the right information efficiently, they're incredibly adept at making dataset-related decisions within their contexts. The importance or usefulness of information depends on the type of decision that the reader must make and the reader's background. For example, when deciding whether to use a dataset, a compliance officer might look at the licenses associated with it, but an engineer looks at the technical stack. Both readers ask the same questions, but expect different answers.

Data Cards should comprehensively describe your dataset so that readers can make decisions confidently. These comprehensive descriptions help you decide what you want readers to get from your Data Card, and determine the kind of accurate, robust, and organized information to document in it. Of course, the challenge is that it's impossible to determine all the possible decisions that readers of your Data Card need to make.

2. Plan your Data Card

  • To determine the decisions that your Data Card readers need to make and how much detail that your Data Card should include, answer the question for each category in the following table:

Readers

Decisions

Goals

Relevance

Nuance

Who is the primary audience?

What decisions will they make about the dataset?

What do they want from the Data Card?

What specific content do they need from the Data Card to meet their goals?

Given what you know about the reader, how detailed or nuanced must your content be?

Example: Production software engineers

Example: Should I use the dataset to test a machine-learning (ML) model that's in production?

Example: Give me an overview of the dataset. Tell me how it's implemented.

Example: Intended and unsuitable uses, past use and results on past models.

Example: Highly nuanced. Emphasis on technical use and usability for the purposes of integration into production systems.

You could use your table to evaluate your Data Card and ensure that your high-priority readers find your Data Card useful. While there are many approaches to evaluating your Data Card, one that we recommend involves rating the severity of usability.

While precise definitions can vary, the following severity scale provides a rating of how broken something is and the impact of the issue without regard to prioritization. In this context, we refer to the usability of your Data Card, which, if unaddressed, can impact the trust placed by a reader in the Data Card and the usefulness of it.

  • To evaluate how useful the state of your Data Card is for each audience group in your table from earlier, answer the questions in the following severity scale:

Violation

Severity

Fix

What answers aren't useful for the reader?

How urgently should this be fixed on a scale of 1 to 5? (Select the checkbox that applies):

  • ☐ 1 = Catastrophic. Fix this before the Data Card is released.
  • ☐ 2 = Major problem. Important to fix and given high priority.
  • ☐ 3 = Minor problem. Given low priority.
  • ☐ 4 = Cosmetic problem only. Fix if time allows.
  • ☐ 5 = This isn't a problem.

What's the solution?

3. Aiming for just enough

More often than not, one of the following two things tend to happen when you create your first Data Card:

  • Too much information overwhelms readers.
  • Too little information confuses readers.

As the creator of a Data Card, you need to curate and prioritize the information in it. A good transparency artifact provides enough context for readers to gain a clear understanding. If not, it tells them where to go next.

You want to provide information that makes the dataset easy to understand and use. Sometimes, the complexity of your dataset increases, which affects the density of information and explanations that you need to summarize in your Data Card.

Regardless of your readers' expertise level, anyone can experience information overload, so it's important to present the correct information, which includes the following:

  • The kind of information that you should provide.
  • How much information that you have to offer.
  • The details in it.

Your answers should do their best to summarize everything without detailing everything, and reflect the context needed for readers to gain insights into your dataset.

Heuristics

We created a set of heuristics that you can use to score the overall experience of reading your Data Card. We view these heuristics as objectives that Data Cards must fulfill so that they're successful and appropriately adopted in practice and at scale. The following table contains these objectives and their descriptions:

Objective

Description

Consistent

Data Cards must be comparable to one another regardless of data modality or domain so that claims are easy to interpret and validate within context of use. While deployment of one-time Data Cards is relatively easy, we find that teams and organizations need to preserve comparability when they scale adoption.

Comprehensive

Rather than being created as a last step in a dataset's lifecycle, it should be easy to create a Data Card concurrently with the dataset. Further, the responsibility of completing fields in a Data Card should be distributed and assigned to the most appropriate individual. This requires standardized methods that extend beyond the Data Card, and apply to the various reports generated in the dataset's lifecycle.

Intelligible and concise

Readers have varying levels of proficiency, which affects their interpretation of the Data Card. In scenarios in which stakeholder proficiency differs, individuals with the strongest mental model of the dataset become de-facto decision makers. Finally, tasks that are more urgent or challenging can reduce the participation of non-traditional stakeholders in decisions, which are left to "the expert''. This risks the omission of critical perspectives that reflect the situated needs of downstream and lateral stakeholders. A Data Card should efficiently communicate to the reader with the least proficiency, and enable readers with greater proficiency to find more information as needed. The content and design should advance a reader's deliberation process without overwhelming them, and encourage stakeholder cooperation toward a shared mental model of the dataset for decision-making.

4. Score your heuristics

  • To review the answers to your Data Card, use the following scorecard that we created to score each heuristic. At the end, you can tally the overall score of your Data Card, which helps you stay on track. You can also include comments to capture additional context and action items needed to improve each heuristic.

Heuristic

Criteria

Comments

Score

Self-score your completed data card on the following heuristics.

Criteria for heuristic

Take special note of areas where the data card can be improved.

Numbers only, self-score (0-10)

Intelligible
The design and content of your transparency artifact is effective, relevant, and easy to understand for a majority of expert and non-expert agents.

  • Effective: A majority of agents can obtain appropriate answers to reasonable questions about the dataset or model.
  • Relevant: Explanations, visualizations, and results of analyses included are relevant and actionable for a majority of agents.
  • Understandable: Information can be easily understood by expert and non-expert agents.

.

.

Comprehensive
The Data Card makes it easy for readers to understand what the dataset or model is about, how it came into being, and what's important to know before using it.

  • Purposeful: Information that establishes context for the dataset and is helpful to all stakeholders is legible.
  • Complete: Information is coherent and complete, appropriately describing all stages in a dataset's lifecycle.
  • In-depth: Summaries are human-readable for general readers, and link to additional information at greater depth or specification for advanced readers.

.

.

Consistent
The Data Card follows platform and industry conventions, and maintains consistency within itself and across other similar transparency cards.

  • Recognizable: Sections are organized in a logical order such that readers can recognize where to find information.
  • Standardized: Uses industry-standard terms, and describes deviations or customizations where relevant.
  • Clear: The same term means the same concept every time that it's used.

.

.

Concise
The design and content in the card reduces vast and complex information into meaningful, digestible bits of relative importance that address the needs of novices and experienced readers.

  • Graspable: The relative meaning and importance of keywords, key-value pairs, and visual summaries are easy to grasp.
  • Glanceable: If and how readers can use the dataset to meet their goals is clear at a glance.
  • Contextual: Background knowledge and context is distilled or abstracted for understanding without sacrificing the nature and nuance of the dataset.

.

.

Total score = (Total points/120)

.

.

/120

5. Thoughtful analysis

We know that data is information about people, cultures, or businesses that's been captured in a structured way for a specific purpose. However, as stated repeatedly, they're all nuanced, entangled by several dimensions with varying degrees. Thereby, the analysis that you perform on your dataset offers a window into the thought that has been put into the dataset itself, which helps to make sense of its intricacies.

For example, an intersectional analysis of people can explore the combinations of human factors within a dataset to identify potential disproportionate outcomes, such as when a model trained on a dataset performs better for a subgroup than others. A disaggregated analysis breaks down the dataset based on different factors to reveal important patterns for subgroups or marginalized populations that are typically masked by larger, aggregate data so that readers can anticipate outcomes.

With that, we find that intersectionality and disaggregated analyses (IDA) are effective ways to communicate a range of plausible outcomes under different circumstances in a Data Card through the establishment of clear relationships in a dataset. IDA can offer readers vital clues about the representation in your dataset, such as how labels correlated with sensitive entities; gaps in your dataset, such as how the dataset only has photographs taken during daytime; and the relationship between variables that can subsequently cause AI models to learn spurious correlations or pick on proxies. These analyses become even more useful when they're situated in real-world circumstances reflective of the experience that impacted users might have with a product or service that uses your dataset.

For example, the presentation of IDA results in a Data Card helps readers proactively build an intuition about how their ML model performs on subsets—also known as slices—in your dataset. While this requires dataset creators to be more diligent in their analyses of the dataset and its presentation in the Data Card, it can ultimately lead to better product outcomes for stakeholders.

IDA can help readers better intuit how to use your dataset in their models. If you have trouble, work with experts, product teams, and individuals with lived experience to help frame your analyses. IDA is often rooted in contexts that need to be explained to readers or require additional support so that readers can interpret these appropriately.

6. Analyze your data

To analyze your dataset, follow these steps:

  1. Explore before you begin your analysis. Develop an intuition for the skews and imbalances in your dataset with a tool, such as TensorFlow Data Validation (TFDV), or the Learning Interpretability Tool (LIT). Use the results to inform your analysis design.
  2. Design your analysis carefully. The results of analysis are heavily influenced by the goals of your evaluation, the access to expertise and resources to conduct the analysis, when and where you conduct the analysis, and the contexts of the AI models in which the analysis is conducted.
  3. Start with factors relevant to your intended use. Align on demographic, sociocultural, behavioral, and morphological factors that can most affect your intended use cases when you create groups of interest and then expand from there.
  4. Report; don't comment. Note that factors and assumptions that affect fairness analyses exist in historically and culturally specific social constructs that are hard to quantify. Beware of adding comments that may confuse the reader. Instead, provide ways to reproduce analyses that can help readers calibrate results in their own context.
  5. Plan for the future. Account for additional factors that might appear in the future by looking at the representation in your dataset, keeping values constant across different scenarios, or combining your analysis with a range of values of additional factors relevant to your dataset.
  6. Provide more context for non-reproducible results. If metrics can't be reproduced by downstream stakeholders, provide enough context around the analysis. If a reader can use this information to weigh the pros and cons of the dataset, it can build trust in the dataset.

7. Congratulations

Congratulations! You have some ways to provide the right answers in your Data Card. Now you're ready to audit them.