Fairness and Algorithmic Decision Making




Introduction

This course examines the greater context under which the practice of Data Science exists and explores concrete ways issues of fairness surface in the technical work of a Data Scientist. Much of the work of a Data Scientist contribute to decision making processes, either through algorithmic systems or informing policy. The course will survey frameworks for studying the objectives and impacts of such decisions, paying particular attention to how such decisions affect a diverse population of individuals.

The course will ground, motivate, and contextualize these frameworks in the experiences of individuals and communities impacted most by decision making systems. As participants in the course, we will relate to these individuals through the critical lens of our own experience. Concretely, the course will dedicate significant time to the study of particular histories and contexts of marginalized individuals and communities as a necessary component of any analysis of these decision making systems.


Learning Objectives

At the end of the course, student will be able to:

  • Identify sources and impacts of the ‘limits of measurement’ via reasoning about context, considering qualities lost in the data collection process.
  • Critique the strengths and weaknesses of quantitative concepts that attempt to model commonly held values related to fairness.
  • Compare and constrast the implications of considering different contexts under which a decision making system may be judged (social, legal, business) and describe the impacts such decisions have on the people with whom it interacts.
  • Write a data-driven analyses of a decision making system, critiquing specific limitations of the analysis, and identifying the values that the quantitative analysis attempts to capture.
  • Create, then audit, a decision making system, while considering and evaluating the implications of choices made during the process on those that the system impacts.

Together, these objectives form a toolkit that students can use in critically analyzing the impacts of a decision making system, diagnose potential sources of bias, and constructively discuss the implications of decision choices in terms of values and the harms incurred by others.


Topics Covered

The course will broadly cover the following topics:

  1. Frameworks for understanding the impacts of algorithmic decision making systems and the contexts under which they operate. These include a discussion of distributive theories of justice, discrimination, harms. We will emphasize connections between these frameworks and the limitations of measurement and classification.
  2. Identifying inequity in data; understanding limitations in such identification.
  3. Identifying sources of bias in the Machine Learning pipeline from the point of view of a data scientist and ways of approaching this bias in practice.
  4. Following the impact of made decisions, including the amplification of bias through feedback loops and reinforcing power structures.
  5. Identifying representational harms, with particular attention paid to quantitatively identifying such harms in natural language and images.

Course Methods

The methods used in the course regularly incorporate experiential and practical approaches to solidify and expand understanding of the course’s lecture-based, topical material. A few of these methods are outlined below:

  1. Weekly reading responses on which you reflect on your experiences in comparison to the experiences told in the readings.
  2. Engaging in weekly discussion with other students (via reading responses) that encourage understanding others points of view.
  3. Focusing on the origins and context of marginalized individuals and groups in the United States that are most likely to experience unfair algorithmic decisions.
  4. Engaging in a focused study of a prominent inequity in San Diego, California, or the United States, including a thorough undestanding of the context and evaluation of the measurements (data) used to draw conclusions.
  5. Practice doing the work of a Data Scientist, critically examining your decision-making system’s impact on its users while weighing the implications of possible design decisions decision (paper-2).

Course Expectations

This course is interdisciplinary by design. Assignments necessitate reading papers from the humanities and social sciences, deriving probabilistic/statistical models, and writing code. I assume you have some exposure to all of these areas through DSC 80 and standard exposure to critical reasoning typical of an upper-division course. However, much of the material in this course will require pushing yourself to understand topics outside your comfort zone; a curious, open mind, with a healthy amount of perseverance is most important.

The goal in this course is to meaningfully engage with and relate to the material; it’s not necessary to understand every line of your reading. Start by carefully reading the abstract, introduction, and conclusion (skimming the rest). Later, read over the work in its entirety. If you are still finding understanding a topic difficult, you should ask your classmates, then reach out to the instructor or TA!

Guidelines for Respectful Conversation

This course will also approach high-stakes and controversial topics. These discussions may occur during lecture, section, among student conversations in breakout rooms, or in peer reviews. Learning to have respectful discussions about such topics is a valuble life skill for a data scientist, who tends to interact with a wide variety of people.

Here are a few guidelines for respectful conversation we will follow in this course:

  • Listen respectfully, without interrupting.
  • Listen actively and with an ear to understanding others’ views. (Don’t just think about what you are going to say while someone else is talking.)
  • Criticize ideas, not individuals.
  • Commit to learning, not debating. Comment in order to share information, not to persuade.
  • Avoid blame, speculation, and inflammatory language.
  • Allow everyone the chance to speak.
  • Avoid assumptions about any member of the class or generalizations about social groups. Do not ask individuals to speak for their (perceived) social group.

(Taken from UMich CRLT).

Attendance

This course will be in-person, as listed in the course catalog. For those who cannot atttend lecture, podcasts of the lectures will be available after the in-person meeting. If you participate in the class asynchronously, please watch lecture within 24 hours of the scheduled lecture to stay up-to-date with the weekly readings.


Assignments

Weekly Reading Responses

Each week, complete the reading assignment found in the schedule and write your response to the given prompt by Mondays at 11:59PM.

Upon submission of the reading response, you must complete your assigned peer review by the following Wednesday at 11:59PM (48 hours later). These assignments are graded on a 0-2 scale.

Grade: 20% of Total

Identification of Inequity

Write a paper that identifies and analyzes a potential inequity in the world, using data. You may either replicate a known publication or find a topic yourself. See the assignment.

This project may be worked on in pairs.

Grade: 40% of Total

Algorithmic Audit

In this paper, you will:

  1. build an algorithmic decision making system (in the area you wrote about in paper-1) and
  2. audit the system / study the impacts on the population on which it decides.

The audit may lead to studying different approaches to potential unfair decisions by the system. Your analysis must at least 2 of the topics covered in the second half of the course. See the assignment.

This project may be worked on in pairs.

Grade: 40% of Total


Schedule

Week Topic Assignments
Week 1 Introduction, Frameworks of Distributive Justice  
Week 2 Measurement, Data, and Decision Making  
Week 3 Measuring Discrepancies at the Group Level and relationships to Fairness  
Week 4 Score Functions, Calibration, and Creating ‘Fair’ Classifiers  
Week 5 Limits of Parity Measures: intersectionality, infra-marginality  
Week 6 Fairness and Discrimination at the Individual Level Paper 1 Due
Week 7 Bias in the ML pipeline: pre-processing, in-processing, post-processing  
Week 8 Amplification of Bias and Feedback Loops  
Week 9 Representational Harms I: Stereotyping; Fairness in Feature Space  
Week 10 Representational Harms II: NLP, 3rd Party APIs  
Finals Week   Paper 2 Due

Resources

Course Content

Writing about bias and discrimination

Creating a report

  • Stripping input cells from notebooks for pdf conversion (Code).
    • You may also use the nbconvert command-line, with code-strip option.
  • LaTex is a typesetting language for generating nice pdfs with formulae and plots.
  • R Markdown may be used to produce pdf documents from noteobok-type interfaces, as well.

Accessibility, Disability, and Remote Learning

I aim to create an environment in which all students can succeed in this course. If you are experiencing obstacles to learning and engagement in the course at any point during the quarter, don’t hesistate to contact me. If you need an accommodation for whatever reason, I will try to work with you to realize the accommodation in some respect.

If you are requesting accommodations for this course due to a disability, you must provide a current Authorization for Accommodation (AFA) letter issued by the Office for Students with Disabilities (OSD). Students are required to present their AFA letters to Faculty (please make arrangements to contact me privately) and to the DSC Student Advisor in advance so that accommodations may be arranged. Contact the OSD for further information: 858.534.4382 (phone) osd@ucsd.edu (email) http://disabilities.ucsd.edu (website)

If you have feedback on how to make the class more accessible and inclusive, please let me know.


Diversity and Inclusion

We are committed to fostering a learning environment for this course that supports a diversity of thoughts, perspectives and experiences, and respects your identities (including race, ethnicity, heritage, gender, sex, class, sexuality, religion, ability, age, educational background, etc.). Our goal is to create a diverse and inclusive learning environment where all students feel comfortable and can thrive.

Our instructional staff will make a concerted effort to be welcoming and inclusive to the wide diversity of students in this course. If there is a way we can make you feel more included please let one of the course staff know, either in person, via email/discussion board, or even in a note under the door. Our learning about diverse perspectives and identities is an ongoing process, and we welcome your perspectives and input.

We also expect that you, as a student in this course, will honor and respect your classmates, abiding by the UCSD Principles of Community (https://ucsd.edu/about/principles.html). Please understand that others’ backgrounds, perspectives and experiences may be different than your own, and help us to build an environment where everyone is respected and feels comfortable.

If you experience any sort of harassment or discrimination, please contact the instructor as soon as possible. If you prefer to speak with someone outside of the course, please contact the Office of Prevention of Harassment and Discrimination: https://ophd.ucsd.edu/.


Academic Integrity

In this course we expect students to adhere to the UC San Diego Integrity of Scholarship Policy. This means that you will complete your work honestly, with integrity. Some examples of specific ways this policy applies to DSC 167 include:

  • All writing must be your own (or that of your group partner).
  • Thoroughly cite all ideas and work that is not your own.
  • Any code that you write must be developed by you.
  • Discussing concepts among classmates is encouraged! However, any work derived from such discussion should come from only you, after a healthy pause to digest the conversation.