Arsenic in My Muffins (with Kasia Chmielinski)

Show Description

Baratunde knows what is healthy to eat or not thanks to the required nutrition labels on our food. But how do we know the ingredients in the algorithms and AI we depend on are safe to use? Baratunde speaks with Kasia Chmielinksi about the Data Nutrition Project, which helps data scientists, developers, and product managers assess the viability, health, and quality of the data that feeds their algorithms and influences our decisions daily.

Show Notes & Actions

Show Transcript

Baratunde Thurston  0:02  

Welcome to How to Citizen with Baratunde, a podcast that reimagines citizen as a verb, not a legal status. This season is all about tech and how it can bring us together instead of tearing us apart. We're bringing you the people using technology for so much more than revenue and user growth; they're using it to help us citizen.

Archival (Baratunde Thurston)  0:34  

I have been working over the past year to try to integrate my own thinking around technology, and last year, I wrote a bit of a manifesto. 

Baratunde Thurston  0:44  

Back in 2019, I was invited to speak at Google IO, an annual developer conference held by Google. They wanted me to share my thoughts on what the future of technology could look like. 

Archival (Baratunde Thurston)  0:56  

I'd went on a journey to try to understand how all my data existed amongst the major platforms, amongst app developers, and what came out of that was a set of principles to help guide us more conscientiously into the future. 

Baratunde Thurston  1:11  

Now, the first principle of my manifesto is all about transparency. I wanted to understand what was going on inside the apps, behind the websites I was spending all my time on. 

Archival (Baratunde Thurston)  1:23  

When I want to know what's in my food, I don't drag a chemistry set to the grocery store and inspect every item point by point, I read the nutrition label, I know the content, the calories, the ratings. I shouldn't have to guess about what's inside the product. I certainly shouldn't have to read a 33,000-word legalese Terms of Service to figure out what's really happening inside. 

Baratunde Thurston  1:50  

It's pretty simple: we make better decisions about the things we consume when we know what's in them. So, if I'm checking out an app on the App Store, and I see up front that it's gonna harvest my data and sling it on some digital street corner. 

Archival  2:04  

Can I interest you in some data? 

Baratunde Thurston  2:09  

I can ask myself, "hey, self, are you okay with this app harvesting your data and slinging it on a digital street corner?" Then, having asked myself that question, I can decide whether or not to download it. I don't have to hope that it won't screw me over. I can know. Check it out, this nutrition label idea hasn't just existed in the vacuum of my own brain. It's a real thing. There are actual people making nutrition labels in the world of tech.

Kasia Chmielinski  2:40  

In the same way that I walk into a bakery, and I see a cake that's been baked and I might think to myself, "I wonder what's in that cake." We would want the same thing for a data set, where even if you encounter that data set in the wild, you as a data practitioner will think to yourself, "I wonder if this is representative."

Baratunde Thurston  2:59  

Kasia Chmielinski is one of those people. These labels are a little different from what I proposed at Google IO. Their data nutrition labels aren't for consumers like me and you at the end of the assembly line; instead, they're for the people at the very beginning: the data scientists. Now, Kasia's data nutrition labels are an easy-to-use tool to help data scientists pick the data that's right for the thing they're making. We interact with algorithms every day, even when we're not aware of it. They affect the decisions we make about hiring about policing, pretty much everything; and in the same way that we the people ensure our well being through government standards and regulations on business activities, for example, data scientists need standards too. Kasia is fighting for standards that will make sure that artificial intelligence works for our collective benefit, or at least doesn't undermine it. 

Kasia Chmielinski  4:01  


Baratunde Thurston  4:02  

Hello. How are you feeling right now, Kasia?

Kasia Chmielinski  4:05  

I'm feeling pretty good. Beginning of another week...

Baratunde Thurston  4:08  

Kasia is the co-founder and lead of the Data Nutrition Project, the team behind those labels. They've also worked as a digital services technologist in the White House, on COVID analytics at McKinsey, and in communications at Google.

Kasia Chmielinski  4:23  

Yeah, yeah. So, I've kind of jumped around.

Baratunde Thurston  4:29  

So, why don't you introduce yourself and just tell me what you do.

Archival (Baratunde Thurston)  4:34  

My name is Kasia Chmielinski, and I am a technologist working on the ethics of data.  I'd say importantly to me, although I have always been a nerd and I studied physics a long time ago, I come from a family of artists. Actually, the painting behind me is by my brother, there's another one in the room by my mom, so I come from a really multidisciplinary group of people who are driven by their passions, and that's what I've tried to do too, and it's just led me on many different paths.

Baratunde Thurston  5:05  

Where does the interest in technology come from for you?

Kasia Chmielinski  5:08  

You know, I don't think that it's really an interest in technology. It's just that we're in a technological time, so when I graduated from university with this physics degree, I had a few options and none of them really seemed great. You know, I could go into defense work, I could become a spy, or I could make weapons, and that really wasn't so interesting to me.

Baratunde Thurston  5:31  

Was spy really an option?

Kasia Chmielinski  5:35  

Yes. So, you know, I could do that, but I didn't, and none of these were really interesting, because I wanted to make an impact and I want to drive change. I think that was around early 2000s, and technology was the place to be, that's where you could really have the most impact and solve really big problems, so that's where I ended up. So, I actually don't think that it's really about the technology at all. I think that the technology is just a tool that you can use to make an impact in the world.

Baratunde Thurston  6:05  

I love the way you describe the interest in technology is really just an interest in the world. So, do you remember some of the first steps that led you to what you are doing now?

Kasia Chmielinski  6:17  

So when I graduated, I actually applied to many things and didn't get them, and what I realized that I really didn't know how to do at all was tell a story. Coming out of a fairly technical path, I couldn't really make eye contact, I hadn't talked to a variety of people--I mean, I was definitely one of the only people who had my identity in that discipline at that time. I went to a school where the head of the school, at the time, was saying that women might not be able to do science because, biologically, they were inferior in some way. 

Baratunde Thurston  6:50  

Oh, that's nice, very welcoming environment. 

Kasia Chmielinski  6:52  

Oh, yeah, super welcoming. I was studying physics and, at the time, I was female identified, now identify as non-binary, but it wasn't a great place to be doing science. I just felt like coming out of that, I didn't know how to talk to people, I didn't know what it was like to be part of a great community, so I actually went into communications at Google, which was strange. I went from this super nerdy, very male-dominated place to, kind of, like the party wing of technology at the time, right? So, people who are doing a lot of marketing and communications, and talking to journalists and telling stories, and trying to figure out what's interesting, and how's this fit into the greater narratives of our time. So, while at Google, I got to see inside of so many different projects that I think was a great benefit to being part of that strategy team. So, I got to work on Core Search, I got to work on Image Search, I got to work on Gmail and Calendar, and I started to see the importance of, first of all, knowing why you're building something before you start to build it, right? There are so many times that I saw a really, really cool product at the end of the day, an algorithm or something technical that was just really cool, but there was no reason that it needed to exist from from a person perspective, from a society perspective.

Baratunde Thurston  8:26  

I am relieved to hear you say that. That has been one of my critiques of this industry for quite some time. It's like, whose problems are you trying to solve? So, you are at the epicenter of one of the major companies seeing some of this firsthand.

Kasia Chmielinski  8:40  

Yeah, that's exactly right, and it was endemic. I mean, it just happens all the time, and it's not the fault of anyone in particular. You just put a bunch of really smart engineers on a technical problem, and they just find amazing ways to solve that, but then at the end of the day, you say, "well, how are we actually going to use this?" That would fall to the Comms team or the Marketing team to say, "okay, now, what are we going to do with this?" So, that was one thing, and that's why I actually ended up moving into product management, where I could think about why we want to build something to begin with and then make sure we're building the right thing. So, I got closer to the technology after that job. The second thing that I became aware of is the importance of considering the whole pipeline of the thing that you build, because the thing that you build, its DNA is in the initial data that you put into it. I'm talking specifically about algorithmic systems here. So, one example I have from my days when I was at Google, I actually worked out of the London office, and there was a new search capability, and it was trained entirely on one particular accent. Then, when other people tried to use that, if they didn't have that very specific accent, it wasn't working so well. I really didn't know much about AI at the time. I hadn't studied it, but I realized, you know, bias in, bias out, like garbage in, garbage out. You feed this machine something, the machine is going to look exactly like what you fed it, right? You are what you eat.

Baratunde Thurston  10:09  

We'll be right back.

We use these terms data, we use this terms algorithm and artificial intelligence. So before we keep going, I'd love for you to pause and explain what these things are and their relationship to each other. Data, algorithms, artificial intelligence: how does Kasia define them?

Kasia Chmielinski  10:44  

Yeah, thank you for taking a moment. I think that something that's so important in technology is that people feel like they aren't allowed to have an opinion or have thoughts about it, because they, quote unquote, don't understand it. 

Baratunde Thurston  10:56  


Kasia Chmielinski  10:57  

You're right, it's just a definitional issue often. So, data is anything that is programmatically accessible, that is probably in enough volume to be used for something by a system. So, it could be records of something, it could be weather information, it could be the notes taken by a doctor that then get turned into something that's programmatically accessible. There's a lot of stuff, and you can feed that to a machine. I'm really interested in algorithms, because it's the practical way of understanding something like AI. It's a mathematical formula, and it takes some stuff, then it outputs something. So, that could be something like you input where you live and your name, then the algorithm will churn and spit out something like what race or ethnicity it thinks you are. That algorithm, in order to make whatever guesses it's making, needs to be fed a bunch of data, so that it can start to recognize patterns. When you deploy that algorithm out in the world, you feed it some data, and it will spit out what it believes is the pattern that it recognizes based on what it knows. You know, there's different flavors of AI. I think a lot of people are very afraid of the Terminator-type AI. 

Archival (Terminator)  12:22  

I'll be back.

Baratunde Thurston  12:24  

As we should be, because the Terminator is very scary. I've seen the documentary many times, and I don't want to live in that world.'"

Kasia Chmielinski  12:30  

Yeah, legitimately very scary. So, there's this question of, is the AI going to come to eat our lunch, right? Are they smarter than us in all the things that we could do? That's like generalized AI, or even super AI; we're not quite there yet. Currently, we're in the phase where we have discrete AI that makes discrete decisions, and we leverage those to help us in our daily lives or to hurt us sometimes.

Baratunde Thurston  12:57  

Yeah, data as food for algorithms, I think, is a really useful metaphor. A lot of us out in the wild who aren't specialized in this, I think, we're not encouraged to understand that relationship.

Kasia Chmielinski  13:10  

I agree, and I think the relationship between what you feed the algorithm and what it gives you is so direct, and people don't necessarily know that or see that. What you see is the the harm or the output that comes out of the system. What you don't see is all the work that went into building that system. You have someone who decided in the beginning they wanted to use AI; and then you have somebody who went and found the data; and you have somebody else who cleaned the data; and you got somebody or somebodies who then built the algorithm and train the algorithm; and then you have the somebody who coded that up; and then you have somebody that deployed that; and then you have people who are running that. So, when the algorithm comes out the end and there's a decision that's made, you get the loan, you didn't get the loan. The algorithm recognizes your speech, doesn't recognize your speech, sees you, doesn't see you. People think, "oh, just change the algorithm." Oh, no, you have to go all the way back to the beginning because you have that long chain of people who are doing so many different things, and it becomes very complicated to try to fix that. So, the more that we can understand that the process begins with the question, "do I need AI for this?" Then very quickly after, "where are we going to get the data to feed that, so that we make the right decision?" The sooner we understand that as a society, I think, the easier it's going to be for us to build better AI, because we're not just catching the issues at the very end of what can be a years-long process.

Baratunde Thurston  14:42  

Hmm. So, what problems does the Data Nutrition Project aim to tackle?

Kasia Chmielinski  14:48  

We've kind of talked about them all in pieces. At its core, the Data Nutrition Project, which is this research organization that I co-founded with a bunch of very smart people, we were all part of a fellowship that was looking at the ethics and governance of AI. So, when we sat down to say what are the real things that we can do to drive change, as practitioners, as people in this space, as people who had built AI before, we decided let's just go really small. Obviously, it's actually a huge problem and it's very challenging, but instead of saying let's look at the harms that come out of an AI system, let's just think about what goes in. I think we were maybe eating a lot of snacks, we were holed up at the MIT Media Lab, right? So, we were just all in this room for many, many hours, many, many days, and I think somebody at some point picked up a snack package, and we're like, what if you just had a nutritional label, like the one you have on food, you just put that on a data set? What would I do? I mean, is it possible, right? If it is possible, would that actually change things? We started talking it over and we thought we think it would. In our experience in data science as practitioners, we know that data doesn't come with standardized documentation. Often, you get a data set, and you don't know how you're supposed to use it or not use it. There may or may not be tools that you use to look at things that will tell you whether that dataset is healthy for the thing that you want to do with it. The standard process would be a product manager, a CEO would come over to the desk of data scientists and say, "look, we have all this information about this new product we want to sell, we need to map the marketing information to the demographics of people who are likely to want to buy our product or click on our product. Go make it happen." The data scientist goes, "okay," and the person goes, "oh, yeah, by Tuesday." The person's like, "oh, okay, let me go find the right data for that." There's a whole world, you just Google a bunch of stuff, then you get the data and you kind of poke around and you think it seems pretty good, then you use it. You build your algorithm on that, your algorithm that's going to determine which demographics, or what geographies. or whatever it is you're trying to do. You train it on that data you found, then you deploy that algorithm and it starts to work in production. No fault of anybody really, but the industry has grown up so much faster than the structure is in the scaffolding to keep that industry doing the right thing. So, there might be documentation on some of this data, there might not be. In some cases, we were working with a data partner that was very concerned how people were going to use their data. The data set documentation was an 80-page PDF, eight-zero. That data scientist who's on deadline for Tuesday is not going to read 80 pages, so our thought was, "hey, can we distill the most important components of a dataset and its usage to something that is maybe one sheet, two sheets, using the analogy of the nutrition label, put it on a data set, then make that the standard, so that anybody who is picking up a dataset to decide whether or not to use it will very quickly be able to assess, "is this healthy for the thing I want to do?"

Baratunde Thurston  18:04  

It's a novel application of a thing that so many of us understand. What are some of the harms you've seen, some of the harms you're trying to avoid, by the data scientists who are building these services not having access to healthy data?

Kasia Chmielinski  18:20  

Yeah, let's say you have a dataset about health outcomes, and you're looking at people who have had heart attacks or something like that, and you realize that the data was only taken from men in their 60s. If you are now going to use this as a dataset to train an algorithm to provide early warning signs for who might have a heart attack, you're going to miss entire demographics of people, which may or may not matter. That's a question. Does that matter? I don't know. Perhaps it matters what the average size of a body is, or the average age of a body is, or maybe there's something that is gender- or sex-related. You will miss all of that if you just take the data at face value and don't think about who's not represented here.

Baratunde Thurston  19:05  

I remember examples that I used to cite in some talks, it was the Amazon hiring decisions.

Archival (News)  19:11  

Amazon software engineers recently uncovered a big problem: their new online recruiting tool did not like women.

Baratunde Thurston  19:20  

It had an automated screening system for resumes, and that system ignored all the women because the dataset showed that successful job candidates at Amazon were men. So, the computer, like garbage in, garbage out, the way we've discussed, said, "well, you've defined success as male, you fed me a bunch of female that's not success, therefore, my formula dictates they get rejected." That affects people's job prospects, that affects people's sense of their self-worth and self-esteem, that could open up the company to liability, all kinds of harms in a system that was supposed to breed efficiency and help.

Kasia Chmielinski  20:02  

Yeah, that's a great example and it's a very true one. I think that one was pretty high-profile. Imagine all the situations that, either have never been caught, or were too low-profile to make it into the news. That happens all the time because the algorithm is a reflection of whatever you've fed it. So, in that case, you had historical bias. So, the historical bias in the resumes that they were using to feed the algorithm showed that men were hired more frequently, and that was success. It also comes down to, in terms of the metrics, how you're defining things. If your definition of success is that someone was hired, you're not necessarily saying that your definition is that person ended up being a good worker, or even if you're looking at the person's performance reviews and saying success would be that we hire somebody who performs well, but historically, you hired more men than women. So, even then, if your success metric is someone who performed well, you're already taking into account the historical bias that there are more men than women who were hired. So, they're all different kinds of biases that are being captured in the data. Something that the Data Nutrition Project is trying to do with the label that we've built is highlight these kinds of historical issues, as well as the technical issues in the data, and that is an important balance to strike. It's not just about what you can see in the data, it's also about what you cannot see in the data. So, in the case that you just called out there with the resumes, you would be able to see that's not representative with respect to gender. Maybe you'd be able to see things like these are all English-language resumes, but what you would not be able to see are things like socioeconomic differences, or people who never applied, or what the job market looked like whenever these resumes were collected. So, you'll kind of not be able to see any of that if you just take a purely technical approach to what's in the data. So, that the dataset nutrition label tries to highlight those things, as well, to data practitioners, to say, "before you use this dataset, here's some things you should consider." Sometimes, we'll even go as far as to say, "you probably shouldn't use this dataset for this particular thing because we just know that it's not good for that." That's always an option to say don't use it. Right? It doesn't mean people won't do it, but at least we can give you a warning, and we hope that people have the best of intentions and are trying to do the right thing. So, it's about explaining what is in the dataset or in the data, so that you can decide as a practitioner whether or not it is healthy for your usage.

Baratunde Thurston  22:42  

After the break, it's snack time. Come back hungry.

So, I'm holding a package of food right now and I'm looking at the nutrition label, nutrition facts. It's got servings per container, the size of a serving, and then numbers and percentages in terms of the daily percent of total fat, cholesterol, sodium, carbohydrates, protein, and a set of vitamins that I can expect in a single serving of this product. Then, I can make an informed choice about whether and how much of that food stuff I want to put in my body. How much garbage I want to let in. In this case, it's pretty healthy stuff. It's dried mangoes, if you're curious. What's on your data nutrition label?

Kasia Chmielinski  23:42  

Yeah, great question, and now I'm like kind of hungry, I'm like, "oh, is it snack time? I feel like it's snack time." This is the hardest part to me about this project is what the right level of metadata is. So, what are the right elements that we want to call out for our nutritional label? What are the fats and the sodiums in these kinds of things? Because you know that the complication here is that there are so many different kinds of datasets. I can have a dataset about trees in Central Park, and I can have a dataset about people in prison, so we've kind of identified that the harms that we're most worried about have to do with people. Not to say that we are, you know, not worried about things like the environment or other things, but when it touches people or communities is when we see the greatest harms from an algorithmic standpoint in society. So, we kind of have a badge system that should be very quick, kind of icon-based, that says this datasets about people or not. This dataset includes sub population data, so, you know, includes information about race or gender, or whatever status, right? This data set can be used for commercial purposes or not. We've identified, let's say, 10 to 15 things that we think are kind of high-level, almost like little food warning symbols that you would see on something, like, "it's organic..."

Baratunde Thurston  25:05  

The Surgeon General's warning, right?

Kasia Chmielinski  25:08  

Exactly, so at a very high level we have these kinds of icons, then underneath that there are additional very important questions that we've highlighted, that people will answer who own the dataset; then, finally, there's a section that says, "here's the reason it was made, this dataset was made." There's probably an intended use. "Here are some other use cases that are possible or ways that other people have used it, then here's some things that you just shouldn't do."

Baratunde Thurston  25:33  

So, how do we make this approach more mainstream?

Kasia Chmielinski  25:38  

Mainstream is a tough word, because we're talking about people who build AI, and I think that is becoming more mainstream, for sure, but we're really focused on data practitioners, so people who are taking data and then building things on that data. There's kind of a bottoms-up approach. It's very anti-establishment in some ways, and very hacker-culture, so we've been working with a lot of data practitioners to say what works? What doesn't? Is this useful? Is it not? Make it open source, right? Open licenses, use it if you want, and just hoping that if we make a good thing, people will use it. A rising tide lifts all boats, we think, so, you know, we're not cagey about it because we just want better data. We want better data out there, and if people have the expectation that they're going to see something like this, that's awesome. There's also the top-down approach, which is regulation policy, and I could imagine a world in which, in the future, if you deploy an algorithm, especially in the public sector, you'd have to include some kind of labeling on that, to talk about the data that it was trained on and provide a label for that. So, it's kind of a two-way approach, you know?

Baratunde Thurston  26:38  

Yeah, I know, I mean, when I think of analogs, most of us don't know civil engineers personally, but we interact with their work on a regular basis through a system of trust, through standards, through approvals, through certifications; and data scientists are on par with a civil engineer in my mind, in that they erect structures that we inhabit on a regular basis, but I have no idea what rules they're operating by. I don't know what's in this algorithm. You know, I don't know what ingredients you used to put this together that's determining whether I get a job or a vaccination. What's your biggest dream for the Data Nutrition Project? Where does it go?

Kasia Chmielinski  27:18  

So, I could easily say, you know, our dream would be that every dataset comes with a label. Cool, but more than that, I think we're trying to drive awareness and change, so even if there isn't a label, you're thinking about, "I wonder what's in this and I wish it had a label on it." In the same way that I walk into a bakery and I see a cake that's been baked, and I might think to myself, "I wonder what's in that cake, and I wonder, you know, if it has this much of something, or maybe I should consider this, when I decide whether to have four or five pieces of cake." We would want the same thing for a dataset, where even if you encounter that dataset in the wild, someone's created it, you just downloaded it from some repository on GitHub. There's no documentation that you as a data practitioner will think to yourself, "I wonder if this is representative. I wonder if the thing I'm trying to do with this data is responsible, considering the data, where it came from, who touched it, who funded it, where it lives, how often it's updated, whether they got consent from people when they took their data." So, we're trying to drive a culture change.

Baratunde Thurston  28:32  

I love that, and I love the idea that when I go to a bakery, one of the questions I'm not asking myself is, "is that muffin safe to eat, right? Is that cake gonna kill me?" It literally doesn't enter my mind, because there's such a level of earned trust in the system overall, that these people are getting inspected, that there's some kind of oversight, that they were trained in a reasonable way, so I know there's not arsenic in my muffins. So, this brings me to, zooming out a little bit further, to artificial intelligence and the idea of standards, because I'm getting this picture from you that there's kind of a Wild West in terms of what we're feeding into the systems that ultimately become some form of AI. What does the world look like when we have more standards in the tools and components that create AI?

Kasia Chmielinski  29:23  

I think that our understanding of what AI is, and what kinds of AI there are, is going to mature. I imagine that there is a system of classification, where some AI is very high-risk, and some AI is less high-risk, and we start to have a stratified view of what needs to occur in each level, in order to reach an understanding that there's no arsenic in the muffins. So, at the highest level when it's super, super risky, maybe we just don't use AI. This seems to be something that people forget, it's that we can decide whether or not to use it. Would you want an AI performing surgery on you with no human around? If it's really, really good, do you want that? Do you want to assume that risk? I mean, that is dealing with your literal organs, your heart. So, I think that, you know, ideally what happens is you've got a good combination of regulation and oversight, which I do believe in, but then also training and, you know, good human intention to do the right thing.

Baratunde Thurston  30:30  

So, when I think about these algorithms, I think of them as automated decision-makers, and I think they can pose a challenge to our ideas of freewill and self-determination, because we're increasingly living in this world where we think we're making choices, but we're actually operating within a narrow set of recommendations. What do you think about human agency in the age of algorithms?

Kasia Chmielinski  30:58  

Whoa, these are the big questions. Well, I mean, I think that we have to be careful not to give the machines more agency than they have, and there are people who are making those machines. So, when we talk about, you know, the free will of people versus machines, it's like the free will of people versus the people who made the machines, to me. Technology is just a tool, and I personally don't want to live in a world that has no algorithms and no technology, because these are useful tools, but I want to decide when I'm using them and what I use them for. So, my perspective is really from the point-of-view of a person who has been making the tools, and I think that we need to make sure that those folks have the free will to say, "no, I don't want to make those tools, or this should not be used in this way, or we need to modify this tool in this way, so those tools don't run away from us." So, I guess I kind of disagree with the premise that it's people versus machines, because people are making the machines and we're not at the Terminator stage yet. Currently, it's people and people, right? So, let's work together to make the right things for people.

Baratunde Thurston  32:16  

Yes. Kasia, thank you so much for spending this time with me. I've learned a lot, and now I'm just thinking about arsenic in my muffins.

Kasia Chmielinski  32:25  

Thanks so much for having me, I've really enjoyed it.

Baratunde Thurston  32:31  

Garbage in, garbage out. It's a cycle that we see that doesn't just apply to the world of artificial intelligence, but everywhere. If I feed my body junk, it turns to junk. If I fill my planet with filth, it turns to filth. If I inject my Twitter feed with hatred, that breeds more hatred. It's pretty straightforward, but it doesn't have to be this way. In essence, Kasia fights to standardize thoughtfulness, and that fills me with so much hope. We're all responsible for something or someone, so let's always do our best to really consider what they need to thrive. If we put a little more goodness into our AI, our bodies, our planet, our relationships, and everything else, we'll see goodness come out. And that's a cycle I can get behind. Goodness in goodness out. This is just one part of the How to citizen conversation about data.

Krystal Tsosie  33:40  

Who does data ultimately benefit? If the data is not benefiting the people, the individuals, the communities that provided that data, then who are we uplifting at the cost of others' justice?

Baratunde Thurston  33:55  

Next week, we dive deeper into how it's collected in the first place, and we meet an indigenous geneticist reclaiming data for her people. See you then.

We asked Kasia what we should have you do, and they came up with a lot, so here's a whole bunch of beautiful options for citizening. Think about this, like people, machines are shaped by the context in which they're created. So, if we think of machines and algorithmic systems as children who are learning from us, we're the parents, what kind of parents do we want to be? How do we want to raise our machines to be considerate, fair, and to build a better world than the one we're in today? Watch Coded Bias. It's a documentary that explores the fallout around MIT Media Lab researcher Joy Buolamwini's discovery that facial recognition doesn't see dark-skinned faces well, and this film is capturing her journey to push for the first-ever legislation in the U.S. that would govern against bias in the algorithms that impact us all. Check out this online buying resource called the Privacy Not Included Buying Guide. Mozilla built this shopping guide, which tells you the data practices of the app or product that you're considering, and it's basically the product reviews we need in this hyper-connected era of data theft, and hoarding, and non-consensual monetization. Donate. If you got money, you can distribute some power through dollars to these groups that are ensuring that the future of AI is human and just: the Algorithmic Justice League, the ACLU, and the Electronic Frontier Foundation. If you take any of these actions, please brag about yourself online, use the hashtag #howtocitizen. Tag us up on Instagram @howtocitizen. We will accept general, direct feedback to our inbox: Make sure you go ahead and visit because that's the brand new kid in town. We have a spanky-new website. It's very interactive. We have an email list you can join. If you liked this show, tell somebody about it. Thanks. How to Citizen with Baratunde is a production of iHeartRadio Podcasts and Dustlight Productions. Our executive producers are me, Baratunde Thurston, Elizabeth Stewart and Misha Euceph. Our senior producer is Tamika Adams, our producer is Alie Kilts, and our assistant producer is Sam Paulson. Stephanie Cohn is our editor, Valentino Rivera is our senior engineer, and Matthew Lai is our apprentice. Original music by Andrew Eapen, with additional original music for season 3 from Andrew Clausen. This episode was produced and sound designed by Sam Paulson. Special thanks to Joelle Smith from iHeartRadio and Rachael Garcia at Dustlight Productions.

Transcribed by


Go Deeper

Share Some Feedback

Let us know your thoughts about the episode. What did you learn or what surprised you or challenged you?

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Spread The Word

Share what you’ve learned. Knowledge is power! Tag #howtocitizen so we can reshare!