Wharton’s Stephanie Creary speaks with Dr. Broderick Turner — a Virginia Tech marketing professor who also runs the school’s Technology, Race, and Prejudice (T.R.A.P.) Lab — and Dr. Karim Ginena — a social scientist and founder of RAI Audit — on how to use AI while thinking critically about its flaws.
This episode is part of the Leading Diversity at Work series. Read an article about this episode here.
Transcript
Stephanie Creary: Hello, my name is Stephanie Creary, and I’m an Assistant Professor of Management at the Wharton School at the University of Pennsylvania. I’m delighted to welcome you to today’s episode of the Knowledge@Wharton Leading Diversity at Work podcast series, which is focused on responsible and fair AI in the workplace and the marketplace. Joining me today are two very special guests. First we have Dr. Broderick Turner, who is an Assistant Professor of Marketing at Virginia Tech’s Pamplin School of Business and a Visiting Fellow at the Harvard Business School’s Institute for Business and Global Society. He also runs the Technology, Race, and Prejudice Lab, or the TRAP Lab, where he and his fellow TRAPpers are pushing the boundaries on understanding how race and racism underlie many consumer and managerial decisions. His main research area focuses on the intersection of marketing, technology, racism, and emotion.
Next we have Dr. Karim Ginena, who is the founder of RAI Audit, an AI governance and research consultant with sixteen years of experience in ethics and governance. Spanning both industry and academia, he recently served as the founding user experience researcher on Meta’s Responsible AI team, leading AI fairness user research at the company. He has helped Meta’s RAI team scale its products across the company and bolster the responsible adoption of AI. Dr. Ginena holds a PhD in Management from the University of Virginia’s Darden School of Business, specializing in Business Ethics and Organizational Behavior. And I just realized we’ve got a Virginia connection across the two of you here. Who would have thought that that would happen?
But in any case, welcome Broderick and Karim. I am so honored to have you with me today for our conversation on responsible and fair AI in the workplace and in the marketplace. So we’re hoping to cover a lot of interesting ground here today.
I talked to Broderick and Karim about this earlier — I like to think of myself as a novice on these topics, so today I’m going to represent the average consumer or worker who doesn’t have a lot of experience talking about AI, other than what I read about and hear about on mainstream media and from my colleagues. So we’re hoping today that this conversation will be helpful to others of you who are just like me, who are trying to figure out how relevant a conversation about responsible and fair AI is to us as workers, but also as consumers.
What Is Responsible and Fair AI?
So let’s get started talking broadly about responsible and fair AI. Broderick, I’m going to go to you first, and what I’m hoping you can do is share a bit about how and why you became involved in this topic, and summarize for us some of your current work related to this topic.
Broderick Turner: Yes, I got involved in this topic area honestly more than a decade ago, when I was teaching high school math at a public high school in Atlanta, Georgia, where I taught 90% Black students. What I wanted to do while I was there was figure out ways to make those kids’ lives better, fairer, more equitable. And simultaneously, I used to teach linear regression, and students would ask me, “Mr. Turner, when am I ever going to use this?” And now I have an answer. The answer is now.
So I’ve moved to the space where I’m thinking a lot about how race and racism underlie market systems, and technology being a system that matters. Our research group, the Technology, Race, and Prejudice lab is doing active research now on what levers can be moved to lead to more equality, and systems to lead to a better outcome in technology. This has extended into some company advisory work, where we’re trying to get companies to consider that moving communities earlier into the development process leads to better products that come out. And then on the public knowledge side, writing white papers on how different identity groups may be impacted by generative AI, for instance.
We just finished a report for Hispanic Heritage Month on how folks with Hispanic origins and backgrounds should be thinking about generative AI. How do they fit into this interlocking system of data classification code?
Creary: Thank you so much. Karim, same question to you. Can you share a bit about how and why you became involved in this topic around responsible and fair AI and summarize some of your current work on this topic?
Karim Ginena: Yes, absolutely. First, thank you so much for inviting me. I’m excited about this dialogue with you and Broderick, and I love the Virginia connection. Virginia is well-represented here today. I’ve been involved in ethics in governance for quite some time, prior to pursuing my PhD. So I’m not surprised that I ended up in AI governance. What I’m a little bit surprised about is actually post-pursuing my PhD, going back to industry. Initially when I started off my PhD program, I had no intentions of going back to industry, but in my last year of my PhD, which was my sixth year, I received offers basically from academia and industry, and I went back to industry.
So firstly, as someone who belongs to a minority group, I was very passionate and still am about having a front seat to these conversations that are happening in tech and helping shape the technology using the latest and greatest research, but also interacting with products seems to be able to do that policy and legal and so forth.
The other reason that I chose to go back was, as I was thinking through my impact, to be honest, joining a company like Meta’s Responsible AI team, where the company has over 3 billion users, I think it’s very difficult to argue with the extent of the impact that you can do when you have 3 billion users, and any kind of product changes that you make an impact really on a lot of people worldwide.
And lastly, honestly, one of the concerns that I’ve had about academia was being stuck in a small college town somewhere, where there’s not very much diversity. That would not be appropriate for bringing up my kids. And so that was a very strong concern, not having much of an autonomy about where I live or where I work. So it drove me back to actually go to industry.
As far as my research, much of what I do for clients is obviously confidential, but on the public side of things, I’ve been testing out some popular AI generators, and I’m finding a few patterns. So most of the images that are being generated, at least from the AI generators that I’ve tested, tend to be of people from the white race. People of color are greatly overlooked. These models tend to associate a professional headshot or a professional dress code or a professional hairstyle with those of white people. Women are underrepresented, and while presented as professionals, they are often limited to gendered occupations such as teachers, nurses, graphic [?] designers. They’re much less likely to appear as CEOs, for example, or medical doctors or lawyers. Images also had this skew on the younger side of things. Senior managers, for example, are associated with having great hair, which is a sign of ageism.
And finally, these models have portrayed certain populations in demeaning ways. For example, if you ask DALL-E Bing to portray Turks, it will provide you with a picture of a stern [?] turkey dressed in a turban. Whereas if you ask it to produce, for example, an image of an American or a French person or something like that, it will do a much better job, right? They might also refuse to produce images of certain populations for no good reason. If you ask it to produce an image of a Muslim or a Jew, while presenting followers of other religions.
So definitely this technology is extremely powerful. I am not a doom-sayer per se, related to the technology, but I believe that we have to do our prudent and due diligence in order for us to be able to direct the trajectory of this technology in ways such that we’re maximizing the benefits while minimizing the harms.
Creary: Absolutely.
Turner: My favorite example of that generative AI art is when people do prompts for Jesus and the Temple, flipping the table, that they literally get Jesus doing gymnastics, doing a back flip over a table. [LAUGHTER] So it has no meaning. It’s just funny.
Creary: Absolutely. So we’re going to definitely dive into it, because there’s a lot to unpack there, and certainly the implications of this. I think of where I sit. I can automatically connect those dots, but I think as we’re beginning to talk about some of the challenges of the importance of understanding what could possibly go wrong when you’re not represented? And how are these models being trained? I think to me, that’s an obvious answer, but I don’t think that’s obvious to everybody. It’s what happens when you lack representation in the data — amongst the data. Or if you’re misrepresented, right? Who you are and the groups that you’re part of are misrepresented in how the model is being trained.
So we’re going to talk about that in a minute, but let me just — and partly this might be an answer to your question here, Karim. Let’s start with business and employers first. I can imagine I have some sense of this from my own work, that businesses and employers are likely and can be struggling to understand their conversations, not just in a role about AI use, but fair and responsible AI use.
And so obviously without saying anything confidential, can you talk generally about how you, through your work, are helping businesses and employers make sense of their roles and responsibilities on this topic?
Ginena: Yes, I think it’s important to be clear that companies are accountable for how their AI systems operate, and that responsibility can be off-loaded, right? It can be considered an externality of some sort. They have obligations to their customers, and it’s just a matter of time before legislation comes into effect. Obviously we have right now the EU AI Act, which we’re waiting for. It’s just a matter of time before enforcement happens, and there’s a lot of talk about legislation in Congress and so forth and its states. Honestly, if companies want to build successful products in this AI era, they must invest in responsible AI infrastructure, right? Doing so requires a commitment from the board of directors, from senior leadership, but ultimately what it does is that it pays off in terms of earning customer trust and confidence, right?
It just doesn’t cut it for you to actually have an AI strategy without thinking through the trust layer or the responsible layer — issues of fairness, of privacy, robustness, and so forth. Because if you’re a product manager, for example, you try to produce the best products. No one really wants to deal with a company whose products are breaching their privacy obligations and things like that. If you’re using personal data of your users to train your models, or if your products don’t work for certain segments of the population or expose your users to specific types of harms — cutting corners like that really does a disservice to your stakeholders, of course. But it also exposes you to legal and reputational risk. And it’s just a bad way of doing business. It’s just a matter of time before better companies are able to gain more market share because they’re taking this more seriously and investing in setting up the infrastructure.
Creary: Yes, a lot of what I hear you saying is sometimes it’s easy to think that ethics and questions of responsibility isn’t the job of a corporation. But you take a strong stance, and certainly through your own dissertation work and your own scholarship, is that these are things that are not nice-to-haves. They are must-haves. While it shouldn’t always come down to this, we do have a system called the legal system, which is set in place to help adjudicate these issues when it does feel like it’s a grey area. It might not be black and white with respect to businesses and their responsibility to ensure that they’re not inducing harm through their activities.
Broderick, certainly I know that you do work with companies, as well. I would also like you to put on your researcher and educator hat and help us to understand the nature of this conversation from a scientific perspective and also in the work that you’re doing as a professor, as part of your fellowship. I’m actually thinking a lot about a ChatGPT these days, certainly because as an educator, this is probably the most I know about this conversation around should students be able to use ChatGPT to complete their assignments? And it seems to me like it’s such a polarizing set of issues.
So I am just curious, as you think about the places where you sit, Broderick, how you’re thinking about the issue around businesses, employers, and education’s role in this conversation?
Turner: So I’m thinking a couple of thoughts, right? First to piggyback on Karim, when I talk to both students and businesses, I think of myself as an educator, no matter where I am. I have the same thing I tell them, which is if you get this wrong, if you don’t include a wide swath of human beings in the creation of your technology products, when you fail — because it’s not an “if” — but when you fail, you will lose money because you’ve spent all of this money on development, without considering the human beings at the end. And when the product gets released, the product fails again because people don’t use it.
So as a really simple example, those automatic faucets where you put your hand under it did not include darker, melanated people in the data set, in the training set for when they were testing out this automatic hand washer. And so while it sold fine in North America, when it went further out into places closer to the equator, where people are browner, regardless of ethnic origin, they didn’t sell any because their product did not work. They would install it, and people would put their hands under the thing. And I know if you’re a Black or a brown person, you have done this at the airport bathroom. Enough said, right? But if your country is majority melanated, then no one bought them.
So had they included those folks earlier in the development process, all that money they spent on development would have been worth it, because they could have sold more products. So when I’m talking to my students and business leaders, who in some ways are my students, I go, “Look, this is why we do it. I’m going to speak to the same incentives that you have: You want to make more money. You want to keep your job. You want to do right by your shareholders. Then you need to move the process or move the people earlier into the development of this process.”
Now in terms of this question around ChatGPT and the way that people are using this inside and outside the classroom, we developed in the TRAP lab these three questions, called the 3-D model of equitable tech adoption. I’m going to share those questions with you, and this will help people understand what do I do with any tech that comes into my business or into my classroom? Those three questions. Question 1: What does this product actually do? Not what you want it to do, not what it might do in the future, not what it could do if we had a billion more hours of compute, but what does it actually do today?
Second question is who does this product disempower? Every technology will increase power for some and decrease power for others. So ask yourself: Who does this disempower?
And the third question is: What is the daily use of this product? Not the edge case. Don’t tell me that we’re going to — I’ll give you a pertinent example for Hispanic Heritage Month. There is some question that maybe we should include in facial recognition whether or not this person is Hispanic, or this person is Latino. And the sales pitch for this is that this will help us when we do the census, to identify who is Hispanic or Latino, so we have a better count.
Now I can get into the research end of this and why it’s problematic, or some of the philosophy of how this goes awry, but let’s talk about that case. They’re saying, “We’re going to use this for the census.” The census happens every ten years, right? There is no government service that’s going to buy the expensive technology and then only use it once every ten years. So then what’s the most likely daily use case or technology that might say this person is Hispanic or not. Are they going to install it at borders? Probably. Would they install it where you hand over our passport? Probably. All right, and is that problematic? Definitely.
And so this is how I’m thinking about these things, and in the case of ChatGPT, your students should ask themselves, “What does this actually do? Is this writing papers, or is it just creating plausible-sounding sentences?” And if it’s just creating plausible-sounding sentences, that could get real bad real fast, if that plausibility has no relationship to accuracy. So feel free to use it and feel free to fail. That’s on you. That’s where I come down on it.
The Top Issues Facing the Future of AI
Creary: So you did a great job, both of you already, of raising some of these key challenges, and I’d like to explore these more. Certainly we don’t have time for Top 10s, and I’m sure that there are 10 on each of your lists, but let’s think about this in terms of the top two or three issues from where you sit. So Broderick, let’s go back to you. You can include those that you’ve already talked about or invite others into the conversation.
What would you say, from where you sit, are the top two to three issues with respect to responsible and fair AI in the workplace and the marketplace?
Turner: Two things. The number one issue is that the human beings that end up using the AI, being users of machine learning, or being users of this generative AI aren’t necessarily the people who are included in the data. They aren’t necessarily the people who are included in the classification of the data, and definitely aren’t the people who are setting the rules and doing the coding of the data. And so the most pressing issue is to get those people into those spaces, right? To get representative data, to get representative classification on that data, to get representativeness in the actual coders that decide the rules for what we see. And then that way, the stuff that actually comes out will be closer to fair and equitable because those people would be in the room. So that’s one.
The second thing is that all of my research around this space is really to demystify this black box. There is nothing magic going on inside of these statistical models. They are statistical models. And if you learned in high school y = mx + b, then you have the building block that you need to understand how these systems work. We can talk about this if you ever come hang out in the TRAP lab, but trust me, you know y = mx + b, then you too can start to understand that it’s not magic going on inside these systems, but just a bunch of opinions around this commoditized human labor in the data, the classification, and the code.
Creary: It’s interesting, we’ve sort of hinted around language here. In this case, I think I find myself personally feeling like it does sound like it’s magic, because the language of AI and the technology that we’re using to talk about these things sounds very foreign to me. And so I think that is what gives it its mystique. If I can talk about it in a way that doesn’t coincide with the lay terminology that regular people use, then it sounds like it’s inaccessible to me, right? Is that a power move? I don’t know. But language is powerful, and language can be used as a way to exclude. In this case, I think what you’re helping us understand, Broderick, is many of you said you learned y = mx + b, you know that’s a language that you spoke at some point. It’s to the extent that we give people a grounding for understanding this new technology through something that they already understand, language that they already possess.
Then I think the mystique around this disappears. The power differential between the creators of the technology and the consumers and the employees who are trying to figure out what this means for them — that starts to reduce, as well. Is that along some of those same lines, Broderick, of what you’re suggesting?
Turner: That’s exactly it, right? If you hear the term — ChatGPT says they have a billion-parameter model. You’ll go, “Oh, those are some big words. “Billion,” “parameter,” what does that mean? Let’s explain it, right? Let’s break it down real quick. So let’s go back to this y = mx + b. B is, call it our y intercepts on every [?] term. We’re not going to worry about that. We’re just going to focus on the y = mx. Y is an output. Every computer, every machine, they’re all based on the Turing machine. Nothing really changed. They can only do what they’ve been told to do. Y is going to be the output that comes out, right? That could be plausible-sounding sentences. That could be art of Jesus flipping over a table — whatever.
And then m and x are what matters. X is inputs, so if I’m doing a system that’s going to decide whether or not someone gets paroled, for instance, those x’s might be zip code — problematic. Those x’s might be past criminal history. Those x’s might be height. Those x’s could be anything.
Then the m is what matters. The m is essentially the slope of that line. We learned this in tenth grade, but that slope is an opinion. It is some developer’s opinion, or if it’s unsupervised, it’s still some developer’s opinion on how much that x matters. How much does it matter that I live in this zip code? How much does it matter that I’m 6’6″? How much does it matter that my name is Broderick?
And then that opinion gets added in, so each one of those mx’s — we can call that a parameter. So if it’s a billion things, that’s a billion opinions, but they all get filtered out and come out to this output Y. That’s it. We have now learned machine learning. Congratulations, everybody. Pat yourselves on the back. If you learned y = mx + b, you too can start to understand what’s going on in these systems.
You can do like Karim or myself and run audits, where you basically test these systems with a bunch of stimuli to see what comes out to explain this thing maybe lends inequality because of this weirdness, right? And so let’s just demystify the whole thing. It’s not [UNINTELLIBLE]. You’re not pooh-poohing my computer science folks out there, but we can share a language that, inside of this increasing complexity, comes down to a pretty simple building block. And you know the building block already if you made it through high school. And our Wharton grads and our MBAs and undergrads, I know that you learned y = mx + b.
Creary: I have to tell you, it has been a long time since I learned y = mx + b. I’m not sure my high school teacher was as great as you were at explaining that, but I feel like I have, in the last three minutes, a much better understanding of what it is that people are trying to suggest is sophisticated, which often sometimes feels inaccessible. But I think you made me believe, and I’m sure others believe that they, too, can understand what the big deal here is, and assess for themselves: Is it a big deal, or is it just more of what we already know?
Karim, let me turn to you, and let’s get your top two to three issues. I don’t know how much they overlap. I know earlier you spoke about issues around representation in the data. I’m not sure if that’s one of your top two or three issues or if you have others. Can you share with us from your vantage point, what are some of the key challenges?
Ginena: Yes, first I agree with Broderick in that transparency is a major issue. This black box problem and trying to basically understand what companies are doing, are companies being transparent about how they’re training base models, what data sets they’re using, or explaining to their users, what is happening behind the scenes? Obviously there is an optimal level of transparency, as well, like transparency after a specific, particular level might get too in-depth, to the extent that the average user might zone out. It’s irrelevant information, right? You don’t want to inundate your user with too much technical detail that they really zone out.
So for me the first pressing challenge is addressing unfairness in AI systems. If you’re trading data like Broderick has mentioned, dismissing certain populations, or if your data is mislabeled, obviously that can give rise to bias and can have adverse effects on certain segments of the population. This particularly becomes problematic in issues of healthcare, employment, criminal justice, where these decisions are consequential for people. So obviously if these issues of bias are left unaddressed, they can perpetuate unfairness in society at a very high rate. We’re not just talking about your prototypical kind of bias. We’re talking at an exponential rate with these automated decision systems, which is why they can be very dangerous.
The second problem I see is what is called the “hallucination problem,” which is pretty much like making up stuff. As Broderick had mentioned, these LLMs learn to predict the next word or phrase in a sentence, so they can misrepresent facts. They can also tell you a very good story that has a very good narrative that is extremely plausible but is misleading. And I think this is a particularly big problem, given that we as human beings — we have an automation bias where we have a propensity to kind of favor suggestions from an automated decision system and to ignore contradictory information that we might know. We just cannot defer to this automated decision system.
The third one, I would say, is data privacy and security concerns. This involves things like unauthorized access of data, a scraping of data that, for example, the company or the LLM might not have access to or consent to use — things like data leakage, for example, where a large language model might present some information that a user had used. We have cases like that coming up in the media, where Samsung, for example, has been — the use of ChatGPT by its employees because of the fact that the LLM was revealing some of these trade secrets. So things like malicious use and manipulation attempts by nefarious actors — these are all very serious concerns, as well.
Tackling AI Privacy Concerns
Creary: As I listen to you all talk about these concerns, it sort of raises the concerns that I’ve just had, again, just as an employee and certainly as a consumer. And certainly what came across my social media feed recently was in the last couple of weeks, there was a lawsuit, a class action lawsuit filed in the Northern District of California. The plaintiffs are OpenAI and ChatGPT and the Microsoft Corporation. One of the big areas included in this lawsuit is invasion of privacy. It’s pretty compelling. Again, I’m not a lawyer, and none of us knows how this is going to pan out, but just looking at it, to understand the lengths at which people don’t understand how their data is being accessed and utilized without their permission — it can be concerning. And so I would encourage people to — It was a nice tutorial, I would say, if you will, for me in privacy concerns around this topic.
But I’m going to certainly cut back to the two of you, and if you’ve had a chance to look at anything in relation to this class action lawsuit, I’m wondering if you’re surprised by this lawsuit, or if you just thought that somebody, it was just a matter of time before it happened? Any thoughts on that, Broderick? And then I’ll come back to you, Karim.
Turner: Yes, so clearly it was only a matter of time. We think about, again, what do these systems actually do? So these large language models train on previous data. And “training” just means that they take in a bunch of data so they then connect it together. But whose data did they take? Where did that data come from?
Now there is some indication that when you’re training on a huge corpus of data, that you’re going to tend towards the cheapest and freest sources. This is why we get a lot of weirdness that comes out of these systems in terms of bias, because they’re trained on the free parts of the internet a lot of the time. And what is overrepresented in the free parts of the internet, Stephanie?
Creary: The free parts of people who — I’m trying to think about social media as an example, as a source of data. And I’m thinking about who the users might be. I’m thinking about — I’m not sure if this is where you’re going — but I think about all the young people who use social media and put all of their information in.
Turner: So we have them. We have an overrepresentation of younger folks, right? So we’re going to have weird age-related things that don’t really exist in the data. The other thing that gets overrepresented on the internet, on the free parts, at least, is propaganda. Some websites where they’ll say really negative things about presidents, for instance are free, whereas if I go to, I don’t know, The New York Times, I can read four articles before I get to a firewall, and they’re like, “No, sir, you’re done.”
And so there’s going to be some weirdness that’s in the data from that. The other thing that’s free on the internet or overrepresented on the internet is pornography. So if I’m taking in porn and propaganda into these systems, then I’m going to be getting all types of weirdness out. And the only way to improve these things is to get access to data that has culture to accuracy that has time put into it.
If you’re going to read one of my research articles, Stephanie, how much does it cost if you’re not affiliated with a school? Like thousands of dollars, right? And so if you’re one of these companies, and you want to improve your data, how do you do it? Do you spend thousands of dollars per article from Stephanie Creary, or do you steal it?
Creary: Right, is that a rhetorical question?
Turner: I really can’t say, but I’ll answer the question. The answer is you’re probably not going to spend thousands of dollars per article, per professor if, instead, you can steal it. And this goes for any “better data,” because I need a bunch of it to get to some version of — and it’s not accurate — but a better distribution of data. And so, yeah, of course they’re getting sued because to make the system better, they had to take this stuff in. We do have intellectual property laws. Somebody is going to have to pay, or they’re going to have to change the law.
Creary: Yes. Karim, do you want to chime in on this conversation before we move to the “how do we fix it?”
Ginena: Yes, absolutely. I think lawsuits are just going to mushroom from here onwards. It’s just a matter of time, honestly. Every day, we’re hearing about another lawsuit, whether it’s relating to privacy or discrimination or other aspects. So to me, I’m not too surprised. I think these companies are — obviously I can’t speak on their behalf — but we have a responsibility to make sure that we’re training our LLMs on reliable data sources. We need to fine tune these LLMs. We need to have some kind of a fact-checking mechanism to ensure that if they’re producing garbage, that they’re receiving feedback and they’re improving. We have to have human reviewers in the loop that can play a role in correcting the trajectory of these LLMs, perhaps even including confidence scores, to allow users to understand and gauge the reliability of the information that they’re getting. And even to provide users with a mechanism to provide feedback for the LLM as to whether the response that I got was terrible or not, so that they are taking another signal from their users.
So I think in conjunction, all of these different measures, hopefully over the course of time, and as long as there is strong buy-in from leadership to improve these models, I think companies can do a better job at protecting the privacy of people but also ensuring that whatever gets propagated in terms of outputs is more reliable.
How to Use AI in a Fair and Responsible Way
Creary: Any other suggestions, solutions around that or the other challenges that you and Broderick brought up today? Can we think about one or two key things that can be done for various audiences? I’m thinking of lots of audiences. I’m thinking of employers and institutions. I’m thinking of consumers and employees, as well.
Turner: I’ll let Karim go first on that one.
Ginena: I think the grass is so green, there is so much that can be done in this space. Firstly, there needs to be legislation. There needs to be conflict [INAUDIBLE] and enforcement to protect public interests. It’s not sufficient for companies to have voluntary commitments, so that’s good, but it’s not sufficient. Secondly, there needs to be, when we’re thinking about data protection laws, they need to be strengthened to include AI systems and to determine the privacy rights of users and so forth.
We spoke about transparency and a company’s need to disclose how AI systems have been built by data sources they’ve been using, and to kind of take a crack at the black box problem. Stakeholder engagement is really important. As we were thinking through diverse audiences and data sources, you need to engage with different stakeholders. I always say when you’re working in this space, you really need to get it to be cross-functional because there’s a lot of barbed wire in this space, whether it is privacy, legal, regulatory, you’re working with ethics as product managers, you’re working with data scientists, researchers, and it is an extremely cross-functional space, and it has to be that way because it is a sociotechnical problem that takes all the great minds to be able to attack it from different ways.
You need to have diverse teams, for example, so that you have representation. People can identify and rectify bias effectively, right? Training data needs to be more inclusive. As Broderick was saying, who is missing out from this data? Who are we not seeing? I always say that this technology is not neutral. This technology already has a point of view, and its point of view is what it has been trained on. If it has been missing people of color, for example, then that’s its point of view. It’s not like it’s coming as a blank slate, no. It already has a point of view.
So that’s a little bit of myth-busting. We need to take inclusive data and inclusivity into consideration throughout the product development lifecycle. We need to have audits that are done on these models and ensure that we have results that can kind of feed into how we can improve. Human oversight, for example, I had mentioned this earlier. There needs to be a human in the loop to be able to get a spot and improve the trajectory of outcomes. There’s so much that can be done, I can go on and on and on.
Creary: So much. Broderick, it’s your turn. What would you say are the key specific things that can be done?
Turner: I’m going to break this down into different market segments because that’s how my brain is wired. So I’m going to talk about what can the companies do, what can consumers do, what can researchers do? And then what can your students do?
So first for companies, consider that 3-D model that I laid out earlier before you roll out a product. What does this product do? Who does it disempower? And what is its daily use? If you need help in answering these questions, then call RAI Audit, and Karim will pick up the phone. Or you can holler at the folks over here in the TRAP lab. We can help you think through some of things as we’re trying to make this technology better and more equitable.
Two, consumers, what can you do? Do not accept that it’s magic. There is nothing magical about these systems. These systems are just people and their opinions of people. If you learn y = mx + b, then you have learned the building blocks of every machine learning system. So if there is some weird outcome that comes out when you’re on Facebook or Twitter, or you notice some weirdness when you put out an application for a loan, trust your gut. It is weird, so say something, because again, the only way to improve these things is for them to update their model. And you are right that something is wrong. It’s not magic. It’s not your fault. It’s them. Don’t accept magic.
For researchers, if you are interested in working on these topics and in this space, go to jointheTRAP.com and come hang out with us. We meet online once a week over Zoom, Wednesdays at 1:00 p.m.
And then finally for students, this is going to be a weird one because in some ways, this rise of generative AI in writing and art makes people believe that we are like this close to having AI that can write beautiful books or make beautiful art. And I’m going to challenge that, right? They also told us we were this close to having fully self-driving cars within a year, ten years ago. And they say this every year, then it’s not happening any time soon.
I’m going to say the same thing is the case for art and writing and creativity. I think that there will be a premium on people who can actually express themselves clearly and accurately and honestly, and so if you are currently a student anywhere, if you’re in a B-school, if you’re in a college, if you’re in high school, and you’re listening to this — I don’t know why you’re listening to a Wharton podcast in high school, but maybe you’re one of those kids. Get really into building your toolbox of creativity. Get really into creative writing. Get really into making more art because there will be a value on the actual human element that comes out of this if what everyone else is doing is just plugging in some chat bot that gives you plausible sentences that aren’t that good. If you can write better sentences, you’ll win.
Creary: Thank you. And I’m just going to tell you, Broderick, we actually do have Knowledge@Wharton high school, so your high school learners are going to learn a lot from what you all shared today, as much as you are fully serious senior leaders, as well.
So Broderick and Karim, this has been fantastic. I feel so much more knowledgeable as somebody who believes that I know a lot about a lot of things, I know that I don’t know a lot about this topic. So the past 45 minutes that we’ve been chatting has been amazing in just helping me to feel more empowered as a researcher, a consumer, and a worker around what exactly is happening and how I need to pay attention to what is being shared.
I want to thank you so much for sharing your insights and your expertise with us, and to all the Leading Diversity at Work podcast listeners. We truly appreciate you for being here. So that’s all for now. Thanks to our audience for joining us and listening to this episode of the Knowledge@Wharton Leading Diversity at Work podcast series. Goodbye for now.