When we buy something on Amazon or watch something on Netflix, we think it’s our own choice. Well, it turns out that algorithms influence one-third of our decisions on Amazon and more than 80% on Netflix. What’s more, algorithms have their own biases. They can even go rogue.
In his recent book titled, A Human’s Guide to Machine Intelligence: How Algorithms Are Shaping Our Lives and How We Can Stay in Control, Kartik Hosanagar, a professor of operations, information and decisions at Wharton, focuses on these issues and more. He discusses how algorithmic decision making can go wrong and how we can control the way technology impacts decisions that are made for us or about us.
In a conversation with Knowledge at Wharton, Hosanagar notes that a solution to this complex problem is that we must “engage more actively and more deliberately and be part of the process of influencing how these technologies develop.”
An edited transcript of the conversation follows.
Knowledge at Wharton: There’s a growing buzz about artificial intelligence (AI) and machine learning. In all the conversations that are going on, what are some points that are being overlooked? How does your book seek to fill that gap?
Kartik Hosanagar: Yes, there’s a lot of buzz around AI and machine learning, which is a sub-field of AI. The conversation tends to either glorify the technology or, in many instances, create fear mongering around it. I don’t think the conversation has focused on the solution, i.e. how are we going to work with AI, especially in the context of making decisions. My book is focused on making decisions through intelligent algorithms.
One of the core questions when it comes to AI is: Are we going to use AI to make decisions? If so, are we going to use it to support [human] decision-making? Are we going to have the AI make decisions autonomously? If so, what can go wrong? What can go well? And how do we manage this? We know AI has a lot of potential, but I think there will be some growing pains on our way there. The growing pains are what I focus on. How can algorithmic decisions go wrong? How do we make sure that we have control over the narrative of how technology impacts the decisions that are made for us or about us?
Knowledge at Wharton: The book begins with some striking examples about chatbots and how they interact with humans. Could you use those illustrations to talk about how human beings interact with algorithms and what are some of the implications?
Hosanagar: I began the book with a description of Microsoft’s experience with a chatbot called “Xiaobing.” In China, it’s called “Xiaobing.” Elsewhere in the world, it’s called “Xiaoice.” This was a chatbot created in the avatar of a teenage girl. It’s meant to engage in fun, playful conversations with young adults and teenagers. This chatbot has about 40 million followers in China. Reports say that roughly a quarter of those followers have said, “I love you” to Xiaoice. That’s the kind of affection and following Xiaoice has.
Inspired by the success of Xiaoice in China, Microsoft decided to test a similar chatbot in the U.S. They created a chatbot in English, which would engage in fun, playful conversations. It was targeted once again at young adults and teenagers. They launched it on Twitter under the name “Tay.” But this chatbot’s experience was very different and short-lived. Within an hour of launching, the chatbot turned sexist, racist and fascist. It tweeted very offensively. It said things like: “Hitler was right.” Microsoft shut it down within 24 hours. Later that year, MIT’s Technology Review rated Microsoft’s Tay as the “Worst Technology of the Year.”
That incident made me question how two similar chatbots or pieces of AI built by the same company could produce such different results. What does that mean for us in terms of using these systems, these algorithms, for a lot of our decisions in our personal and professional lives?
Knowledge at Wharton: Why did the experiences differ so dramatically? Is there anything that can be done about that?
Hosanagar: One of the insights that I got as I was writing this book, trying to explain the differences in behavior of these two chatbots, was from human psychology. Psychologists describe human behavior in terms of nature and nurture. Our nature is our genetic cord, and nurture is our environment. Psychologists attribute problematic issues like alcoholism, for instance, partly to nature and partly to nurture. I realized algorithms, too, have nature and nurture. Nature, for algorithms, is not a genetic cord, but the code that the engineer actually writes. That’s the logic of the algorithm. Nurture is the data from which the algorithm learns.
Increasingly, as we move towards machine learning, we’re heading away from a world where engineers used to program the end-to-end logic of an algorithm, where they would actually specify what happens in any given situation.” If this happens, you respond this way. If that happens, you respond a different way.” Earlier, it used to be all about nature, because the programmer gave very minute specifications telling the algorithm how to work. But as we moved towards machine learning, we’re telling algorithms: “Here’s data. Learn from it.” So nature starts to become less important, and nurture starts to dominate.
If you look at what happened between Tay and Xiaoice, in some ways the difference is in terms of their training data. In the case of Xiaoice, in particular, it was created to mimic how people converse. In the case of Tay, it picked up how people were talking to it, and it reflected that. There were many intentional efforts to trip Tay – that’s the nurture aspect. Part of it was nature, as well. The code could have specified certain rules like: “Do not say the following kinds of things,” or “Do not get into discussions of these topics,” and so on. So it’s a bit of both nature and nurture, and I think that’s what, in general, rogue algorithmic behavior comes down to.
Knowledge at Wharton: There was a time when algorithmic decision-making seemed to be about “Amazon will suggest what books to read,” or “Netflix will recommend which movies you should watch.”
But because of AI, algorithmic decision-making has become a lot more complex. Could you give some examples of this? Also, what are some of the implications for the choices that we make or don’t make as a result?
“Psychologists describe human behavior in terms of nature and nurture ….When I was looking at algorithms, I realized algorithms, too, have nature and nurture.”
Hosanagar: Yes, algorithms pervade our lives. Sometimes we see it — like Amazon’s recommendations — and sometimes we don’t. But they have a huge impact on decisions we make. On Amazon, for example, more than a third of the choices that we make are influenced by algorithmic recommendations like: “People who bought this also bought this. People who viewed this eventually bought that.” On Netflix, they drive more than 80% of the viewing activity. Algorithmic recommendations also influence decisions such as whom we date and marry. In apps like Tinder, algorithms create most of the matches.
Algorithms also drive decisions at the workplace. For example, when you apply for a loan, algorithms increasingly make mortgage approval decisions. If you apply for a job, resume-screening algorithms decide whom to invite for an interview. They make life-and-death decisions, as well. In courtrooms in the U.S., there are algorithms that predict the likelihood that the defendant will re-offend, so that judges can make sentencing decisions. In medicine, we’re moving towards personalized medicine. Two people with the same symptoms might not get the same treatment. It might be customized based on their DNA profile. Algorithms guide doctors on those decisions.
We’re moving to a point where the algorithms don’t merely offer decision support. They can function autonomously, as well. Driverless cars are a great example of that.
Knowledge at Wharton: With algorithms making more and more decisions, is there anything like free will in the world anymore?
Hosanagar: Free will is an interesting concept. For the most part, I used to think of free will in a philosophical sense. Philosophers have argued we don’t have free will. But I think we have a literal interpretation of free will now in the context of algorithms, which is: Are you making the final choice?
As I said, a third of your choices on Amazon are driven by recommendations. Eighty percent of viewing activities on Netflix are driven by algorithmic recommendations. Seventy percent of the time people spend on YouTube is driven by algorithmic recommendations. So it doesn’t feel like algorithms are merely recommending to us what we want. Think about a search on Google. We might see less than 0.01% of any search results, because rarely do we even cross page one. The algorithm has decided which pages we look at. So yes, they’re making a lot of choices for us.
Do we have free will? At some level, yes — we do. But we don’t have the level of independent decision-making we think we have. We think we see the recommendations and then we do what we want, but algorithms are actually nudging us in interesting ways. Mostly that’s a good thing, because they’re saving us time. But sometimes we become passive about how we use algorithms, and that can have consequences.
Knowledge at Wharton: You write in your book that design choices can have unintended consequences. Could you explain that?
Hosanagar: By unintended consequences, I’m referring to situations where you’re trying to optimize some aspect of a decision. Perhaps you manage to improve that really well, but then something else goes wrong. For example, when Facebook was manually curating its trending stories through human editors, it was accused of having a left-leaning bias. These editors supposedly were choosing left-leaning stories and curating those more often. So Facebook used an algorithm for this curation and then tested it for political bias. It did not have any political bias, but there was something else it had which they hadn’t explicitly tested for, which is fake news. The algorithm curated fake news stories and circulated them. That’s an example of unintended consequences. Algorithm design can drive that in many ways.
I’ve done a lot of work on recommendation systems and how they influence the kinds of products we consume, the kinds of media we consume. I’ve specifically studied two kinds of recommendation algorithms. One kind is like what Amazon does: “People who bought this also bought this.” It’s based on social curation. The other kind of algorithm attempts to understand at a deeper level. It tries to find items that are similar to the user’s interests. An example of that would be Pandora. Its music recommendations are not [based on social curation]. Pandora has very detailed information – more than 150 musical attributes for each song. For instance, how rhythmic is the song? How much instrumentation is there in the music? And every time you say you like a song or you don’t it, they look at the musical qualities of the song, and then they adjust their recommendations based on other songs which have attributes similar to what you have liked or not liked.
“We think we see the recommendations and then we do what we want, but the algorithms are actually nudging us in interesting ways.”
I looked at both these designs, and I looked at which design is more helpful in finding, let’s say, indie songs or very novel and niche books or movies. At the time we did the study — this was some time back — the conventional wisdom was that all these algorithms help in pushing the long tail, meaning niche, novel items or indie songs that nobody has heard of. What I found was that these designs were very different. The algorithm that looks at what others are consuming has a popularity bias. It’s trying to recommend stuff that others are consuming, and so it tends to lean towards popular items. It cannot truly recommend the hidden gems.
But an algorithm like Pandora’s doesn’t have popularity as a basis for recommendation, so it tends to do better. That’s why companies like Spotify and Netflix and many others have changed the design of their algorithms. They’ve combined the two approaches. They’ve combined the social appeal of a system that looks at what others are consuming, and the ability of the other design to bring hidden gems to the surface.
Knowledge at Wharton: Let’s go back to the point you brought up earlier about algorithms going rogue. Why does that happen and what can be done about it?
Hosanagar: Let me point to a couple of examples of algorithms going rogue, and then we’ll talk about why this happens. I mentioned algorithms are used in courtrooms in the U.S., in the criminal justice system. In 2016, there was a report or study done by ProPublica, which is a non-profit organization. They looked at algorithms used in courtrooms and found that these algorithms have a race bias. Specifically, they found that these algorithms were twice as likely to falsely predict future criminality in a black defendant than a white defendant. Late last year, Reuters carried a story about Amazon trying to use algorithms to screen job applications. Amazon gets a million-plus job applications; they hire hundreds of thousands of people. It’s hard to do that manually, and so you need algorithms to help automate some of this. But they found that the algorithms tended to have a gender bias. They tended to reject female applicants more often, even when the qualifications were similar. Amazon ran the test and realized this – they are a savvy company, so they decided not to roll this out. But there are probably many other companies that are using algorithms to screen resumes, and they might be prone to race bias, gender bias, and so on.
In terms of why algorithms go rogue, there are a couple of reasons I can share. One is, we have moved away from the old, traditional algorithms where the programmer wrote up the algorithm end-to-end, and we have moved towards machine learning. In this process, we have created algorithms that are more resilient and perform much better but they’re prone to biases that exist in the data. For example, you tell a resume-screening algorithm: “Here’s data on all those people who applied to our job, and here are the people we actually hired, and here are the people whom we promoted. Now figure out whom to invite for job interviews based on this data.” The algorithm will observe that in the past you were rejecting more female applications, or you were not promoting women in the workplace, and it will tend to pick up that behavior.
The other piece is that engineers in general tend to focus narrowly on one or two metrics. With a resume-screening application, you will tend to measure the accuracy of your model, and if it’s highly accurate, you’ll rule it out. But you don’t necessarily look at fairness and bias.
Knowledge at Wharton: What are some of the challenges involved in autonomous algorithms making decisions on our behalf?
Hosanagar: One of the big challenges is there is usually no human in the loop, so we lose control. Many studies show that when we have limited control, we are less likely to trust algorithms. If there is a human in the loop, there’s a greater chance that the user can detect certain problems. And the likelihood that problems get detected is therefore greater.
Knowledge at Wharton: You tell a fascinating story in the book about a patient who gets diagnosed with tapanuli fever. Could you share that story with our audience? What implications does it have for how far algorithms can be trusted?
“Companies should formally audit algorithms before they deploy them, especially in socially consequential settings like recruiting.”
Hosanagar: The story is that of a patient walking into a doctor’s office feeling fine and healthy. The patient and doctor joke around for a while. The doctor eventually picks up the pathology report and suddenly looks very serious. He informs the patient: “I’m sorry to let you know that you have tapanuli fever.” The patient hasn’t heard of tapanuli fever, so he asks what exactly it is. The doctor says it’s a very rare disease, and it’s known to be fatal. He suggests that if the patient has a particular tablet, it will reduce the chance that he will have any problems. The doctor says: “Here, you take this tablet three times a day, and then you go about your life.”
I asked my readers if they were the patient, would they feel comfortable in that situation? Here’s a disease you know nothing about and a solution you know nothing about. The doctor has given you a choice and told you to go ahead, but he has not given you many details. And with that, I posed the question: If an algorithm were to make this recommendation — that you have this rare disease, and we want you to take this medication — without any information, would you?
Tapanuli fever is not a real disease. It’s a disease in one of the Sherlock Holmes stories, and even in the original Sherlock Holmes story, it turns out that the person who is supposed to have tapanuli fever doesn’t actually have it. But setting that aside, it brings up the question of transparency. Are we willing to trust decisions when we don’t have information about why a certain decision was made the way it was?
What I highlight is that sometimes we seek more transparency from algorithms than humans. But in practice, lots of companies are imposing algorithmic decisions on us without any information about why these decisions are being made. Research shows that we’re not fine with that. For example, a PhD student at Stanford looked at an algorithm that would compute grades for students and how they did when they just got their score, versus when they got their score with an explanation. As expected, when the students had an explanation, they trusted it more.
Then why is it that in the real world there are a lot of algorithms making decisions for us — or about us — and we have no transparency about those decisions? I advocate that we need a certain level of transparency with regard to say what kinds of data were used to make the decision. For example, if you applied for a loan, and the loan was rejected, we would like to know why that was the case. If you applied for a job, and you were rejected, it would be helpful to know that the algorithm not only evaluated what you submitted as part of your job application, but also looked at your social media posts. Transparency regarding what data was considered, what were the key factors that drove a decision, is important.
Knowledge at Wharton: At the end of the book, you recommend an Algorithmic Bill of Rights. What exactly is that, and why is it necessary?
Hosanagar: The Algorithmic Bill of Rights is a concept that I borrowed from the Bill of Rights in the U.S. Constitution. The history of the Bill of Rights is that when the Founding Fathers were drafting the Constitution, some people were worried that we were creating a powerful government here in the U.S. The Bill of Rights was created as a way to protect citizens.
Today, we are in a situation where there is a lot of talk about powerful tech companies. There’s a feeling that consumers need certain protections. The Algorithmic Bill of Rights is targeted at that. A lot of consumers feel that they’re helpless against big tech and against algorithms deployed by big tech. I feel that consumers do have some power, and that power is in terms of our knowledge, our votes, and our dollars.
Knowledge implies that we shouldn’t be passive users of technology. We should be active and deliberate about it. We should know how it’s changing decisions we are making or others are making about us. Look at how Facebook is changing its product design today. That change – support for encryption and so on — is because of a push from users. It shows that when users complain, changes do happen.
Votes are another aspect of that. They involve our being aware of which elected representatives understand the nuances of algorithms and the challenges and how to regulate them. It’s about voting for them. The question is: How are these regulators going to protect us?
That’s where the Bill of Rights comes in. The Bill of Rights I propose has a few key pillars. One pillar is transparency — transparency with regard to the data used to make decisions and with regard to the underlying decision itself. What were the most important factors that lead to a certain decision? Europe’s GDPR [General Data Protection Regulation] has certain provisions, like right to explanations and information on the data that companies are using. I think some of that transparency is needed, and companies should provide that.
Another pillar in my Bill of Rights is the idea of some user control, that we cannot be in an environment where we have no control over the technology. We should, for example, be able to — with a simple instruction — tell Alexa: “You’re not listening to any conversation in the house until I instruct you that it’s allowed.” There’s no such provision at present. We are told that the system is not listening, but then we’re also hearing from others that there are instances where it listens, even when you’re not actually saying “Alexa,” and giving it instructions.
This control is very important. If you look at Facebook and the issue of false news, two years ago there was no way for users to alert Facebook’s algorithm and say: “This post in my newsfeed is false news.” Today, with just two clicks, you can let Facebook know that a certain news post in your feed is either offensive or it is false. That feedback is very important for the algorithm to correct itself.
Lastly, I have been advocating the idea that companies should formally audit algorithms before they deploy them, especially in socially consequential settings like recruiting. The audit process must to be done by a team that is independent of the one that developed the algorithm. The audit process is important because it will help ensure that somebody has looked at things beyond, say, the prediction accuracy of the model. They have looked at things like privacy. They have looked at things like bias and fairness. That will help curb some of these problems with algorithmic decision making.
“[We need to] engage more actively and more deliberately and be part of the process of influencing how these technologies develop.”
Knowledge at Wharton: Any final points that you would like to emphasize?
Hosanagar: Even though I talk about many of the challenges with algorithms in my book, I’m not an algorithm skeptic. I’m actually a believer in algorithms. The message I want to share is not “be wary,” but “engage more actively and more deliberately and be part of the process of influencing how these technologies develop.” The reason I say that is that studies show that algorithms, on average, are less biased than human beings. Furthermore, my contention is that it is easier in the long run to fix algorithmic bias than it is to fix human bias.
The challenge with algorithm bias is in the way it scales. A prejudiced judge can impact the lives of maybe 200 or 300 people, but an algorithm used in all the courtrooms in a country or across the world can influence the lives of hundreds of thousands, or even millions, of people. Similarly, a biased recruiter can affect the lives of hundreds of people, but a biased recruiting algorithm can affect the lives of millions of people. It’s the scale that we have to worry about. That’s why we need to take the issue seriously.
The key message is that we are going into a world where these algorithms will help us make better decisions. We’ll have growing pains along the way. The few examples I mentioned are only the beginning. We’ll hear many more. We should engage actively now to minimize those incidences.