Wharton professor Ethan Mollick joins Eric Bradlow, vice dean of Analytics at Wharton, to discuss AI’s impact on education: How is generative AI being used by students? Should teachers embrace AI in the classroom? Their conversation touches on the latest developments in ChatGPT and other generative AI tools, and how they will affect educators and the workforce at large. This interview is part of a special 10-part series called “AI in Focus.”
Watch the video or read the full transcript below.
Eric Bradlow: Welcome to this edition of the AI at Wharton and Analytics at Wharton podcast series on artificial intelligence. Today’s episode will actually have a dual role. While it says “AI in Education” here, our guest actually has lots of expertise in AI in education and the workforce, and a lot more general topics, as well. I’m joined by my faculty colleague, Ethan Mollick. Ethan is the Ralph J. Roberts Distinguished Faculty Scholar. He is an associate professor in our Management Department. He’s also the academic director of Wharton Interactive. So Ethan, welcome to our podcast series.
Ethan Mollick: I’m thrilled to be here. Thank you.
Bradlow: I don’t even know where to start. I’ll just say, and this is my first question to you: Most of what I’ve learned on AI in education — we’ll start there — is by watching the 5-part series that you and your wife created. Could you tell our listeners here on our AI in education and workforce episode, what was in those five episodes? What do all of us as professors need to know about AI in education?
Mollick: There are at least three different things that matter, right? The first thing that matters is disruption. Homework is over. There is not a homework assignment basically anywhere that a well-prompted AI can’t solve at this point. So that’s a big deal.
Bradlow: And just to be clear, let me take them one at a time. Explain to our listeners. There are multiple versions of even, let’s say, ChatGPT, which is just one of the open AI sources. Some versions can ingest documents. Some versions cannot. Some have just a text prompt. So which version or versions are you referring to when you say, “Homework as we know it is over”?
Mollick: When you think about AI, you want to think about all of the what’s called “foundation models,” which are Llama and ChatGPT, all these kinds of different models. But you also want to think about what are called “frontier models,” not to create a more confusing vocabulary. But there are really only three frontier models right now, which are OpenAI’s ChatGPT 4, which is the paid version, but you can also get it for free through Microsoft Bing in creative mode, which turns out to be really important for education, for reasons we’ll talk about. [OVERTALK]
Bradlow: Which is the way that I do right now.
Mollick: It’s a little limited in some ways. It’s weird. It has a personality. We can talk about that. Then there is Google’s Bard, which right now is powered by an underpowered model called Pom 2, but all the rumors are that it will be upgraded to a model that probably will be the first model to beat GPT 4 in the next couple of months. And then finally there’s a company called Anthropic that has a product called Claude 2.
So when I talk about AI can do something, I’m almost always talking about the frontier models. So GPT 4 currently has a separate mode for vision and pictures. That’s all being united. They’re already rolling it out. So they’ll be able to take in documents, take in images, read PDFs. It already can. It just does it a little bit jankily right now.
Bradlow: Let me ask you a few things. Will all of this — right now there are paid and unpaid versions. What’s your vision, since you also teach innovation? You teach entrepreneurship. Are all these things going to stay free? If they are, what’s their revenue model? Is it advertising? How do you see this playing out from our point of view and from the company’s point of view?
Mollick: Right now OpenAI has announced that they’re on a run rate of 1.2 billion dollars in revenue, less than a year after releasing ChatGPT. Most of that money probably comes through their use of their API, which is their business-to-business solution, right? Less of it is the $20 a month we pay, if you pay for GPT Plus. By the way, if you can, you should. The difference between GPT 4 and GPT 3.5, the paid version and the free version, is so large. It may not appear that way at first, but it’s big enough that it is a hundred percent worth it.
Microsoft releases a bunch of GPT 4 products through Bing, and they’re planning on doing it free, as far as I know, for the near future. Google Bard is also releasing theirs for free, because it’s part of the search engine fight going on. So we earn the benefits, but it also means that people in 169 countries around the world have access to GPT 4 and will probably have Bard access, as well, which means the same model that you get access to if you go to Goldman Sachs or you go to McKinsey or you go to Nike — it’s no better than the model that every kid in Uganda and Sri Lanka has access to, which I think is really exciting and interesting, and also really a big deal for education.
Bradlow: You’ve brought me five questions, but let me go one at a time. I’m getting excited here. Obviously you’ve spent time thinking about Wharton Interactive. So this idea of democratizing education through something like this has to be really thrilling and exciting to you as an educator and a scholar, because I know that was part of your mission and still is with Wharton Interactive.
Mollick: Yes, for those of you who don’t know, Wharton Interactive is our attempt to build games and simulations to teach entrepreneurship at scale. Wharton has been incredibly supportive. You’ve been incredibly supportive. We have built these very large games. We have a team of people. We have writers and coders and interactive fiction experts. Once GPT 4 came out, we just tried a little experiment. We tried saying, “What if we just write a paragraph, create a simulation of a negotiation, give me grading on it, make it realistic?” And we’re 80% of the way there with the paragraph. GPT 4 just runs a simulation.
So we’ve pivoted now all of our simulations to basically AI is powering every — we have AIs watching AIs. AIs are instructors, AIs are mentors, all interacting with each other, that are actually doing teaching.
Bradlow: So is it writing the code?
Mollick: It doesn’t even need to write the code. It is writing the code, but that turns out to be secondary. It’s writing the code. It’s creating the images. It’s doing all that stuff, but what it really is it’s also the brains of the operation. If we do good prompting, we can tell it, “Here’s your goal. Make sure that you’re keeping students engaged. Change tone if you need to. Here’s your overall — ” And it just does it all.
Bradlow: I’m just in shock because, wow. Let me ask you a question. How does one become — I know this is part of your video series — How does one, I don’t know if I’d call it an “expert,” but how does one become sophisticated, that’s a good word, in prompt engineering? Could I do this? Do I need subject matter expertise to create specific enough prompts, or is it just by — as you and your wife talked about in the video series — you’ll just learn by doing?
Mollick: That’s a really good question. Sort of all of the above. A few things. One is, for those who don’t know, prompt engineering writing is the idea of writing really good prompts to the AI. It is going to go away. I talk to OpenAI regularly. I talk to Microsoft. I talk to Google. Nobody who is an insider thinks this is going to last because the AIs are really good at intent. If you say, “I want to write a novel,” fairly soon they’ll just be able to say, “Okay, let’s go through the steps together.” It already kind of does that, right? So you’re 80% of the way there by just interacting with the AI.
Now, there is an exception. If you want to encode your expertise as a subject matter expert, you want this to do a really good marketing analysis, it will do a perfectly fine job. But if you encode your expertise into it by saying, “Here are the angles you take. Here’s the approach.” Do a little bit of prompt engineering. You can give that prompt to anyone, and they’ll get the benefits almost of your experience or wisdom. So there is some value in doing that.
It’s pretty straightforward. Most of it is using it. Ten hours of the frontier models, by minimum rule of thumb. But then beyond that piece, there are a couple of simple tricks. So one is you tell the AI who it is. You give it context. So the more context, the better. You are an expert marketer. And weirdly, by the way, there is research now suggesting that when you tell the AI as an expert, under some circumstances it works better. Again, there’s a lot of strange stuff about prompting.
The second thing you want to do is provide a lot of what’s called “few shots.” You want to provide a lot of examples. So when you’re sitting down, you want to say, “Here are some examples of the kind of report that you’ll produce.”
The third thing you want to do is have it do step-by-step thinking. So you want to say, “First do this, then do that, then do this.” Because it only knows what it writes. So you want it to write stuff out and then go back to it, and then build a plan from there. Those three things will make you a better prompt engineer, but it’s not going to be that important in the long-term.
Bradlow: All right, so let’s go back to the focus of AI in education. Let’s talk about the roles that we have as educators. You already said the traditional way of doing homework — let’s start with homework is in jeopardy. So given that, what can we do? Or in my view, I’m teaching next semester. I’m like, “Use ChatGPT. As a matter of fact, if you know how to do this to solve the problems I’m asking you to do, that’s a skill set. Or should I be thinking about this differently?” Then I want to go to standing in the classroom. Then I want to go to other forms of assessment and things that we do.
Mollick: Yes, the problem is — what I feel is everybody is rushing in to do a ChatGPT class, right? And the answer is, “Yes, Chat can do that.” There are very few things that a medium level, like at the 80th percentile, that Chat doesn’t do reasonably well right now. So the question is, do we want all of our classes to be — Can you use ChatGPT to solve this problem? I think we still want to teach the subjects that we’re really good at teaching. We still think people need to learn these things, which means we have to adjust, right? You can’t use, by the way — everybody should know, do not use any kind of AI detector. They do not work. They are biased against people who use English as a second language. That ship has sailed. We cannot detect AI right now.
Bradlow: Okay, so just to be clear, just like when we have Canvas at the University of Pennsylvania, there’s a Turnitin. So whatever version of that for AI, you might as well just forget it. I think you can see where my next question is going. Let’s say Eric Bradlow and Ethan Mollick are both in a class, and they both used ChatGPT 4 to solve some problem. And let’s say by chance they happen to both put in the same prompt. Will they get the same exact text back? And if the answer is no, if both of those were turned in, could Canvas turn it in, not saying it was AI-generated, but would it say, “Hey, wait a second. There’s cheating going on there, because their responses are so similar.”
Mollick: So that’s a really good question on a few things. First of all, they wouldn’t get the same answer, because there’s randomness built in, and there’s a temperature, right? So there’s a random seed initially, and then the words are a little bit randomly different, which diverges over time.
Bradlow: [OVERTALK] So you’re saying large language models are — I’m a statistician — they’re probabilistic models, which means even if we put in the exact same prompt, we’re going to get out things because there’s a probability of the next word or the next phrase.
Mollick: And they’re autoregressive, so once they head into one direction or another, they sort of spin off in that direction further. So that’s the first thing. The second thing is, I’ve had an assignment even before ChatGPT came out, using GPT 3, where I had my students cheat in class. So I had them write the best essay they could. Part of the assignment is you have to prompt it at least five times. By the time you’ve prompted the AI two or three or four times, there’s no way that they seem similar anymore.
So yes, if people just paste in the question, they’re not going to get the same answer, but they might have some similarities. If they do any work like, “Make this more vivid.” Or, “Here’s my writing style.” That’s all they need to make it very different. Turnitin will not detect those things. And by the way, I really think it’s unethical to use Turnitin right now. You should be turning it off. It has a high false accusation rate.
By the way, even worse is to ask GPT for, or ChatGPT, the 3.5 free version, whether something was created by AI. A new study showed the GPT 4 has a 95% rate of just telling you that something is made by AI, if you paste it in and ask it if it’s made by AI. And GPT 3.5 has a 5% rate of telling you. They just randomly say, “This is made by AI,” or not. They have no way of telling.
Bradlow: So what kinds of things can be — when you presented to the Wharton faculty, which was one of the best, most informative presentations I’ve seen in a long time. When you presented to the Wharton faculty, I was like, “Okay.” Maybe at the time this was true, but like, “All right, so maybe it’s not text data, but maybe I’ll give exam questions that have video. Or maybe I’ll give stuff that has voice.” Because you know what? ChatGPT can’t possibly do as well with that. Am I off-base, or — well, let me just say. I might be correct, but it’s better than you think.
Mollick: No, it’s better at voice than humans are right now. So Whisper, which is the free — it’s built into the ChatGPT app, probably illegally trained — not illegally. I don’t know who’s watching. But trained on YouTube videos, probably, as far as we can tell, has better than human hearing. So accents, mixes of languages. I use it all the time. Actually, my students pitch to it. I have real venture capitalists, and then I have the AI playing a VC. The VCs think that the AI does a better job than they do in giving feedback.
So yes, it can listen. Now it can see things, so any visual problem, you just upload, and it will [OVERTALK].
Bradlow: So if I upload a video or anything?
Mollick: Video, it still has a little bit of trouble with, so that you could just pull off video right now, but give it a few days. No, I mean a couple of months, probably.
Bradlow: You actually brought up another topic. We’ve been talking about AI in education. Could you talk to us about the work that you have done on AI in the workforce? I know you’re extremely proud of the work you’re doing, and I just want to see — because a lot of our experts have talked about, “It can’t be AI or humans; it’s got to be AI and humans.” I’m just interested in the angle in which you’ve studied AI in the workforce.
Mollick: Okay, so one example. I have a paper with a whole bunch of great people at Harvard, including Karim Lakhani, Fabrizio Dell’Acqua, and people at MIT, Kate Kellogg — a whole bunch of people on this project. But what we did was we went to BCG, one of the three big elite consulting companies. And the Boston Consulting Group, a lot of our students want to work there. A lot of alumni work there. And we did an experiment.
We created 20 tasks. They were all realistic tasks with BCG. They are actual tasks they use. And we used 8% of their global workforce, which is a lot.
Bradlow: When you say an experiment, do you mean an actual experiment?
Mollick: I mean an actual experiment. Eight percent of their global workforce, and some of them got the help of GPT 4, and some did not. There are a bunch of other conditions. The people who were given GPT 4 to use in business tasks had a 40% improvement in quality. No training, no specialization in the module, just the same ChatGPT all of us have access to. Forty percent increase in quality across 108 regressions.
Bradlow: How was quality measured?
Mollick: Every way we could. So we did analytical tasks and marketing tasks and persuasion tasks. And all of them were graded by human PhDs, human MBAs. And then we also used GPT 4, which, by the way, grades just as well as humans do. It’s just a little nicer on the scores, but the relative scores were exactly the same. And then they completed tasks 26% faster — sorry, got 26% more tasks done, 12.5% faster. No training. We only had like five minutes of training for some conditions; others for none.
Just to put that in context, when steam power was put into a factory in the early 1800s, it improved performance by 18 to 22%. We’ve never seen a 40% improvement. This is not tuned. This is not trained. This is the Chat interface that you’re used to using. So huge, huge performance impacts from just a little experiment.
Bradlow: All right, so usually in academic papers, we have some thesis or hypothesis. And in your case, you have an experiment and results. What did the back end of that paper look like? Let’s imagine you’re now consulting for a company, BCG, or you’re consulting to our students, undergrads, MBAs. Like, “This should be how you think about your training.” What did the back end of the paper look like? What conclusions did you come to as a result of this?
Mollick: We barely scratched — there’s so much else we could talk about here, too, that are interesting caveats on creativity and who uses what answers. And also how people work with AI. That’s another thing I’ve been doing a lot of work on effectively. But the back end of the paper is really the idea that, “Look, this is a big enough impact that there should be a red alert everywhere in every organization.” You don’t see these kinds of performance improvements. A lot of people are taking their time on AI. They’re insisting that they do something like integrate their own data with the AI system.
We didn’t have to do that here. The P in GPT stands for “Pre-trained.” It knows a lot of stuff already. It’s not clear that you should be waiting to build a large data integration and use RAG and all these other techniques, when you should just probably be using this, and it should be a red alert to figure out how to use this. As much as we say, “Oh, it’s people using AI.” But also another side of this paper was there was an Appendix C to the paper that I don’t always talk about because I don’t quite know what to do with it. But it measures what’s called “retainment.” How much of GPT 4’s answer did you just use as your answer? And there’s almost a direct correlation between how much of the answer you use and how successful your results are. I mean there’s a direct correlation. It’s almost a perfect correlation.
So basically the only way to mess up was to change ChatGPT’s answer. And not only that, the performance boost was the largest for everyone in the bottom half of performance. So we measured prior and after performance. A 42% boost in improvement from the bottom half; 18% for the top half performers. It leveled everybody up to like the 80% percentile of BCG consultants. I don’t even know what to do with that. That’s such a big number.
Bradlow: So given, as you said, this is more impactful than the steam engine, what did BCG now do with this? What’s their plan? Let me just say, do they have any doubt that what you found is generalizable? Here’s an argument. I’ll be a statistician for a moment. Maybe this wasn’t — well, you said 8% of the workforce. Maybe this wasn’t a massive sample size. Maybe it works for these 20 tasks, but not for these tasks. Or maybe it helps in the short run, but you know what? We can also train humans, so maybe the effectiveness is going to decrease over time
Like if I were a reviewer on a paper, I’m just playing the devil’s advocate. What would be the response to all of this?
Mollick: So a few things. We did manage to create one task the AI couldn’t do. One of the things to know about —
Bradlow: [OVERTALK] Our listeners want to hear what that is.
Mollick: Well, it was hard, right? It was a task where we had to hide data in interviews, and some [were] in a spreadsheet. And this was before ADA, the Advanced Data Analysis module came out. But we managed to find something, right? It took some work. And on that task, what happened was people who used the AI did worse because they were mistaken, because they took in what the ADA [UNINTEL].
So part of what people are saying is like, “Well, what’s the border of what AI does and doesn’t?” We call that the “jagged frontier.” For example, if you ask GPT 4 to write a 25-word paragraph, it will have trouble doing that, because it doesn’t see words. It sees tokens. But if you ask it to write a sonnet, it will do an amazing job with that. Sonnets are harder for humans than 25 words. You have to learn the frontiers of AI. Going back to the point we mentioned earlier, if you use it a lot, that’s how you start to understand it’s going to be good at this task, bad at other tasks.
So to go back to the overall question about what do you do with this, is it generalizable? This is just one piece of result. There’s another study out of MIT that got published in Science that shows similar-sized improvements in business writing tasks, in a completely different sample. There is a study out of GitHub showing the same kind of improvement for programmers. The 30 to 70% number just keeps coming up over and over again, in different samples, in different — There’s a piece on creative work. There’s another paper out of Harvard looking at business proposal writing. There’s our own colleague Christian Terwiesch, Karl Ulrich, and their colleagues’ work, showing innovation. So this is not like a one-time thing. This is a pretty broad-based set of findings.
Bradlow: Where do you think the — I like the words “jagged edge.”
Mollick: Jagged frontier.
Bradlow: Jagged frontier. Where is the jagged frontier here? Let’s say you’re in the 1/1,000th upper percentile of people in knowledge about AI in general right now. Where do you think that jagged frontier is?
Mollick: It’s hard to explain.
Bradlow: [OVERTALK] Doesn’t it always move?
Mollick: That’s the main thing, that the frontier is moving out. I guarantee in the next month, the frontier is going to move out. For some reasons that I can’t talk about and some reasons that are already public elsewhere, but this is not stopping. There is no indication to me that the jagged frontier is not going to keep moving forward in the next few months, the next year or two. I think ultimately the big question is how far, how fast, and when does it stop?
And I will tell you that people training the AI models, I don’t think, have an answer to that question.
Bradlow: So let’s start with a few things. I saw, I think it was yesterday, President Biden signed a bill or something, an executive order, on artificial intelligence and some sort of safety and security protection. Could you tell our listeners what that is about? What is the policy trying to do?
Mollick: There are a few things in the policy. I have not spent a huge amount of time with the executive order, but what it does that I like about it is there are sort of two stages of threats from AI that people are worried about. One is if you read the press a lot, you may hear about extinction risks, right? What if we make artificial intelligence smarter than a human? And what does it do to us? Does it save us? Does it kill us? What happens if we build a machine god? And by the way, that’s the stated goal of OpenAI, is to build AGI, right? That’s their plan.
Bradlow: By the way, just so I know. I just want to be sure, since I’m a big movie guy. I don’t know how much you watch movies. Wasn’t that some part of The Terminator? In other words, these robots became so smart, that in some sense they ended up launching wars. I mean the Schwarzenegger movie, The Terminator?
Bradlow: It’s not unrelated, right?
Mollick: No, that is one example of AGI. The people who are in favor of AGI think that this will save us all and redeem humanity and give us all eternal life. The people who don’t like it think it will murder us all. So I think it’s worth worrying about. It’s something that enough serious people in computer science are worried about it that I’m glad that we’re addressing that. But I think the bigger policy implications in the near term for me, as somebody at the Wharton School is, look, we’ve got something that’s doing high-end creative work, high-end managerial work. It’s going to get better at this. We’re improvising a chat interface and using that to do consulting work. That’s pretty crazy.
So that means there are going to be widespread implications for work, widespread implications for education, as we were talking about. And also one of the things that is really important, like part of the reason we catch criminals and bad actors so easily, especially when they’re not connected to a state intelligence agency, is most of them aren’t that great, right? Now what happens if the AI brings everybody up to the 80th percentile in biological engineering, the 80th percentile in building chemical weapons? That’s also a concern.
And then there are privacy concerns. Deepfakes are perfect for this thing, right?
Bradlow: Can you define what a deepfake is? What does that term mean?
Mollick: Sure, AI content is basically undetectable. I have made videos of my fake me talking. In my presentations, I always have one real picture. Everything else, I generate on my own. No one can tell what the real picture is, right? So you cannot tell if I can create an actor with their voice. I can do this right now for $1.50 with software that anyone can use. It’s not even like dark web software. It’s a company that’s VC-backed, and I can use eleven labs in D-ID and create a fake video of you talking right now, and it would be pretty realistic.
So we have this issue with this kind of deepfake email, and it’s also perfect for phishing. You shouldn’t trust anything you see online anymore, and that’s not a joke. There literally is no way to — I’ve already talked to banks that have gotten calls from the voice of people who weren’t actually calling them, demanding money for ransoms. Stuff is going crazy already. That ship has already sailed.
So part of the Biden agenda is how do we watermark these things? It’s not going to be possible. That’s not going to happen, really, because even though these frontier models are based in the U.S., there are a whole bunch of open-source models, worldwide models that are not going to have this kind of protection.
So the attempt of this executive order, the way I see it, has both to do — worrying about this sort of future AI and training it, but also trying to think about how do we restore privacy? How do we restore — and I don’t know how much of that is possible, but regulations are probably needed.
Bradlow: A lot of our listeners are probably sitting here saying, “I need to get started. I haven’t even started, but now, clearly, if I’m listening to Professor Mollick, I’ve got to start now.” Where do you suggest that someone starts? You mentioned 10 hours. Let’s say that’s good. What should they spend their 10 hours doing? Where should they go? We’re at Wharton, so we’re fortunate. We have the video series you sent around. But whether it’s you have materials, others have materials. I’m making it up. Can you go to Khan Academy? Where does someone go to get started?
Mollick: So a few things. Those videos are on YouTube. Anyone can see them. [OVERTALK] If you search the name “Ethan Mollick,” you’ll find it. I also have a Substack with a whole bunch of getting started guides called “One Useful Thing,” all free.
Bradlow: Which I just started following today.
Mollick: Excellent. Hopefully you’ll enjoy that. But I think the basics of this are — my principles of AI, my first principle of using AI is just invite it to everything you morally and legally can. Use it for everything. Use it for your job. Like literally, do you want to send an email? See how good the AI is at writing the email. You have to do ideation? You have it do ideation. You’re going to a meeting? Bring it to the meeting. Have it record the meeting and give you advice and feedback on what you should do better next time. Just use it for everything. That’s the only way to figure out where the jagged frontier is in your field.
Nobody knows anything right now. As I said, I talk to all of the major AI companies on a regular basis, and no one has an instruction manual for this thing. No one knows whether it’s going to be good or bad in your subfield, or wherever you’re listening right now. So you can be the world expert by just using it and seeing what that is. So just try it for your work, and then there are a bunch of techniques you’ll start to learn. But the first thing is to try it. It’s really important, though, to use a frontier model, to use the most advanced model available to you right now.
Bradlow: Now you mentioned something else about — let me see if I got this right. I think it was Google. I think you used the term, “They’re coming out with something new that might be better than ChatGPT.” What would “better” mean? When you use the word “better,” I was just intrigued by that. What does it mean to be “better?” One large language model being better than another?
Mollick: Okay, so there are a lot of interesting angles to that. Right now, the two major frontier models which are OpenAI’s model, which Microsoft powers it, as well; and Google’s model. Both have added a whole bunch of capabilities that, if you’re not paying attention, you may have missed. So they’re all fully multi-modal. You can ask them to create pictures. They also can see the world. So not doing an image search, but like you could literally show a picture and say, “How does this dress fit?” And it will give you reasonable feedback on that. Or, “How do I undo this lock?” Or, “What’s this pass code?” Whatever you want to do.
So they’re all multimodal. They’re all are going to be doing voice back and forth. They all connect to other — it can read documents, connect to other materials. So those are the basics. All of that is connected to the large language model itself, which you can kind of think of as the brain. And so large language models basically get smarter over time, so if we think about GPT 3.5, the free version you’re using — maybe a high school sophomore. I would say GPT 4, at its best moments, is a first-year grad student.
Bradlow: That’s pretty good.
Mollick: Yes, so part of the question is, what is two or four times better than that look like? We don’t know yet. We could just do raw test scores, right? We go from scoring at the 5th percent at the bar exam, beating 5% of humans for free ChatGPT, to beating 95% of humans for GPT 4. What happens with the next one? We don’t really know. So the smarts are the brains behind the whole thing.
Bradlow: This is the question I’ve asked everyone in this series to kind of end the episode. If we’re sitting here 10 years from now, and we’ll make a date. I’m going to interview — I hope to interview the same group of people 10 years from now. What are we talking about, do you think, that has happened over the previous 10 years?
Mollick: I can only think of scenarios at this point, because the only question that kind of matters for this is how fast will these models improve, and when will they hit their limits? And nobody knows the answer to those questions, right? So it’s not going to be static. If you think you have time to wait, you don’t, because these models are advancing very rapidly. The question is, do they stop at the 95th percentile of the best humans in one area? So everybody has got something they’re really good at that they definitely beat the AI on? The 99th percentile? Better than human? We don’t know, right?
And so to me, that’s the only relevant question. Nobody has the answer to that, so we have to prepare for a scenario where, okay, we’re getting close to the top. I haven’t seen evidence of this, but it’s entirely possible that it starts to slow down. Then we still have at least 10 or 15 years of absorbing what GPT 4 can do, because it’s barely connected to anything in the world, right? We’ve got 10 years of disruption ahead of us that’s going to roll ahead, anyway.
If they keep getting better, then we start to think seriously about, if not AGI, what does it mean that it beats every human at running marketing copy? What do we do with that? And so there are a lot of open questions. I don’t have the easy answers, but I think you’re more likely to see a transformed world in 10 years than in five, even if the technology stops, because it takes a while for systems to absorb change, right? The futurists’ rule is that everyone overestimates short-term change and underestimates long-term change. I think in 10 years, we’re going to see a transformed world in a lot of ways. Some are good, some are bad.
Bradlow: I’d like to thank Professor Ethan Mollick for joining me today on the podcast series on AI in Education and the Workforce. Ethan, thank you for joining me.
Mollick: Thank you for having me.