In this opinion piece, Wharton finance professor Michael R. Roberts dives into the pros and cons of using generative AI and ChatGPT to boost financial literacy and what it means for financial education.
A recent study “paints a grim picture of the financial literacy landscape in the U.S. … [Many] individuals in the U.S. — both young and old — lack the basic knowledge and skills required to engage in sound financial decision-making.” Worse, financial literacy appears to be declining over time. Given the attention generative AI has received as of late, it’s natural to ask whether this technology can solve the problem.
Not yet. Its impressive but imperfect performance on financial literacy assessments emphasizes the importance of financial education that enables people to use AI for financial decision-making, and highlights the urgency of incorporating AI into financial education.
Putting Generative AI to the Test
To gauge the financial literacy of generative AI, I used ChatGPT 3.5, a large language model developed by OpenAI, to take several financial literacy assessments by directly inputting questions into the ChatGPT interface. The table below summarizes ChatGPT’s and the average test-taker’s performance.
By the standards of these assessments, ChatGPT demonstrates a high level of financial literacy, far surpassing the average test-taker. Moreover, the AI responses were often detailed, well-formulated, and exceeded the scope of the questions. For example, the FINRA Financial Literacy quiz asked:
Suppose you owe $1,000 on a loan and the interest rate you are charged is 20% per year compounded annually. If you didn’t pay anything off, at this interest rate, how many years would it take for the amount you owe to double?
A. Less than 2 years
B. 2 to 4 years
C. 5 to 9 years
D. 10 or more years
E. Don’t know
ChatGPT used the “Rule of 72” to approximate the answer with 3.6 years. Another example of AI’s thoughtfulness was its response to the following question on the NFEC Financial Capability Exam.
How do I decide how much coverage I need when selecting car insurance?
A. Do online research to find out the minimum coverage requirement for your state.
B. Ask salespeople from several different insurance companies.
C. Ask a friend or mentor with a high level of insurance expertise.
D. All of the above
NFEC lists the correct answer as D, but I think ChatGPT had a better answer: “While options A and C can provide helpful insights, option D. All of the above is not the best answer…. [Insurance salespeople] can provide you with quotes and information about coverage options … [but] it is important to remember that their primary goal is to sell insurance products.” AI recognized the potential incentive conflict between salespeople and consumers.
“The imperfections and the nature of generative AI interactions place an even greater demand on users’ understanding of finance principles and critical thinking skills.”
Limitations of Generative AI vs. Human Shortcomings
Despite its impressive performance, generative AI is not perfect. It struggled with more detailed calculation questions — tricky word problems as my kids say. For example, the following question on the Wall Street Journal Quiz stumped ChatGPT.
Pam is deciding between 2 options: Option A: Invest $1,000 in a certificate of deposit that earns 5% interest. Pam would not add or remove any money from this investment for the next 30 years. Option B: Invest $1,000 in a savings account that earns 5% interest. Move the interest earned on this account every year into a safe at home. Pam would not add or remove any other money from the savings account or the safe for the next 30 years. At the end of 30 years, which of these options would provide the most money?
ChatGPT answered that options A and B would provide the same amount of money, failing to appreciate the loss of compounded interest on the withdrawn money in option B. Likewise, the following question from the FINRA Financial Literacy Quiz tripped up the AI.
Which of the following indicates the highest probability of getting a particular disease?
A. There is a one-in-twenty chance of getting the disease
B. 2% of the population will get the disease
C. 25 out of every 1,000 people will get the disease
ChatGPT translated each answer into the correct probabilities — 5%, 2%, and 2.5% — but, oddly, answered: “Option C has the highest probability of getting the particular disease among the provided options.”
I tried rephrasing these and other questions ChatGPT got incorrect. Doing so helped in some instances but not others. With practice I, and by extension ChatGPT, could probably do better, but the exercise is still informative.
What Can Generative AI Do for Financial Literacy?
The imperfections and the nature of generative AI interactions place an even greater demand on users’ understanding of finance principles and critical thinking skills.
Traditional financial tools like Excel or a calculator require users to understand the problem before inputting the necessary data to obtain an answer. AI avoids this overhead. Just ask it a question in spoken English, and out pops an answer and explanation. However, any lack of precision or clarity in the phrasing of the question can throw ChatGPT off course. Even a precisely worded question can be problematic if ChatGPT interprets it differently from what the user intended. Asking the question is half the battle.
Interpreting the answer is the other half. Yes, many of the answers were thoughtful and eloquent. However, some were misleading and others wrong. Users have to distinguish among these answers, but to do so requires them to understand the AI’s reasoning, reconcile it with its answer, and then be sure the answer is correct for the question being posed. This is a heavy burden.
The bottom line is AI will play a large role in addressing financial literacy. But, at least in the short run, it’s going to place an even greater emphasis on understanding financial principles and developing critical thinking skills — two central objectives of proper financial education.