January 8, 2023

ChatGPT takes my finance exam

Like many instructors these days, I am wondering if I should try to prevent my students from using AI to solve their assignments and exams. So I ran the final exam of my course Introduction to Finance with ChatGPT. The course is for Master students in data science who have no prior knowledge of finance. It covers standard and basic concepts in finance. The exam is all multiple-choice-questions such that answering at random gets you an expected score of 30%.

And ChatGPT gets the amazing score of... 20%. (My students got 73% on average.)

What is interesting is that ChatGPT explains its reasoning, so it is possible to check where it gets things wrong.

Sometimes it makes surprisingly basic mistakes such as believing that 96.12 is greater than 97.06. Excerpt:

Present Value of Asset A = 100 / (1 + 0.03)^1 = 97.06
Present Value of Asset B = 100 / (1 + 0.03)^2 = 94.17
Present Value of Asset C = 50 / (1 + 0.03)^1 + 50 / (1 + 0.03)^2 = 96.12

As you can see, Asset C has the highest present value, followed by Asset A and then Asset B.

More often, the AI fails to understand basic concepts in finance or it draws incorrect implications from them. For instance, it does not understanding diversification. It writes:

A well diversified portfolio eliminates both idiosyncratic risk and systematic risk.

It does not understand that only idiosyncratic risk can be diversified away, whereas systematic risk such as market-wide risk is non-diversifiable.

It does not understand market efficiency either. It writes:

If the stock market is efficient, you expect that the stock price is going up today [in reaction to the announcement of a news] if [the news] was anticipated by investors.

Instead, in an efficient market, if a piece of information is anticipated by investors, say a week before the official announcement, the stock price incorporates the information a week before the announcement and does not react upon the actual announcement.

Final thoughts

It is usually argued that AI does better than humans in areas where the human brain is prone to making mistakes such as in statistics, because statistical reasoning is often counter-intuitive. This does not seem to apply to economics and finance, which also feature many counter-intuitive notions such as market efficiency. But in this area, AI does not seem to do very well, possibly because economics relies on a combination of interpretation of practical situations and deductive reasoning, whereas statistics weights much more on deductive reasoning.

The poor performance of ChatGPT might be due to the fact that the public version uses a general-purpose AI whereas one specifically trained to solve finance problems would perhaps do a better job. Still, it is quite striking that the public version cannot even solve the exam for a introductory course. So AI may not yet going to substitute for students' intelligence to solve my exams, and probably even less for replacing skilled humans in the financial sector.


Previous post: The privacy paradox: Why do we share our personal data? »
Home »