Slow learning in the market for lemons: a note on reinforcement learning and the winner's curse

Nick Feltovich

Human-subject experiments using markets with asymmetric information typically exhibit a "winner's curse", where bidders systematically bid more than their optimal amount. The winner's curse is very persistent; even when subjects are able to make decisions repeatedly in the same situation, they repeatedly overbid. Why do people keep making the same mistakes over and over again? In this chapter, we consider a class of one-player decision problems which generalize Akerlof's (1970) market-for-lemons model. We show that if decision makers learn via reinforcement, specifically by the reference point model of Erev and Roth (1996), their behavior typically changes very slowly, and persistent mistakes are likely. We also develop testable predictions regarding when individuals ought to be able to learn more quickly.