Present bias, the tendency to weigh costs and benefits incurred in the present too heavily, is one of the most widespread human behavioral biases. It has also been the subject of extensive study in the behavioral economics literature. While the simplest models assume that decision-making agents are naive, reasoning about the future without taking their bias into account, there is considerable evidence that people often behave in ways that are sophisticated with respect to present bias, making plans based on the belief that they will be present-biased in the future. For example, committing to a course of action to reduce future opportunities for procrastination or overconsumption are instances of sophisticated behavior in everyday life. Models of sophisticated behavior have lacked an underlying formalism that allows one to reason over the full space of multi-step tasks that a sophisticated agent might face, and this has made it correspondingly difficult to make comparative or worst-case statements about the performance of sophisticated agents in arbitrary scenarios. In this paper, we incorporate the framework of sophistication into a graph-theoretic model that we used in recent work for modeling naive agents. This new synthesis of two formalisms --- sophistication and graph-theoretic planning --- uncovers a rich structure that wasn't apparent in the earlier behavioral economics work on this problem, including a range of findings that shed new light on sophisticated behavior. In particular, our graph-theoretic model makes two kinds of new results possible. First, we give tight worst-case bounds on the performance of sophisticated agents in arbitrary multi-step tasks relative to the optimal plan, along with worst-case bounds for related questions. Second, the flexibility of our formalism makes it possible to identify new phenomena about sophisticated agents that had not been seen in prior literature: these include a surprising non-monotonic property in the use of rewards to motivate sophisticated agents; a sharp distinction in the performance of agents who overestimate versus underestimate their level of present bias; and a framework for reasoning about commitment devices that shows how certain classes of commitments can produce large gains for arbitrary tasks.
[1]
R. H. Strotz.
Myopia and Inconsistency in Dynamic Utility Maximization
,
1955
.
[2]
George A. Akerlof.
Procrastination and Obedience
,
1991
.
[3]
David I. Laibson,et al.
Golden Eggs and Hyperbolic Discounting
,
1997
.
[4]
Ted O’Donoghue,et al.
Doing It Now or Later
,
1999
.
[5]
Matthew Rabin,et al.
Choice and Procrastination
,
2000
.
[6]
G. Loewenstein,et al.
Time Discounting and Time Preference: A Critical Review
,
2002
.
[7]
M. Dewatripont,et al.
Commitment devices under self-control problems: an overview
,
2004
.
[8]
Ulrike Malmendier,et al.
Paying Not to Go to the Gym
,
2006
.
[9]
Jon M. Kleinberg,et al.
Time-inconsistent planning: a computational problem in behavioral economics
,
2014,
EC.
[10]
David Card,et al.
What Do Editors Maximize? Evidence from Four Leading Economics Journals
,
2017
.