On expectations, information and dynamic game equilibria

Abstract Optimization techniques developed for physical and engineering systems are often applied to the control of economic systems. The objective functional of the controller (e.g., the government) is minimized given a passive economic system. However, as Lucas (1976) has stressed, an economy is not a passive system. In an economy there are, in general, several independently acting controllers. The actions of these controllers depend on their expectations with respect to the actions of the other decision-makers. Forward-looking expectations ruin the standard non-anticipation property of a system. As Kydland and Prescott (1977) have noticed the optimal policy is time inconsistent. Kydland and Prescott (1977) came more or less to the conclusion that the optimal control approach as such was under reconsideration. Later on, however, it became the common idea that the optimal control approach had to be placed in a game theory framework. The revival of dynamic game theory in economic literature was a fact. The famous result of Simaan and Cruz (1973) that Bellman's principle of optimality does not generalize to the Stackelberg solution became known as time inconsistency. What is actually the problem? The answer to this question has a technical side and a conceptual side. Technically speaking the problem is that dynamic programming can not be used to find the closed-loop no-memory Stackelberg solution. Conceptually speaking the problem is that the global Stackelberg, decision model yields policies which become suboptimal in the course of the game when reoptimizing is allowed for: in the future there can be an incentive to change the policy which was originally established. Furthermore, it is hard to defend a decision model in which the follower believes such a time-inconsistent announcement of the leader. The Stackelberg solution concept is a sequential concept: the players act one after another. The follower knows the action of the leader when he acts himself. However, the follower does not know the leader's future actions. He has to decide on the basis of the leader's announcement. Incentives to cheat arise for the leader because of the time inconsistency. Two solutions can be distinguished. Firstly, the follower believes the announcement and the leader does not cheat or reoptimize. This situation results in the global Stackelberg outcome. Secondly, a reoptimization in the future is to be expected. This outcome has to be consistent. That is to say, a future reoptimization will not lead to change in policy. Otherwise the announcement will not be believed. The feedback stagewise Stackelberg solution, which is found by means of dynamic programming, is consistent by construction. It is also possible to formulate a consistent open-loop Stackelberg solution. The reputation of the government plays an important role here. It is an interesting idea to try to formalize the concept of reputation [see e.g. Kreps and Wilson (1982) and Barro and Gordon (1983)] . In dynamic finite horizon games the loss of reputation in the course of the game can be formalized by an end-penalty. It is important to note that these problematic aspects of the Stackelberg concept can also occur in a game where the players have to perform their actions at the same time. One of the players (e.g., the government) can try to become a Stackelberg leader by announcing his policy before it is actually played. In this type of game additional incentives to cheat arise! Generally, the leader can gain by cheating on his announcement at the time of action. It is reasonable to assume that the effect of cheating will be that the follower will not believe the announcements anymore. Three solutions can be distinguished. Firstly, the follower believes the announcement and the leader does not cheat or reoptimize. This situation results again in the Stackelberg outcome. Secondly, a reoptimization in the future is to be expected. This outcome has to be consistent. Finally, cheating is expected. In this case the announcement has to be ‘cheating-proof’. Therefore it is necessary, but not sufficient, that the announcement is consistent. It will be shown that the Nash concept seems to be the only reasonable concept for this game. For this reason the Nash announcement can be called a credible announcement. When cheating is not expected, but occurs anyhow, several outcomes are possible. For example, the follower believes the announcement, consistent or not, but the leader cheats on his announcement. It is reasonable to assume that after this has happened the follower stops believing. All possible outcomes have to be evaluated by the leader against his expected loss of reputation. The paper is organized as follows. Section 2 discusses the impact of the information structure on the outcome of Nash and Stackelberg games. The terminology is briefly summarized and some examples are given. Some papers [e.g., Backus and Driffill (1985)] jump very quickly from an open-loop Stackelberg framework to a feedback stagewise or dynamic -programming framework in order to achieve consistency. It is true that the feedback stagewise solution concept has very nice properties (like sub-game perfectness). It presupposes, however, information on the state of the system. A change in information structure should not be justified alone on the strive for time consistency. Section 3 elaborates on announcements, consistency and cheating. A simple linear quadratic example is used for illustration. Definitions and propositions are brought together in an appendix.