The prevalence of overconfidence, i.e., the human tendency to overestimate our own skills and predictions for success, is a classic in the behavioral literature. Experimental evidence confirmed the role of overconfidence in areas as diverse as financial markets, health, driving, insurance markets, job search or consumer behavior. Many anomalies in financial markets have also been suggested to be a consequence of investor overconfidence, like excessive volatility, return predictability, excessive trading, under-diversification, etc. Finally, research is also vast regarding managerial overconfidence. Executives appear to be particularly prone to display overconfidence, and its effects include literature on mergers and acquisitions and high rates of business failure, among others.

However, when we delve into the concept of overconfidence, things are more complex. Moore and Healy (2008) claim this bias had been studied in inconsistent ways. They identify three different measures of overconfidence that have been confounded in the literature before. In particular, people may exhibit overconfidence: (1) in estimating their own performance (‘overestimation’ **E**); (2) in estimating their own performance relative to others (‘overplacement’ **P**); and (3) having an excessive precision to estimate future uncertainty (‘overprecision’ **M**). Moore and Healy’s model predicts overprecision is systematic, overestimation increases with task difficulty and overplacement decreases with it. The latter findings explain the previously observed hard-easy effect: on easy tasks, people underestimate their performance but overplace themselves compared to others; hard tasks, instead, produce overestimation and underplacement.

In order to avoid confusing overestimation and overprecision, we study overestimation by measuring perceptions across a set of items, whereas overprecision is analyzed through a series of questions on interval estimates. In order to elicit parameters **E** and **P**, participants were required to complete a set of 4 trivia with 10 questions each one. In order to account for the hard-easy effect, two quizzes should be easy and two of hard difficulty. In each quiz, for each item participants have to mark the correct answer. Then, when they finish a quiz, they are required to estimate their own scores, as well as the score of a *randomly selected previous participant*, (RSPP).[1] They repeat the same process for all the 4 rounds.

**Overestimation** (**E**) is calculated substracting a participant’s actual score in each of the 4 trivia from his or her reported expected score and summing all 4 results. A measure E > 0 means the respondent exhibits overestimation, while E < 0 means underestimation. The hard-easy effect may be tested if similar estimations are calculated separately for the hard and easy tasks.

**Overplacement** (**P**) is calculated taking into account whether a participant is really better than others. For each quiz we compute (E[X_{i}] – E[X_{j}]) – (x_{i} – x_{j}) –where E[X_{i}] is her belief about her expected performance in a particular trivia, E[X_{j}] is her belief about the expected performance of the RSPP, and x_{i} and x_{j} measure the actual scores of the individual and the RSPP- and then sum all 4 results. A measure P > 0 means the judge exhibits overplacement and P < 0 means underplacement. Again, the hard-easy effect may be tested computing similar estimations separately for the hard and easy tasks.

In order to elicit parameter **M**, participants were presented a series of 6 questions on 3 domains (device inventions, mortality rates and time walking between two places). For each question they were asked to specify a 3-point estimate (median, 10% fractile and 90% fractile), so we have low and high boundaries for an 80% confidence interval. Soll and Klayman (2004) show overconfidence in interval estimates may result from variability of interval widths. Hence, in order to disentangle variability and true overprecision, they define the ratio M = MEAD/MAD, being MEAD the mean of the expected absolute deviations implied by each pair of fractiles a subject gives, and MAD the observed mean absolute deviation. **Overprecision** (**M**) is calculated by having an M_{i} estimation per domain, and then computing M either as a median (M_{med}) or average (M_{avg}) across the 3 domains. Here M = 1 implies perfect calibration and M < 1 overprecision, with the higher overprecision the lower M is.

In a subsequent post we will provide the results of the experiment.

[1] In our experiment participants were required to estimate “the average score of other students here today and in similar sessions with students of this University”.

References

Moore, D. A. and P. J. Healy (2008), The trouble with overconfidence, *Psychological Review* 115(2), 502–517.

Soll, J.B. and J. Klayman (2004), Overconfidence in interval estimates, *Journal of Experimental Psychology: Learning, Memory and Cognition* 30(2), 299-314.