A recent article discusses how humans may estimate the value of choices (1). Alternatives taken provide feedback that allows the decision-maker to assign a value for that alternative for future decisions. However, this also results in a likely biased low value for the alternative choice, not taken. No feedback can exist for this counterfactual and the decision-maker has to simply inverse the outcomes she observes from the alternative selected. This is reasonably intuitive but experiments also indicate that the duration and complexity of the decision do not affect this process. Human decision-making in the presence of multiple choices simply reinforce outcomes based on the feedback that is observable.
There could be multiple extensions to this experimental setup. First, in repeated independent decisions, one could ask how long it takes to reverse this apparent reinforcement effect. That’s to say, if the choice is binary, yes/no, and in the first instance, the user picked yes and got a positive outcome (by random), how many repeated experiments with negative outcomes for the yes choice does it take for the decision-maker to neutralize the expectation of superiority for that choice? Evolution may have introduced a quirk into the brain for only a positive outcome for the choice selected (such as avoiding the location of a predator) allows repetition and a negative outcome at any point in the chain will simply terminate the decision-maker. Financial market trading data may provide rich fodder for such an analysis.
Second, if the repeated experiments are not independent but rather related to each other, how does the user value sequential decisions that progress over time that do not allow backtracking? If decisions are path dependent, the stated process may bias decision-makers in a direction that can be shown to be suboptimal mathematically. Sequential business decision making often show this feature that systematically leads the decision-maker to the wrong path based on previous outcomes. Assigning an ex. ante probability to the observed outcome may alleviate this bias.
Finally, this is also important for societal decision-making. If society and policy-makers took a decision, let’s say in the presence of a pandemic with multiple policy choices, the feedback available is only for the choice taken, such as social distancing and vaccination. If such decisions have had some negative outcomes, society is going to automatically value the alternatives higher, even though such an experiment is not run. This understanding has policy implications and may need to be studied broadly.
Assigning a value to the counterfactual is not possible. Using the inverse of the outcomes of the option taken appears programmatic for the human brain. This is likely suboptimal in most cases.