The Evaluation of Expert Judgment and Why Should We Study It
A essay about the evaluation of expert judgment and why should we study itI believe we all have at least asked ourselves once in our life, if we should carry a umbrella in our bag because the weather forecast has predicted so. Should we trust this opinion? And how believable are these judgment? On what basis do the experts make these judgments? These are the questions asked by us before making a action based on the expert opinion. Many human skills are practiced in a social context, such as lie detection, judges, juries and evaluation of medical imaging. Their opinion are often considered as the expert judgment and are used to help others choosing an appropriate action.
In this system to subsystems can be identified -- the expert and the consumer. The term expert that has been mentioned above can be a person, a group of person or even a computer simulation whereas the consumer is the person who uses these expert judgments to make a decision on an action. How are these judgment made? Let’s discuss this in detail. In the article by Harvey L. O. (1992), he suggested that when a expert makes a judgment about whether a specific event has or has not occur or will or will not occur, this behaviour is equivalent to when a tester makes an observation of whether a stimulus was present in the previous trail in a vision experiment. He suggested that this is because the underlay signal detection theory is no different from on and another. This signal detection theory he suggested is the Dual-Gaussian, Variable-Criterion Model which is a more general version of the law of comparative judgment model concluded by Louis L. Thurstone (Harvey L. O. , 1992) The law of comparative judgment illustrated the discriminal process when 2 or more stimuli are subject to comparison.
However, based on the fact that the judgment given by the observer is not consistent from one occasion to another, the discriminal process fluctuates. So the theoretical psychological continuum is built in a way that the frequencies of the respective discriminal process for any given stimulus forms a normal distribution (Thurstone L. L. , 1994). The probability of the expert saying “yes” and “no” of the occurrence or non-occurrence of the event makes up the 2 normal distribution model. In general, the experts can process a range of information and make a single value that represents the strength of the evidence concerning the occurrence or the non-occurrence of the event, and if this evidence strength is larger than the set value then the expert makes a positive judgment by saying “yes” of the occurrence of the event and vice versa.
As mentioned before, the expert are capable to compare this value with the internal decision criterion Xc and make a judgment of the situation. The decision can be a single value, it can also be a rating system meaning the expert can have n-1 Xc which divides the response into n ratings (Harvey L. O. , 1992). Since the hit rate correspond to the area under the signal probability distribution and same for the false alarm rate. So hit rate and false alarm rate with a specific Xc can be given by the following two integrals:
(2) & (3)This new relationship forms the receiver operating characteristics (ROC) curve of the expert. These 2 integrals may be evaluated by converting Xc to a Z-score of the appropriate distribution to calculate the probability, and now the ROC curve becomes linear. da and Az are introduced to compare the accuracy of experts which is independent of the Xc and prior probability of the event. However, these value still do not describe the performance of the expert. A new evaluation is introduced by Harvey L. O. (1992) namingly the posterior hit probability (p(s/Y)), which measures the probability of the of the event after the expert says “yes” about the occurrence of the event, hence explains the performance of the expert. The formula of posterior probability is shown below, and in the equation p(Y/s) = hit rate, p(Y/n) = false alarm rate, p(s) = prior probability of the event occuring, and p(n) = prior probability of the event not occuring.
So a new system is discovered by Harvey L. O. (1992) called the critical operating characteristics (COC) to represent the inversely proportional relationship between the hit rate and the posterior probability as shown in Fig. 1. 2 major pieces of information can be extracted from the COC curve: the posterior probability when all the experts reaches the same hit rate and the critical hit rate (HRcr), when they made themselves equally believable. The 4 COC curves shown in Fig. 1, when the 4 experts made themselves met the same posterior probability of 0. 9 the hit rate of the first expert from the left is only 0. 00004. This expert set the decision criterion in a way that most of the hit rate is sacrificed to make him or her more believable to the consumer.
In ideal cases, the expert should have a high believability as well as maintaining a high hit rate. However, this is not always achievable due to the uncertainty of the situation and and the limitation of the experts. Now we have a better understanding of how the experts’ judgments are made. However, are these evidences enough for us to make the decision of whether to make an action based on the them? In fact, the expert judgment can be affect and biased by numerous reasons. Some of widely-known psychometric analyses involving clinical psychological experts took place in the 1950s and 1960s. In 1959, 22 experiment subjects including 4 practicing clinical psychologist, 10 psychology interns, and 8 naive subjects were asked to evaluate 30 tests for cortical brain damage. The accuracy of all groups ranged from 65% to 70%, which indicates no significant difference between groups (Goldberg, 1959). Similar results were also observed in other experiments from various backgrounds, such as medical doctors (Einhorn, 1974), clinical psychologists (Oskamp, 1962), and court judges (Ebbesen and Konecni, 1975). Others might argue about the availability of information to the test panels which might increase the accuracy of the experts. However in 1965, Oskamp demonstrated that the more information provided to the contestants only increases their rating of confidence but not reflected on the accuracy of their judgment. It is concluded by Goldberg (1968), and I quote “ All together, the conclusion from psychometric research is that the experts are lacking in validity and reliability and that more information increases confidence but not accuracy”.
This result is quite on the contrary of what we commonly believe that the expert with a better knowledge should have a higher accuracy in their judgment. If all of it is true then why do we still need the experts and why do we still study them? On one hand, they can act like a representation of the majority and help the psychologists have a better understand of the decision making process of the majority. The common explanation for the low accuracy of experts from other related sources is that they rely on heuristics when making judgments (Shanteau and Stewart, 1992). Even though this approach works sometimes, it often leads to bias and errors that could happen to both the experts and the naive subjects. The research conducted by Northcraft and Neale (1987) on price decision with real estate agents showed that all the real estate agents was giving the estimated price biased towards the initial price that was given to them. This phenomenon proved that decision bias and heuristics play a crucial part in the everyday decision making process and neither experts nor naive subjects are immune to it. In a way, by studying the decision behaviour helps the psychologists obtain a better understanding of the same process in the majority. On the other hand, by studying the experts, it helps investigating the decision making process under uncertainty. However, all the results and evidences pointed to the same direction that expert judgments in most clinical and medical domains are no more accurate than those of slightly trained novices (Camerer and Johnson, 1991). Although all the evidence listed so far are suggesting that the expert judgment are no different or sometime worse than the naive subject there are still exceptions. The first notable counter example came from research on weather forecasters in 1977.
In the research performed by Murphy and Winkler (1977), the experiment result showed an improved calibration line on the probability assessment of precipitation days that was normally out of proportion in earlier relevant researches. This evidence further proved that in some specific fields, the expert have the power of demonstrating an improved performance. Now the question goes back to whether the consumers should consider the expert opinion and how should the consumers take action based on the judgment of the experts?Most of the times, the goal of the expert and the consumer are not consistent with each other, for instance, the goal of the expert might be reaching the maximum percentage of hit rate whereas the goal of the consumer involves taking into account of social policy. So in order to make the best possible decision with a certain goal, the consumer need to have the knowledge of the expert and as well as the event. Equipped with the essential knowledge, the consumer should be able to make an action decision that will implement the social policy or cost-benefit outcome desires. A study upon how jurors assess an expert’s believability by Shuman et al. (1996). The data were collected from the phone survey of 156 civil trial jurors, they found out that the jury members were able to make rational decision based on the perceived qualifications, familiarity with the facts, good reasoning, and impartiality of the expert, despite the occupation of the experts or the characteristics of the jurors.
On one hand, it seems that the decision made by the juror are fair and justifiable, however on the other hand, it exposed the concern of the manipulation of the jurors by the expert is obvious as the decision made by the juror are mainly based on the “rational criterion” which was given by the expert. It is rather dangerous for the consumer with less knowledge of the event to make a decision of an action mainly based on expert judgment. So I am able to conclude from all the evidences that the judgment of the experts can sometimes be biased due to various of reasons just like the slightly trained individuals or naive subjects. So it is risky to fully depend on a expert opinion instead of studying the event ourselves. Most of the times these judgment by the experts should only act as a assisting tool. However, the study of the expert is still an interesting topic to investigate and many psychologists are intrigued by this process and want to know more about it. Besides the personal interested, the influence of the expert’s judgment is closely related the decision we make in our daily life, which means they would have a significant effect on the quality of our lives and even the society and this effect have a potential of being more pronounced in the future.