January 21, 2011

How competent are the competency evaluators?

Largest real-world study finds modest agreement among independent alienists

A bad forensic report by a bad evaluator sets bad events in motion.

True story: A judge ordered a robbery suspect to undergo evaluation. A drive-by psychologist opined that the defendant was incompetent to stand trial due to schizophrenia and attention deficit/hyperactivity disorder (ADHD). The judge rubber-stamped the evaluator's opinion. The defendant was shipped off to the dysfunctional state hospital system for competency restoration treatment. There, the psychologist's diagnoses were rubber-stamped. The unruly defendant was shot full of powerful antipsychotics, given a few months of bus therapy, and proclaimed competent. The defendant had never been psychotic in the first place. Years later, he remained just as mentally retarded as ever.

"Penny-wise, pound-foolish" is the expression that comes to mind. The courts try to save money by appointing only one psychologist per case, and by paying a ludicrously small sum that encourages shoddy practices. But cleaning up the resultant messes is costly, inefficient, and fundamentally unfair.

Competency evaluations are the bread and butter of forensic work. An estimated 60,000 defendants per year -- roughly 5% of the total -- are evaluated to see whether they understand their legal situations and can rationally assist their lawyers in their defense. But for all of the importance of accurate assessments, both to a smoothly running court system and to the rights of the mentally ill to a fair trial, surprisingly little is known about the real-world accuracy of forensic evaluators.

In the case I just outlined, the judge viewed psychologists and psychiatrists as equal and interchangeable, all inherently reliable and trustworthy. At the other extreme, some believe forensic opinions are as random as a chimp with a typewriter.

Hawaii: Exemplar or exception?

Only one U.S. state squarely addresses the problem of reliability in competency evaluations. In the Aloha State, when a doubt is raised as to a defendant's competency, three separate evaluators must conduct independent evaluations. One evaluator is a state employee; the other two are independent. One must be a psychiatrist. By law, the three cannot talk with each other about the case.

This makes Hawaii the perfect setting to examine the real-world reliability of competency evaluators. In a study just accepted for publication in Law and Human Behavior, three investigators took advantage of this opportunity to conduct the largest naturalistic study ever of evaluators' agreement about competency to stand trial.

It should not be a surprise that Daniel Murrie and Marcus Boccaccini are two of the investigators. Not the types to run Psych 101 undergrads through artificial lab experiments, these two are committed to examining forensic practice in the courtroom trenches. I've blogged about their previous work exposing "partisan allegiance" effects in the real-world application of the Psychopathy Checklist (PCL-R). For the current innovative study, they teamed up with W. Neil Gowensmith of the Hawaii courts' forensic services unit.

Examining 729 reports authored by 35 evaluators, they found that all three evaluators agreed in just under three out of four -- or 71 percent -- of initial competency referrals. Agreement was a bit lower -- 61 percent -- in cases where defendants were being reevaluated after undergoing competency restoration treatment.

Consistent with the results of a hot-off-the-press meta-analysis of 50 years of competency research, evaluators believed that the broad majority of defendants referred for evaluation, about 73 percent, were competent to stand trial. This figure was somewhat lower for defendants being reevaluated after an initial finding of competency, with evaluators opining competence in about half of such restoration cases.

Why do evaluators differ?

As far as why agreement is not higher, the study raised more questions than it answered. The researchers sifted through the data looking for patterns, but none jumped out. Evaluators did not lean one way or the other by discipline (psychologist vs. psychiatrist) or by employer (state versus private practice). Defendant demographics were not explanatory. Nor were evaluator disagreements about diagnosis.

It would be interesting to conduct qualitative analyses of the 216 cases in this study to see whether those in which evaluators differed were more complex and ambiguous than the others. I suspect that to be the case.

Competency is nebulous. It exists along a continuum, so there is no precise cut point at which a defendant is automatically "competent" or "incompetent" to go forward with his legal case. Thus, evaluator agreement will never be perfect, nor -- necessarily -- should it be.

How did the judges rule?

One of the more intriguing aspects of the study was its exposition of how judges ruled after being presented with three reports. Not surprisingly, when evaluators were unanimous or split 2-1, the judges tended to go with the majority. But unlike the judge in the vignette I described earlier, many Hawaiian judges were independent thinkers who did not just rubber-stamp the evaluators' opinions.

When they disagreed with the opinions of the court-appointed psychologists and psychiatrists, it was typically to find a defendant incompetent. In fact, in a few cases the judges found defendants to be incompetent even when all three evaluators believed a defendant was competent. In this way, they elevated defendants' due-process rights over prosecutorial efficiency. But maybe that's just Hawaii.

Moving forward

I found the results somewhat encouraging. When not subjected to partisan allegiance pressures, forensic practitioners agreed about three-fourths of the time about whether a defendant was competent to stand trial or not.

Still, if these results are generalizable, it means evaluators will disagree in about two or three cases out of every ten. So in jurisdictions that appoint only a single evaluator, the researchers point out, many judges may be unwittingly rubber-stamping an idiosyncratic -- and even patently incorrect -- opinion:
[T]o the extent that there is a factually correct answer to the question of whether or not a defendant is competent to stand trial, relying on one evaluator increases the likelihood that the court reaches an incorrect decision (by following an incorrect single opinion that would have been revealed as a minority opinion if other evaluations were available). In some instances, this may result in delaying a trial while a defendant is unnecessarily hospitalized. In other instances this may result in a defendant proceeding to trial when additional evaluator(s) would have opined the defendant was unable to participate meaningfully in that trial….

The justice system needs to continue to wrestle with how to handle these competing demands -- efficient use of resources versus fair consideration of defendants' right to due process.
Murrie and Boccaccini are on a roll. Let's hope they keep churning out this ground-breaking line of research, examining the real-world vagaries of forensic practice, and that others climb down from the ivory towers and jump on their bandwagon.

As they note, "naturalistic studies of field reliability are an essential first step in gauging wide-scale quality across all manner of forensic practice and targeting areas for improvement."

No comments:

Post a Comment