January 30, 2014

Research roundup

The articles are flooding in at an alarming rate, threatening to bury me under yet another avalanche. Before I am completely submerged, let me share brief synopses of a few of the more informative ones that I have gotten around to reading.

Assessor bias in high-stakes testing: The case of children’s IQ

I’ve blogged quite a bit about bias in forensic assessment, reporting on problems with such widely used tests as the Psychopathy Checklist and the Static-99R. As I’ve reported, some of the bias can be chalked up to adversarial allegiance, or which side the evaluator is working for, whereas some may be due to personality differences among evaluators. Now, researchers are extending this research into other realms -- with alarming findings.

In a study of intelligence testing among several thousand children at 448 schools, the researchers found significant and nontrivial variations in test scoring that had nothing to do with children’s actual intelligence differences. The findings, reported in the journal Psychological Assessment, are especially curious because scoring of the test in question, the Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV), seems relatively straightforward and objective (at least as compared to inherently subjective tests like the Psychopathy Checklist, for example).

The article is:

  • Whose IQ Is It? Assessor Bias Variance in High-Stakes Psychological Assessment.  McDermott, Paul A.; Watkins, Marley W.; Rhoad, Anna M. Psychological Assessment, Published online on Nov 4 , 2013. To request a copy from the first author, click HERE.

Beware pseudo-precision in expert opinions

I’ve never forgotten a video I saw a long time ago, in which the filmmakers drove up to random strangers and asked for directions to a nearby landmark. Some of the good samaritans gave enthusiastic instructions that were completely wrong, while other people gave correct directions but in a more tentative fashion. The trouble is, the more confident someone appears, the more we judge them as knowing what they are talking about.  

One way we gauge a presenter’s confidence, in turn, is by their level of precision. In a new study, researchers found that participants were more likely to rely on advice given by people who provided more precise information. For example, they were more likely to trust someone who said that the Mississippi River was 3,992 miles long, rather than 4,000 miles long.

What this means in the forensic realm is that we should not make claims of false precision, when our evidence base is weak. For example, we should not claim to know that someone has a 44 percent chance of violent reoffense within three years. Such misleading claims-making lends an aura of confidence and expertise that is not warranted.

The article is:

Ethics and the DSM-5

Speaking of avalanches, the volume of critical response to the DSM-5 is lessening now that the tome has been on the bookshelves for eight months. Trying to keep my finger on the pulse because of my training activities on the manual’s forensic implications, I found an interesting summary of the ethical dilemmas of the latest trends in psychiatric diagnosis.

The author, Jennifer Blumenthal-Barby, is an ethics professor at Baylor College of Medicine’s Center for Medical Ethics and Health Policy. In her critique, published in the Journal of Medical Ethics, she focuses on consequence-based concerns about the dramatic expansion of psychiatric diagnoses in the latest edition of the American Psychiatric Association’s influential manual. Concerns include:

  • False positives, or over-diagnosis, in clinical (and I would add forensic) practice
  • Risks associated with pharmacological treatments of new conditions
  • Neglect of larger structural issues and reduction of individual responsibility through medicalization
  • Discrediting of psychiatry through the trivialization of mental disorders
  • Efforts to eradicate conditions that are valuable or even desirable

Although her discussion is fairly general, she does mention a few of the proposed diagnostic changes of forensic relevance that I’ve blogged about. These include the proposed hypersexual disorder and a proposal to eliminate the age qualifier (of 18 and above) for antisocial personality disorder, to make it consistent with all of the other personality disorders.

It’s a good, brief overview suitable for assignment to students and professionals alike.

The article is: 
  • Psychiatry’s new manual (DSM-5): ethical and conceptual dimensions. Journal of Medical Ethics. Published online on 10 Dec. 2013. To request a copy, click HERE.

Dual relationships: Are they all bad?

We’ve all seen the memo: Dual relationships are to be avoided.

But is that always true?

Not according to ethics instructor Ofer Zur.

Multiple relationships are situations in which a mental health professional has a professional role with a client and another role with a person closely related to the client. In a new overview, Zur asserts that, not only are some multiple relationships ethical, they may be unavoidable, desirable, or even -- in some cases -- mandated.

In delineating the ethics and legality of 26 different types of multiple relationships, Zur stresses that in forensic settings, most multiple relationships should be avoided.

The article, Not All Multiple Relationships Are Created Equal: Mapping the Maze of 26 Types of Multiple Relationships, is another good teaching tool, and is freely available online at Zur’s continuing education website.

By the way, if you are in California and are looking for more ethics training, Zur and two of my former colleagues from the state psychological association’s Ethics Committee -- Michael Donner, PhD and Pamela Harmell, PhD -- are co-presenting at an interactive ethics session at the upcoming California Psychological Association convention. The convention runs April 9-13 in Monterey, and the ethics conversation -- “Ethics are not Rules: Psych in the Real World” -- is on Saturday, April 12.

January 26, 2014

Psycholegal evaluations in Immigration Court: Free online training series

Feb. 5 UPDATE: The first webinar in this series was a huge success. To register for any or all of the remaining three webinars, click HERE.

Torture victims from El Salvador. Gay people from Uganda. Immigrants with elderly dependents who are U.S. citizens.

In our increasingly multicultural society, more and more people find themselves in U.S. Immigration Court. And, often, psychological evaluations play a role in deciding their fates. Unfortunately, most immigrants applying for political asylum or hardship waivers have very little money, creating an acute need for psychologists willing and able to provide low-fee evaluations.

Working to fulfill this need is my hard-working colleague Anatasia Kim, a professor at the Wright Institute in Berkeley and chair of the Immigration Task Force of the California Psychological Association. Dr. Kim is spearheading a drive to train a cadre of psychologists to conduct these evaluations. In exchange for conducting low-fee or pro bono evaluations, psychologists and students will get free mentorship by expert forensic psychologists and attorneys in the field.  

As part of the campaign, the Immigration Task Force is hosting a four-part Webinar series in February aimed at teaching the basic competencies. Immigration attorneys and psychologists will train virtual attendees on the nuts and bolts of conducting psycholegal evaluations in immigration courts.

Best of all, the series is entirely FREE. You can even earn continuing education credits (one unit per session).

The four workshops, each running from noon to 1:00 p.m. (Pacific Standard Time) on
a Tuesday, are:

Feb. 4: Basics of Conducting a Psychological Evaluation for Immigration Court. Nancy Baker, Ph.D., ABPP, Diplomate in Forensic Psychology, Director of Forensic Concentration at Fielding Graduate University

Feb. 11: Legal Relevance of Psychologists’ Opinions in the Immigration Context. Robin Goldfaden, Esq., Senior Attorney, Immigrant Justice, Lawyers’ Committee for Civil Rights of the San Francisco Bay Area, and Lisa Fryman, Esq., Associate Director/Managing Attorney, Center for Gender and Refugee Studies at U.C. Hastings College of Law

Feb. 18: Recommended Immigration Evaluation Process for Hardship Cases. Margaret Lee, Ph.D., Clinical Psychologist and Adjunct Professor, Alliant International University, San Diego and Former Clinical Director at Survivors of Torture International

Feb. 25: Writing Psychological Assessment Reports for Immigration Court. James Livingston, Ph.D., Senior Staff Psychologist, Center for the Survivors of Torture in San Jose.

You can register for the first training HERE

If you have any questions, email Dr. Kim HERE.  

January 23, 2014

California conference to highlight juvenile treatment

Michael Caldwell, co-founder of the Mendota Juvenile Treatment Center in Wisconsin, will share his Center’s innovative approach to treating hard-core juvenile offenders at this year’s Forensic Mental Health Association of California (FMHAC) conference.

Caldwell, whose research on juvenile risk assessment has been highlighted on this blog, says the Mendota approach has been proven to reduce violent offense among the extreme end of intractable juvenile delinquents who absorb such a disproportionate amount of rehabilitation resources and account for a large proportion of violent crimes.

His two workshops are part of a special juvenile track that will also feature a session on introducing the practice of mindfulness to incarcerated juveniles.

The juvenile track is one of five special tracks at this year’s FMHAC conference, coming up March 19 in beautiful Monterey, California. The other tracks are clinical/assessment, legal, psychiatric and, of course, the omnipresent sex offender track.

More details and registration information can be found HERE.The FMHAC's website is HERE.

January 20, 2014

Orange is the New Black -- Read the book!

Taylor Schilling plays Piper Kerman in the TV series
Hollywood prison scenes are so revolting. Most revolting are the depictions of women’s prisons. They superimpose onto female prisoners the worst stereotype of male prisoners as hulking, sexually aggressive brutes. And, even more so than for male prisons, the public has little direct information to counter this distorted image.

Blasting apart this image is Piper Kerman’s outstanding memoir. Detailing her year in a minimum-security federal camp, Orange is the New Black is a first-rate effort to educate the public about the realities of women’s prison.

Promo for blockbuster Netflix spinoff
Kerman tiptoed into prison with the trepidation one might expect of a white, college-educated woman thrown into the lion’s den. But instead of prisoner-on-prisoner predation, she found a sense of community, where women survived by forging family-like relationships among their “tribes.” The greatest dangers in prison came not at the hands of other women, Kerman found, but from the agents of bureaucracy who wielded the threat of the SHU* (Security Housing Unit) or loss of good-time credits for any petty misstep.

I found myself grateful that, once in a blue moon, a middle-class person with a social conscience is sent to prison. Kerman’s bad luck is the public’s fortune. With the overwhelming mass of prisoners voiceless, who else can speak the truth and be heard? Kerman is the everywoman; through recognizing ourselves in her, we feel the prisoner’s plight as our own.

Her sense of not belonging among the underclass was shared by correctional officers and prisoners alike, who more than once asked the blond-haired, blue-eyed Smith College graduate: “What’s someone like you doing in a place like this?!”

Laverne Cox as trans prisoner Sophia Burset
Don’t think that if you’ve seen the blockbuster TV spinoff, you know the story. While colorful, the series is by comparison shallow and exploitive. Netflix does a public service by counteracting Hollywood’s crude stereotypes, portraying incarcerated women as diverse human beings, but the semi-fictional show’s biggest accomplishment may be to steer intelligent viewers toward the book. (As an aside, it has also given greater visibility to the issues of transgender women of color, with trans actress Laverne Cox outstanding in the role of a transwoman prisoner.)

For a real-life visual representation of the lot of the woman prisoner, I recommend the documentary Crime After Crime. The story of battered woman Debbie Peagler’s struggle for justice is far more heart-wrenching than Kerman’s memoir, but both dramatize how a soulless bureaucratic machine chews up and spits out human potential.

The real-life Piper Kerman
Kerman is a fluid story-teller, and her saga is intrinsically gripping. But, as writer Mary Karr points out in a recent interview, the "through-line" of an effective memoir is the character’s transformation. Seeing herself through the eyes of other women in the bleak prison milieu, Kerman realizes virtues in herself that she never knew. And she confronts for the first time her own complicity in her comrades' oppression, through her former role as an international heroin smuggler.

The sincerity of Kerman’s transformation is evident in her life since leaving prison nine years ago. She serves on the board of the non-profit prison reform group Women’s Prison Association and does public education on the plight of women prisoners -- especially the two-thirds who are mothers -- through influential media outlets such as National Public Radio. As she writes in a recent op-ed in the New York Times:  
"Harshly punitive drug laws and diminishing community mental health resources have landed many women in prison who simply do not belong there, often for shockingly long sentences. What is priceless about JusticeHome, however, is that it is working not only to rehabilitate women but to keep families together -- which we know is an effective way to reduce crime and to stop a cycle that can condemn entire families to the penal system."

* * * * *

*I listened to the audiobook version. The reader was quite good. Her only false steps came in reading the word "SHU": She read it aloud as "S-H-U," instead of the way it is actually pronounced in prison ("shoe"). The SHU is too ubiquitous to merit three syllables at every utterance.

(c) Copyright Karen Franklin 2014 - All rights reserved

January 12, 2014

Putting the Cart Before the Horse: The Forensic Application of the SRA-FV

As the developers of actuarial instruments such as the Static-99R acknowledge that their original norms inflated the risk of re-offense for sex offenders, a brand-new method is cropping up to preserve those inflated risk estimates in sexually violent predator civil commitment trials. The method introduces a new instrument, the “SRA-FV,” in order to bootstrap special “high-risk” norms on the Static-99R. Curious about the scientific support for this novel approach, I asked forensic psychologist and statistics expert Brian Abbott to weigh in.

Guest post by Brian Abbott, PhD*

NEWS FLASH: Results from the first peer-reviewed study about the Structured Risk Assessment: Forensic Version (“SRA-FV”), published in Sexual Abuse: Journal of Research and Treatment (“SAJRT”), demonstrate the instrument is not all that it’s cracked up to be.
Promotional material for an SRA-FV training
For the past three years, the SRA-FV developer has promoted the instrument for clinical and forensic use despite the absence of peer-reviewed, published research supporting it validity, reliability, and generalizability. Accordingly, some clinicians who have attended SRA-FV trainings around the country routinely apply the SRA-FV in sexually violent predator risk assessments and testify about its results in court as if the instrument has been proven to measure what it intends to assess, has known error rates, retains validity when applied to other groups of sexual offenders, and produces trustworthy results.

Illustrating this rush to acceptance most starkly, within just three months of its informal release (February 2011) and with an absence of any peer-reviewed research, the state of California incredibly decided to adopt the SRA-FV as its statewide mandated dynamic risk measure for assessing sexual offenders in the criminal justice system. This decision was rescinded in September 2013, with the SRA-FV replaced with a similar instrument, the Stable-2007.

The SRA-FV consists of 10 items that purportedly measure “long-term vulnerabilities” associated with sexual recidivism risk. The items are distributed among three risk domains and are assessed using either standardized rating criteria devised by the developer or by scoring certain items on the Psychopathy Checklist-Revised (PCL-R). Scores on the SRA-FV range from zero to six. Some examples of the items from the instrument include: sexual interest in children, lack of emotionally intimate relationships with adults, callousness, and internal grievance thinking. Patients from the Massachusetts Treatment Center in Bridgewater, Massachusetts who were evaluated as sexually dangerous persons between 1959 and 1984 served as members of the SRA-FV construction group (unknown number) and validation sample (N = 418). It was released for use by Dr. David Thornton, a co-developer of the Static-99R, Static-2002R, and SRA-FV and research director at the SVP treatment program in Wisconsin, in December 2010 during training held in Atascadero, California. Since then, Dr. Thornton has held similar trainings around the nation where he asserts that the SRA-FV is valid for predicting sexual recidivism risk, achieves incremental validity over the Static-99R, and can be used to choose among Static-99R reference groups.

A primary focus of the trainings is a novel system in which the total score on the SRA-FV is used to select one Static-99R “reference group” among three available options. The developer describes the statistical modeling underlying this procedure, which he claims increases predictive validity and power over using the Static-99R alone. However, reliability data is not offered to support this claim. In the December 2010 training, several colleagues and I asked for the inter-rater agreement rate but Dr. Thornton refused to provide it.

I was astounded but not surprised when some government evaluators in California started to apply the SRA-FV in sexually violent predator risk assessments within 30 days after the December 2010 training. This trend blossomed in other jurisdictions with sexually violent predator civil confinement laws. Typically, government evaluators applied the SRA-FV to select Static-99R reference groups, invariably choosing to compare offenders with the “High Risk High Needs” sample with the highest re-offense rates. A minority of clinicians stated in reports and court testimony that the SRA-FV increased predictive accuracy over the Static-99R alone but they were unable to quantify this effect. The same clinicians have argued that the pending publication of the Thornton and Knight study was sufficient to justify its use in civil confinement risk assessments for sexually violent predators. They appeared to imply that the mere fact that a construction and validation study had been accepted for publication was an imprimatur that the instrument was reliable and valid for its intended purposes. Now that the research has been peer-reviewed and published, the results reflect that these government evaluators apparently put the proverbial cart before the horse.

David Thornton and Raymond Knight penned an article that documents the construction and validation of the SRA-FV. The publication is a step in the right direction, but by no means do the results justify widespread application of the SRA-FV in sexual offender risk assessment in general or sexually violent predator proceedings in particular. Rather, the results of the study only apply to the group upon which the research was conducted and do not generalize to other groups of sexual offenders. Before discussing the limitations of the research, I would like to point out some encouraging results.

The SRA-FV did, as its developer claimed, account for more sources of sexual recidivism risk than the Static-99R alone. However, it remains unknown which of the SRA-FV’s ten items contribute to risk prediction. The study also found that the combination of the Static-99R and SRA-FV increased predictive power. This improved predictive accuracy, however, must be replicated to determine whether the combination of the two instruments will perform similarly in other groups of sexual offenders. This is especially important when considering that the SRA-FV was constructed and validated on individuals from the Bridgewater sample from Massachusetts who are not representative of contemporary groups of sexual offenders. Thornton and Knight concede this point when discussing how the management of sexual offenders through all levels of the criminal justice system in Massachusetts between 1959 and 1984 was remarkably lenient compared to contemporary times. Such historical artifacts likely compromise any reliable generalization from patients at Bridgewater to present-day sexual offenders.

Training materials presented four months before
State of California rescinded use of the SRA-FV

Probably the most crucial finding from the study is the SRA-FV’s poor inter-rater reliability. The authors categorize the 64 percent rate of agreement as “fair.” It is well known that inter-rater agreement in research studies is typically higher than in real-world applications. This has been addressed previously in this blog in regard to the PCL-R. A field reliability study of the SRA-FV among 19 government psychologists rating 69 sexually violent predators in Wisconsin (Sachsenmaier, Thornton, & Olson, 2011) found an inter-rater agreement rate of only 55 percent for the SRA-FV total score, which is considered as poor reliability. These data illustrate that 36 percent to 45 percent of an SRA-FV score constitutes error, raising serious concerns over the trustworthiness of the instrument. To their credit, Thornton and Knight acknowledge this as an issue and note that steps should be taken to increase reliable scoring. Nonetheless, the current inter-rater reliability falls far short of the 80 percent floor recommended for forensic practice (Heilbrun, 1992). Unless steps are taken to dramatically improve reliability, the claims that the SRA-FV increases predictive accuracy either alone or in combination with the Static-99R, and that it should be used to select Static-99R reference groups, are moot.

It is also important to note that, although Thornton and Knight confuse the terms validation and cross validation in their article, this study represents a validation methodology. Cross-validation is a process by which the statistical properties found in a validation sample (such as reliability, validity, and item correlations) are tested in a separate group to see whether they hold up. In contrast, Thornton and Knight first considered the available research data from a small number of individuals from the Bridgewater group to determine what items would be included in the SRA-FV. This group is referred to as the construction sample. The statistical properties of the newly conceived measure were studied on 418 Bridgewater patients who constitute the validation sample. The psychometric properties of the validation group have not been tested on other contemporary sexual offender groups. Absent such cross-validation studies, we simply have no confidence that the SRA-FV works at it has been designed for groups other than the sample upon which it was validated. To their credit, Thornton and Knight acknowledge this limitation and warn readers not to generalize the validation research to contemporary groups of sexual offenders.

The data on incremental predictive validity, while interesting, have little practical value at this point for two reasons. One, it is unknown whether the results will replicate in contemporary groups of sexual offenders. Two, no data are provided to quantify the increased predictive power. The study does not provide an experience table of probability estimates at each score on the Static-99R after taking into account the effect of the SRA-FV scores. It seems disingenuous, if not misleading, to inform the trier of fact that the combined measures increase predictive power but to fail to quantify the result and the associated error rate.

In my practice, I have seen the SRA-FV used most often to select among three Static-99R reference groups. Invariably, government evaluators in sexually violent predator risk assessments assign SRA-FV total scores consistent with the selection of the Static-99R High Risk High Needs reference group. Only the risk estimates associated with the highest Static-99R scores in this reference group are sufficient to support an opinion that an individual meets the statutory level of sexual dangerousness necessary to justify civil confinement. Government evaluators who have used the SRA-FV for this purpose cannot cite research demonstrating that the procedure works as intended or that it produces a reliable match to the group representing the individual being assessed. Unfortunately, Thornton and Knight are silent on this application of the SRA-FV.

In a recently published article, I tested the use of the SRA-FV for selecting Static-99R reference groups. In brief, Dr. Thornton used statistical modeling based solely on data from the Bridgewater sample to devise this model. The reference group selection method was not based on the actual scores of members from each of the three reference groups. Rather, it was hypothetical, presuming that members of a Static-99R reference group will exhibit a certain range of SRA-FV score that do not overlap with any of the other two reference groups. To the contrary, I found that the hypothetical SRA-FV reference group system did not work as designed, as the SRA-FV scores between reference groups overlapped by wide margins. In other words, the SRA-FV total score would likely be consistent with selecting two if not all three Static-99R reference groups. In light of these findings, it is incumbent upon the developer to provide research using actual subjects to prove that the SRA-FV total score is a valid method by which to select a single Static-99R reference group and that the procedure can be applied reliably. At this point, credible support does not exist for using the SRA-FV to select Static-99R reference groups.

The design, development, validation, and replication of psychological instruments is guided by the Standard for Educational and Psychological Testing (“SEPT” -- American Educational Research Association et al., 1999). When comparing the Thornton and Knight study to the framework provided by SEPT, it is apparent the SRA-FV is in the infancy stage of development. At best, the SRA-FV is a work in progress that needs substantially more research to improve its psychometric properties. Aside from its low reliability and inability to generalize the validation research to other groups of sexual offenders, other important statistical properties await examination, including but not limited to:

  1. standard error of measurement
  2. factor analysis of whether items within each of the three risk domains significantly load in their respective domains
  3. the extent of the correlation between each SRA-FV item and sexual recidivism
  4. which SRA-FV items add incremental validity beyond the Static-99R or may be redundant with it; and proving each item has construct validity. 

It is reasonable to conclude that at its current stage of development the use of the SRA-FV in forensic proceedings is premature and scientifically indefensible. In closing , in their eagerness to improve the accuracy of their risk assessments, clinicians relied upon Dr. Thornton’s claim in the absence of peer-reviewed research demonstrating that the SRA-FV achieved generally accepted levels of reliability and validity. The history of forensic evaluators deploying the SRA-FV before the publication of the construction and validation study raises significant ethical and legal questions:

  • Should clinicians be accountable to vet the research presented in trainings by an instrument’s developer before applying a tool in forensic practice? 

  • What responsibility do clinicians have to rectify testimony where they presented the SRA-FV as if the results were reliable and valid?

  •  How many individuals have been civilly committed as sexually violent predators based on testimony that the findings from the SRA-FV were consistent with individuals meeting the legal threshold for sexual dangerousness, when the published data does not support this conclusion?

Answers to these questions and others go beyond the scope of this blog. However, in a recent appellate decision, a Washington Appeals Court questions the admissibility of the SRA-FV in the civil confinement trial of Steven Ritter. The appellate court determined that the application of the SRA-FV was critical to the government evaluator’s opinion that Mr. Ritter met the statutory threshold for sexual dangerousness. Since the SRA-FV is considered a novel scientific procedure, the appeals court reasoned that the trial court erred by not holding a defense-requested evidentiary hearing to decide whether the SRA-FV was admissible evidence for the jury to hear. The appeals court remanded the issue to the trial court to hold a Kelly-Frye hearing on the SRA-FV. Stay tuned!


Abbott, B.R. (2013). The Utility of Assessing “External Risk Factors” When Selecting Static-99R Reference Groups. Open Access Journal of Forensic Psychology, 5, 89-118.

American Educational Research Association, American Psychological Association and National Council on Measurement in Education. (1999). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association.

Heilbrun, K. (1992). The role of psychological testing in forensic assessment. Law and Human Behavior, 16, 257-272. doi: 10.1007/BF01044769.

In Re the Detention of Steven Ritter. (2013, November). In the Appeals Court of the State of Washington, Division III. 

Sachsenmaier, S., Thornton, D., & Olson, G. (2011, November). Structured risk assessment forensic version (SRA-FV): Score distribution, inter-rater reliability, and margin of error in an SVP population. Presentation at the 30th Annual Research and Treatment Conference of the Association for the Treatment of Sexual Abusers, Toronto, Canada.

Thornton, D. & Knight, R.A. (2013). Construction and validation of the SRA-FV Need Assessment. Sexual Abuse: A Journal of Research and Treatment. Published online December 30, 2013. doi: 10.1177/ 1079063213511120. 
* * *

*Brian R. Abbott is licensed psychologist in California and Washington who has evaluated and treated sexual offenders for more than 35 years. Among his areas of forensic expertise, Dr. Abbott has worked with sexually violent predators in various jurisdictions within the United States, where he performs psychological examinations, trains professionals, consults on psychological and legal issues, offers expert testimony, and publishes papers and peer-reviewed articles.

(c) Copyright 2013 - All rights reserved

January 9, 2014

Special book offer for blog subscribers

What Would Sherlock Do?

In an intriguing bestseller, journalist Maria Konnikova explores how you can reason more scientifically by applying the exalted sleuth’s techniques of mindfulness, astute observation and logical deduction with a modern twist that incorporates cutting-edge neuroscience and psychology.

The goal of the author, a Scientific American columnist and psychology graduate student, is to teach readers how, with a little self-awareness and practice, you can sharpen your perceptions, solve difficult problems, and enhance your creative powers. The central premise is that Sherlock Holmes is a near-ideal window into the science of how we think and a rare teacher of how to upgrade our default mode of thinking.

"Mastermind: How to Think Like Sherlock Holmes" is getting strong praise and has climbed onto the New York Times bestseller list. Steven Pinker calls it "A delightful tour of the science of memory, creativity, and reasoning … which will help you master your own mind." Illustrated with cases from the annals of Sir Arthur Conan Doyle, this entertaining work could also serve as a perfect introduction to cognitive science for undergraduate students.

In a special offer, Penguin press will send a free copy of Mastermind to two subscribers to this blog. If you would like to be one, email me with your full name and mailing address. The offer is first-come, first-served, with preference given to my paying subscribers and donors. The offer is limited to U.S. residents.

January 5, 2014

New evidence of psychopathy test's poor accuracy in court

Use of a controversial psychopathy test is skyrocketing in court, even as mounting evidence suggests that the prejudicial instrument is highly inaccurate in adversarial settings.

The latest study, published by six respected researchers in the influential journal Law and Human Behavior, explored the accuracy of the Psychopathy Checklist, or PCL-R, in Sexually Violent Predator cases around the United States.

The findings of poor reliability echo those of other recent studies in the United States, Canada and Europe, potentially heralding more admissibility challenges in court. 

Although the PCL-R is used in capital cases, parole hearings and juvenile sentencing, by far its most widespread forensic use in the United States is in Sexually Violent Predator (SVP) cases, where it is primarily invoked by prosecution experts to argue that a person is at high risk for re-offense. Building on previous research, David DeMatteo of Drexel University and colleagues surveyed U.S. case law from 2005-2011 and located 214 cases from 19 states -- with California, Texas and Minnesota accounting for more than half of the total -- that documented use of the PCL-R in such proceedings.

To determine the reliability of the instrument, the researchers examined a subset of 29 cases in which the scores of multiple evaluators were reported. On average, scores reported by prosecution experts were about five points higher than those reported by defense-retained experts. This is a large and statistically significant difference that cannot be explained by chance. 

Prosecution experts were far more likely to give scores of 30 or above, the cutoff for presumed psychopathy. Prosecution experts reported scores of 30 or above in almost half of the cases, whereas defense witnesses reported scores that high in less than 10 percent.

Looking at interrater reliability another way, the researchers applied a classification scheme from the PCL-R manual in which scores are divided into five discreet categories, from “very low” (0-8) to “very high” (33-40). In almost half of the cases, the scores given by two evaluators fell into different categories; in about one out of five cases the scores were an astonishing two or more categories apart (e.g., “very high” versus “moderate” psychopathy). 

Surprisingly, interrater agreement was even worse among evaluators retained by the same side than among opposing experts, suggesting that the instrument’s inaccuracy is not solely due to what has been dubbed adversarial (or partisan) allegiance.

Despite its poor accuracy, the PCL-R is extremely influential in legal decision-making. The concept of psychopathy is superficially compelling in our current era of mass incarceration, and the instrument's popularity shows no sign of waning. 

Earlier this year, forensic psychologist Laura Guy and colleagues reported on its power in parole decision-making in California. The state now requires government evaluators to use the PCL-R in parole fitness evaluations for “lifers,” or prisoners sentenced to indeterminate terms of up to life in prison. Surveying several thousand cases, the researchers found that PCL-R scores were a strong predictor of release decisions by the Parole Board, with those granted parole scoring an average of about five points lower than those denied for parole. Having just conducted one such evaluation, I was struck by the frightening fact – alluded to by DeMatteo and colleagues -- that the chance assignment of an evaluator who typically gives high scores on the PCL-R “might quite literally mean the difference between an offender remaining in prison versus being released back into the community.”

Previous research has established that Factor 1 of the two-factor instrument – the factor measuring characterological traits such as manipulativeness, glibness and superficial charm – is especially prone to error in forensic settings. This is not surprising, as traits such as “glibness” are somewhat in the eye of the beholder and not objectively measurable. Yet, the authors assert, “it is exactly these traits that seem to have the most impact” on judges and juries.

Apart from the issue of poor reliability, the authors questioned the widespread use of the PCL-R as evidence of impaired volitional control, an element required for civil commitment in SVP cases. They labeled as “ironic, if not downright contradictory” the fact that psychopathy is often touted in traditional criminal responsibility (or insanity) cases as evidence of badness as opposed to mental illness, yet in SVP cases it magically transforms into evidence of a major mental disorder that interferes with self-control. 

The evidence is in: The Psychopathy Checklist-Revised is too inaccurate in applied settings to be relied upon in legal decision-making. With consistent findings of abysmal interrater reliability, its prejudicial impact clearly outweighs any probative value. However, the gatekeepers are not guarding the gates. So long as judges and attorneys ignore this growing body of empirical research, prejudicial opinions will continue to be cloaked in a false veneer of science, contributing to unjust outcomes.

* * * * *
The study is: 

The Role and Reliability of the Psychopathy Checklist-Revised in U.S. Sexually Violent Predator Evaluations: A Case Law Survey by DeMatteo, D., Edens, J. F., Galloway, M., Cox, J., Toney Smith, S. and Formon, D. (2013). Law and Human Behavior

Copies may be requested from the first author (HERE).

The same research team has just published a parallel study in Psychology, Public Policy and Law

“Investigating the Role of the Psychopathy Checklist-Revised in United States Case Law” by DeMatteo, David; Edens, John F.; Galloway, Meghann; Cox, Jennifer; Smith, Shannon Toney; Koller, Julie Present; Bersoff, Benjamin

My related essays and blog posts (I especially recommend the three marked with asterisks):

(c) Copyright Karen Franklin 2013 - All rights reserved