Showing posts with label ethics. Show all posts
Showing posts with label ethics. Show all posts

March 19, 2013

California high court upholds parolee confidentiality right

Two years ago, I reported on a California appellate opinion upholding the sacredness of patient-therapist confidentiality even for convicted felons who are mandated to treatment as a condition of parole. Today, the California Supreme Court upheld the gist of the ruling -- but with a proviso. Using strained logic, the court held that the breach of confidentiality was not so prejudicial as to merit overturning Ramiro Gonzales's civil commitment, as the Sixth District Court of Appeal had done.

Gonzales is a developmentally disabled man whose therapist turned over prejudicial therapy records to a prosecutor seeking to civilly detain him as a sexually violent predator (SVP). Forensic psychology experts Brian Abbott and Tim Derning testified for the defense; called by the prosecution were psychologists Thomas MacSpeiden and Jack Vognsen.

As I wrote two years ago, the ruling is good news for psychology ethics and should serve as a reminder that we are obligated to actively resist subpoenas requesting confidential records of therapy.

Today's California Supreme Court ruling is HERE. My prior post, with much more detail on the case, is HERE. The Sixth District Court of Appeal opinion from 2011, available HERE, provides a nice overview of both federal and California case law on confidentiality in forensic cases.
 
Hat tip: Adam Alban

March 5, 2013

Remarkable experiment proves pull of adversarial allegiance

 Psychologists' scoring of forensic tools depends on which side they believe has hired them

A brilliant experiment has proven that adversarial pressures skew forensic psychologists' scoring of supposedly objective risk assessment tests, and that this "adversarial allegiance" is not due to selection bias, or preexisting differences among evaluators.

The researchers duped about 100 experienced forensic psychologists into believing they were part of a large-scale forensic case consultation at the behest of either a public defender service or a specialized prosecution unit. After two days of formal training by recognized experts on two widely used forensic instruments -- the Psychopathy Checklist-R (PCL-R) and the Static-99R -- the psychologists were paid $400 to spend a third day reviewing cases and scoring subjects. The National Science Foundation picked up the $40,000 tab.

Unbeknownst to them, the psychologists were all looking at the same set of four cases. But they were "primed" to consider the case from either a defense or prosecution point of view by a research confederate, an actual attorney who pretended to work on a Sexually Violent Predator (SVP) unit. In his defense attorney guise, the confederate made mildly partisan but realistic statements such as "We try to help the court understand that ... not every sex offender really poses a high risk of reoffending." In his prosecutor role, he said, "We try to help the court understand that the offenders we bring to trial are a select group [who] are more likely than other sex offenders to reoffend." In both conditions, he hinted at future work opportunities if the consultation went well. 

The deception was so cunning that only four astute participants smelled a rat; their data were discarded.

As expected, the adversarial allegiance effect was stronger for the PCL-R, which is more subjectively scored. (Evaluators must decide, for example, whether a subject is "glib" or "superficially charming.") Scoring differences on the Static-99R only reached statistical significance in one out of the four cases.

The groundbreaking research, to be published in the journal Psychological Science, echoes previous findings by the same group regarding partisan bias in actual court cases. But by conducting a true experiment in which participants were randomly assigned to either a defense or prosecution condition, the researchers could rule out selection bias as a cause. In other words, the adversarial allegiance bias cannot be solely due to attorneys shopping around for simpatico experts, as the experimental participants were randomly assigned and had no group differences in their attitudes about civil commitment laws for sex offenders.

Sexually Violent Predator cases are an excellent arena for studying adversarial allegiance, because the typical case boils down to a "battle of the experts." Often, the only witnesses are psychologists, all of whom have reviewed essentially the same material but have differing interpretations about mental disorder and risk. In actual cases, the researchers note, the adversarial pressures are far higher than in this experiment:
"This evidence of allegiance was particularly striking because our experimental manipulation was less powerful than experts are likely to encounter in most real cases. For example, our participating experts spent only 15 minutes with the retaining attorney, whereas experts in the field may have extensive contact with retaining attorneys over weeks or months. Our experts formed opinions based on files only, which were identical across opposing experts. But experts in the field may elicit different information by seeking different collateral sources or interviewing offenders in different ways. Therefore, the pull toward allegiance in this study was relatively weak compared to the pull typical of most cases in the field. So the large group differences provide compelling evidence for adversarial allegiance."

This is just the latest in a series of stunning findings by this team of psychologists led by Daniel Murrie of the University of Virginia and Marcus Boccaccini of Sam Houston University on an allegiance bias among psychologists. The tendency of experts to skew data to fit the side who retains them should come as no big surprise. After all, it is consistent with 2009 findings by the National Academies of Science calling into question the reliability of all types of forensic science evidence, including supposedly more objective techniques such as DNA typing and fingerprint analysis.

Although the group's findings have heretofore been published only in academic journals and have found a limited audience outside of the profession, this might change. A Huffington Post blogger, Wray Herbert, has published a piece on the current findings, which he called "disturbing." And I predict more public interest if and when mainstream journalists and science writers learn of this extraordinary line of research.

In the latest study, Murrie and Boccaccini conducted follow-up analyses to determine how often matched pairs of experts differed in the expected direction. On the three cases in which clear allegiance effects showed up in PCL-R scoring, more than one-fourth of score pairings had differences of more than six points in the expected direction. Six points equates to about two standard errors of measurement (SEM's), which should  happen by chance in only 2 percent of cases. A similar, albeit milder, effect was found with the Static-99R.

Adversarial allegiance effects might be even stronger in less structured assessment contexts, the researchers warn. For example, clinical diagnoses and assessments of emotional injuries involve even more subjective judgment than scoring of the Static-99 or PCL-R.

But ... WHICH psychologists?!


For me, this study raised a tantalizing question: Since only some of the psychologists succumbed to the allegiance effect, what distinguished those who were swayed by the partisan pressures from those who were not?

The short answer is, "Who knows?"

The researchers told me that they ran all kinds of post-hoc analyses in an effort to answer this question, and could not find a smoking gun. As in a previous research project that I blogged about, they did find evidence for individual differences in scoring of the PCL-R, with some evaluators assigning higher scores than others across all cases. However, they found nothing about individual evaluators that would explain susceptibility to adversarial allegiance. Likewise, the allegiance effect could not be attributed to a handful of grossly biased experts in the mix.

In fact, although score differences tended to go in the expected direction -- with prosecution experts giving higher scores than defense experts on both instruments -- there was a lot of variation even among the experts on the same side, and plenty of overlap between experts on opposing sides.

So, on average prosecution experts scored the PCL-R about three points higher than did the defense experts. But the scores given by experts on any given case ranged widely even within the same group. For example, in one case, prosecution experts gave PCL-R scores ranging from about 12 to 35 (out of a total of 40 possible points), with a similarly wide range among defense experts, from about 17 to 34 points. There was quite a bit of variability on scoring of the Static-99R, too; on one of the four cases, scores ranged all the way from a low of two to a high of ten (the maximum score being 12).

When the researchers debriefed the participants themselves, they didn't have a clue as to what caused the effect. That's likely because bias is mostly unconscious, and people tend to recognize it in others but not in themselves. So, when asked about factors that make psychologists vulnerable to allegiance effects, the participants endorsed things that applied to others and not to them: Those who worked at state facilities thought private practitioners were more vulnerable; experienced evaluators thought that inexperience was the culprit. (It wasn't.)

I tend to think that greater training in how to avoid falling prey to cognitive biases (see my previous post exploring this) could make a difference. But this may be wrong; the experiment to test my hypothesis has not been run. 

The study is: "Are forensic experts biased by the side that retained them?" by Daniel C. Murrie, Marcus T. Boccaccini, Lucy A. Guarnera and Katrina Rufino, forthcoming from Psychological Science. Contact the first author (HERE) if you would like to be put on the list to receive a copy of the article as soon as it becomes available.

Click on these links for lists of my numerous prior blog posts on the PCL-R, adversarial allegiance, and other creative research by Murrie, Boccaccini and their prolific team. Among my all-time favorite experiments from this research team is: "Psychopathy: A Rorschach test for pychologists?"

January 27, 2013

Showdown looming over predictive accuracy of actuarials

Large error rates thwart individual risk prediction
Brett Jordan David Macdonald (Creative Commons license)
If you are involved in risk assessments in any way (and what psychology-law professional is not, given the current cultural landscape?), now is the time to get up to speed on a major challenge that's fast gaining recognition.

At issue is whether the margins of error around scores are so wide as to prevent reliable prediction of an individual's risk, even as risk instruments show some (albeit weak) predictive accuracy on a group level. If the problem is unsolvable, as critics maintain, then actuarial tools such as the Static-99 and VRAG should be barred from court, where they can literally make the difference between life and death.

The debate has been gaining steam since 2007, with a series of back-and-forth articles in academic journals (see below). Now, the preeminent journal Behavioral Sciences and the Law has published findings by two leading forensic psychologists from Canada and Scotland that purport to demonstrate once and for all that the problem is "an accurate characterization of reality" rather than a statistical artifact as the actuarials' defenders had argued.

So-called actuarial tools have become increasingly popular over the last couple of decades in response to legal demand. Instruments such as the Static-99 (for sexual risk) and the VRAG (for general violence risk) provide quick-and-dirty ways to guess at an individual's risk of violent or sexual recidivism. Offenders are scored on a set of easy-to-collect variables, such as age and number of prior convictions. The assumption is that an offender who attains a certain score resembles the larger group of offenders in that score range, and therefore is likely to reoffend at the same rate as the collective.

Responding to criticisms of the statistical techniques they used in their previous critiques, Stephen Hart of Simon Fraser University and David Cooke of Glasgow Caledonian University developed an experimental actuarial tool that worked on par with existing actuarials to separate offenders into high- and low-risk groups.* The odds of sexual recidivism for subjects in the high-risk group averaged 4.5 times that of those in the low-risk group. But despite this large average difference, the researchers established through a traditional statistical procedure, logistic regression, that the margins of error around individual scores were so large as to make risk distinctions between individuals "virtually impossible." In only one out of 90 cases was it possible to say that a subject's predicted risk of failure was significantly higher than the overall baseline of 18 percent. (See figure.)

Vertical lines show confidence intervals for individual risk estimates;
these large ranges would be required in order to reach the traditional 95 percent level of certainty.

The brick wall limiting predictive accuracy at the individual level is not specific to violence risk. Researchers in more established fields, such as medical pathology, have also hit it. Many of you will know of someone diagnosed with a cancer and given six months to live who managed to soldier on for years (or, conversely, who bit the dust in a matter of weeks). Such cases are not flukes: They owe to the fact the six-month figure is just a group average, and cannot be accurately applied to any individual cancer patient.

Attempts to resolve this problem via new technical procedures are "a waste of time," according to Hart and Cooke, because the problem is due to the "fundamental uncertainty in individual-level violence risk assessment, one that cannot be overcome." In other words, trying to precisely predict the future using "a small number of risk factors selected primarily on pragmatic grounds" is futile; all the analyses in the world "will not change reality."

Legal admissibility questionable 

The current study has grave implications for the legal admissibility of actuarial instruments in court. Jurisdictions that rely upon the Daubert evidentiary standard should not be allowing procedures for which the margins of error are "large, unknown, or incalculable," Hart and Cooke warn.

By offering risk estimates in the form of precise odds of a new crime within a specific period of time, actuarial methods present an image of certitude. This is especially dangerous when that accuracy is illusory. Being told that an offender "belongs to a group with a 78 percent likelihood of committing another violent offense within seven years" is highly prejudicial and may poison the judgment of triers of fact. More covertly, it influences the judgment of the clinician as well, who -- through a process known as "anchoring bias" -- may tend to judge other information in a case in light of the individual's actuarial risk score.

Classic '56 Chevy in Cuba. Photo credit: Franciscovies
With professional awareness of this issue growing, it is not only irresponsible but ethically indefensible not to inform the courts or others who retain our services about the limitations of actuarial risk assessment. The Ethics Code of the American Psychological Association, for example, requires informing clients of "any significant limitations of [our] interpretations." Unfortunately, I rarely (if ever) see limitations adequately disclosed, either in written reports or court testimony, by evaluators who rely upon the Static-99, VRAG, Psychopathy Checklist-Revised (which Cooke and statistician Christine Michie of Glasgow University tackled in a 2010 study) and similar instruments in forming opinions about individual risk.

In fact, more often than not I see the opposite: Evaluators tout the actuarial du jour as being far more accurate than "unstructured clinical judgment." That's like an auto dealer telling you, in response to your query about a vehicle's gas mileage, that it gets far more miles per gallon than your old 1956 Chevy. Leaving aside Cuba (where a long-running U.S. embargo hampers imports), there are about as many gas-guzzling '56 Chevys on the roads in 2013 as there are forensic psychologists relying on unstructured clinical judgment to perform risk assessments. 

Time to give up the ghost? 

Hart and Cooke recommend that forensic evaluators stop the practice of using these statistical algorithms to make "mechanistic" and "formulaic" predictions. They are especially critical of the practice of providing specific probabilities of recidivism, which are highly prejudicial and likely to be inaccurate.

"This actually isn’t a radical idea; until quite recently, leading figures in the field of forensic mental health [such as Tom Grisso and Paul Appelbaum] argued that making probabilistic predictions was questionable or even ill advised," they point out. “Even in fields where the state of knowledge is arguably more advanced, such as medicine, it is not routine to make individual predictions.”

They propose instead a return to evidence-based approaches that more wholistically consider the individual and his or her circumstances:

From both clinical and legal perspectives, it is arbitrary and therefore inappropriate to rely solely on a statistical algorithm developed a priori - and therefore developed without any reference to the facts of the case at hand - to make decisions about an individual, especially when the decision may result in deprivation of liberties. Instead, good practice requires a flexible approach, one in which professionals are aware of and rely on knowledge of the scientific literature, but also recognize that their decisions ultimately require consideration of the totality of circumstances - not just the items of a particular test. 

In the short run, I am skeptical that this proposal will be accepted. The foundation underlying actuarial risk assessment may be hollow, but too much construction has occurred atop it. Civil commitment schemes rely upon actuarial tools to lend an imprimatur of science, and statutes in an increasing number of U.S. states mandate use of the Static-99 and related statistical algorithms in institutional decision-making.

The long-term picture is more difficult to predict. We may look back sheepishly on today's technocratic approaches, seeing them as emblematic of overzealous and ignorant pandering to public fear. Or -- more bleakly -- we may end up with a rigidly controlled society like that depicted in the sci-fi drama Gattaca, in which supposedly infallible scientific tests determine (and limit) the future of each citizen.

* * * * *

I recommend the article, "Another Look at the (Im-)Precision of IndividualRisk Estimates Made Using Actuarial RiskAssessment Instruments." It's part of an upcoming special issue on violence risk assessment, and it provides a detailed discussion of the history and parameters of the debate. (Click HERE to request it from Dr. Hart.) Other articles in the debate include the following (in rough chronological order): 
  • Hart, S. D., Michie, C. and Cooke, D. J. (2007a). Precision of actuarial risk assessment instruments: Evaluating the "margins of error" of group v. individual predictions of violence.  British Journal of Psychiatry, 190, s60–s65. 
  • Mossman, D. and Sellke, T. (2007). Avoiding errors about "margins of error" [Letter]. British Journal of Psychiatry, 191, 561. 
  • Harris, G. T., Rice, M. E. and Quinsey, V. L. (2008). Shall evidence-based risk assessment be abandoned? [Letter]. British Journal of Psychiatry, 192, 154. 
  • Cooke, D. J. and Michie, C. (2010). Limitations of diagnostic precision and predictive utility in the individual case: A challenge for forensic practice. Law and Human Behavior, 34, 259–274. 
  • Hanson, R. K. and Howard, P. D. (2010). Individual confidence intervals do not inform decision makers about the accuracy of risk assessment evaluations. Law and Human Behavior, 34, 275–281. 
*The experimental instrument used for this study was derived from the SVR-20, a structured professional judgment tool. The average recidivism rate among the total sample was 18 percent, with 10 percent of offenders in the low-risk group and 33 percent of those in the high-risk group reoffending. The instrument's Area Under the Curve, a measure of predictive validity, was .72, which is in line with that of other actuarial instruments.

December 14, 2012

Judge bars Static-99R risk tool from SVP trial

Developers staunchly refused requests to turn over data
For several years now, the developers of the most widely used sex offender risk assessment tool in the world have refused to share their data with independent researchers and statisticians seeking to cross-check the  instrument's methodology.

Now, a Wisconsin judge has ordered the influential Static-99R instrument excluded from a sexually violent predator (SVP) trial, on the grounds that failure to release the data violates a respondent's legal right to due process.

The ruling may be the first time that the Static-99R has been excluded altogether from court. At least one prior court, in New Hampshire, barred an experimental method that is currently popular among government evaluators, in which Static-99R risk estimates are artificially inflated by comparing sex offenders to a specially selected "high-risk" sub-group, a procedure that has not been empirically validated in any published research. 

In the Wisconsin case, the state was seeking to civilly commit Homer Perren Jr. as a sexually dangerous predator after he completed a 10-year prison term for an attempted sexual assault on a child age 16 or under. The exclusion of the Static-99R ultimately did not help Perren.  This week, after a 1.5-day trial, a jury deliberated for only one hour before deciding that he met the criteria for indefinite civil commitment at the Sand Ridge Secure Treatment Center.*

Dec. 18 note: After publishing this post, I learned that the judge admitted other "actuarial" risk assessment instruments, including the original Static-99 and the MnSOST-R, which is way less accurate than the Static-99R and vastly overpredicts risk. He excluded the RRASOR, a four-item ancestor of the Static-99. In hindsight, for the defense to get the Static-99R excluded was a bit like cutting off one's nose to spite one's face.

The ruling by La Crosse County Judge Elliott Levine came after David Thornton, one of the developers of the Static-99R and a government witness in the case, failed to turn over data requested as part of a Daubert challenge by the defense. Under the U.S. Supreme Court's 1993 ruling in Daubert v. Merrell Dow Pharmaceuticals, judges are charged with the gatekeeper function of filtering evidence for scientific reliability and validity prior to its admission in court.

Defense attorney Anthony Rios began seeking the data a year ago so that his own expert, psychologist Richard Wollert, could directly compare the predictive accuracy of the Static-99R with that of a competing instrument, the Multisample Age-Stratified Table of Sexual Recidivism Rates," or MATS-1. Wollert developed the MATS-1 in an effort to improve the accuracy of risk estimation by more precisely considering the effects of advancing age. It incorporates recidivism data on 3,425 offenders published by Static-99R developer Karl Hanson in 2006, and uses the statistical method of Bayes's Theorem to calculate likelihood ratios for recidivism at different levels of risk.

The state's attorney objected to the disclosure request, calling the data "a trade secret."

Hanson, the Canadian psychologist who heads the Static-99 enterprise, has steadfastly rebuffed repeated requests to release data on which the family of instruments is based. Public Safety Canada, his agency, takes the position that it will not release data on which research is still being conducted, and that "external experts can review the data set only to verify substantive claims (i.e., verify fraud), not to conduct new analyses,"  according to a document filed in the case.

Thornton estimated that the raw data will remain proprietary for another five years, until the research group finishes its current projects and releases the data to the public domain.

While declining to release the data to the defense, Hanson agreed to release it to Thornton, the government's expert and a co-developer of the original Static-99, so that Thornton could analyze the relative accuracy of the two instruments. 

The American Psychological Association's Ethics Code requires psychologists to furnish data, after their research results are published, to "other competent professionals who seek to verify the substantive claims through reanalysis" (Section 8.14).

At least three five researchers have been rebuffed in their attempts to review Static-99 data over the past few years, for purposes of research replication and reanalysis. As described in their 2008 article, Hanson's steadfast refusals to share data required Wollert and his colleagues, statisticians Elliot Cramer and Jacqueline Waggoner, to perform complex statistical manipulations to develop their alternate methodology. (Correspondence between Hanson and Cramer can be viewed HERE.) Hanson also rejected a request by forensic psychologists Brian Abbott and Ted Donaldson; see comments section, below.


Since the Static-99 family of instruments (which include the Static-99, Static-99R, and Static-2000) began to be developed more than a decade ago, they have been in a near-constant state of flux, with risk estimates and instructions for interpretation subject to frequent and dizzying changes.

It is unfortunate, with the stakes so high, that all of these researchers cannot come together in a spirit of open exchange. I'm sure that would result in more scientifically sound, and defensible, risk estimations in court.

The timing of this latest brouhaha is apropos, as reports of bias, inaccuracy and outright fraud have  shaken the psychological sciences this year and led to more urgent calls for transparency and sharing of data by researchers. Earlier this year, a large-scale project was launched to systematically try to replicate studies published in three prominent psychological journals.

A special issue of Perspectives on Psychological Science dedicated to the problem of research bias in psychology is available online for free (HERE).

*Hat tip to blog reader David Thompson for alerting me that the trial had concluded. 

October 27, 2012

Another one bites the dust: Hollow SVP prosecution no match for jurors' common sense

15 minutes.

After a five-week trial, that's how long it took a jury in a rural Northern California county to decide that an openly gay man who had served two years in prison for a forcible oral copulation of an acquaintance back in 2003 did not merit civil commitment as a sexually violent predator.

The prosecution's case featured a lone government psychologist whose opinion rested on a hollow combination of homophobia, bogus psychiatric diagnoses and trumped-up risk estimates. The psychologist cited archaic (and discredited) Freudian theory to claim that the ex-offender's crime at age 23 was evidence of an "oral incorporation" fixation caused by a domineering mother and an absent biological father. As a legal basis for civil commitment, he cited the bogus disorder of "paraphilia not otherwise specified-nonconsent,” and he used the Static-99R actuarial tool to present a highly inflated estimate of risk.

Testifying for the defense were four psychologists, including two retained by the defense, a government evaluator who had changed her mind (or "flipped," in the current parlance) and the man's treating psychologist at Coalinga State Hospital, who testified in no uncertain terms that "Mr. Smith," as I will call him, is neither mentally disordered nor likely to reoffend.

The defense team had barely left the courthouse when the court clerk summoned them back, saying the jury had reached a verdict. Their astonishingly fast decision hints that the jurors agreed that this case was an egregious example of overzealous prosecution and a waste of their valuable time.

Prior to being screened for possible civil commitment, Mr. Smith had been on parole in the community for 14 months without getting into any trouble whatsoever. Indeed, he was busy doing good works. His sexually violent predator screening stemmed from an entirely accidental parole violation connected with his charity work for a local gay rights organization. He had a special parole condition forbidding any contact with children. When a fellow member of the executive board brought his child to an awards ceremony, Mr. Smith was exposed to "incidental contact as one might have while shopping at a market," in the words of the parole hearing officer. Unfortunately for Mr. Smith, this was just one month after California voters enacted Jessica's Law, which allows for civil commitment of sex offenders who have only one qualifying victim rather than the previous minimum of two.

The prosecutor's strategy, as is typical in weak cases, was to hurl as many prejudicial, pseudoscientific labels as possible in Mr. Smith's direction, and hope a few might stick and scare jurors into voting for civil commitment: Psychopath, antisocial, homosexual, paraphilic, high risk, etc.

While licensed as a psychologist, the government's expert had not done what clinical psychologists are trained to do: Psychological testing, individualized case formulation, etc. Rather, as he boldly admitted on the witness stand, he relied on an assistant to cull through Mr. Smith's hospital records and pull out negative behavioral reports for him to review. Wow! Can you spell B-I-A-S?

In my testimony, which stretched over the course of three days, I stressed that Mr. Smith was neither sexually deviant nor likely to reoffend. His risk of sexual reoffense, I testified, was no greater than that of any other garden-variety sex offender. (The base rate of sexual recidivism among convicted sex offenders in California -- similar to the rest of the United States -- hovers around 6 percent or less.) I explained how growing up gay in a homophobic family and community causes sexual identity confusion that can lead to sexual acting out and other delinquent behavior in adolescence and early adulthood, and how Mr. Smith had changed as he matured and accepted his sexuality. I further debunked the accuracy of the Static-99R "actuarial" risk estimates assigned in this case, and the pretextually applied diagnoses of "paraphilia not otherwise specified-nonconsent" (which I've blogged about repeatedly) and antisocial personality disorder, a red herring that was invoked despite Mr. Smith's exceptionally good conduct in the community and while in prison.

Stacking the deck

The prosecutor tried to stack the deck by striking from the jury all gay people or those who admitted having relatives or close friends who are gay; he also challenged those with advanced educational degrees. I guess he thought it would be easier to pull the wool over the eyes of an uneducated jury. It just goes to show that times have changed: Even in a rural county, antigay discrimination is no longer considered acceptable, and jurors don't need PhD's to recognize bias and pseudoscience when they hear it.  

The verdict was likely a bitter-sweet moment for Mr. Smith, who had spent more than four years incarcerated at Coalinga awaiting trial. Luckily, he has close friends to stay with while getting on his feet.

This is my third SVP case in a row that evaporated when finally exposed to the light of day. Like Mr. Smith's case, one of the other two also featured prominent antigay bias; the other targeted an immigrant. In neither case were the men either pedophiles or rapists.

I suppose I should feel pleased to see such gross miscarriages of justice thwarted. Instead, I find myself horrified by the unfettered power wielded by rogue psychologists, assigned to a case by luck of the draw. Whereas many government evaluators reserve "positive" findings for the rare sex offenders who are truly deviant and at high risk to reoffend, others are just hacks who are raking in obscene amounts of public funds while making little effort to truly understand these men, their motivations, their circumstances, or their pathways to desistance.

Especially frightening is the unconscious bias that creeps into SVP prosecutions. The constructs of "mental disorder" and "risk for reoffense" are malleable, lending themselves to use as pretextual weapons of prejudice wielded against gay men, racial minorities (especially African American men) and immigrants.

Clearly, people shouldn't get away with sexual misconduct. But none of these men had. All had pleaded guilty and served their time, only to be ambushed at the end of their prison terms with misguided efforts to indefinitely detain them based on purported future risk.

As it turned out, each case was about as solid as a house of cards. It didn't take gale-force winds like Hurricane Sandy's to flatten them.

Evaluators flipping like pancakes

The "flipping" of government evaluators illustrated this weak foundation. In two of the three cases, after reading the more thorough and individualized reports of the defense-retained experts, government psychologists abruptly changed their minds and decided that their previously proffered diagnoses of "paraphilia not otherwise-nonconsent" were invalid.

On the one hand, I applaud the openness and ethical backbone such a change of heart signals. But these "flips" also demonstrate the whimsical, nonscientific nature of the commitment process. The longer I work in these trenches, the more I realize that the random assignment of evaluators and attorneys (on both sides) exerts as much influence on the outcome as does the true level of future risk to the community that an ex-offender poses.

Indeed, the real reason Mr. Smith -- clearly not a sexual predator to anyone with a whit of commons sense -- was taken to trial, at a total cost to the citizenry of hundreds of thousands of dollars, was not because of his high risk, but because of a rigid prosecutor who was blind to the writing on the wall.

In contrast, the government dismissed the other two cases (one in the Midwest and one in the South) on the eve of trial. One case involved a gay man who had a brief sexual interlude with a teenage male relative; the other involved an immigrant who had gone on two dates with an underage teen girl he met on an online dating site (his misconduct never went beyond petting). Both had served substantial prison terms. But, again, garden-variety sex offenders, not the depraved, sex-crazed monsters likely envisioned by jurors when they are told they will be deciding a "sexually violent predator" case.

Bottom line: Should a random clinical psychologist, earning hundreds of thousands of dollars a year churning out boilerplate pseudoscientific garbage, be allowed to decide the fates of others?

At least in this one case, 12 discerning and conscientious jurors answered that question with a resounding "NO."


ON OTHER,TOTALLY UNRELATED NOTES: If you're looking for an intelligent movie in theaters now (always a challenging search), ARGO earns a qualified thumbs-up from me; my review is HERE. (If you find the review helpful, please click on "yes" at the bottom.) I've also just finished reading a thoroughly researched and well-written cultural biography of John Brown, Midnight Rising, that positions his raid on Harper's Ferry as a seminal moment in the lead-up to the Civil War. Tony Horwitz previously wrote Conservatives in the Attic, which -- as the descendant of Southerners -- I found spot-on.

July 22, 2012

Aurora massacre: To speak or not to speak?

The blood on the movie theater floor was still tacky when mental health professionals began pontificating on the psychology of the mass murderer. Among the brashest self-promoters was a forensic psychologist who shamelessly asserted his preternatural ability to "look inside the mind" of the Aurora, Colorado massacre suspect.
Much of the psycho-punditry reads like it was pulled from a psychoanalytic fortune cookie:
  • James Holmes is a "deeply disturbed" individual. 
  • He may, or may not, be psychotic and delusional. 
  • He harbors a lot of rage.
Such "armchair psychology" is a natural byproduct of the news media's frenetic competition for online traffic. To object is as pointless as it would have been to stand in the killer's way and shout "stop!" as he opened fire during the Batman movie.

But some are nonetheless voicing criticism, saying it is both misleading and irresponsible to speculate at this early stage about the accused's state of mind. Curtis Brainard of the venerated Columbia Journalism Review goes so far as to call it unethical, a violation of the so-called "Goldwater Rule" of 1973. That principle cautions psychiatrists not to offer a professional opinion without having conducted a psychiatric examination and "been granted proper authorization for such a statement."

While that ethics rule applies only to psychiatrists, the American Psychological Association has a very similar one. Section 9.01 cautions psychologists to "provide opinions of the psychological characteristics of individuals only after they have conducted an examination of the individuals adequate to support their statements or conclusions."

But it is in the gray area of interpreting these ethics rules that reasonable minds differ. Indisputably, we should not attempt to clinically diagnose Mr. Holmes absent a formal evaluation. But must professionals with expertise in the general patterns underlying mass killings stand silently on the sidelines, refraining from offering any collective wisdom to the public?

As a blogger who frequently comments on breaking news stories pertinent to forensic psychology, I have often grappled with this conundrum. When the UK Guardian asked me to write a commentary on Phillip Garrido, the kidnapper and rapist of Jaycee Dugard, I ultimately decided that providing general information about the forensic implications of the case was an appropriate public service that did not violate any ethics rules.

Consider this commentary by high-profile forensic psychiatrist Michael Welner on a Washington Post blog:
Mass shooting cases have the common motive of an attacker seeking immortality. Each of the attackers have different degrees of paranoia and resentment of the broader community. Some are so paranoid that they’re psychotic. Others are paranoid in a generally resentful way but have no significant psychiatric illness. But you have to hate everyone in order to kill anyone. The threshold that the mass shooter crosses is one in which he decides that his righteous indignation and entitlement to destroy is more important than the life of any random person that he might kill. This is why mass shooting are invariably, invariably carried out by people who have had high self esteem. They are people who had high expectations of themselves. It’s not at all surprising to hear about these crimes in people who either valued their own intelligence or their own career prospects at one time. They’re people who are unfailingly unable to form satisfying sexual attachments and their masculinity essentially gets replaced with their fascination for destruction.
Now, I don't always see eye to eye with Dr. Welner, author of the controversial "Depravity Scale." But the above perspective has the potential to contribute to informed discussion of the Aurora tragedy. It doesn't matter whether every single detail turns out to be a precise fit; the comments are general enough to enlighten without stepping over the line to claim an ability to see into Holmes's troubled soul.

One could even argue that we as professionals have an affirmative duty to help offset the inane speculation that pours in to fill any vacuum in the cutthroat world of daily journalism: Portrayals of Holmes as a "recluse" and a "loner" because he didn’t converse with his neighbors; assertions that he "didn’t seem like the type" to massacre a dozen people, because he appeared superficially "normal"; simplistic theories blaming the tragedy on violence in the media or the legality of gun ownership.

Our field is positioned to help the public separate the wheat from the chaff. We can discuss the complex admixture of entitlement, alienation and despair that contributes to these catastrophic explosions. Equally important, we can remind the public that such rampages are rare and unpredictable, and that knee-jerk, "memorial crime control" responses are unwarranted and potentially dangerous. We can urge restraint in jumping to conclusions absent the facts, lest we -- as journalist Dave Cullen, author of the book Columbine, warns in yesterday's New York Times -- contribute to harmful myth-making:  
Over the next several days, you will be hit with all sorts of evidence fragments suggesting one motive or another. Don’t believe any one detail. Mr. Holmes has already been described as a loner. Proceed with caution on that. Nearly every shooter gets tagged with that label, because the public is convinced that that’s the profile, and people barely acquainted with the gunman parrot it back to every journalist they encounter. The Secret Service report determined that it’s usually not true. Resist the temptation to extrapolate details prematurely into a whole…. The killer is rarely who he seems.
But we should also recognize the limitations of our discipline’s micro focus on the individual, and encourage the public to grapple with the larger issues raised by this cultural affliction of the late-20th and early 21st century. As I commented last year in regard to the media coverage of the Jared Loughner shooting rampage in Arizona, journalists need to train a macro lens on the cultural forces that lead disaffected middle-class men -- like canaries in a coal mine -- to periodically self-implode with rage. Disciplines such as sociology, anthropology and cultural studies have much to contribute to this much-needed analysis.

The irony of the Aurora case is hard to miss. An attack in a movie theater featuring The Dark Knight Rises, a movie in which a masked villain leads murderous rampages against unsuspecting citizens in public venues including a packed football stadium and the stock exchange.

As Salon film critic Andrew O'Hehir noted in an insightful essay entitled, "Does Batman Have Blood on his Hands?":
Whether or not Holmes had any particular interest in “The Dark Knight Rises,” he saw correctly that in our increasingly fragmented culture it was the biggest mass-culture story of the year and one of the biggest news stories of any kind. Shoot up a KenTaco Hut or a Dunkin’ Donuts, in standard suburban-nutjob fashion, and you get two or three days of news coverage, tops. Shoot up the premiere of a Batman movie, and you become a symbol and provoke a crisis of cultural soul-searching.
Bottom line: The larger error is not for informed professionals to respond -- cautiously, of course -- to media inquiries but, rather, for the public to settle for facile explanations, in which calling someone crazy or disturbed is mistaken for understanding what is going on. 

POSTSCRIPT: See media critic Gene Lyons's article, linking to this post, at the National Memo. 

Related blog posts: 

January 14, 2012

Martin Luther King Jr. on maladjustment

Last year, in honor of Martin Luther King Day, I excerpted a large portion of a keynote speech the visionary civil rights leader delivered at the 1967 convention of the American Psychological Association, just seven months before he was gunned down and at a time when he was drawing larger connections between racial oppression and the Vietnam War. This year, I am excerpting only one short section, but I have made the entire speech, "The Role of the Behavioral Scientist in the Civil Rights Movement," available for download (HERE). It's 45 years old, but still remarkably relevant today.

There are certain technical words in every academic discipline which soon become stereotypes and even clichés. Every academic discipline has its technical nomenclature. You who are in the field of psychology have given us a great word. It is the word maladjusted. This word is probably used more than any other word in psychology. It is a good word; certainly it is good that in dealing with what the word implies you are declaring that destructive maladjustment should be destroyed. You are saying that all must seek the well-adjusted life in order to avoid neurotic and schizophrenic personalities.

But on the other hand, I am sure that we will recognize that there are some things in our society, some things in our world, to which we should never be adjusted. There are some things concerning which we must always be maladjusted if we are to be people of good will. We must never adjust ourselves to racial discrimination and racial segregation. We must never adjust ourselves to religious bigotry. We must never adjust ourselves to economic conditions that take necessities from the many to give luxuries to the few. We must never adjust ourselves to the madness of militarism, and the self-defeating effects of physical violence....

Thus, it may well be that our world is in dire need of a new organization, The International Association for the Advancement of Creative Maladjustment. Men and women should be as maladjusted as the prophet Amos, who in the midst of the injustices of his day, could cry out in words that echo across the centuries, 'Let justice roll down like waters and righteousness like a mighty stream'; or as maladjusted as Abraham Lincoln, who in the midst of his vacillations finally came to see that this nation could not survive half slave and half free; or as maladjusted as Thomas Jefferson, who in the midst of an age amazingly adjusted to slavery, could scratch across the pages of history, words lifted to cosmic proportions, 'We hold these truths to be self evident, that all men are created equal. That they are endowed by their creator with certain inalienable rights. And that among these are life, liberty, and the pursuit of happiness.' And through such creative maladjustment, we may be able to emerge from the bleak and desolate midnight of man’s inhumanity to man, into the bright and glittering daybreak of freedom and justice.

I have not lost hope. I must confess that these have been very difficult days for me personally. And these have been difficult days for every civil rights leader, for every lover of justice and peace.

November 20, 2011

Psychology rife with inaccurate research findings

The case of a Dutch psychologist who fabricated experiments out of whole cloth for at least a decade is shining a spotlight on systemic flaws in the reporting of psychological research.

Diederik Stapel, a well-known and widely published psychologist in the Netherlands, routinely falsified data and made up entire experiments, according to an investigative committee.

But according to Benedict Carey of the New York Times, the scandal is just one in a string of embarrassments in "a field that critics and statisticians say badly needs to overhaul how it treats research results":
In recent years, psychologists have reported a raft of findings on race biases, brain imaging and even extrasensory perception that have not stood up to scrutiny…. 
Dr. Stapel was able to operate for so long, the committee said, in large measure because he was “lord of the data,” the only person who saw the experimental evidence that had been gathered (or fabricated). This is a widespread problem in psychology, said Jelte M. Wicherts, a psychologist at the University of Amsterdam. In a recent survey, two-thirds of Dutch research psychologists said they did not make their raw data available for other researchers to see. "This is in violation of ethical rules established in the field," Dr. Wicherts said.
In a survey of more than 2,000 American psychologists scheduled to be published this year, Leslie John of Harvard Business School and two colleagues found that 70 percent had acknowledged, anonymously, to cutting some corners in reporting data. About a third said they had reported an unexpected finding as predicted from the start, and about 1 percent admitted to falsifying data.
Also common is a self-serving statistical sloppiness. In an analysis published this year, Dr. Wicherts and Marjan Bakker, also at the University of Amsterdam, searched a random sample of 281 psychology papers for statistical errors. They found that about half of the papers in high-end journals contained some statistical error, and that about 15 percent of all papers had at least one error that changed a reported finding -- almost always in opposition to the authors' hypothesis….
Forensic implications

While inaccurate and even fabricated findings make the field of psychology look silly, they take on potentially far more serious ramifications in forensic contexts, where the stakes can include six-figure payouts or extreme deprivations of liberty.

For example, claims based on fMRI brain-scan studies are increasingly being allowed into court in both criminal and civil contexts. Yet, a 2009 analysis found that about half of such studies published in prominent scientific journals were so "seriously defective" that they amounted to voodoo science that "should not be believed."

Similarly, researcher Jay Singh and colleagues have found that meta-analyses purporting to show the efficacy of instruments used to predict who will be violent in the future are plagued with problems, including failure to adequately describe study search procedures, failure to check for overlapping samples or publication bias, failure to investigate the confound of sample heterogeneity, and use of a problematic statistical technique, the Area Under the Curve (AUC), to measure predictive accuracy.

Particularly troubling to me is a brand-new study finding that researchers' willingness to share their data is directly correlated with the strength of the evidence and the quality of reporting of statistical results. (The analysis is available online from the journal PloS ONE.)

I have heard about several researchers in the field of sex offender risk assessment who stubbornly resist efforts by other researchers to obtain their data for reanalysis. As noted by Dr. Wicherts, the University of Amsterdam psychologist, this is a violation of ethics rules. Most importantly, it makes it impossible for us to be confident about the reliability and validity of these researchers' claims. Despite this, potentially unreliable instruments -- some of them not even published -- are routinely introduced in court to establish future dangerousness.

Critics say the widespread problems in the field argue strongly for mandatory reforms, including the establishment of policies requiring that researchers archive their data to make it available for inspection and analysis by others. This reform is important for the credibility of psychology in general, but absolutely essential in forensic psychology.

Related blog posts:
Hat tips: Ken Pope and Jane

    New article of related interest:

    Psychological Science (November 2011)
    Joseph Simmons, Leif Nelson, and Uri Simonsohn (click on any of the authors' names to request a copy)

    From the abstract: This article show[s] that despite empirical psychologists' nominal endorsement of a low rate of false-positive findings (≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis.

    October 20, 2011

    More on test administration issues in Twilight Rapist case

    Alan Cohen, the attorney in the Billy Joe Harris case that I blogged about last week, wrote to clarify the unusual test administration procedures of psychiatrist Colin Ross, who testified for the defense. Because his letter (which he posted at my Psychology Today blog, Witness) is of general interest to forensic psychology, I re-post it here, along with my response:

    Mr. Cohen wrote:
    I found your article of interest and hope this will create a forum for further discussion on DID and its use in the courtroom setting.

    The issue of my administering the examination to my client took on a sinister spin from the way it was interpreted by Dr. [Robert] Barden when in fact it was nothing more then my hand-carrying it to the jail and passing the sealed envelope into the hand of a deputy who then gave it to my client. The transaction took less then a minute. I remained in an attorney booth with my client who spent four hours answering the self-administered questions. When he completed the exam he placed the results in an envelope and sealed it. He then handed the envelope to a deputy who then gave it to me. That transaction took less then a minute.

    I personally carried the test to the jail so that the contents would not be examined by either the sheriffs department or the prosecutors office since Mr. Harris was under extremely tight surveillance and the results of the test would/could form the basis of our defense. I could not jeopardize the results of the exam being compromised by falling into the "wrong hands."

    * * * * *

    Mr. Cohen,

    Thanks for writing to clarify the circumstances of the test administration. I have seen other cases in which psychologists have had third parties administer psychological tests, or have even given prisoners tests to fill out in their spare time and return at their leisure. While the intermediary who delivers the test is not doing anything sinister, from the standpoint of professional ethics and practice there are several problems with such practices.

    First and foremost, if a test is standardized -- that is, if it has norms to which an individual is being compared -- then such procedures violate the standardized administration and may invalidate the results.

    Second, such procedures violate test security.

    Third, they prevent the expert from ensuring the adequacy of testing conditions, or of observing the individual as he performs the tasks; observation by skilled examiners can be an important component of one's ultimate opinions. Relatedly, sitting with the test-taker allows the examiner to assess for adequate comprehension, and answer any questions that may come up.

    When Dr. Barden testified that it was unethical for the attorney to administer the tests, he was likely referring to the Ethics Code for psychologists, as well as the Standards for Educational and Psychological Testing ("The Standards") promulgated by the American Educational Research Association, the American Psychological Association and the National Council on Measurement in Education.

    As noted in the introduction to the Standards, which apply to everyone who administers, scores and interprets psychological or educational tests, regardless of whether they are a psychologist:
    The improper use of tests can cause considerable harm to test takers and other parties affected by test-based decisions. The intent of the Standards is to promote the sound and ethical use of tests and to provide a basis for evaluating the quality of testing practices. 
    Collectively, the Ethics Code and the Standards require that:
    • Test administrators receive proper training (Ethics Code 9.07; Standards 12.8)
    • Tests not be administered by unqualified persons (Ethics Code 9.07; Standards 12.8)
    • Examinees receive proper informed consent (Ethics Code 9.03; Standards 12.10)
    • Test data be kept confidential and secure (Ethics Code 9.04; Standards 12.11)
    • Assessment techniques be protected from disclosure to the extent permitted by law (Ethics Code 9.11; Standards 12.11) 
    Again, I appreciate your taking the time to write.

    NOTE: After I posted this exchange, the testifying psychiatrist, Colin A. Ross, posted a comment at my Psychology Today blog. He provided more information about the screening tests for dissociation and why they were administered as they were. He also offered his opinion on the validity of Dissociative Identity Disorder. His comment can be viewed HERE. Please feel free to join in the discussion, either here or (preferably) at my Witness blog, where the conversation began.

    August 18, 2011

    At long last: New forensic specialty guidelines approved

    After a 9-year revision process, the American Psychological Association has finally approved new Specialty Guidelines for Forensic Psychologists. The Guidelines will replace those established in 1991.

    The Guidelines are intended for use not only by forensic psychologists, but by any psychologist when engaged in the practice of forensic psychology. Forensic psychology is defined as the application of any specialized psychological knowledge to a legal context, to assist in addressing legal, contractual, and administrative matters. The Guidelines are also meant to provide guidance on professional conduct to the legal system, and other organizations and professions.

    Guidelines differ from standards, such as those in the APA’s Ethics Code, in that they are aspirational rather than mandatory. They are intended to facilitate the continued systematic development of the profession and facilitate a high level of practice by psychologists, rather than being intended to serve as a basis for disciplinary action or civil or criminal liability.

    The revision committee, chaired by Randy Otto, included representatives of the American Psychology-Law Society (Division 41 of the APA) and the American Academy of Forensic Psychology. 

    The Guidelines will be published shortly in the American Psychologist journal. In the meantime, a draft version is available HERE. I encourage all of you to read and learn its contents. Much of it will sound familiar to those with a working knowledge of the APA’s Ethical Principles of Psychologists and Code of Conduct. Although the Guidelines dance around some of the major controversies in our field, there is still plenty to be happy about. By way of whetting your appetite (hopefully), here is a random smattering:
      2.05 Knowledge of the Scientific Foundation for Opinions and Testimony: Forensic practitioners seek to provide opinions and testimony that are sufficiently based upon adequate scientific foundation, and reliable and valid principles and methods that have been applied appropriately to the facts of the case. When providing opinions and testimony that are based on novel or emerging principles and methods, forensic practitioners seek to make known the status and limitations of these principles and methods.
      2.08 Appreciation of Individual and Group Differences: Forensic practitioners strive to understand how factors associated with age, gender, gender identity, race, ethnicity, culture, national origin, religion, sexual orientation, disability, language, socioeconomic status, or other relevant individual and cultural differences may affect and be related to the basis for people’s contact and involvement with the legal system.
      6.03 Communication with Forensic Examinee: Forensic practitioners inform examinees about the nature and purpose of the examination, … including potential consequences of participation or non-participation, if known.
      10.01 Focus on Legally Relevant Factors: Forensic practitioners are encouraged to consider the problems that may arise by using a clinical diagnosis in some forensic contexts, and consider and qualify their opinions and testimony appropriately.
      11.04 Comprehensive and Accurate Presentation of Opinions in Reports and Testimony: Forensic practitioners are encouraged to limit discussion of background information that does not bear directly upon the legal purpose of the examination or consultation. Forensic practitioners avoid offering information that is irrelevant and that does not provide a substantial basis of support for their opinions, except when required by law.
    Leonard Rubenstein, a senior scholar at the Center for Human Rights and Public Health of the Johns Hopkins Bloomberg School of Public Health, writes in a Huffington Post column that the new Guidelines will prevent psychologists from participating in abusive government interrogations as they did at Guantanamo. I think that's a stretch. These guidelines are not enforceable. And, like all such professional guidelines, they will be subject to diverse interpretations.