Showing posts with label civil commitment. Show all posts
Showing posts with label civil commitment. Show all posts

January 12, 2014

Putting the Cart Before the Horse: The Forensic Application of the SRA-FV

As the developers of actuarial instruments such as the Static-99R acknowledge that their original norms inflated the risk of re-offense for sex offenders, a brand-new method is cropping up to preserve those inflated risk estimates in sexually violent predator civil commitment trials. The method introduces a new instrument, the “SRA-FV,” in order to bootstrap special “high-risk” norms on the Static-99R. Curious about the scientific support for this novel approach, I asked forensic psychologist and statistics expert Brian Abbott to weigh in.

Guest post by Brian Abbott, PhD*

NEWS FLASH: Results from the first peer-reviewed study about the Structured Risk Assessment: Forensic Version (“SRA-FV”), published in Sexual Abuse: Journal of Research and Treatment (“SAJRT”), demonstrate the instrument is not all that it’s cracked up to be.
Promotional material for an SRA-FV training
For the past three years, the SRA-FV developer has promoted the instrument for clinical and forensic use despite the absence of peer-reviewed, published research supporting it validity, reliability, and generalizability. Accordingly, some clinicians who have attended SRA-FV trainings around the country routinely apply the SRA-FV in sexually violent predator risk assessments and testify about its results in court as if the instrument has been proven to measure what it intends to assess, has known error rates, retains validity when applied to other groups of sexual offenders, and produces trustworthy results.

Illustrating this rush to acceptance most starkly, within just three months of its informal release (February 2011) and with an absence of any peer-reviewed research, the state of California incredibly decided to adopt the SRA-FV as its statewide mandated dynamic risk measure for assessing sexual offenders in the criminal justice system. This decision was rescinded in September 2013, with the SRA-FV replaced with a similar instrument, the Stable-2007.

The SRA-FV consists of 10 items that purportedly measure “long-term vulnerabilities” associated with sexual recidivism risk. The items are distributed among three risk domains and are assessed using either standardized rating criteria devised by the developer or by scoring certain items on the Psychopathy Checklist-Revised (PCL-R). Scores on the SRA-FV range from zero to six. Some examples of the items from the instrument include: sexual interest in children, lack of emotionally intimate relationships with adults, callousness, and internal grievance thinking. Patients from the Massachusetts Treatment Center in Bridgewater, Massachusetts who were evaluated as sexually dangerous persons between 1959 and 1984 served as members of the SRA-FV construction group (unknown number) and validation sample (N = 418). It was released for use by Dr. David Thornton, a co-developer of the Static-99R, Static-2002R, and SRA-FV and research director at the SVP treatment program in Wisconsin, in December 2010 during training held in Atascadero, California. Since then, Dr. Thornton has held similar trainings around the nation where he asserts that the SRA-FV is valid for predicting sexual recidivism risk, achieves incremental validity over the Static-99R, and can be used to choose among Static-99R reference groups.

A primary focus of the trainings is a novel system in which the total score on the SRA-FV is used to select one Static-99R “reference group” among three available options. The developer describes the statistical modeling underlying this procedure, which he claims increases predictive validity and power over using the Static-99R alone. However, reliability data is not offered to support this claim. In the December 2010 training, several colleagues and I asked for the inter-rater agreement rate but Dr. Thornton refused to provide it.

I was astounded but not surprised when some government evaluators in California started to apply the SRA-FV in sexually violent predator risk assessments within 30 days after the December 2010 training. This trend blossomed in other jurisdictions with sexually violent predator civil confinement laws. Typically, government evaluators applied the SRA-FV to select Static-99R reference groups, invariably choosing to compare offenders with the “High Risk High Needs” sample with the highest re-offense rates. A minority of clinicians stated in reports and court testimony that the SRA-FV increased predictive accuracy over the Static-99R alone but they were unable to quantify this effect. The same clinicians have argued that the pending publication of the Thornton and Knight study was sufficient to justify its use in civil confinement risk assessments for sexually violent predators. They appeared to imply that the mere fact that a construction and validation study had been accepted for publication was an imprimatur that the instrument was reliable and valid for its intended purposes. Now that the research has been peer-reviewed and published, the results reflect that these government evaluators apparently put the proverbial cart before the horse.

David Thornton and Raymond Knight penned an article that documents the construction and validation of the SRA-FV. The publication is a step in the right direction, but by no means do the results justify widespread application of the SRA-FV in sexual offender risk assessment in general or sexually violent predator proceedings in particular. Rather, the results of the study only apply to the group upon which the research was conducted and do not generalize to other groups of sexual offenders. Before discussing the limitations of the research, I would like to point out some encouraging results.

The SRA-FV did, as its developer claimed, account for more sources of sexual recidivism risk than the Static-99R alone. However, it remains unknown which of the SRA-FV’s ten items contribute to risk prediction. The study also found that the combination of the Static-99R and SRA-FV increased predictive power. This improved predictive accuracy, however, must be replicated to determine whether the combination of the two instruments will perform similarly in other groups of sexual offenders. This is especially important when considering that the SRA-FV was constructed and validated on individuals from the Bridgewater sample from Massachusetts who are not representative of contemporary groups of sexual offenders. Thornton and Knight concede this point when discussing how the management of sexual offenders through all levels of the criminal justice system in Massachusetts between 1959 and 1984 was remarkably lenient compared to contemporary times. Such historical artifacts likely compromise any reliable generalization from patients at Bridgewater to present-day sexual offenders.

Training materials presented four months before
State of California rescinded use of the SRA-FV

Probably the most crucial finding from the study is the SRA-FV’s poor inter-rater reliability. The authors categorize the 64 percent rate of agreement as “fair.” It is well known that inter-rater agreement in research studies is typically higher than in real-world applications. This has been addressed previously in this blog in regard to the PCL-R. A field reliability study of the SRA-FV among 19 government psychologists rating 69 sexually violent predators in Wisconsin (Sachsenmaier, Thornton, & Olson, 2011) found an inter-rater agreement rate of only 55 percent for the SRA-FV total score, which is considered as poor reliability. These data illustrate that 36 percent to 45 percent of an SRA-FV score constitutes error, raising serious concerns over the trustworthiness of the instrument. To their credit, Thornton and Knight acknowledge this as an issue and note that steps should be taken to increase reliable scoring. Nonetheless, the current inter-rater reliability falls far short of the 80 percent floor recommended for forensic practice (Heilbrun, 1992). Unless steps are taken to dramatically improve reliability, the claims that the SRA-FV increases predictive accuracy either alone or in combination with the Static-99R, and that it should be used to select Static-99R reference groups, are moot.

It is also important to note that, although Thornton and Knight confuse the terms validation and cross validation in their article, this study represents a validation methodology. Cross-validation is a process by which the statistical properties found in a validation sample (such as reliability, validity, and item correlations) are tested in a separate group to see whether they hold up. In contrast, Thornton and Knight first considered the available research data from a small number of individuals from the Bridgewater group to determine what items would be included in the SRA-FV. This group is referred to as the construction sample. The statistical properties of the newly conceived measure were studied on 418 Bridgewater patients who constitute the validation sample. The psychometric properties of the validation group have not been tested on other contemporary sexual offender groups. Absent such cross-validation studies, we simply have no confidence that the SRA-FV works at it has been designed for groups other than the sample upon which it was validated. To their credit, Thornton and Knight acknowledge this limitation and warn readers not to generalize the validation research to contemporary groups of sexual offenders.

The data on incremental predictive validity, while interesting, have little practical value at this point for two reasons. One, it is unknown whether the results will replicate in contemporary groups of sexual offenders. Two, no data are provided to quantify the increased predictive power. The study does not provide an experience table of probability estimates at each score on the Static-99R after taking into account the effect of the SRA-FV scores. It seems disingenuous, if not misleading, to inform the trier of fact that the combined measures increase predictive power but to fail to quantify the result and the associated error rate.

In my practice, I have seen the SRA-FV used most often to select among three Static-99R reference groups. Invariably, government evaluators in sexually violent predator risk assessments assign SRA-FV total scores consistent with the selection of the Static-99R High Risk High Needs reference group. Only the risk estimates associated with the highest Static-99R scores in this reference group are sufficient to support an opinion that an individual meets the statutory level of sexual dangerousness necessary to justify civil confinement. Government evaluators who have used the SRA-FV for this purpose cannot cite research demonstrating that the procedure works as intended or that it produces a reliable match to the group representing the individual being assessed. Unfortunately, Thornton and Knight are silent on this application of the SRA-FV.

In a recently published article, I tested the use of the SRA-FV for selecting Static-99R reference groups. In brief, Dr. Thornton used statistical modeling based solely on data from the Bridgewater sample to devise this model. The reference group selection method was not based on the actual scores of members from each of the three reference groups. Rather, it was hypothetical, presuming that members of a Static-99R reference group will exhibit a certain range of SRA-FV score that do not overlap with any of the other two reference groups. To the contrary, I found that the hypothetical SRA-FV reference group system did not work as designed, as the SRA-FV scores between reference groups overlapped by wide margins. In other words, the SRA-FV total score would likely be consistent with selecting two if not all three Static-99R reference groups. In light of these findings, it is incumbent upon the developer to provide research using actual subjects to prove that the SRA-FV total score is a valid method by which to select a single Static-99R reference group and that the procedure can be applied reliably. At this point, credible support does not exist for using the SRA-FV to select Static-99R reference groups.

The design, development, validation, and replication of psychological instruments is guided by the Standard for Educational and Psychological Testing (“SEPT” -- American Educational Research Association et al., 1999). When comparing the Thornton and Knight study to the framework provided by SEPT, it is apparent the SRA-FV is in the infancy stage of development. At best, the SRA-FV is a work in progress that needs substantially more research to improve its psychometric properties. Aside from its low reliability and inability to generalize the validation research to other groups of sexual offenders, other important statistical properties await examination, including but not limited to:

  1. standard error of measurement
  2. factor analysis of whether items within each of the three risk domains significantly load in their respective domains
  3. the extent of the correlation between each SRA-FV item and sexual recidivism
  4. which SRA-FV items add incremental validity beyond the Static-99R or may be redundant with it; and proving each item has construct validity. 

It is reasonable to conclude that at its current stage of development the use of the SRA-FV in forensic proceedings is premature and scientifically indefensible. In closing , in their eagerness to improve the accuracy of their risk assessments, clinicians relied upon Dr. Thornton’s claim in the absence of peer-reviewed research demonstrating that the SRA-FV achieved generally accepted levels of reliability and validity. The history of forensic evaluators deploying the SRA-FV before the publication of the construction and validation study raises significant ethical and legal questions:

  • Should clinicians be accountable to vet the research presented in trainings by an instrument’s developer before applying a tool in forensic practice? 

  • What responsibility do clinicians have to rectify testimony where they presented the SRA-FV as if the results were reliable and valid?

  •  How many individuals have been civilly committed as sexually violent predators based on testimony that the findings from the SRA-FV were consistent with individuals meeting the legal threshold for sexual dangerousness, when the published data does not support this conclusion?

Answers to these questions and others go beyond the scope of this blog. However, in a recent appellate decision, a Washington Appeals Court questions the admissibility of the SRA-FV in the civil confinement trial of Steven Ritter. The appellate court determined that the application of the SRA-FV was critical to the government evaluator’s opinion that Mr. Ritter met the statutory threshold for sexual dangerousness. Since the SRA-FV is considered a novel scientific procedure, the appeals court reasoned that the trial court erred by not holding a defense-requested evidentiary hearing to decide whether the SRA-FV was admissible evidence for the jury to hear. The appeals court remanded the issue to the trial court to hold a Kelly-Frye hearing on the SRA-FV. Stay tuned!


Abbott, B.R. (2013). The Utility of Assessing “External Risk Factors” When Selecting Static-99R Reference Groups. Open Access Journal of Forensic Psychology, 5, 89-118.

American Educational Research Association, American Psychological Association and National Council on Measurement in Education. (1999). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association.

Heilbrun, K. (1992). The role of psychological testing in forensic assessment. Law and Human Behavior, 16, 257-272. doi: 10.1007/BF01044769.

In Re the Detention of Steven Ritter. (2013, November). In the Appeals Court of the State of Washington, Division III. 

Sachsenmaier, S., Thornton, D., & Olson, G. (2011, November). Structured risk assessment forensic version (SRA-FV): Score distribution, inter-rater reliability, and margin of error in an SVP population. Presentation at the 30th Annual Research and Treatment Conference of the Association for the Treatment of Sexual Abusers, Toronto, Canada.

Thornton, D. & Knight, R.A. (2013). Construction and validation of the SRA-FV Need Assessment. Sexual Abuse: A Journal of Research and Treatment. Published online December 30, 2013. doi: 10.1177/ 1079063213511120. 
* * *

*Brian R. Abbott is licensed psychologist in California and Washington who has evaluated and treated sexual offenders for more than 35 years. Among his areas of forensic expertise, Dr. Abbott has worked with sexually violent predators in various jurisdictions within the United States, where he performs psychological examinations, trains professionals, consults on psychological and legal issues, offers expert testimony, and publishes papers and peer-reviewed articles.

(c) Copyright 2013 - All rights reserved

January 5, 2014

New evidence of psychopathy test's poor accuracy in court

Use of a controversial psychopathy test is skyrocketing in court, even as mounting evidence suggests that the prejudicial instrument is highly inaccurate in adversarial settings.

The latest study, published by six respected researchers in the influential journal Law and Human Behavior, explored the accuracy of the Psychopathy Checklist, or PCL-R, in Sexually Violent Predator cases around the United States.

The findings of poor reliability echo those of other recent studies in the United States, Canada and Europe, potentially heralding more admissibility challenges in court. 

Although the PCL-R is used in capital cases, parole hearings and juvenile sentencing, by far its most widespread forensic use in the United States is in Sexually Violent Predator (SVP) cases, where it is primarily invoked by prosecution experts to argue that a person is at high risk for re-offense. Building on previous research, David DeMatteo of Drexel University and colleagues surveyed U.S. case law from 2005-2011 and located 214 cases from 19 states -- with California, Texas and Minnesota accounting for more than half of the total -- that documented use of the PCL-R in such proceedings.

To determine the reliability of the instrument, the researchers examined a subset of 29 cases in which the scores of multiple evaluators were reported. On average, scores reported by prosecution experts were about five points higher than those reported by defense-retained experts. This is a large and statistically significant difference that cannot be explained by chance. 

Prosecution experts were far more likely to give scores of 30 or above, the cutoff for presumed psychopathy. Prosecution experts reported scores of 30 or above in almost half of the cases, whereas defense witnesses reported scores that high in less than 10 percent.

Looking at interrater reliability another way, the researchers applied a classification scheme from the PCL-R manual in which scores are divided into five discreet categories, from “very low” (0-8) to “very high” (33-40). In almost half of the cases, the scores given by two evaluators fell into different categories; in about one out of five cases the scores were an astonishing two or more categories apart (e.g., “very high” versus “moderate” psychopathy). 

Surprisingly, interrater agreement was even worse among evaluators retained by the same side than among opposing experts, suggesting that the instrument’s inaccuracy is not solely due to what has been dubbed adversarial (or partisan) allegiance.

Despite its poor accuracy, the PCL-R is extremely influential in legal decision-making. The concept of psychopathy is superficially compelling in our current era of mass incarceration, and the instrument's popularity shows no sign of waning. 

Earlier this year, forensic psychologist Laura Guy and colleagues reported on its power in parole decision-making in California. The state now requires government evaluators to use the PCL-R in parole fitness evaluations for “lifers,” or prisoners sentenced to indeterminate terms of up to life in prison. Surveying several thousand cases, the researchers found that PCL-R scores were a strong predictor of release decisions by the Parole Board, with those granted parole scoring an average of about five points lower than those denied for parole. Having just conducted one such evaluation, I was struck by the frightening fact – alluded to by DeMatteo and colleagues -- that the chance assignment of an evaluator who typically gives high scores on the PCL-R “might quite literally mean the difference between an offender remaining in prison versus being released back into the community.”

Previous research has established that Factor 1 of the two-factor instrument – the factor measuring characterological traits such as manipulativeness, glibness and superficial charm – is especially prone to error in forensic settings. This is not surprising, as traits such as “glibness” are somewhat in the eye of the beholder and not objectively measurable. Yet, the authors assert, “it is exactly these traits that seem to have the most impact” on judges and juries.

Apart from the issue of poor reliability, the authors questioned the widespread use of the PCL-R as evidence of impaired volitional control, an element required for civil commitment in SVP cases. They labeled as “ironic, if not downright contradictory” the fact that psychopathy is often touted in traditional criminal responsibility (or insanity) cases as evidence of badness as opposed to mental illness, yet in SVP cases it magically transforms into evidence of a major mental disorder that interferes with self-control. 

The evidence is in: The Psychopathy Checklist-Revised is too inaccurate in applied settings to be relied upon in legal decision-making. With consistent findings of abysmal interrater reliability, its prejudicial impact clearly outweighs any probative value. However, the gatekeepers are not guarding the gates. So long as judges and attorneys ignore this growing body of empirical research, prejudicial opinions will continue to be cloaked in a false veneer of science, contributing to unjust outcomes.

* * * * *
The study is: 

The Role and Reliability of the Psychopathy Checklist-Revised in U.S. Sexually Violent Predator Evaluations: A Case Law Survey by DeMatteo, D., Edens, J. F., Galloway, M., Cox, J., Toney Smith, S. and Formon, D. (2013). Law and Human Behavior

Copies may be requested from the first author (HERE).

The same research team has just published a parallel study in Psychology, Public Policy and Law

“Investigating the Role of the Psychopathy Checklist-Revised in United States Case Law” by DeMatteo, David; Edens, John F.; Galloway, Meghann; Cox, Jennifer; Smith, Shannon Toney; Koller, Julie Present; Bersoff, Benjamin

My related essays and blog posts (I especially recommend the three marked with asterisks):

(c) Copyright Karen Franklin 2013 - All rights reserved

December 24, 2013

Legal challenge may force changes to Minnesota civil commitment

Guest post by Jon Brandt, MSW, LICSW

It has been 16 years since the U.S. Supreme Court narrowly upheld the constitutionality of controversial preventive detention schemes for dangerous sex offenders. Now, with 20 U.S. states incarcerating many thousands of men at an annual cost of more than $500 million, Minnesota has become Ground Zero for a new round of legal challenges alleging that the state’s treatment program is a sham from which no one is ever released. In this guest post, Jon Brandt gives a first-person report on last week’s momentous federal hearing.

U.S. District Court Judge Donovan Frank
SAINT PAUL, MINNESOTA -- On December 18 at the Federal District Courthouse, Judge Donovan Frank heard motions in a federal lawsuit that promises to dramatically change the civil commitment landscape in Minnesota and, by extension, around the country.

The case began modestly two years ago as a pro se complaint by about a dozen detainees at the Minnesota Sex Offender Program (MSOP).* The Federal District Court for Minnesota determined the case had merit, appointed counsel, and in 2012 Judge Frank certified it as a class action.  At a hearing last Wednesday, Dan Gustafson, lead attorney for the plaintiffs, argued motions alleging that civil commitment as administered in Minnesota is unconstitutional. 

An inauspicious start

When the court convened there was a sparse audience that included a few families of MSOP clients, a handful of reporters, and several professional stakeholders. Conspicuously absent were any plaintiffs.   Perhaps there’s some irony in the fact that, in 20 years, not only has no one ever been fully discharged from MSOP, apparently all current clients are too dangerous for any of them to be shackled and accompanied by security personnel to a federal courtroom to hear arguments on the conditions of their own confinement. Given that courtrooms are designed to contain dangerous people, whether the decision to exclude clients was made by executive or judicial authorities, it seems like a missed opportunity to allow some representative plaintiffs to bear direct witness to the wheels of justice.   

The hearing had an inauspicious start for the 698 plaintiffs civilly detained 90 miles away – the audio feed via phone lines failed. So, after waiting 15 years for the courts to reconsider their plight, the plaintiffs missed the first hour of legal arguments.   When the audio connection was finally restored, Judge Frank assured wary plaintiffs that the technical problems were not deliberate, and personally took responsibility.
The hearing began with attorney Gustafson arguing for “declaratory judgment,” or a legal finding that the state’s civil commitment program is operating in an unconstitutional manner.   He cited case law that clients have a constitutional right to rehabilitation and claimed that the program breaches civil liberties and offers neither adequate rehabilitation nor acceptable living conditions.

No one ever released

Detainee at Moose Lake MSOP facility
The state’s attorney, Assistant Attorney General Nate Brennaman, countered that the program does provide appropriate treatment, that there is no constitutional right to treatment, and that the plaintiffs are basing their entire case on a single fact, “That no one has ever gotten out.”

Gustafson seemed amused that the defense was making his case. The fact that no one is released is strong evidence, he asserted. He pointed out that nearby states have far better track records. Wisconsin, with demographics nearly identical to Minnesota’s, has civilly committed only 351 people, and nearly half are now on either conditional or full release. Iowa has committed only about 103 people, and about 30 of those have been provisionally or fully released. He pointed out that treatment which was originally estimated to be completed in 32 months is now anticipated to last eight to nine years. Not a single one of the more than 700 individuals (including one female) who have been detained has ever completed the treatment program, and only one is on conditional release.

The plaintiff next argued for a court order mandating that each detainee be individually evaluated to determine whether he might safely be released to a “less restrictive alternative,” or LRA.

Judge Frank peppered the hearing with comments and questions that frequently interrupted attorneys on both legal teams, and also gave clues to his persuasion.   Noting that Justice Kennedy was the swing vote in the 5-4 ruling in Kansas v. Hendricks, he read a passage from Kennedy’s concurring opinion whereby Kennedy cautioned that “an improvident plea bargain” by the criminal justice system cannot be remedied by the civil commitment system, and that retribution is exclusively within the domain of criminal justice. Judge Frank also raised concerns about 18 infirmed clients (one who is 91) who require assisted living and questioned the “dangerousness” of such relatively incapacitated clients. He also questioned conditions of confinement that mimic prison. When the state’s attorney argued that conditions of criminal versus civil confinement had been decided by the US Supreme Court in the 1982 Youngberg v. Romeo case, Judge Frank interrupted with, “No, it wasn’t… but continue.”

Judge Frank expressed concern that most of the clients at the MSOP were still in the first phase of treatment, and twice pointed to his understanding that treatment progress is not only slow but that some clients are apparently sent back to redo previous phases. He also seemed concerned that detainees get less treatment than sexual offenders incarcerated in state prisons. He pondered rhetorically, “How much treatment is enough,” and questioned how the “Youngberg standard” of professional judgment might determine completion of treatment.

Motion for federal oversight

Moose Lake
The plaintiffs’ third motion was for the appointment of a “special master” and federal supervision of both the facility and the system. A special master is an administrator who would oversee MSOP operations and implement federal court directives. The state’s attorney responded that clients are getting effective treatment at MSOP, that treatment is subject to quarterly reviews, which is more stringent than other states that only require annual reviews, that MSOP has filled most of its open clinical positions, and that there is nothing that a special master could do that isn’t either already being done, or that DHS couldn’t manage if so directed by the federal court.

If Judge Frank grants the first motion, finding conditions unconstitutional, the other two motions might be automatic -- MSOP could be put under federal supervision in a similar manner as the state of Washington from 1994 to 2007. 

Judge Frank confirmed that on December 6 he appointed four sex offender treatment experts to guide the proceedings, under Federal Court Rule 706 . The four experts are: 
  • Mike Miner, Professor and Research Director of the Program in Human Sexuality at the University of Minnesota Medical School
  • Naomi Freeman, who leads New York’s unit for Strict and Intensive Supervision and Treatment that manages civilly committed individuals outside of secure facilities
  • Deborah McCulloch, director of Wisconsin’s sex offender civil commitment program, and 
  • Robin Wilson, former clinical director at the Florida sex offender civil commitment program from 2006 to 2011, during which time there was a class action and settlement
Judge Frank seems to have exercised judicial restraint over the two years since the original complaint was filed. In an effort to prod state government, in 2012 he ordered the establishment of a special task force  to make recommendations to the state legislature. The Task Force held several hearings and collected relevant documents. It issued its first report in December 2012, with general recommendations for public-private partnerships to establish a statewide network of less restrictive alternatives. The report echoed critical findings by the Minnesota Office of the Legislative Auditor in 2011. Unfortunately the state legislature adjourned in May 2013 without enacting legislative changes. 

Events may force action

Wednesday’s motions, the critical reports, and two other events in 2013 will likely force Judge Frank to act soon. Last summer Dr. Grant Duwe, chief researcher for the Minnesota Department of Corrections, published research that challenges the government’s foundational claim that civil detainees are “highly likely” to reoffend. Duwe’s research indicates that most of the detainees are highly likely to NOT reoffend.   

Then, last month, Minnesota Governor Mark Dayton issued an executive order that continued the eight-year moratorium of his predecessor -- that there will be no further releases of clients from MSOP, except by court order. With this abdication of executive oversight, all three branches of the state government seem to be in perpetual paralysis. 

Minnesota’s government is managing the Sex Offender Civil Commitment program (SOCC)  like holding a wolf by the ears -- don’t want to hold on and afraid to let go.   Modest reforms that are in progress at MSOP are being sabotaged by systemic failures. Clinical staff have the impossible job of trying to maintain the integrity of endless treatment goals for clients trapped in a treatment paradox and have come to realize that the promise of rehabilitation is disingenuous.

Legal scholar Eric Janus
One of the highly principled critics of SOCC who is likely to be vindicated by imminent rulings from Judge Frank is Eric Janus. Janus is the President and Dean of the William Mitchell College of Law, and author of, “Failure to Protect; America’s Sexual Predator Laws and the Rise of the Preventive State” (Cornell University Press, 2006). Janus led an unsuccessful challenge to SOCC before the Minnesota Supreme Court in the 1990s. Since then, he has been warning that the SOCC, as public policy, is deceptively enticing, deeply flawed, and destined to overreach its stated intent. Janus was also a member of the Minnesota SOCC Task Force. 

Judge Frank indicated that he will accept a joint amicus brief from Janus and the ACLU, due Dec. 27, and will rule on the motions within 60 days.

My take is that the federal courts can no longer ignore repeated judicial admonishments; if the SOCC begins to look like retribution or prison in disguise, the courts will intervene. With precedence in the state of Washington, Judge Frank seems poised to put MSOP under federal supervision. Depending on the strength of any finding of “unconstitutional,” the ruling could have far-reaching implications that echo around the United States.    

Relevant legal cases:

*Karsjens, et al. v. MN Department of Human Services, et al., CV 11-3659 DWF/JJK

Foucha v. Louisiana, (90-5844), 504 U.S. 71 (1992).

Strutton v. Meade, (10–2029) 668 F.3d 549, US Court of Appeals for the Eighth Circuit (2012)

Youngberg v. Romeo, (80-1429) 457 U.S. 307 (1982)

Call v. Gomez, 535 N.W.2d 312, Supreme Court of Minnesota (1995)

Seling v. Young (99-1185) 531 U.S. 250 (2001)

Jon Brandt is a clinical social worker in Minnesota, for 35 years working in the prevention of sexual abuse. He has provided evaluations, treatment, and supervision to several hundred sexual offenders, and provided professional consultation and training to colleagues. He is a Clinical Member of the Association for the Treatment of Sexual Abusers (ATSA) and is a blogger for ATSA’s website, Sexual Abuse: A Journal of Research and Treatment.   In February 2012 his post, “Doubts about SVP Programs,” was re-blogged here.

November 17, 2013

Static-99 “norms du jour” get yet another makeover

It would be humorous if the real-world consequences were not so grave.

Every year, at a jam-packed session of the annual conference of the Association for the Treatment of Sexual Abusers (ATSA), the developers of the Static-99 family of actuarial risk assessment tools roll out yet a new methodology to replace the old. 

This year, they announced that they are scrapping two of three sets of "non-routine" comparison norms that they introduced at an ATSA conference just four years ago. Stay tuned, they told their rapt audience, for further instructions on how to choose between the two remaining sets of norms. 

To many, this might sound dry and technical. But in the courtroom trenches, sexually violent predator cases often hinge on an evaluator's choice of a comparison group. Should the offender be compared with the full population of convicted sex offenders? Or a subset labeled "high risk/needs" that offended at a rate more than 3.5 times higher than the more representative group (21 percent versus 6 percent after five years)? 

To illustrate, whereas only about 3 percent (4 out of 139) of the men over 70 in the combined Static-99R samples reoffended, invoking the high-risk norms would cause a septuagenarian's risk to skyrocket by 400 percent. It's not hard to see why such an inflated estimate might increase the odds of a judge or jury finding a former offender to be dangerous, and recommending indefinite detention. 

The first problem with this method is that the basis for choosing a comparison group is very vague, inviting bias on the part of forensic evaluators. Even more essentially, there is not a shred of empirical evidence that choosing the high-risk norms improves decision-making accuracy in sexually violent predator (SVP) cases.

That should come as no surprise. Not one of the six samples that were cobbled together post-hoc to create the high-risk norms included anyone who was civilly committed -- or considered for commitment -- under modern-day SVP laws, which now exist in 20 U.S. states. (Four samples are Canadian, one is Danish, and the only American one is an exceptionally high-risk, archaic and idiosyncratic sample from an infamous psychiatric facility in Bridgewater, Massachusetts.) 

A typical psychological test has a published manual that gives instructions on proper use and clearly describes its norms. In contrast, the Static-99, despite its high-stakes deployment, has no published manual. Its users must rely on a website, periodic conferences and training sessions, and word-of-mouth information. 

High-risk norms based on guesswork, say forensic psychologists

Now, two forensic psychologists have joined a growing chorus of mainstream practitioners cautioning against the use of the high-risk norms, unless and until research proves that they improve evaluators' accuracy in forecasting risk of sexual re-offense. 

"There is zero empirical research showing increased accuracy by switching to a non- representative group," note Gregory DeClue and Denis Zavodny in an article just published in the Open Access Journal of Forensic Psychology. "Unless and until such choices are found to increase the accuracy of risk assessments, forensic evaluators should use local norms (if available) or the FULLPOP* comparison group (considered roughly representative of all adjudicated sex offenders)."

The authors critiqued the growing practice of selecting the high-risk norms based on so-called "psychologically meaningful risk factors." The Static-99 developers’ recommendation for this clinical decision-making is based on mere guesswork or speculation that is contradicted by scientific evidence from at least five recent studies, they note.
"In theory, it is possible that a standardized procedure could be developed whereby evaluators would use a dynamic risk-assessment tool in addition to a static-factor tool such as the Static-99R. Next, it could be tested whether carefully trained evaluators in a controlled study, using that combination of tools, arrive at more accurate predictions…. A third step would be field studies to address the practical impact of using the combination procedure in actual cases. Even if well-trained evaluators could use the procedure effectively under controlled conditions, it would be important to explore whether allegiance or other social-psychological factors decrease the accuracy of risk assessments in forensic cases. At present, there is no research showing that incremental validity is added by using clinical judgment regarding ‘external psychologically meaningful risk factors’ to augment or facilitate a statistically based risk- assessment scheme."

Indeed, an empirical study last year of Static-99 risk predictions found that accuracy decreased when evaluators used clinical judgment to override actuarial scores.

"The ratings with overrides predicted recidivism in the wrong direction -- that is, clinical overrides of increased risk were actually associated with lower recidivism rates and vice versa,” wrote Jennifer Storey, Kelly Watt, Karla Jackson and Stephen Hart in an article in Sexual Abuse: Journal of Research and Treatment.

DeClue and Zavodny question the Static-99 developers' decision to report only 5-year recidivism data, rather than also include 10-year recidivism rates, for the full sample, even though such information is readily available. This decision may influence some evaluators to go to the high-risk norms, for which 10-year data are reported, as the reference group for an offender.

The absolute best practice, they note, is to compare an offender with the actual recidivism rates in the local jurisdiction. To facilitate this, they provide a chart of contemporary recidivism rates from several U.S. states, including California, Washington, Texas, Florida, Connecticut, New Jersey, Minnesota and South Carolina. Recidivism rates varied from a low of less than 1 percent, among supervised offenders in Texas, all the way up to 25% for a group of offenders in Washington. 

As I reported last month on the new research out of Florida, a growing body of research is establishing that detected recidivism is far lower than was originally reported by the Static-99 developers. I predict that the high-risk samples will eventually fall by the wayside, as have other unscientifically proven methods.

But even if this suspect procedure is discredited and abandoned by the actuarial gurus who originally introduced it, this will not provide automatic redress for those already detained under the debunked method.

There's got to be a saner way to protect the public from sexual predators.

* * * * *
The articles are:

Forensic Use of the Static-99R: Part 3. Choosing a Comparison Group” (2013), Gregory DeClue and Denis Zavodny, Open Access Journal of Forensic Psychology available online (HERE)

“Utilization and implications of the Static-99 in practice” (2012), Jennifer Storey, Kelly Watt, Karla Jackson and Stephen Hart, Sexual Abuse: Journal of Research and Treatment, available by request from Stephen Hart (HERE)

* * * * *

*NOTE: DeClue and Zavodny replaced the developer’s label of the full group as "routine" with the term FULLPOP, for full population, after hearing evaluators testify in court that they did not use the full norms because they did not consider the individual in question to be "a routine sex offender."

October 27, 2013

Black swan crash lands on Florida SVP program

Audit finds low recidivism, critiques reliance on inflated Static-99 risk estimates

Dan Montaldi’s words were prophetic.

Speaking to Salon magazine last year, the former director of Florida's civil commitment program for sex offenders called innovative rehabilitation programs "fragile flowers." The backlash from one bad deed that makes the news can bring an otherwise successful enterprise crashing down.

Montaldi was referring to a community reintegration program in Arizona that was derailed by the escape of a single prisoner in 2010.

But he could have been talking about Florida where, just a year after his Salon interview, the highly publicized rape and murder of an 8-year-old girl is sending shock waves through the treatment community. Cherish Perrywinkle was abducted from a Walmart, raped and murdered, allegedly by a registered sex offender who had twice been evaluated and found not to meet criteria for commitment as a sexually violent predator (SVP).

Montaldi resigned amidst a witch hunt climate generated by the killing and a simultaneous investigative series in the Sun Sentinel headlined "Sex Predators Unleashed." His sin was daring to mention the moral dilemma of locking up people because they might commit a crime in the future, when recidivism rates are very low. Republican lawmakers called his statements supportive of "monsters" and said it made their "skin crawl."

Montaldi's comments were contained in an email to colleagues in the Association for the Treatment of Sexual Abusers, in response to the alarmist newspaper series. He observed that, as a group, sex offenders were "statistically unlikely to reoffend." In other words, Cherish Perrywinkle’s murder was a statistical anomaly (also known as a black swan, or something that is so rare that it is impossible to predict or prevent). He went on to say that in a free society, the civil rights of even "society's most feared and despised members" are an important moral concern. A subscriber to the private listserv apparently leaked the email to the news media.

The Sun Sentinel series had also criticized the decline in the proportion of paroled offenders who were recommended for civil commitment under Montaldi's directorship. "Florida's referral rate is the lowest of 17 states with comparable sex-offender programs and at least three times lower than that of such large states as California, New York and Illinois," the newspaper reported.

Audit finds very low recidivism rates 

In the wake of the Sun Sentinel investigation, the Florida agency that oversees the Sexually Violent Predator Program has released a comprehensive review of the accuracy of the civil commitment selection process. Since Florida enacted its Sexually Violent Predator (SVP) law in 1999, more than 40,000 paroling sex offenders have been reviewed for possible commitment. A private corporation, GEO Care, LLC, runs the state’s 720-bed civil detention facility in Arcadia for the state's Department of Children and Families.

Three independent auditors -- well known psychologists Chris Carr, Anita Schlank and Karen C. Parker -- reviewed data from both a 2011 state analysis and an internal recidivism study conducted by the SVP program. They also reviewed data on 31,626 referrals obtained by the Sun Sentinel newspaper for its Aug. 18 expose.

All of the data converged upon an inescapable conclusion: Current assessment procedures are systematically overestimating the risk that a paroling offender will commit another sex offense.

In other words, Montaldi’s controversial email about recidivism rates was dead-on accurate.

First, the auditors examined recidivism data for a set of sex offenders who were determined to be extremely dangerous predators, but who were nonetheless released into a community diversion program instead of being detained.

"This study provided an opportunity to see if offenders who were recommended for commitment as sexually violent predators, actually behaved as expected when they were placed back into the community," they explained.

Of the 140 released offenders, only five were convicted of a new felony sex offense during a follow-up period of up to 10 years. Or, to put it another way, more than 96 percent did not reoffend. "This finding indicates that many individuals who were thought to be at high risk, were not," the report concluded.

Next, they analyzed internal data from the program itself. As of March 2013, 710 of the roughly 1,500 men referred for civil commitment were later released for one reason or another. Of those, only 5.7 percent went on to be convicted of a new sexually motivated crime.

Interestingly, this reconviction rate is not much different than that of a larger group of 1,200 sex offenders who were considered but rejected for civil commitment after a face-to-face evaluation. About 3 percent of those offenders incurred a new felony sex offense conviction after five to 10 years, with about 4 percent being reconvicted over a longer follow-up period of up to 14 years.

Logo on wall of sex offender hearing room in Salem, MA
"The recommended and the non-recommended groups differed by less than 2 percent in the percentage of offenders obtaining a new felony sex offense conviction after release," the investigators found. "Such a minor difference is surprising and indicates that the traditional approach to determining SVP status needs to be improved. There are too many false positives (someone determined to fit the SVP definition when he does not, or someone determined to be likely to re-offend but he is not)."

Overestimation of risk was especially prevalent for older offenders. Only one out of 94 offenders over the age of 60 was arrested on a new sex offense charge, and that charge was ultimately dismissed.

Finally, the auditors reanalyzed the data obtained by the Sun Sentinel newspaper via a public records request. Of this larger group of about 30,000 paroling offenders who were NOT recommended for civil commitment, less than 2 percent were convicted of a new sex offense.

What the public is most concerned about, naturally, is sex-related murders, such as that of young Cherish Perrywinkle. Fourteen of the tens of thousands of men not recommended for civil commitment had new convictions for sexual murders. This is a rate of 0.047, or less than five one-hundredths of 1 percent – the very definition of a black swan.

Static-99R producing epidemic of false positives

Determining which offender will reoffend is extremely difficult when base rates of sex offender recidivism are so low. However, the auditors identified an actuarial risk assessment tool, the widely used Static-99R, as a key factor in Florida’s epidemic of over-prediction. Florida mandates use of this tool in the risk assessment process.

Florida Civil Commitment Center
In 2009, government evaluators in Florida and elsewhere in the United States began a controversial practice of comparing some offenders to a select set of norms called "high risk." This practice dramatically inflates risk estimates, thereby alarming jurors in adversarial legal proceedings. The decision rules for using this comparison group are unclear and have not been empirically tested.

The recidivism rate of the Static-99R "high risk" comparison sample is several times higher than the actual recidivism rate of even the highest-risk offenders, the auditors noted. Thus, consistent with research findings from other states, they found that use of these high-risk norms is a major factor in the exaggeration of sex offender risk in Florida.

(It is certainly gratifying to see mainstream leadership in the civil commitment industry coming around to what people like me have been pointing out for years now.)

"The precision once thought to be present in using the Static-99 has diminished," the report states. "It seems apparent that less weight needs to be given to the Static-99R in sexually violent predator evaluations."

What goes around comes around

Due to the identified problems with actuarial tools, and the Static-99R in particular, the independent auditors are recommending that more weight be placed on clinical judgment. 

"It now appears that clinical judgment, guided by the broad and ever-expanding base of empirical data, may be superior to simply quoting 'rates,' which may lack sufficient application to the offenders being evaluated."

Ironically, the subjectivity of clinical judgment was the very practice that the actuarial tools were designed to alleviate. I have my doubts that clinical judgment will end up being all that reliable in adversarial proceedings, either. Perhaps the safest practice would be to "bet the base rate," or estimate risk based on local base rates of reoffending for similar offenders. This, however, would result in far fewer civil commitments.

Consistent with recent research, the auditors also recommended re-examining the practice of mandating lengthy treatment that can lead to demoralization and, in some cases, iatrogenic (or harmful) effects.

Although the detailed report may be helpful to forensic evaluators and the courts, it looks like Florida legislators aiming to appease a rattled public will ignore the findings and move in the opposite direction. Several are now advocating for new black swan legislation to be known as "Cherish’s Law."

As sex offender researcher and professor Jill Levenson noted in a commentary on the website of WLRN in Florida, such an approach is penny-wise but pound-foolish: 

“Every dollar spent on hastily passed sex offender policies is a dollar not spent on sexual assault victim services, child protection, and social programs designed to aid at-risk families…. We need to start thinking about early prevention and fund, not cut, social service programs for children and families. Today's perpetrators are often yesterday's victims."

* * * * *

Photo credit: Mike Stocker, Sun Sentinel
BREAKING NEWS: Montaldi has just been replaced as director of the civil commitment facility by Kristin Kanner, a longtime prosecutor from Broward County, Florida who headed that county's Sexually Violent Predator Unit for almost a decade. Not only does she have a JD in law from the Florida College of Law, but she holds undergraduate degrees in psychology and public policy from Duke. Word on the street is that she is an extremely competent and ethical person. It will be interesting to see how she will be treated by the media and politicians in the event that any black swan crash lands on the facility during her watch.

 * * * * *

The full report on the Florida SVP program is available HERE.  

Related post: 

Systems failure or black swan? New frame needed to stop "Memorial Crime Control" frenzy (Oct. 19, 2010)