It would be humorous if the real-world consequences were not so grave.
Every year, at a jam-packed session of the annual conference of the Association for the Treatment of Sexual Abusers (ATSA), the developers of the Static-99 family of actuarial risk assessment tools roll out yet a new methodology to replace the old.
This year, they announced that they are scrapping two of three sets of "non-routine" comparison norms that they introduced at an ATSA conference just four years ago. Stay tuned, they told their rapt audience, for further instructions on how to choose between the two remaining sets of norms.
To many, this might sound dry and technical. But in the courtroom trenches, sexually violent predator cases often hinge on an evaluator's choice of a comparison group. Should the offender be compared with the full population of convicted sex offenders? Or a subset labeled "high risk/needs" that offended at a rate more than 3.5 times higher than the more representative group (21 percent versus 6 percent after five years)?
To illustrate, whereas only about 3 percent (4 out of 139) of the men over 70 in the combined Static-99R samples reoffended, invoking the high-risk norms would cause a septuagenarian's risk to skyrocket by 400 percent. It's not hard to see why such an inflated estimate might increase the odds of a judge or jury finding a former offender to be dangerous, and recommending indefinite detention.
The first problem with this method is that the basis for choosing a comparison group is very vague, inviting bias on the part of forensic evaluators. Even more essentially, there is not a shred of empirical evidence that choosing the high-risk norms improves decision-making accuracy in sexually violent predator (SVP) cases.
That should come as no surprise. Not one of the six samples that were cobbled together post-hoc to create the high-risk norms included anyone who was civilly committed -- or considered for commitment -- under modern-day SVP laws, which now exist in 20 U.S. states. (Four samples are Canadian, one is Danish, and the only American one is an exceptionally high-risk, archaic and idiosyncratic sample from an infamous psychiatric facility in Bridgewater, Massachusetts.)
A typical psychological test has a published manual that gives instructions on proper use and clearly describes its norms. In contrast, the Static-99, despite its high-stakes deployment, has no published manual. Its users must rely on a website, periodic conferences and training sessions, and word-of-mouth information.
High-risk norms based on guesswork, say forensic psychologists
Now, two forensic psychologists have joined a growing chorus of mainstream practitioners cautioning against the use of the high-risk norms, unless and until research proves that they improve evaluators' accuracy in forecasting risk of sexual re-offense.
"There is zero empirical research showing increased accuracy by switching to a non- representative group," note Gregory DeClue and Denis Zavodny in an article just published in the Open Access Journal of Forensic Psychology. "Unless and until such choices are found to increase the accuracy of risk assessments, forensic evaluators should use local norms (if available) or the FULLPOP* comparison group (considered roughly representative of all adjudicated sex offenders)."
The authors critiqued the growing practice of selecting the high-risk norms based on so-called "psychologically meaningful risk factors." The Static-99 developers’ recommendation for this clinical decision-making is based on mere guesswork or speculation that is contradicted by scientific evidence from at least five recent studies, they note.
Indeed, an empirical study last year of Static-99 risk predictions found that accuracy decreased when evaluators used clinical judgment to override actuarial scores.
"The ratings with overrides predicted recidivism in the wrong direction -- that is, clinical overrides of increased risk were actually associated with lower recidivism rates and vice versa,” wrote Jennifer Storey, Kelly Watt, Karla Jackson and Stephen Hart in an article in Sexual Abuse: Journal of Research and Treatment.
DeClue and Zavodny question the Static-99 developers' decision to report only 5-year recidivism data, rather than also include 10-year recidivism rates, for the full sample, even though such information is readily available. This decision may influence some evaluators to go to the high-risk norms, for which 10-year data are reported, as the reference group for an offender.
The absolute best practice, they note, is to compare an offender with the actual recidivism rates in the local jurisdiction. To facilitate this, they provide a chart of contemporary recidivism rates from several U.S. states, including California, Washington, Texas, Florida, Connecticut, New Jersey, Minnesota and South Carolina. Recidivism rates varied from a low of less than 1 percent, among supervised offenders in Texas, all the way up to 25% for a group of offenders in Washington.
As I reported last month on the new research out of Florida, a growing body of research is establishing that detected recidivism is far lower than was originally reported by the Static-99 developers. I predict that the high-risk samples will eventually fall by the wayside, as have other unscientifically proven methods.
But even if this suspect procedure is discredited and abandoned by the actuarial gurus who originally introduced it, this will not provide automatic redress for those already detained under the debunked method.
There's got to be a saner way to protect the public from sexual predators.
“Forensic Use of the Static-99R: Part 3. Choosing a Comparison Group” (2013), Gregory DeClue and Denis Zavodny, Open Access Journal of Forensic Psychology available online (HERE)
“Utilization and implications of the Static-99 in practice” (2012), Jennifer Storey, Kelly Watt, Karla Jackson and Stephen Hart, Sexual Abuse: Journal of Research and Treatment, available by request from Stephen Hart (HERE)
*NOTE: DeClue and Zavodny replaced the developer’s label of the full group as "routine" with the term FULLPOP, for full population, after hearing evaluators testify in court that they did not use the full norms because they did not consider the individual in question to be "a routine sex offender."
Every year, at a jam-packed session of the annual conference of the Association for the Treatment of Sexual Abusers (ATSA), the developers of the Static-99 family of actuarial risk assessment tools roll out yet a new methodology to replace the old.
This year, they announced that they are scrapping two of three sets of "non-routine" comparison norms that they introduced at an ATSA conference just four years ago. Stay tuned, they told their rapt audience, for further instructions on how to choose between the two remaining sets of norms.
To many, this might sound dry and technical. But in the courtroom trenches, sexually violent predator cases often hinge on an evaluator's choice of a comparison group. Should the offender be compared with the full population of convicted sex offenders? Or a subset labeled "high risk/needs" that offended at a rate more than 3.5 times higher than the more representative group (21 percent versus 6 percent after five years)?
To illustrate, whereas only about 3 percent (4 out of 139) of the men over 70 in the combined Static-99R samples reoffended, invoking the high-risk norms would cause a septuagenarian's risk to skyrocket by 400 percent. It's not hard to see why such an inflated estimate might increase the odds of a judge or jury finding a former offender to be dangerous, and recommending indefinite detention.
The first problem with this method is that the basis for choosing a comparison group is very vague, inviting bias on the part of forensic evaluators. Even more essentially, there is not a shred of empirical evidence that choosing the high-risk norms improves decision-making accuracy in sexually violent predator (SVP) cases.
That should come as no surprise. Not one of the six samples that were cobbled together post-hoc to create the high-risk norms included anyone who was civilly committed -- or considered for commitment -- under modern-day SVP laws, which now exist in 20 U.S. states. (Four samples are Canadian, one is Danish, and the only American one is an exceptionally high-risk, archaic and idiosyncratic sample from an infamous psychiatric facility in Bridgewater, Massachusetts.)
A typical psychological test has a published manual that gives instructions on proper use and clearly describes its norms. In contrast, the Static-99, despite its high-stakes deployment, has no published manual. Its users must rely on a website, periodic conferences and training sessions, and word-of-mouth information.
High-risk norms based on guesswork, say forensic psychologists
Now, two forensic psychologists have joined a growing chorus of mainstream practitioners cautioning against the use of the high-risk norms, unless and until research proves that they improve evaluators' accuracy in forecasting risk of sexual re-offense.
"There is zero empirical research showing increased accuracy by switching to a non- representative group," note Gregory DeClue and Denis Zavodny in an article just published in the Open Access Journal of Forensic Psychology. "Unless and until such choices are found to increase the accuracy of risk assessments, forensic evaluators should use local norms (if available) or the FULLPOP* comparison group (considered roughly representative of all adjudicated sex offenders)."
The authors critiqued the growing practice of selecting the high-risk norms based on so-called "psychologically meaningful risk factors." The Static-99 developers’ recommendation for this clinical decision-making is based on mere guesswork or speculation that is contradicted by scientific evidence from at least five recent studies, they note.
"In theory, it is possible that a standardized procedure could be developed whereby evaluators would use a dynamic risk-assessment tool in addition to a static-factor tool such as the Static-99R. Next, it could be tested whether carefully trained evaluators in a controlled study, using that combination of tools, arrive at more accurate predictions…. A third step would be field studies to address the practical impact of using the combination procedure in actual cases. Even if well-trained evaluators could use the procedure effectively under controlled conditions, it would be important to explore whether allegiance or other social-psychological factors decrease the accuracy of risk assessments in forensic cases. At present, there is no research showing that incremental validity is added by using clinical judgment regarding ‘external psychologically meaningful risk factors’ to augment or facilitate a statistically based risk- assessment scheme."
Indeed, an empirical study last year of Static-99 risk predictions found that accuracy decreased when evaluators used clinical judgment to override actuarial scores.
"The ratings with overrides predicted recidivism in the wrong direction -- that is, clinical overrides of increased risk were actually associated with lower recidivism rates and vice versa,” wrote Jennifer Storey, Kelly Watt, Karla Jackson and Stephen Hart in an article in Sexual Abuse: Journal of Research and Treatment.
DeClue and Zavodny question the Static-99 developers' decision to report only 5-year recidivism data, rather than also include 10-year recidivism rates, for the full sample, even though such information is readily available. This decision may influence some evaluators to go to the high-risk norms, for which 10-year data are reported, as the reference group for an offender.
The absolute best practice, they note, is to compare an offender with the actual recidivism rates in the local jurisdiction. To facilitate this, they provide a chart of contemporary recidivism rates from several U.S. states, including California, Washington, Texas, Florida, Connecticut, New Jersey, Minnesota and South Carolina. Recidivism rates varied from a low of less than 1 percent, among supervised offenders in Texas, all the way up to 25% for a group of offenders in Washington.
As I reported last month on the new research out of Florida, a growing body of research is establishing that detected recidivism is far lower than was originally reported by the Static-99 developers. I predict that the high-risk samples will eventually fall by the wayside, as have other unscientifically proven methods.
But even if this suspect procedure is discredited and abandoned by the actuarial gurus who originally introduced it, this will not provide automatic redress for those already detained under the debunked method.
There's got to be a saner way to protect the public from sexual predators.
* * * * *
The articles are:“Forensic Use of the Static-99R: Part 3. Choosing a Comparison Group” (2013), Gregory DeClue and Denis Zavodny, Open Access Journal of Forensic Psychology available online (HERE)
“Utilization and implications of the Static-99 in practice” (2012), Jennifer Storey, Kelly Watt, Karla Jackson and Stephen Hart, Sexual Abuse: Journal of Research and Treatment, available by request from Stephen Hart (HERE)
* * * * *
*NOTE: DeClue and Zavodny replaced the developer’s label of the full group as "routine" with the term FULLPOP, for full population, after hearing evaluators testify in court that they did not use the full norms because they did not consider the individual in question to be "a routine sex offender."
No comments:
Post a Comment