Showing posts sorted by date for query forensic psychology and civil commitment sex offenders. Sort by relevance Show all posts
Showing posts sorted by date for query forensic psychology and civil commitment sex offenders. Sort by relevance Show all posts

October 4, 2024

Junk-science paraphilias remain popular despite official rejection, study finds

Sometimes, you can’t win for losing.

Just over a decade ago, opponents of junk science in court won a hard-fought battle when they succeeded in keeping two unreliable sexual-deviance diagnoses from debuting in the fifth edition of the American Psychiatric Association’s Diagnostic and Statistical Manual of Mental Disorders (DSM).

Now, a new study finds that the rejection did nothing to stop the introduction of these diagnoses in court. Rather, they are being snuck into forensic reports and testimony through the back door, via two vague catchall labels inserted into the DSM manual in 2013. And although proponents had argued at the time that these residual labels would reduce confusion and improve diagnostic reliability, the study suggests that the opposite has occurred.

Long-time readers of this blog may recall the brouhaha over the two novel conditions of “hebephilia” and “nonconsent.” Both were considered but rejected for the sexual disorders (“paraphilias”) section of the 2013 DSM. Their rejection owed to their lack of proven reliability or scientific validity. Neither condition has a standard definition, which is a basic precursor to accurate scientific measurement. Hebephilia generally references a sexual attraction to youths in the pubertal stage of development, while nonconsent refers to attraction to sexual coercion.

A single niche  


The single niche where the two labels are in widespread use is a forensic one: Sexually violent predator (SVP) litigation. That’s because the indefinite civil confinement of serial sex offenders has been ruled unconstitutional except in cases where an offender poses a substantial future danger to the public due to a formal mental disorder. The lobby to create the new disorders of nonconsent and hebephilia was led by forensic psychologists working in the SVP trenches, along with psychologists at a Canadian clinic with outsized influence over the paraphilias section of the 2013 DSM manual. The American Psychiatric Association’s refusal to label rapists as mentally ill has encouraged some evaluators to “bend the language of the DSM” to make it work.

The current researchers found that “nonconsent” and “hebephilia” are the two most common bases for invoking an idiosyncratic catchall label of “Other Specified Paraphilic Disorder” (OSPD). Their findings are consistent with a recent review of U.S. legal cases that found that large proportions of civilly committed sex offenders – including about half in California and 43% in Washington - are diagnosed with "OSPD-nonconsent."

The study, published in the journal Sexual Abuse, is the first to systematically analyze the prevalence and patterns of use of OSPD and another vaguely defined label, “Unspecified Paraphilic Disorder” (UPD), in sexually violent predator litigation. It analyzed SVP evaluations in Florida over a four-year period. Because the researchers aimed to calculate the reliability of the disputed labels, only cases in which a convicted sex offender was evaluated by two different psychologists were included. In all, 190 separate cases involving 380 forensic reports were analyzed.

At least one paraphilia was diagnosed in four out of five cases reviewed. Pedophilia was the most invoked, followed by the catchall categories of OSPD and UPD.

OSPD’s reliability – or the agreement among two psychologists evaluating the same man – was abysmal. In cases where one evaluator assigned a diagnosis of OSPD, there was a less-than-chance likelihood that a second evaluator would agree. The kappa reliability statistic was a very poor .21, far below chance agreement. Kappas of below 0.4 are generally considered to be below the minimum reliability threshold in the forensic arena.

Evaluator disagreement was even more profound with Unspecified Paraphilic Disorder, with two psychologists agreeing about its presence only 30% of the time. That comes as no surprise. That label, as critics have long pointed out, is inherently unreliable, in that it is designed to be used in circumstances in which there is not enough information to make a specific diagnosis, or a clinician “chooses not to specify the reason” why it is being assigned, according to the manual’s instructions.

One of forensic psychology’s dirty little secrets is that the assignment of controversial labels often hinges as much on evaluator whims as on the facts of the case. For example, research has found that some evaluators routinely assign higher scores than others on measures of psychopathy, an especially prejudicial label. The current research showed this same problematic pattern with diagnoses of OSPD. Two of the 21 psychologists under study proffered that catchall diagnosis in most of their cases, whereas 38% of the clinicians assigned it in fewer than one out of four cases; one evaluator never used it at all. This suggests that case outcomes are being influenced not only by offender characteristics but by which psychologist happens to be assigned to the case.

Similar evaluator variability was evident when the researchers zoomed in on OSPD diagnoses in which either hebephilia or nonconsent were proffered as its basis. Three evaluators used the term “hebephilia” in half of their OSPD diagnoses, while nine evaluators never used hebephilia-related terminology at all. And evaluators agreed on the hebephilia label in only about one out of four instances. Regarding nonconsent, 13 evaluators invoked it in at least half of their evaluations, whereas five evaluators never used that specifier.

The study’s authors theorized that the widely ranging rates of use of the OSPD and UPD labels likely reflect hesitancy by some psychologists to proffer diagnoses “with vague diagnostic criteria and debatable level of empirical support.”

What all this suggests is that whether an offender is said to have a mental disorder pertaining to an attraction to pubescent minors and/or rape hinges in large part on the luck of the draw as to whether they are assigned to Dr. Jones versus Dr. Smith.

The large variance among evaluators is especially remarkable in that “adversarial allegiance” was not in play. This forensic bias becomes an issue when evaluators’ opinion are influenced by whether they were retained by the prosecution or the defense. Here, all of the evaluators were members of the same ostensibly neutral panel of contracted psychologists. If adversarial allegiance had come into play, the divergences in diagnoses likely would have been even more profound.

Highlighting the higgledy-piggledy nature of any ad-hoc diagnosis, the researchers found that the so-called “specifiers” – or specific rationales – attached to OSPD diagnoses were highly idiosyncratic. Examples included descriptions of behaviors that are illegal but not necessarily evidence of mental disorder, such as “OSPD-Non-Consensual Sexual Activity with Adolescent,” “OSPD-Attraction to Adolescent Females” and an even more bizarre “OSPD-Sexting.”

 Custom-tailored labels


“[O]ne may be particularly concerned that several of the labels appear custom to the facts of the specific case rather than resting on any empirically derived diagnosis,” the study’s authors noted.

I witnessed this first-hand last month, when a psychologist testified in federal court that a sex offender the government was aiming to civilly commit had a novel combination of sexual interests that cumulatively rose to the level of a unique mental disorder called “OSPD-deviant sexual interests in hebephilic, sadistic, exhibitionistic and voyeuristic behavior.”

Fortunately, the federal judge at this particular trial was skeptical. Pointing out that “OSPD-hebephilia” was rejected from the DSM and remains controversial in the psychological community, he wrote in his opinion that he was “troubled by the combination of multiple insufficient specifiers, which does not appear to have been contemplated by the DSM-5-TR.”

No matter how nonconsent or hebephilia were defined in the specific psychological reports, the interrater agreement – or concordance between evaluators – remained poor across the board, and far below recommended reliability for diagnoses in routine clinical practice, much less the forensic arena in which precision is especially critical.

"Bad science"


“Relying upon diagnoses with poor empirical support can perpetuate the use of bad science in the courtroom,” the authors concluded. “While it is certainly true that there are high-risk individuals who are likely to sexually recidivate upon their release from prison, providing makeshift diagnoses to satisfy civil commitment criteria significantly questions the ethical practice of psychological decision making.”

A survey of legal cases found a smattering of successful challenges to these controversial diagnoses. These Daubert and Frye evidentiary challenges focused on definitional problems, an absence of substantial research support, and a lack of general acceptance. In State of New York v. Jason C., for example, the court wrote:

“This Court cannot help but ask, if this disorder exists, why isn't there convincing evidence that it exists outside the realm of civil commitment? If this disorder is a matter of the human condition, then shouldn't this paraphilia be seen outside of SVP proceedings?”

The diagnosis was similarly excluded in a Missouri case, In Re: Stanley Williams, on the basis of a high error rate, a dearth of peer-reviewed publications, poor validity, and lack of general acceptance. The judge in that case wrote:

“Using diagnostic language which has been rejected from inclusion in the DSM does not indicate general acceptance by the relevant community, but rather an unwillingness to accept the given methods and language in question.”


The study, "Other Specified Paraphilic Disorder: Patterns of Use in Sexually Violent Predator Evaluations," is authored by Nicole Graham, Cynthia Calkins and Elizabeth Jeglic of the John Jay College of Criminal Justice in New York.

Related reading:


Behavioral Sciences and the Law published an overview of the evidentiary shortcomings of the nonconsent diagnosis, “The admissibility of other specified paraphilic disorder (non-consent) in sexually violent predator,” in 2020. The peer-reviewed article by forensic psychiatrist Brian Holoyda gives a blueprint of how a Daubert evidentiary admissibility challenge to OSPD-nonconsent might be raised due to the purported construct's weak interrater reliability, limited research support and lack of established diagnostic criteria. The same analysis easily applies to hebephilia.

Interested readers can find more background on the history of the term “hebephilia” in a 2010 article by this blogger, "Hebephilia: Quintessence of Diagnostic Pretextuality. " also published in Behavioral Sciences and the Law.

August 14, 2016

Hebephilia flunks Frye test

Photo credit: NY Law Journal
In a strongly worded rejection of hebephilia, a New York judge has ruled that the controversial diagnosis cannot be used in legal proceedings because of “overwhelming opposition” to its validity among the psychiatric community.

Judge Daniel Conviser heard testimony from six experts (including this blogger) and reviewed more than 100 scholarly articles before issuing a long-awaited opinion this week in the case of “Ralph P.,” a 72-year-old man convicted in 2001 of a sex offense against a 14-year-old boy. The state of New York is seeking to civilly detain Ralph P. on the basis of alleged future dangerousness.

State psychologist Joel Lord had initially labeled Ralph P. with the unique diagnosis of sexual attraction to “sexually inexperienced young teenage males,” but later changed his diagnosis to hebephilia, a condition proposed but rejected for the current edition of the American Psychiatric Association’s Diagnostic and Statistical Manual of Mental Disorders (DSM-5).

Under the Frye evidentiary standard, designed to bar novel scientific methods that are not sufficiently validated, a construct must be “generally accepted” by the relevant scientific community before it can be relied upon in legal proceedings.

Judge Conviser found that hebephilia (generally defined as sexual attraction to children in the early stages of puberty, or around the ages of 11 or 12 to 14) is being promoted by a tiny fringe of researchers and in practice is used almost exclusively as a tool to civilly commit convicted sex offenders. Under U.S. Supreme Court rulings, such offenders must have a mental disorder in order to qualify for prolonged detention after they have served their prison terms.

“It is not an accident, as Dr. Franklin outlined, that hebephilia became a prominent diagnosis only with the advent of SVP laws,” the judge wrote in his 75-page opinion. “It is also not a coincidence that each of the three expert witnesses who testified for the State at the instant hearing either work or formerly worked for state [Sexually Violent Predator] programs.”

Conviser’s ruling analyzed both the practical problems in reliably identifying hebephilia and the political controversies swirling around it: Without any standardized criteria, “clinicians are free to assign hebephilia diagnoses in widely disparate ways, many of which are just plainly wrong.” Using age as a proxy for pubertal stage is no guarantee of reliability because pubertal onset is highly variable. Ultimately, he concluded, whether erotic interest in pubescent minors is deemed "pathological" is more about moral values than science.

APA secrecy faulted


The judge was harshly critical of the American Psychiatric Association for its refusal to publicly explain why it rejected hebephilia from the DSM-5 in 2013. The diagnosis was aggressively promoted by a Canadian psychologist, Ray Blanchard, and fellow researchers from Canada’s Centre for Addiction and Mental Health (CAMH), who dominated the DSM-5 subcommittee on paraphilias.

Blanchard rewrote the DSM section on paraphilias (sexual deviances) in a broad way such that virtually all sexual interests other than a narrowly defined “normophilic” pattern became pathological. However, the APA rejected Blanchard’s proposal to expand pedophilia to pathologize adult sexual attractions to pubescent-aged (rather than just prepubescent) minors.

“The proposal was apparently rejected because it was greeted with a firestorm of criticism by the sex offender psychiatric community, which was communicated to the APA board…. As best as this Court can surmise, the APA rejected the pedohebephilia proposal because it was opposed by most of the psychiatrists and psychologists who worked in the field.”

“[S]trikingly,” wrote Judge Conviser, “the process through which proposed new diagnoses are approved or rejected is shrouded in a degree of secrecy which would be the envy of many totalitarian regimes…. With respect to hebephilia, the APA board’s actions will have a direct impact on both public safety and the fundamental liberty interests of hundreds or thousands of people.”

The APA forces those involved in the DSM revision process to sign nondisclosure contracts. That policy came in the wake of a series of published exposes – including Christopher Lane’s Shyness: How Normal Behavior Became a Sickness, Jonathan Metzl's The Protest Psychosis, and Ethan Watters’s Crazy Like Us (to name just a few of my favorites) -- that embarrassed the world’s largest psychiatric organization by shining a light inside the often subjective and political process of diagnosis creation and expansion.

“Overwhelming” opposition


Blanchard and his CAMH colleagues’ 2009 proposal to expand pedophilia into a new “pedohebephilia” diagnosis in the DSM-5 spawned a massive outcry, which mushroomed into at least five dozen published critiques.

In preparation for my testimony at this and similar Frye hearings in New York, I expanded on my 2010 article in Behavioral Sciences and the Law tracing hebephilia’s rise from obscurity, to produce an updated chart containing all 116 articles addressing the construct. If one tallies only those articles that take a position (pro or con) on hebephilia and are not written by members of the CAMH team, fully 83% are critical as compared to only 17% that are favorable. This, Judge Conviser noted, is strong evidence against the government’s position that hebephilia is “generally accepted” by the relevant scientific communities.

“The thrust of the evidence at the hearing was … clear: there was overwhelming opposition to the pedohebephilia proposal in the sex offender psychiatric community,” he wrote. “There is overwhelming opposition to the hebephilia diagnosis today.”

Courts scrutinizing nouveau diagnoses


With the APA’s rejection of hebephilia as well as two other proposed sexual disorders (one for preferential rape and another for hypersexuality), government evaluators continue to shoehorn novel, case-specific diagnostic labels into the catchall DSM-5 category of “other specified paraphilic disorder” (OSPD) as a basis for civil commitment.

Under a 2012 New York appellate court ruling in the case of State v. Shannon S., upon a defense request, a Frye evidentiary hearing must be held on any such attempt to introduce an OSPD diagnosis into a Sexually Violent Predator (SVP) case. That has triggered a spate of Frye hearings in the Empire State, affording greater scrutiny and judicial gatekeeping of scientifically questionable diagnoses.

Ironically, although the Shannon S. court upheld hebephilia by a narrow 4-3 margin, Shannon S. would not have met diagnostic criteria under the narrower definitions presented by the government experts at Ralph P.’s Frye hearing four years later, because his victims were older than 14.

“Assuming hebephilia is a legitimate diagnosis, Shannon S., like many SVP respondents, was apparently diagnosed with the condition not based on evidence he was preferentially attracted to underdeveloped pubescent body types but because he offended against underage victims,” Judge Conviser observed in his detailed summary of prior New York cases.

The three dissenting judges in Shannon S. were adamant that hebephilia was “absurd,” and an example of “junk science,” deployed with the pretextual goal of “locking up dangerous criminals” who had committed statutory rapes.

The opening of the Frye floodgates has led to a flurry of sometimes-competing opinions.

In 2015, in State v. Mercado, Judge Dineen Riviezzo ruled against “OSPD--sexually attracted to teenage females” as a legitimate diagnosis. However, she declined to rule on the general acceptance of hebephilia because it was not specifically diagnosed in that case.

A year later, relying on similar evidence, a judge in upstate New York ruled in State v. Paul V. that hebephilia was generally accepted, in large part because it was backed by the APA’s paraphilias sub-workgroup. Judge Conviser found that reasoning unpersuasive, pointing out that the subworkgroup was dominated by the very same CAMH researchers who were hebephilia’s primary advocates; it was therefore “not a valid proxy" for the scientific community.

In July, another court rejected both hebephilia and “OSPD--underage males” as valid diagnoses, in the cases of Hugh H. and Martello A. The court noted that hebephilia is inconsistently defined, was rejected for the DSM-5, and is primarily advanced by one research group; further, attraction to pubescent minors is not intrinsically abnormal.

Cynthia Calkins, a professor at John Jay College of Criminal Justice in New York, echoed those points in her testimony at Ralph P.'s hearing. She noted that in the United States, the main psychologists advocating for hebephilia are government-retained evaluators in SVP cases, who make up only perhaps one-fourth of one percent of psychologists and psychiatrists in the U.S. and so cannot be a proxy for “general acceptance” in the scientific community.

The government’s choice of experts illustrated Calkins’ point: Testifying for the government were Christopher Kunkle, director of New York’s civil management program for sex offenders, David Thornton of Wisconsin’s civil commitment center, and Robin Wilson, formerly of Florida’s civil commitment center and a protégé of Ray Blanchard’s.

The third expert called by Ralph P.’s attorneys was Charles Ewing, a distinguished professor at the University at Buffalo Law School who is both an attorney and a forensic psychologist and has authored several books on forensic psychology.

Defense attorneys Maura Klugman and Jessica Botticelli of Mental Hygiene Legal Service represented Ralph P. Assistant New York Attorney General Elaine Yacyshyn represented the state.

Ultimately, New York State’s highest court may have to weigh in to resolve once and for all the question of whether novel psychiatric diagnoses like hebephilia are admissible for civil commitment purposes. But that could be years down the road.

----------

The ruling in State v. Ralph P. is HERE. The subsequent order of Sept. 28, 2016 granting Ralph P.'s motion for summary judgment and dismissal of the civil commitment petition is HERE.

A New York Law Journal report on the case, "judge Rejects Diagnosis for Civil Confinement," is HERE.

A search of this blog site using the term hebephilia will produce my reports on this construct dating all the way back to my original post from 2007, "Invasion of the Hebephile Hunters."

April 19, 2015

Static-99: A bumpy developmental path

By Brian Abbott, PhD and Karen Franklin, PhD* 

The Static-99 is the most widely used instrument for assessing sex offenders’ future risk to the public. Indeed, some state governments and other agencies even mandate its use. But bureaucratic faith may be misplaced. Conventional psychological tests go through a standard process of development, beginning with the generation and refinement of items and proceeding through set stages that include pilot testing and replication, leading finally to peer review and formal publication. The trajectory of the Static-99 has been more haphazard: Since its debut 15 years ago, the tool has been in a near-constant state of flux. Myriad changes in items, instructions, norms and real-world patterns of use have cast a shadow over its scientific validity. Here, we chart the unorthodox developmental course of this tremendously popular tool.
 
 
Static-99 and 99R Developmental Timeline
Date
Event
1990
The first Sexually Violent Predator (SVP) law passes in the United States, in Washington. A wave of similar laws begins to sweep the nation.
1997
The US Supreme Court upholds the Constitutionality of preventive detention of sex offenders. 
1997
R. Karl Hanson, a psychologist working for the Canadian prison system, releases a four-item tool to assess sex offender risk. The Rapid Risk Assessment for Sex Offence Recidivism (RRASOR) uses data from six settings in Canada and one in California.[1]
1998
Psychologists David Thornton and Don Grubin of the UK prison system release a similar instrument, the Structured Anchored Clinical Judgment (SACJ- Min) scale.[2]
1999
Hanson and Thornton combine the RRASOR and SACJ-Min to produce the Static-99, which is accompanied by a three-page list of coding rules.[3] The instrument's original validity data derive from four groups of sex offenders, including three from Canada and one from the UK (and none from the United States). The new instrument is atheoretical, with scores interpreted based on the recidivism patterns among these 1,208 offenders, most of them released from prison in the 1970s.
2000
Hanson and Thornton publish a peer-reviewed article on the new instrument.[4]
2003
New coding rules are released for the Static-99, in an 84-page, unpublished booklet that is not peer reviewed.[5] The complex and sometimes counterintuitive rules may lead to problems with scoring consistency, although research generally shows the instrument can be scored reliably.
2003
The developers release a new instrument, the Static-2002, intended to "address some of the weaknesses of Static-99."[6] The new instrument is designed to be more logical and easier to score; one item from the Static-99 – pertaining to whether the subject had lived with a lover for at least two years – was dropped due to issues with its reliability and validity. Despite its advantages, Static-2002 never caught on, and did not achieve the popularity of the Static-99 in forensic settings. 
2007
Leslie Helmus, A graduate student working with Karl Hanson, reports that contemporary samples of sex offenders have much lower offense rates than did the antiquated, non-US samples upon which the Static-99 was originally developed, both in terms of base rates of offending and rates of recidivism after release from custody.[7]
September 2008
Helmus releases a revised actuarial table for Static-99, to which evaluators may compare the total scores of their subjects to corresponding estimates of risk.[8] Another Static-99 developer, Amy Phenix, releases the first of several "Evaluators’ Handbooks."[9]
October 2008
At an annual convention of the Association for the Treatment of Sexual Abusers (ATSA), Andrew Harris, a Canadian colleague of Hanson's, releases a new version of the Static-99 with  three separate "reference groups" (Complete, CSC and High Risk) to which subjects can be compared. Evaluators are instructed to report a range of risks for recidivism, with the lower bound coming from a set of Canadian prison cases (the so-called CSC, or Correctional Service of Canada group), and the upper bound derived from a so-called "high-risk" group of offenders. The risk of the third, or "Complete," group was hypothesized as falling somewhere between those of the other two groups.[10]
November 2008
At a workshop sponsored by a civil commitment center in Minnesota, Thornton and a government evaluator named Dennis Doren propose yet another new method of selecting among the new reference groups.  In a procedure called "cohort matching,” they suggest comparing an offender with either the CSC or High-Risk reference group based on how well the subject matched a list of external characteristics they had created but never empirically tested or validated.[11]
December 2008
Phenix and California psychologist Dale Arnold put forth yet a new idea for improving the accuracy of the Static-99: After reporting the range of risk based on a combination of the CSC and High-Risk reference groups, evaluators are encouraged to consider a set of external factors, such as whether the offender had dropped out of treatment and the offender's score on Robert Hare's controversial Psychopathy Checklist-Revised (PCL-R). This new method does not seem to catch on.[12] [13]
2009
An official Static-99 website, www.static99.org, debuts.[14]
Winter 2009
The Static-99 developers admit that norms they developed in 2000 are not being replicated: The same score on the Static-99 equates with wide variations in recidivism rates depending on the sample to which it is compared. They theorize that the problem is due to large reductions in Canadian and U.S. recidivism rates since the 1970s-1980s. They call for the development of new norms.[15]
September 2009
Hanson and colleagues roll out a new version of the Static-99, the Static-99R.[16] The new instrument addresses a major criticism by more precisely considering an offender's age at release, an essential factor in reoffense risk.  The old Static-99 norms are deemed obsolete. They are replaced by data from 23 samples collected by Helmus for her unpublished Master's thesis. The samples vary widely in regard to risk. For estimating risk, the developers now recommend use of the cohort matching procedure to select among four new reference group options. They also introduce the concepts of percentile ranks and relative risk ratios, along with a new Evaluators’ Workbook for Static-99R and Static-2002R. Instructions for selecting reference groups other than routine corrections are confusing and speculative. Research is lacking to demonstrate that selecting other than routine corrections reference group produces more accurate risk estimates.[17]
November 2009
Just two months after their introduction, the Evaluators’ Workbook for Static-99R and Static-2002R is withdrawn due to errors in its actuarial tables.[18] The replacement workbook provides the same confusing and speculative method for selecting a nonroutine reference group, a method that lacks scientific validation and reliability.
2010
An international team of researchers presents large-scale data from the United States, New Zealand and Australia indicating that the Static-99 would be more accurate if it took better account of an offender's age.[19] The Static-99 developers do not immediately embrace these researchers' suggestions.
January 2012
Amy Phenix and colleagues introduce a revised Evaluators’ Workbook for Static-99R and Static-2002R.[20] The new manual makes a number of revisions both to the underlying data (including percentile rank and relative risk ratio data) and to the recommended procedure for selecting a reference group. Now, in an increasingly complex procedure, offenders are to be compared to one of three reference groups, based on how many external risk factors they had. The groups included Routine Corrections (low risk), Preselected Treatment Need (moderate risk), and Preselected High Risk Need (high risk). Subsequent research shows that using density of external risk factors to select among the three reference group options is not valid and has no proven reliability.[21]A fourth reference group, Nonroutine Corrections, may be selected using a separate cohort-matching procedure. New research indicates that evaluators who are retained most often by the prosecution are more likely than others to select the high-risk reference group, [22]  which has base rates much higher than in contemporary sexual recidivism studies and will thus produce exaggerated risk estimates.[23]    
July 2012
Six months later, the percentile ranks and relative risk ratios are once again modified, with the issuance of the third edition of the Static-99R and Static-2002R Evaluators’ Handbook.[24] No additional data is provided to justify that the selection of nonroutine reference groups produces more accurate risk estimates than choosing the routine corrections reference group.
October 2012
In an article published in Criminal Justice & Behavior, the developers concede that risk estimates for the 23 offender samples undergirding the Static-99 vary widely. Further, absolute risk levels for typical sex offenders are far lower than previously reported, with the typical sex offender having about a 7% chance of committing a new sex offense within five years. They theorize that the Static-99 might be inflating risk of reoffense due to the fact that the offenders in its underlying samples tended to be higher risk than average.[25]
2012
The repeated refusal of the Static-99 developers to share their underlying data with other researchers, so that its accuracy can be verified, leads to a court order excluding use of the instrument in a Wisconsin case.[26]
October 2013
At an annual ATSA convention, Hanson and Phenix report that an entirely new reference group selection system will be released in a peer-reviewed article in Spring 2014.[27] The new system will include only two reference groups: Routine Corrections and Preselected High Risk High Need.  An atypical sample of offenders from a state hospital in Bridgewater, Massachusetts dating back to 1958 is to be removed altogether, along with some other samples, while some new data sets are to be added.
October 2014
At the annual ATSA convention, the developers once again announce that the anticipated rollout of the new system has been pushed back pending acceptance of the manuscript for publication. Helmus nonetheless presents an overview.[28] She reports that the new system will abandon two out of the current four reference groups, retaining only Routine Corrections and Preselected High Risk Need.   Evaluators should now use the Routine Corrections norms as the default unless local norms (with a minimum of 100 recidivists) are available. Evaluators will be permitted to choose the Preselected High Risk Need norms based on “strong, case-specific justification.” No specific guidance nor empirical evidence to support such a procedure is proffered. A number of other new options for reporting risk information are also presented, including the idea of combining Static-99 data with that from newly developed, so-called "dynamic risk instruments."   
January 2015
At an ATSA convention presentation followed by an article in the journal Sexual Abuse,[29] the developers announce further changes in their data sets and how Static-99R scores should be interpreted. Only two of the original four "reference groups" are still standing. Of these, the Routine group has grown by 80% (to 4,325 subjects), while the High-Risk group has shrunk by 35%, to a paltry 860 individuals. Absent from the article is any actuarial table on the High-Risk group, meaning the controversial practice by some government evaluators of inflating risk estimates by comparing sex offenders' Static-99R scores with the High-Risk group data has still not passed any formal peer review process. The developers also correct a previous statistical method as recommended by Ted Donaldson and colleagues back in 2012,[30] the effect of which is to further lower risk estimates in the high-risk group. Only sex offenders in the Routine group with Static-99R scores of 10 are now statistically more likely than not to reoffend. It is unknown how many sex offenders were civilly committed in part due to reliance on the now-obsolete data.

References


[1] Hanson, R. K. (1997). The development of a brief actuarial risk scale for sexual offense recidivism. (Unpublished report 97-04). Ottawa: Department of the Solicitor General of Canada.
[2] Grubin, D. (1998). Sex offending against children: Understanding the risk. Unpublished report, Police Research Series Paper 99. London: Home Office.
[3] Hanson, R.K. & Thornton, D. (1999).  Static 99: Improving Actuarial Risk Assessments for Sex Offenders. Unpublished paper
[4] Hanson, R. K., & Thornton, D. (2000). Improving risk assessments for sex offenders: A comparison of three actuarial scales. Law and Human Behavior, 24(1), 119-136.
[5] Harris, A. J. R., Phenix, A., Hanson, R. K., & Thornton, D. (2003). Static-99 coding rules: Revised 2003. Ottawa, ON: Solicitor General Canada.
[6] Hanson, R.K., Helmus, L., & Thornton, D (2010). Predicting recidivism amongst sexual offenders: A multi-site study of Static-2002. Law & Human Behavior 34, 198-211.
[7] Helmus, L. (2007). A multi-site comparison of the validity and utility of the Static-99 and Static-2002 for risk assessment with sexual offenders. Unpublished Honour’s thesis, Carleton University, Ottawa, ON, Canada.
[8] Helmus, L. (2008, September). Static-99 Recidivism Percentages by Risk Level. Last Updated September 25, 2008. Unpublished paper.
[9] Phenix, A., Helmus, L., & Hanson, R.K. (2008, September). Evaluators’ Workbook. Unpublished, September 28, 2008
[10] Harris, A. J. R., Hanson, K., & Helmus, L. (2008). Are new norms needed for Static-99? Workshop presented at the ATSA 27th Annual Research and Treatment Conference on October 23, 2008, Atlanta: GA. Available at www.static99.org.
[11] Doren, D., & Thornton, D. (2008). New Norms for Static-99: A Briefing. A workshop sponsored by Sand Ridge Secure Treatment Center on November 10, 2008. Madison, WI.
[12] Phenix, A. & Arnold, D. (2008, December). Proposed Considerations for Conducting Sex Offender Risk Assessment Draft 12-14-08. Unpublished paper.
[13] Abbott, B. (2009). Applicability of the new Static-99 experience tables in sexually violent predator risk assessments. Sexual Offender Treatment, 1, 1-24.
[14] Helmus, L., Hanson, R. K., & Thornton, D. (2009). Reporting Static-99 in light of new research on recidivism norms. The Forum, 21(1), Winter 2009, 38-45.
[15] Ibid.
[16] Hanson, R. K., Phenix, A., & Helmus, L. (2009, September). Static-99(R) and Static-2002(R): How to Interpret and Report in Light of Recent Research. Paper presented at the 28th Annual Research and Treatment Conference of the Association for the Treatment of Sexual Abusers, Dallas, TX, September 28, 2009.
[17] DeClue, G. & Zavodny, D. (2014). Forensic use of the Static-99R: Part 4. Risk Communication. Journal of Threat Assessment and Management, 1(3), 145-161.
[18] Phenix, A., Helmus, L., & Hanson, R.K. (2009, November). Evaluators’ Workbook. Unpublished, November 3, 2009.
[19] Wollert, R., Cramer, E., Waggoner, J., Skelton, A., & Vess, J. (2010). Recent Research (N = 9,305) Underscores the Importance of Using Age-Stratified Actuarial Tables in Sex Offender Risk Assessments. Sexual Abuse: A Journal of Research and Treatment, 22 (4), 471-490. See also: "Age tables improve sex offender risk estimates," In the News blog, Dec. 1, 2010.
[20] Phenix, A., Helmus, L., & Hanson, R.K. (2012, January). Evaluators’ Workbook. Unpublished, January 9, 2012.
[21] Abbott, B.R. (2013). The Utility of Assessing “External Risk Factors” When Selecting Static-99R Reference Groups. Open Access Journal of Forensic Psychology, 5, 89-118.
[22] Chevalier, C., Boccaccini, M. T., Murrie, D. C. & Varela, J. G. (2014), Static-99R Reporting Practices in Sexually Violent Predator Cases: Does Norm Selection Reflect  Adversarial Allegiance? Law & Human Behavior. To request a copy from the author, click HERE.
[23] Abbott (2013) op. cit.
[24] Phenix, A., Helmus, L., & Hanson, R.K. (2012, July). Evaluators’ Workbook. Unpublished, July 26, 2012.
[25] Helmus, Hanson, Thornton, Babchishin, & Harris (2012), Absolute recidivism rates predicted by Static-99R and Static-2002R sex offender risk assessment tools vary across samples: A meta-analysis, Criminal Justice & Behavior. See also: "Static-99R risk estimates wildly unstable, developers admit," In the News blog, Oct. 18, 2012.
[27] Hanson, R.K. & Phenix, A. (2013, October). Report writing for the Static-99R and Static-2002R. Preconference seminar presented at the 32nd Annual Research and Treatment Conference of the Association for the Treatment of Sexual Abusers, Chicago, IL, October 30, 2013. See also: "Static-99 'norms du jour' get yet another makeover," In the News blog, Nov. 17, 2013.
[28] Helmus, L.M. (2014, October). Absolute recidivism estimates for Static-99R and Static-2002R: Current research and recommendations. Paper presented at the 33rd Annual Research and Treatment Conference of the Association for the Treatment of Sexual Abusers, San Diego, CA, October 30, 2014.
Hanson, R. K., Thornton, D., Helmus, L-M, & Babchishin, K. (2015). What sexual recidivism rates are associated with Static-99R and Static-2002R scores? Sexual Abuse: A Journal of Research and Treatment, 1-35.
Donaldson, T., Abbott, B., & Michie,  C. (2012). Problems with the Static-99R prediction estimates and confidence intervals. Open Access Journal of Forensic Psychology, 4,
1-23.

* * * * *

*Many thanks to Marcus Boccaccini, Gregory DeClue, Daniel Murrie and other knowledgeable colleagues for their valuable feedback.  


* * * * *

Related blog posts:
·        Static-99 "norms du jour" get yet another makeover (Nov. 17, 2013)
·        Age tables improve sex offender risk estimates (Dec. 1, 2010)
·        New study: Do popular actuarials work? (April 20, 2010)
·        Delusional campaign for a world without risk (April 3, 2010)