February 4, 2014

Research review II: Sexual predator controversies

Following up on last week’s research review, here are some new articles from the ever-controversial practice niche of sexually violent predator cases:

Facts? Who cares about the facts?!

Once a jury is empaneled to decide whether someone with a prior sex offense conviction is so dangerous to the public that he should be civilly detained, the verdict is a foregone conclusion. Dangerousness is presumed based on the prior conviction, rather than having to be proven.


Researchers Nicholas Scurich and Daniel Krauss confirmed this by giving jury-eligible citizens varying degrees of information in a Sexually Violent Predator (SVP) case and asking them to vote. Some mock jurors were told only that the person had a prior conviction for a sex offense. Others were also given information that the person had a mental abnormality that made him likely to engage in future acts of sexual aggression.


It mattered not a whit. The mock jurors voted to civilly commit at the same rate, whether or not they had heard evidence of current dangerousness.


“The mere fact that a respondent had been referred for an SVP proceeding was sufficient for a majority of participants to authorize commitment,’ the researchers found. “These findings raise concerns about whether the constitutionally required due process occurs in SVP commitment proceedings.”


No surprise, really. In this practice niche more than others, fear and hype often overshadow reason. Sex offenders are not the most appealing human beings, and no one wants to shoulder the responsibility of voting to release someone who could go out and rape or molest again.


The study is:

The presumption of dangerousness in sexually violent predator commitment proceedings, Nicholas Scurich and Daniel A. Krauss, Law, Probability and Risk. A copy may be requested from the first author (HERE).





Sexual disorder diagnoses not reliable



Meanwhile, even when jurors do hear evidence of mental abnormality, it is not especially accurate.


Examining the diagnoses given to 375 sex offenders referred for civil commitment in New Jersey, researchers found “questionable” diagnostic reliability to be a widespread problem across the range of clinicians.


Pedophilia was the only diagnosis in which two evaluators were likely to agree at a level above chance. The rates of agreement were far worse for other disorders that are typically rendered in SVP cases, including “Paraphilia Not Otherwise Specified,” Sexual Sadism, Antisocial Personality Disorder and Exhibitionism. In fact, among the six cases in which Exhibitionism was diagnosed, there was not a single case in which both clinicians agreed.


The study, by Anthony Perillo of John Jay College and colleagues, adds to a burgeoning body of literature (some of which I’ve previously reported on) suggesting that psychiatric diagnoses in SVP evaluations are often dubious and not to be trusted.


The article is:

Examining the scope of questionable diagnostic reliability in Sexually Violent Predator(SVP) evaluations, Anthony D. Perillo, Ashley H. Spada, Cynthia Calkins and Elizabeth L. Jeglic, International Journal of Law and Psychiatry. A copy may be requested from the first author (HERE).





Race bias in actuarial risk prediction



Okay, so the diagnoses aren’t reliable. But we’ve still got another tool of science up our sleeves -- actuarial risk assessment.


Not so fast.


As I’ve previously reported, the predictive accuracy of actuarial risk assessment tools is pretty wimpy. And now, researchers from Sam Houston State University are finding that the most widely used actuarial tool, the Static-99, doesn’t work at all with Latino offenders.


The findings are based on research with a large sample of about 2,000 sex offenders, almost 600 of whom were Latino.


“Findings have implications for fairness in testing and highlight the need for continuedresearch regarding the potentially moderating role of offender race/ethnicity in risk research,” note researchers Jorge Varela and colleagues.


The study is:

Do the Static-99 and Static-99R Perform Similarly for White, Black, and Latino Sexual Offenders? Jorge G. Varela , Marcus T. Boccaccini, Daniel C. Murrie, Jennifer D. Caperton and Ernie Gonzalez Jr. International Journal of Forensic Mental Health. To request a copy from the first author, click HERE.





How to lie with statistics: “The Area Under the Curve”



Listen to any defender of actuarial risk prediction for a few minutes, and you will likely hear "Receiver Operating Characteristics” and “The Area Under the Curve” touted as indicators of statistical accuracy.


But in a new study in the Journal of Threat Assessment, two European scholars argue that these arguments are “fundamentally misleading.” Using the Risk Matrix 2000 instrument -- widely deployed in the United Kingdom -- as an exemplar, they found that a prediction of reoffense for an offender who scored in the “Very High Risk” range will be wrong an astounding 93 percent of the time.


“The numbers necessary to detain in order to prevent one instance of recidivism are large,” write David Cooke and Christine Michie. “On further reflection, from a statistical rather than a psychological perspective, should we be surprised? It has long been recognized that low-frequency events are hard to predict.”


The authors argue that the weak performance of actuarials is being systematically camouflaged by “statistical rituals” that are confusing and non-transparent, raising fundamental questions of fairness in legal decision-making.


The article is:

The Generalizability of the Risk Matrix 2000: On Model Shrinkage and the Misinterpretation of the Area Under the Curve. David Cooke and Christine Michie. Journal of Threat Assessment and Management. To request a copy from the first author, click HERE.





Counterpoint



Not everyone agrees with Cooke and Michie’s analysis. One detractor is Douglas Mossman, of the Department of Psychiatry at the University of Cincinnati College of Medicine. Using a fictional scenario, he attempts to illustrate how "group data have an obvious application to individual decisions.” His paper goes on to argue that “misinterpretations of mathematical concepts and misunderstanding of the aims of risk assessment have led to mistakes about the applicability of group data to individual instances.”


The paper is:

From Group Data to Useful Probabilities: The Relevance of Actuarial Risk Assessment in Individual Instances. (Unpublished.) Douglas Mossman. Paper available online (HERE).





Who is minding the store?



If nothing else, the above research snippets demonstrate the high level of controversy and complexity in the implementation of Sexually Violent Predator laws. If psychologists -- who must master psychometrics and statistics in order to earn our PhD’s -- have a hard time with these concepts, imagine how difficult it is for attorneys. With people’s lives at stake, do they have the knowledge base necessary to avoid being hoodwinked, and to educate jurors and judges?


In a new paper, prolific legal scholars Heather Cucolo and Michael L. Perlin of the New York School of Law argue that more stringent standards for representation are necessary for effective assistance of counsel in SVP cases.


They propose that counsel should be required to “demonstrate a familiarity with the psychometric tests regularly employed at such hearings, and with relevant expert witnesses who could assist in the representation of the client.” Furthermore, they argue for a pool of court-appointed experts who could be appointed at no cost, similar to those provided in insanity cases.


“There is no question that the population in question is the most despised group of individuals in the nation. Society’s general revulsion towards this population is shared by judges, jurors and lawyers. Although the bar pays lip service to the bromide that counsel is available for all, no matter how unpopular the cause, the reality is that there are few volunteers for the job of representing these individuals, and that the public's enmity has a chilling effect on the vigorous of representation in this area.”


The paper is:

'Far from the Turbulent Space': Considering the Adequacy of Counsel in the Representation of Individuals Accused of Being Sexually Violent Predators. Heather Cucolo and Michael L. Perlin. It is available online HERE.


January 30, 2014

Research roundup

The articles are flooding in at an alarming rate, threatening to bury me under yet another avalanche. Before I am completely submerged, let me share brief synopses of a few of the more informative ones that I have gotten around to reading.


Assessor bias in high-stakes testing: The case of children’s IQ


I’ve blogged quite a bit about bias in forensic assessment, reporting on problems with such widely used tests as the Psychopathy Checklist and the Static-99R. As I’ve reported, some of the bias can be chalked up to adversarial allegiance, or which side the evaluator is working for, whereas some may be due to personality differences among evaluators. Now, researchers are extending this research into other realms -- with alarming findings.


In a study of intelligence testing among several thousand children at 448 schools, the researchers found significant and nontrivial variations in test scoring that had nothing to do with children’s actual intelligence differences. The findings, reported in the journal Psychological Assessment, are especially curious because scoring of the test in question, the Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV), seems relatively straightforward and objective (at least as compared to inherently subjective tests like the Psychopathy Checklist, for example).


The article is:

  • Whose IQ Is It? Assessor Bias Variance in High-Stakes Psychological Assessment.  McDermott, Paul A.; Watkins, Marley W.; Rhoad, Anna M. Psychological Assessment, Published online on Nov 4 , 2013. To request a copy from the first author, click HERE.





Beware pseudo-precision in expert opinions


I’ve never forgotten a video I saw a long time ago, in which the filmmakers drove up to random strangers and asked for directions to a nearby landmark. Some of the good samaritans gave enthusiastic instructions that were completely wrong, while other people gave correct directions but in a more tentative fashion. The trouble is, the more confident someone appears, the more we judge them as knowing what they are talking about.  


One way we gauge a presenter’s confidence, in turn, is by their level of precision. In a new study, researchers found that participants were more likely to rely on advice given by people who provided more precise information. For example, they were more likely to trust someone who said that the Mississippi River was 3,992 miles long, rather than 4,000 miles long.


What this means in the forensic realm is that we should not make claims of false precision, when our evidence base is weak. For example, we should not claim to know that someone has a 44 percent chance of violent reoffense within three years. Such misleading claims-making lends an aura of confidence and expertise that is not warranted.


The article is:




Ethics and the DSM-5


Speaking of avalanches, the volume of critical response to the DSM-5 is lessening now that the tome has been on the bookshelves for eight months. Trying to keep my finger on the pulse because of my training activities on the manual’s forensic implications, I found an interesting summary of the ethical dilemmas of the latest trends in psychiatric diagnosis.


The author, Jennifer Blumenthal-Barby, is an ethics professor at Baylor College of Medicine’s Center for Medical Ethics and Health Policy. In her critique, published in the Journal of Medical Ethics, she focuses on consequence-based concerns about the dramatic expansion of psychiatric diagnoses in the latest edition of the American Psychiatric Association’s influential manual. Concerns include:


  • False positives, or over-diagnosis, in clinical (and I would add forensic) practice
  • Risks associated with pharmacological treatments of new conditions
  • Neglect of larger structural issues and reduction of individual responsibility through medicalization
  • Discrediting of psychiatry through the trivialization of mental disorders
  • Efforts to eradicate conditions that are valuable or even desirable


Although her discussion is fairly general, she does mention a few of the proposed diagnostic changes of forensic relevance that I’ve blogged about. These include the proposed hypersexual disorder and a proposal to eliminate the age qualifier (of 18 and above) for antisocial personality disorder, to make it consistent with all of the other personality disorders.


It’s a good, brief overview suitable for assignment to students and professionals alike.


The article is: 
  • Psychiatry’s new manual (DSM-5): ethical and conceptual dimensions. Journal of Medical Ethics. Published online on 10 Dec. 2013. To request a copy, click HERE.




Dual relationships: Are they all bad?


We’ve all seen the memo: Dual relationships are to be avoided.


But is that always true?


Not according to ethics instructor Ofer Zur.


Multiple relationships are situations in which a mental health professional has a professional role with a client and another role with a person closely related to the client. In a new overview, Zur asserts that, not only are some multiple relationships ethical, they may be unavoidable, desirable, or even -- in some cases -- mandated.


In delineating the ethics and legality of 26 different types of multiple relationships, Zur stresses that in forensic settings, most multiple relationships should be avoided.


The article, Not All Multiple Relationships Are Created Equal: Mapping the Maze of 26 Types of Multiple Relationships, is another good teaching tool, and is freely available online at Zur’s continuing education website.

By the way, if you are in California and are looking for more ethics training, Zur and two of my former colleagues from the state psychological association’s Ethics Committee -- Michael Donner, PhD and Pamela Harmell, PhD -- are co-presenting at an interactive ethics session at the upcoming California Psychological Association convention. The convention runs April 9-13 in Monterey, and the ethics conversation -- “Ethics are not Rules: Psych in the Real World” -- is on Saturday, April 12.

January 26, 2014

Psycholegal evaluations in Immigration Court: Free online training series

Feb. 5 UPDATE: The first webinar in this series was a huge success. To register for any or all of the remaining three webinars, click HERE.

Torture victims from El Salvador. Gay people from Uganda. Immigrants with elderly dependents who are U.S. citizens.

In our increasingly multicultural society, more and more people find themselves in U.S. Immigration Court. And, often, psychological evaluations play a role in deciding their fates. Unfortunately, most immigrants applying for political asylum or hardship waivers have very little money, creating an acute need for psychologists willing and able to provide low-fee evaluations.

Working to fulfill this need is my hard-working colleague Anatasia Kim, a professor at the Wright Institute in Berkeley and chair of the Immigration Task Force of the California Psychological Association. Dr. Kim is spearheading a drive to train a cadre of psychologists to conduct these evaluations. In exchange for conducting low-fee or pro bono evaluations, psychologists and students will get free mentorship by expert forensic psychologists and attorneys in the field.  

As part of the campaign, the Immigration Task Force is hosting a four-part Webinar series in February aimed at teaching the basic competencies. Immigration attorneys and psychologists will train virtual attendees on the nuts and bolts of conducting psycholegal evaluations in immigration courts.

Best of all, the series is entirely FREE. You can even earn continuing education credits (one unit per session).

The four workshops, each running from noon to 1:00 p.m. (Pacific Standard Time) on
a Tuesday, are:

Feb. 4: Basics of Conducting a Psychological Evaluation for Immigration Court. Nancy Baker, Ph.D., ABPP, Diplomate in Forensic Psychology, Director of Forensic Concentration at Fielding Graduate University

Feb. 11: Legal Relevance of Psychologists’ Opinions in the Immigration Context. Robin Goldfaden, Esq., Senior Attorney, Immigrant Justice, Lawyers’ Committee for Civil Rights of the San Francisco Bay Area, and Lisa Fryman, Esq., Associate Director/Managing Attorney, Center for Gender and Refugee Studies at U.C. Hastings College of Law

Feb. 18: Recommended Immigration Evaluation Process for Hardship Cases. Margaret Lee, Ph.D., Clinical Psychologist and Adjunct Professor, Alliant International University, San Diego and Former Clinical Director at Survivors of Torture International

Feb. 25: Writing Psychological Assessment Reports for Immigration Court. James Livingston, Ph.D., Senior Staff Psychologist, Center for the Survivors of Torture in San Jose.


You can register for the first training HERE

If you have any questions, email Dr. Kim HERE.  

January 23, 2014

California conference to highlight juvenile treatment

Michael Caldwell, co-founder of the Mendota Juvenile Treatment Center in Wisconsin, will share his Center’s innovative approach to treating hard-core juvenile offenders at this year’s Forensic Mental Health Association of California (FMHAC) conference.

Caldwell, whose research on juvenile risk assessment has been highlighted on this blog, says the Mendota approach has been proven to reduce violent offense among the extreme end of intractable juvenile delinquents who absorb such a disproportionate amount of rehabilitation resources and account for a large proportion of violent crimes.

His two workshops are part of a special juvenile track that will also feature a session on introducing the practice of mindfulness to incarcerated juveniles.

The juvenile track is one of five special tracks at this year’s FMHAC conference, coming up March 19 in beautiful Monterey, California. The other tracks are clinical/assessment, legal, psychiatric and, of course, the omnipresent sex offender track.

More details and registration information can be found HERE.The FMHAC's website is HERE.

January 20, 2014

Orange is the New Black -- Read the book!

Taylor Schilling plays Piper Kerman in the TV series
Hollywood prison scenes are so revolting. Most revolting are the depictions of women’s prisons. They superimpose onto female prisoners the worst stereotype of male prisoners as hulking, sexually aggressive brutes. And, even more so than for male prisons, the public has little direct information to counter this distorted image.

Blasting apart this image is Piper Kerman’s outstanding memoir. Detailing her year in a minimum-security federal camp, Orange is the New Black is a first-rate effort to educate the public about the realities of women’s prison.

Promo for blockbuster Netflix spinoff
Kerman tiptoed into prison with the trepidation one might expect of a white, college-educated woman thrown into the lion’s den. But instead of prisoner-on-prisoner predation, she found a sense of community, where women survived by forging family-like relationships among their “tribes.” The greatest dangers in prison came not at the hands of other women, Kerman found, but from the agents of bureaucracy who wielded the threat of the SHU* (Security Housing Unit) or loss of good-time credits for any petty misstep.

I found myself grateful that, once in a blue moon, a middle-class person with a social conscience is sent to prison. Kerman’s bad luck is the public’s fortune. With the overwhelming mass of prisoners voiceless, who else can speak the truth and be heard? Kerman is the everywoman; through recognizing ourselves in her, we feel the prisoner’s plight as our own.

Her sense of not belonging among the underclass was shared by correctional officers and prisoners alike, who more than once asked the blond-haired, blue-eyed Smith College graduate: “What’s someone like you doing in a place like this?!”

Laverne Cox as trans prisoner Sophia Burset
Don’t think that if you’ve seen the blockbuster TV spinoff, you know the story. While colorful, the series is by comparison shallow and exploitive. Netflix does a public service by counteracting Hollywood’s crude stereotypes, portraying incarcerated women as diverse human beings, but the semi-fictional show’s biggest accomplishment may be to steer intelligent viewers toward the book. (As an aside, it has also given greater visibility to the issues of transgender women of color, with trans actress Laverne Cox outstanding in the role of a transwoman prisoner.)

For a real-life visual representation of the lot of the woman prisoner, I recommend the documentary Crime After Crime. The story of battered woman Debbie Peagler’s struggle for justice is far more heart-wrenching than Kerman’s memoir, but both dramatize how a soulless bureaucratic machine chews up and spits out human potential.

The real-life Piper Kerman
Kerman is a fluid story-teller, and her saga is intrinsically gripping. But, as writer Mary Karr points out in a recent interview, the "through-line" of an effective memoir is the character’s transformation. Seeing herself through the eyes of other women in the bleak prison milieu, Kerman realizes virtues in herself that she never knew. And she confronts for the first time her own complicity in her comrades' oppression, through her former role as an international heroin smuggler.

The sincerity of Kerman’s transformation is evident in her life since leaving prison nine years ago. She serves on the board of the non-profit prison reform group Women’s Prison Association and does public education on the plight of women prisoners -- especially the two-thirds who are mothers -- through influential media outlets such as National Public Radio. As she writes in a recent op-ed in the New York Times:  
"Harshly punitive drug laws and diminishing community mental health resources have landed many women in prison who simply do not belong there, often for shockingly long sentences. What is priceless about JusticeHome, however, is that it is working not only to rehabilitate women but to keep families together -- which we know is an effective way to reduce crime and to stop a cycle that can condemn entire families to the penal system."

* * * * *

*I listened to the audiobook version. The reader was quite good. Her only false steps came in reading the word "SHU": She read it aloud as "S-H-U," instead of the way it is actually pronounced in prison ("shoe"). The SHU is too ubiquitous to merit three syllables at every utterance.


(c) Copyright Karen Franklin 2014 - All rights reserved

January 12, 2014

Putting the Cart Before the Horse: The Forensic Application of the SRA-FV

As the developers of actuarial instruments such as the Static-99R acknowledge that their original norms inflated the risk of re-offense for sex offenders, a brand-new method is cropping up to preserve those inflated risk estimates in sexually violent predator civil commitment trials. The method introduces a new instrument, the “SRA-FV,” in order to bootstrap special “high-risk” norms on the Static-99R. Curious about the scientific support for this novel approach, I asked forensic psychologist and statistics expert Brian Abbott to weigh in.

Guest post by Brian Abbott, PhD*

NEWS FLASH: Results from the first peer-reviewed study about the Structured Risk Assessment: Forensic Version (“SRA-FV”), published in Sexual Abuse: Journal of Research and Treatment (“SAJRT”), demonstrate the instrument is not all that it’s cracked up to be.
Promotional material for an SRA-FV training
For the past three years, the SRA-FV developer has promoted the instrument for clinical and forensic use despite the absence of peer-reviewed, published research supporting it validity, reliability, and generalizability. Accordingly, some clinicians who have attended SRA-FV trainings around the country routinely apply the SRA-FV in sexually violent predator risk assessments and testify about its results in court as if the instrument has been proven to measure what it intends to assess, has known error rates, retains validity when applied to other groups of sexual offenders, and produces trustworthy results.

Illustrating this rush to acceptance most starkly, within just three months of its informal release (February 2011) and with an absence of any peer-reviewed research, the state of California incredibly decided to adopt the SRA-FV as its statewide mandated dynamic risk measure for assessing sexual offenders in the criminal justice system. This decision was rescinded in September 2013, with the SRA-FV replaced with a similar instrument, the Stable-2007.

The SRA-FV consists of 10 items that purportedly measure “long-term vulnerabilities” associated with sexual recidivism risk. The items are distributed among three risk domains and are assessed using either standardized rating criteria devised by the developer or by scoring certain items on the Psychopathy Checklist-Revised (PCL-R). Scores on the SRA-FV range from zero to six. Some examples of the items from the instrument include: sexual interest in children, lack of emotionally intimate relationships with adults, callousness, and internal grievance thinking. Patients from the Massachusetts Treatment Center in Bridgewater, Massachusetts who were evaluated as sexually dangerous persons between 1959 and 1984 served as members of the SRA-FV construction group (unknown number) and validation sample (N = 418). It was released for use by Dr. David Thornton, a co-developer of the Static-99R, Static-2002R, and SRA-FV and research director at the SVP treatment program in Wisconsin, in December 2010 during training held in Atascadero, California. Since then, Dr. Thornton has held similar trainings around the nation where he asserts that the SRA-FV is valid for predicting sexual recidivism risk, achieves incremental validity over the Static-99R, and can be used to choose among Static-99R reference groups.

A primary focus of the trainings is a novel system in which the total score on the SRA-FV is used to select one Static-99R “reference group” among three available options. The developer describes the statistical modeling underlying this procedure, which he claims increases predictive validity and power over using the Static-99R alone. However, reliability data is not offered to support this claim. In the December 2010 training, several colleagues and I asked for the inter-rater agreement rate but Dr. Thornton refused to provide it.

I was astounded but not surprised when some government evaluators in California started to apply the SRA-FV in sexually violent predator risk assessments within 30 days after the December 2010 training. This trend blossomed in other jurisdictions with sexually violent predator civil confinement laws. Typically, government evaluators applied the SRA-FV to select Static-99R reference groups, invariably choosing to compare offenders with the “High Risk High Needs” sample with the highest re-offense rates. A minority of clinicians stated in reports and court testimony that the SRA-FV increased predictive accuracy over the Static-99R alone but they were unable to quantify this effect. The same clinicians have argued that the pending publication of the Thornton and Knight study was sufficient to justify its use in civil confinement risk assessments for sexually violent predators. They appeared to imply that the mere fact that a construction and validation study had been accepted for publication was an imprimatur that the instrument was reliable and valid for its intended purposes. Now that the research has been peer-reviewed and published, the results reflect that these government evaluators apparently put the proverbial cart before the horse.

David Thornton and Raymond Knight penned an article that documents the construction and validation of the SRA-FV. The publication is a step in the right direction, but by no means do the results justify widespread application of the SRA-FV in sexual offender risk assessment in general or sexually violent predator proceedings in particular. Rather, the results of the study only apply to the group upon which the research was conducted and do not generalize to other groups of sexual offenders. Before discussing the limitations of the research, I would like to point out some encouraging results.

The SRA-FV did, as its developer claimed, account for more sources of sexual recidivism risk than the Static-99R alone. However, it remains unknown which of the SRA-FV’s ten items contribute to risk prediction. The study also found that the combination of the Static-99R and SRA-FV increased predictive power. This improved predictive accuracy, however, must be replicated to determine whether the combination of the two instruments will perform similarly in other groups of sexual offenders. This is especially important when considering that the SRA-FV was constructed and validated on individuals from the Bridgewater sample from Massachusetts who are not representative of contemporary groups of sexual offenders. Thornton and Knight concede this point when discussing how the management of sexual offenders through all levels of the criminal justice system in Massachusetts between 1959 and 1984 was remarkably lenient compared to contemporary times. Such historical artifacts likely compromise any reliable generalization from patients at Bridgewater to present-day sexual offenders.

Training materials presented four months before
State of California rescinded use of the SRA-FV

Probably the most crucial finding from the study is the SRA-FV’s poor inter-rater reliability. The authors categorize the 64 percent rate of agreement as “fair.” It is well known that inter-rater agreement in research studies is typically higher than in real-world applications. This has been addressed previously in this blog in regard to the PCL-R. A field reliability study of the SRA-FV among 19 government psychologists rating 69 sexually violent predators in Wisconsin (Sachsenmaier, Thornton, & Olson, 2011) found an inter-rater agreement rate of only 55 percent for the SRA-FV total score, which is considered as poor reliability. These data illustrate that 36 percent to 45 percent of an SRA-FV score constitutes error, raising serious concerns over the trustworthiness of the instrument. To their credit, Thornton and Knight acknowledge this as an issue and note that steps should be taken to increase reliable scoring. Nonetheless, the current inter-rater reliability falls far short of the 80 percent floor recommended for forensic practice (Heilbrun, 1992). Unless steps are taken to dramatically improve reliability, the claims that the SRA-FV increases predictive accuracy either alone or in combination with the Static-99R, and that it should be used to select Static-99R reference groups, are moot.

It is also important to note that, although Thornton and Knight confuse the terms validation and cross validation in their article, this study represents a validation methodology. Cross-validation is a process by which the statistical properties found in a validation sample (such as reliability, validity, and item correlations) are tested in a separate group to see whether they hold up. In contrast, Thornton and Knight first considered the available research data from a small number of individuals from the Bridgewater group to determine what items would be included in the SRA-FV. This group is referred to as the construction sample. The statistical properties of the newly conceived measure were studied on 418 Bridgewater patients who constitute the validation sample. The psychometric properties of the validation group have not been tested on other contemporary sexual offender groups. Absent such cross-validation studies, we simply have no confidence that the SRA-FV works at it has been designed for groups other than the sample upon which it was validated. To their credit, Thornton and Knight acknowledge this limitation and warn readers not to generalize the validation research to contemporary groups of sexual offenders.

The data on incremental predictive validity, while interesting, have little practical value at this point for two reasons. One, it is unknown whether the results will replicate in contemporary groups of sexual offenders. Two, no data are provided to quantify the increased predictive power. The study does not provide an experience table of probability estimates at each score on the Static-99R after taking into account the effect of the SRA-FV scores. It seems disingenuous, if not misleading, to inform the trier of fact that the combined measures increase predictive power but to fail to quantify the result and the associated error rate.

In my practice, I have seen the SRA-FV used most often to select among three Static-99R reference groups. Invariably, government evaluators in sexually violent predator risk assessments assign SRA-FV total scores consistent with the selection of the Static-99R High Risk High Needs reference group. Only the risk estimates associated with the highest Static-99R scores in this reference group are sufficient to support an opinion that an individual meets the statutory level of sexual dangerousness necessary to justify civil confinement. Government evaluators who have used the SRA-FV for this purpose cannot cite research demonstrating that the procedure works as intended or that it produces a reliable match to the group representing the individual being assessed. Unfortunately, Thornton and Knight are silent on this application of the SRA-FV.

In a recently published article, I tested the use of the SRA-FV for selecting Static-99R reference groups. In brief, Dr. Thornton used statistical modeling based solely on data from the Bridgewater sample to devise this model. The reference group selection method was not based on the actual scores of members from each of the three reference groups. Rather, it was hypothetical, presuming that members of a Static-99R reference group will exhibit a certain range of SRA-FV score that do not overlap with any of the other two reference groups. To the contrary, I found that the hypothetical SRA-FV reference group system did not work as designed, as the SRA-FV scores between reference groups overlapped by wide margins. In other words, the SRA-FV total score would likely be consistent with selecting two if not all three Static-99R reference groups. In light of these findings, it is incumbent upon the developer to provide research using actual subjects to prove that the SRA-FV total score is a valid method by which to select a single Static-99R reference group and that the procedure can be applied reliably. At this point, credible support does not exist for using the SRA-FV to select Static-99R reference groups.

The design, development, validation, and replication of psychological instruments is guided by the Standard for Educational and Psychological Testing (“SEPT” -- American Educational Research Association et al., 1999). When comparing the Thornton and Knight study to the framework provided by SEPT, it is apparent the SRA-FV is in the infancy stage of development. At best, the SRA-FV is a work in progress that needs substantially more research to improve its psychometric properties. Aside from its low reliability and inability to generalize the validation research to other groups of sexual offenders, other important statistical properties await examination, including but not limited to:

  1. standard error of measurement
  2. factor analysis of whether items within each of the three risk domains significantly load in their respective domains
  3. the extent of the correlation between each SRA-FV item and sexual recidivism
  4. which SRA-FV items add incremental validity beyond the Static-99R or may be redundant with it; and proving each item has construct validity. 

It is reasonable to conclude that at its current stage of development the use of the SRA-FV in forensic proceedings is premature and scientifically indefensible. In closing , in their eagerness to improve the accuracy of their risk assessments, clinicians relied upon Dr. Thornton’s claim in the absence of peer-reviewed research demonstrating that the SRA-FV achieved generally accepted levels of reliability and validity. The history of forensic evaluators deploying the SRA-FV before the publication of the construction and validation study raises significant ethical and legal questions:

  • Should clinicians be accountable to vet the research presented in trainings by an instrument’s developer before applying a tool in forensic practice? 

  • What responsibility do clinicians have to rectify testimony where they presented the SRA-FV as if the results were reliable and valid?

  •  How many individuals have been civilly committed as sexually violent predators based on testimony that the findings from the SRA-FV were consistent with individuals meeting the legal threshold for sexual dangerousness, when the published data does not support this conclusion?

Answers to these questions and others go beyond the scope of this blog. However, in a recent appellate decision, a Washington Appeals Court questions the admissibility of the SRA-FV in the civil confinement trial of Steven Ritter. The appellate court determined that the application of the SRA-FV was critical to the government evaluator’s opinion that Mr. Ritter met the statutory threshold for sexual dangerousness. Since the SRA-FV is considered a novel scientific procedure, the appeals court reasoned that the trial court erred by not holding a defense-requested evidentiary hearing to decide whether the SRA-FV was admissible evidence for the jury to hear. The appeals court remanded the issue to the trial court to hold a Kelly-Frye hearing on the SRA-FV. Stay tuned!

References

Abbott, B.R. (2013). The Utility of Assessing “External Risk Factors” When Selecting Static-99R Reference Groups. Open Access Journal of Forensic Psychology, 5, 89-118.

American Educational Research Association, American Psychological Association and National Council on Measurement in Education. (1999). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association.

Heilbrun, K. (1992). The role of psychological testing in forensic assessment. Law and Human Behavior, 16, 257-272. doi: 10.1007/BF01044769.

In Re the Detention of Steven Ritter. (2013, November). In the Appeals Court of the State of Washington, Division III. 

Sachsenmaier, S., Thornton, D., & Olson, G. (2011, November). Structured risk assessment forensic version (SRA-FV): Score distribution, inter-rater reliability, and margin of error in an SVP population. Presentation at the 30th Annual Research and Treatment Conference of the Association for the Treatment of Sexual Abusers, Toronto, Canada.

Thornton, D. & Knight, R.A. (2013). Construction and validation of the SRA-FV Need Assessment. Sexual Abuse: A Journal of Research and Treatment. Published online December 30, 2013. doi: 10.1177/ 1079063213511120. 
* * *


*Brian R. Abbott is licensed psychologist in California and Washington who has evaluated and treated sexual offenders for more than 35 years. Among his areas of forensic expertise, Dr. Abbott has worked with sexually violent predators in various jurisdictions within the United States, where he performs psychological examinations, trains professionals, consults on psychological and legal issues, offers expert testimony, and publishes papers and peer-reviewed articles.



(c) Copyright 2013 - All rights reserved