Academics and policy makers are increasingly calling for better evidence of what works in criminal justice. This has led to a focus on the randomized controlled trial (RCT).
RCTs are considered the “gold standard” for helping us drawing causal inferences about program effectiveness. That is, RCTs help answer the question: “Did this program produce this change?” This is known as internal validity.
But for evidence-based policy, we also need to know the extent to which a program’s outcomes may be applicable to other settings and populations. This is where external validity comes in.
Of course, an RCT is not, in itself, a guarantee of validity. For example, problems with drop-out can lead to biased results, particularly in studies of more risky groups – like those commonly found in criminal justice studies.
This can be addressed, in part, by evaluators making the full results and technical details of their studies available for fellow researchers and policy communities. To do this, they need to be good at presenting the key features of the study clearly – so-called “descriptive validity.”
For descriptive validity, there is a need to explain how the study was designed, implemented and analyzed so that we know if its quality is satisfactory. There is also a need to say who received it, where and when: this helps us to judge if the outcomes may be generalized for implementation on a wider scale.
So how can descriptive validity be improved in crime and justice assessments? Are there any standards for descriptive validity? In criminology, there are none – but inspiration may be available from the health and medical services. The Consolidated Standards of Reporting Trials (CONSORT) offers a 22-item checklist for the reporting of medical trials. Could CONSORT point the way forward?
In a recent study, Charlotte Gill, a post-doctoral fellow at George Mason University, US, used CONSORT to assess how well criminal justice research is doing on descriptive validity. She divided the CONSORT criteria into two groups according to whether they concern internal or external validity. Like a previous study [see: Controlled Trial and Error] that applied CONSORT to selected criminological trials, Gill finds room for improvement in descriptive validity.
For internal validity, the focus is on reporting of random assignment sequence, length of follow-up period, number of participants analyzed, use of intent-to-treat (ITT) analysis, attrition, and the authors’ view of how far the trial was affected by bias.
For external validity, the criteria include reporting of eligibility criteria for participants, study setting, participants’ baseline characteristics, and authors’ interpretation of generalizability.
Gill then applied these criteria to 38 crime-related RCTs published in leading journals in the period 2002-2008. For each study, she scored each of the 45 criteria according to whether it was “not reported” (0), “reported partially” (1), or “reported fully” (2).
She concluded that overall the mean scores across the sample were “fairly promising,” with medium descriptive validity on internal validity items and high descriptive validity on external validity items. However, this masked a great deal of variation.
Regarding internal validity, reporting was good on number of participants, dates and timing of follow-up, and numbers assigned and analyzed for primary outcome, but it was less consistent for whether studies used ITT analysis, who enrolled participants and whether there were deviations from planned treatment.
Almost all elements scored highly for external validity, although fewer authors addressed the issue of generalizability in their conclusions.
There was a slight improvement in reporting during the later years, and reporting was also slightly better in traditional criminal justice domains – corrections and courts – than in psychological therapies and treatments. This was surprising “given the psychology field’s closer alliance with health science,” Gill said.
Researchers from medical schools were best at giving internal validity data, while sociologists provided the most detail about external validity, which may be unsurprising.
But studies where researchers with PhDs in sociology or criminology were lead authors had the highest scores. Gill attributes this unexpected finding possibly to the fact that many leading proponents of evidence-based practice and RCTs in those fields “have themselves been directly involved in conducting randomized controlled trials.”
When both types of validity were considered together, eight of the 38 studies (21%) scored “high” on reporting of both internal and external validity, and 26 of the 38 (66%) scored “high” on at least one.
However, “much more needs to be done to improve reporting quality even further,” argues Gill, particularly in relation to internal validity. Participant flow, differential attrition and treatment crossover may be harder to capture than external validity, but they can pose major threats to internal validity.
The study’s limitations include the small subset of criminal justice experiments, a limited timeframe, the exclusion of place-based experiments, and restriction to journal articles. In the future some criteria might be given more weight than others to signify relative importance. It would also be interesting to conduct a similar study in health and compare the results.
Gill concludes that good reporting can help policy makers make sense of research quality. She also argues that there is a moral imperative “not only to produce the best research, but to clearly report it to enhance the objectives of evidence-based crime policy.”
Reference:
Gill, C. E. (2011). Missing links: how descriptive validity impacts the policy relevance of randomized controlled trials in criminology. Journal of Experimental Criminology, 7, 201-224.

Top