Literature Review

Firearms and Toolmark Error Rates

Introduction

            On January 3, 2022, four statisticians issued a statement entitled, “Firearms and Toolmark Error Rates”. These four statisticians were: Alicia Carriquiry, Heike Hofmann, Kori Khan, and Susan Vanderplas. All the statisticians, except Kori Kahn, are part of the Center for Statistics and Applications in Forensic Evidence (CSAFE). Their purpose for the statement is to offer the opinion on the Firearm and Toolmark discipline that “error rates established from studies with sampling flaws, methodological flaws, non-response and attrition bias, and inconclusive results are not sufficiently sound to be used in criminal proceedings.” I reject the statements made and I will be summarizing the statement in this article and providing my own opinion.

Participant Sampling

            They first offer that there is a sampling problem within the studies conducted for the discipline. They state that having examiners volunteer for participation in a study will bias the study and create lower error rates. This is because examiners who volunteer are more involved in the discipline and tend to have more experience. The announcements for these studies are usually posted on the Association of Firearm and Toolmark Examiner (AFTE) forum, which is a place where all members derive most of their income from being a Firearm examiner. They state that examiners part of this organization is assumed to be more involved in the field and have more experience. I disagree with their statement because the study has to be announced in an area that the relevant scientific community can have the opportunity to volunteer. Examiners that are part of AFTE range from all different experience levels and and it cannot be assumed that membership does not include examiners with only a few years of experience. In my case, I have had only 2 years of experience in this field and I am an AFTE member who has access to the AFTE forum. There are also plenty of published studies that have volunteers that only have a couple of years of experience, which includes a consecutively manufactured Ruger slide study performed by the Miami-Dade Crime Laboratory. I also disagree that volunteers will affect the validity of the results because it would be impossible for the researcher to randomly select participants and then have their laboratory present the study as actual case work. Most laboratories evidence intake makes this hard to accomplish and it would be hard to replicate all the evidence and paperwork needed to make the study appear as a real case. All other scientific disciplines, including the medical field, rely on volunteers for their studies, so this should not be used to exclusively invalidate Firearm and Toolmark studies.

Material Sampling

            The group then argues that the discipline has material sampling problems. Studies in the discipline tend to focus on consecutively manufactured studies, which this group of statisticians finds problematic. They state that the studies lose the ability to make broad sweeping claims about the discipline. To do this they recommend that a black box study is needed with a large number of firearms and ammunition types so that the study can encompass more of what is found in actual case work. I disagree with this statement because consecutively manufactured studies create the worst-case scenario for examiners, thus giving the highest theoretical error rate. The consecutive studies are done on almost every part of the firearm, for example, the barrel., extractors, ejectors, and breech faces. In addition to the multiple parts that are examined, multiple machining methods are examined, for example, double broaching rifling, and hammer forged rifling. So, when combined these studies isolate the different parts of a firearm and the different manufacturing methods. These studies focus on the machining method rather than a mass amount of firearms because there is only a limited number of machining methods that manufacturers can use to manufacture a firearm. Therefore, examining the machining method is more beneficial for the examiner than examining random firearm make and models. I also believe that a creating big study examining multiple firearms as the statement suggests will not be useful because examiners would be able to eliminate samples early on in the study due to differences found in the class characteristics, which would prevent the individual characteristics from being examined.

Non-Response Bias

            They then go into the problem of missing data and non-response bias. They claim that most studies never disclose the data of their study and the drop-out rate.  Their suggestion is that the dropout rate should be factored into the error rate. They claim that a dropout rate of 20% should be enough to invalidate the study results and a 5% rate should be sufficient to cause concern for the study. When the dropout rate reaches these percentages they recommend that these participants answers be included and counted as 100% incorrect. They reason that it can be assumed that the participants quit the study based on the difficulty of the study or their own lack of time management. Due to this, their answers would have been assumed to be largely incorrect. Applying this could raise low error rates up to 16.56%, which will provide an upper bound for the error rate. This argument does not hold up well, because many people may drop out of the study due to case load at the laboratory or other responsibilities. Their dropout should not automatically be assumed that the examiner thought the study was too hard, especially since the statistician’s earlier assumption was that all volunteers were considered experienced. Also, to assume that their error rate would be 100% would assume complete incompetence of the examiner, the scientific backing of the discipline, and the quality assurance measures of the laboratory. Most laboratories require a second examiner to come to the same conclusion before the conclusion can be published, and this would assume that the second examiner would have also had an error rate of a 100%.  

Inconclusive

            Their next argument is about the AFTE Theory of Identification’s use of inconclusive. The AFTE Theory allows the examiner to conclude identification, inconclusive, and elimination. AFTE also allows three different levels of inconclusive that range from being close to an identification to being close to an elimination. Although, AFTE allows these three levels of inconclusive they are seldom used in laboratories. The group of statisticians believes that the inconclusive conclusion is used when it’s a hard decision and the examiner wants to be right. Because of their disagreement with the inconclusive conclusion, they want this conclusion to be considered an error, rather than the common practice of omitting the conclusions from error rates. When they consider an inconclusive an error the error can be brought up to around 50% making the conclusion a “coin toss”. The field is seeing a lot of “professionals” speaking out against the inconclusive conclusion, but I disagree with their statements. Inconclusive is a valid conclusion because of the nature of the evidence that is normally received in the laboratory. For example, many expended bullets that come through the laboratory are damaged which can cause foreshortening and damage to the underlying toolmarks. This will cause some areas to be unusable and leave the examiner with a limited number of markings. These markings may not meet the examiner’s threshold for an identification, but their presence will prevent the examiner from excluding the bullet. The only option that the examiner would be left with is to report an inconclusive result. Another situation is when the pressure inside a firearm may prevent the head of the casing from making good contact with the breech face which causes the primer to take limited marks of the breech face of the firearm. This situation would be similar to the bullet, and in no way eludes to the examiner wanting to take the easy way out. The examiner would only list the conclusion to properly speak for the evidence and prevent misguiding anyone reading the report.

Conclusion

            Based on the above-listed arguments the group of statisticians make the move that they can not support Firearm and Toolmark examination as evidence in criminal proceedings. They base most of their finding on the studies conducted in the field rather than the specific examiners in the field. They take a strict stand against the discipline but fail to recognize the complexity and uniqueness of this comparative science. For example, their misunderstanding of inconclusive results and their importance. Their recommendations are considered extreme and seem to be implemented just to raise the error rate of a study, for example, counting dropouts as 100% error or considering inconclusive results as errors. The courts should not accept their statement because of their lack of understanding and extreme views on how firearm-related studies should be conducted. They have little evidence to support their claims and provide very few references This statement has also brought the FBI on May 3, 2022, to post their own response. Their response will be reviewed in another literature review post.

VincentJ
I graduated with a Forensic Science Bachelor Degree from John Jay College of Criminal Justice with a focus in Criminalistics. I started my career as a Forensic Scientist in the Controlled Substance Section in a Police Laboratory. After gaining 2 years of experience with controlled substances, I transferred to the field of Firearm and Toolmark examination. I have published in a scientific journal and continue to research and improve in the field of the comparative sciences.
1 COMMENT
  • Mark

    Thanks for your blog, nice to read. Do not stop.

Comments are closed.