I want to start off with: By calling into question the beloved drop test, I'm not (intentionally) trolling here. With that said, I do love the idea behind this test, an attempt toward an objective, scientific means of assessing the durability of a scope going through a field test to do its job of staying on target.
It does help us as consumers with information to guide our gear selection, but in thinking critically about it, there is room for error within. This contemplation was triggered by a good friend's die hard advocacy, and justification of his purchase of scope brand X, cuz at the end of a long justification... "it held up to the drop test".
I think that the scientific methods of these tests could be improved, and that's what I want to talk about.
First, I am by trade and training, somewhat of a scientist. I'm not the lab tech guy in the white lab coat and googles, but I am a doctor, a surgeon, and read and critique scientific papers to evaluate the published studies of our profession. We have a little monthly tradition called 'Journal Club' where we sit around, let the libations flow, and discuss a latest medical paper. We praise it for it's strengths and contributions to medical/surgical care, then rip it apart for all of it's weaknesses in methods, irrepresentative study population and poor design. So, without further ado, crack a cold one and let's dice apart these methods and how they might be improved.
First, strengths; well... It accurately exposes the ones that don't hold zero. We have a 100% true positive here - the ones that are dropped, are not holding zero; That scope, individually, failed.
However...
The ones that do hold zero; is it then a quality scope? That is what we assume. But, did it get dropped on the same impact point, with the same impact force, same system weight to equilibrate momentum, etc, as the other that didn't hold? Dropping a rifle onto matted, tarped, variable surfaces, lets in a lot of room for variability between drops. Ie. This test is not truly repeatable. Without repeatability and consistency between POI, Force, Momentum, I don't think there's an argument that says you've effectively included all scopes that do not hold zero up to X amount of force. There are perhaps scopes that don't hold zero, that passed the test.
I was reading another scope review, I think a Maven RS with a test I really appreciated; can't remember where it was, but they dropped a stated weight (28 oz iirc) hammer onto the turret, front housing, rear objective, focus/paralax adjustment, etc, with a consistent pendulum drop, creating rather repeatable force/momentum onto the rifle/scope system. I though this was more repeatable, constant designed test than the dropped gun onto matted ground.
The results of "holding zero": So, most of these scopes that "pass" still have some variance off of true zero. So, rather than a binary yes/no, there seems more appropriate to have a value or degree of variance. Why not measure the degree off zero after x ft lbs of impact to these specific points? We could also find what force is required to put a given scope off zero. I'm certain every scope has it's breaking point.
The N. The "N" refers to number of subjects N is the biggest factor to consider in study design. It amalgamates group data to represent an outcome from enough subjects to detect a difference. A single scope tested is an N number of 1. This is then, is not a study but an anecdote. An N of one with a suspect, variable/inconsistent study method is, to that scope manufacturer, quite an injustice. Drawing your conclusion that X brand's Y line of scopes does or doesn't hold zero, says a lot about X brand, and definitely influences a lot of consumers. Just looking at the various scope review threads here, they are in the several thousands, even for the more obscure ones. So, my point is that these statements of "not holding zero" while for that individual scope is true, it may not accurately reflect the 'average' quality of that optic line. Equally, because a scope passed a drop test, given aforementioned variability, it may overstate that individual optic line's quality/durability, and overstate the 'average' quality of that brand. I have to assume that every scope brand produces a few lemons in their lineup. An N of 1 doesn't speak to every scope, or even the average. Get my drift? We need an N greater than 1.
Any archery hunter's watch Lusk Archery Adventures broadhead reviews on Youtube? Those are great methods. Virtually every test is standardized, repeatable, with minimal chance for error or variability. The data he collects is quantitiative rather than binary. I know, broadheads are much easier and cheaper to test than riflescopes, but it's a good example of rather repeatible, scientific testing and data collection.
While I can appreciate the anecdotes and the spirit/intent of these field tests as really solid information to help guid gear selection. As to the drop test, it may be representative all the while they may not, and I wouldn't put 100% stock in them.
That's about all I've got for this month's Journal Club.
It does help us as consumers with information to guide our gear selection, but in thinking critically about it, there is room for error within. This contemplation was triggered by a good friend's die hard advocacy, and justification of his purchase of scope brand X, cuz at the end of a long justification... "it held up to the drop test".
I think that the scientific methods of these tests could be improved, and that's what I want to talk about.
First, I am by trade and training, somewhat of a scientist. I'm not the lab tech guy in the white lab coat and googles, but I am a doctor, a surgeon, and read and critique scientific papers to evaluate the published studies of our profession. We have a little monthly tradition called 'Journal Club' where we sit around, let the libations flow, and discuss a latest medical paper. We praise it for it's strengths and contributions to medical/surgical care, then rip it apart for all of it's weaknesses in methods, irrepresentative study population and poor design. So, without further ado, crack a cold one and let's dice apart these methods and how they might be improved.
First, strengths; well... It accurately exposes the ones that don't hold zero. We have a 100% true positive here - the ones that are dropped, are not holding zero; That scope, individually, failed.
However...
The ones that do hold zero; is it then a quality scope? That is what we assume. But, did it get dropped on the same impact point, with the same impact force, same system weight to equilibrate momentum, etc, as the other that didn't hold? Dropping a rifle onto matted, tarped, variable surfaces, lets in a lot of room for variability between drops. Ie. This test is not truly repeatable. Without repeatability and consistency between POI, Force, Momentum, I don't think there's an argument that says you've effectively included all scopes that do not hold zero up to X amount of force. There are perhaps scopes that don't hold zero, that passed the test.
I was reading another scope review, I think a Maven RS with a test I really appreciated; can't remember where it was, but they dropped a stated weight (28 oz iirc) hammer onto the turret, front housing, rear objective, focus/paralax adjustment, etc, with a consistent pendulum drop, creating rather repeatable force/momentum onto the rifle/scope system. I though this was more repeatable, constant designed test than the dropped gun onto matted ground.
The results of "holding zero": So, most of these scopes that "pass" still have some variance off of true zero. So, rather than a binary yes/no, there seems more appropriate to have a value or degree of variance. Why not measure the degree off zero after x ft lbs of impact to these specific points? We could also find what force is required to put a given scope off zero. I'm certain every scope has it's breaking point.
The N. The "N" refers to number of subjects N is the biggest factor to consider in study design. It amalgamates group data to represent an outcome from enough subjects to detect a difference. A single scope tested is an N number of 1. This is then, is not a study but an anecdote. An N of one with a suspect, variable/inconsistent study method is, to that scope manufacturer, quite an injustice. Drawing your conclusion that X brand's Y line of scopes does or doesn't hold zero, says a lot about X brand, and definitely influences a lot of consumers. Just looking at the various scope review threads here, they are in the several thousands, even for the more obscure ones. So, my point is that these statements of "not holding zero" while for that individual scope is true, it may not accurately reflect the 'average' quality of that optic line. Equally, because a scope passed a drop test, given aforementioned variability, it may overstate that individual optic line's quality/durability, and overstate the 'average' quality of that brand. I have to assume that every scope brand produces a few lemons in their lineup. An N of 1 doesn't speak to every scope, or even the average. Get my drift? We need an N greater than 1.
Any archery hunter's watch Lusk Archery Adventures broadhead reviews on Youtube? Those are great methods. Virtually every test is standardized, repeatable, with minimal chance for error or variability. The data he collects is quantitiative rather than binary. I know, broadheads are much easier and cheaper to test than riflescopes, but it's a good example of rather repeatible, scientific testing and data collection.
While I can appreciate the anecdotes and the spirit/intent of these field tests as really solid information to help guid gear selection. As to the drop test, it may be representative all the while they may not, and I wouldn't put 100% stock in them.
That's about all I've got for this month's Journal Club.