Sheeples much….? This is interesting data, but you can hardly draw any conclusions from this.
How many scopes were tested for each make and model? This is a biased sample size and nothing conclusive can be drawn from a spread sheet with pass or fail.
Once again, this data is useless with such a small sample size. You can’t test 1, 2 or even 10 scopes and say that it is representative of the entire population….you need hundreds if not thousands of samples for each scope. People should definitely be testing this stuff for themselves and not believing everything they read just because someone tested a few scopes.
I understand your frustration, but I think you are highlighting aspects that are outside the scope of what Formi has laid out.
Large sample sizes are obviously not realistic for this program, and if you go back and read some of Formi's posts he's been careful to call these scope "evals" and not "tests". That was smart, and I would say highly commendable.
He did that well before it even became a thing on Rokslide (i.e. posted it on other forums). Many members believe that "drop testing" started at Rokslide - it's been going on well before Rokslide was ever created.
Anyway, I would just think of "drop tests" as a gut-check, spot test, or simply a consumer's proof of concept (prior to incorporating a scope onto a platform) rather than a scientific study.
How is it useful?
You're looking for solid designs. If one sample of a certain model passes, you assume that the actual mechanical design is sound. At least for that application. You perform ongoing monitoring, and ideally add more samples. That's pretty straight forward.
However, if one sample of a certain model fails, then it gets more complicated. If it confirms a previous bias, one may see little value in doing more evaluations. Especially if the track record from other reliable sources support that decision.
If subsequent samples continue to fail, then why continue? Root cause might be design and/or build quality, but who cares?
If instead, some of the samples pass and some fail then you need to look into the build quality. Obviously the design is sound, given that some samples passed.
That's where statistics becomes most relevant - characterizing that one particular model that has some fails. We'll need sufficient samples, repeatable methods, thorough documentation, blah, blah, and blah. But is it worth it, to determine a failure rate? Maybe, but I think time/money would be better invested in other designs.
Limitations...
Unfortunately, we are somewhat stuck relying on track records. So if a certain model passes, ideally with multiple samples, and we have corroborating information from other reliable sources, then it's about as good as it gets without more time/money.
Some people may remember when Frank Galli stated that Nightforce failed the least, Leupold the most, and S&B somewhere in between. How many samples, under what conditions, and what measures? Who knows, but his observation can still be valuable, even taken with a grain of salt. Just remember, things can change!
There's a group of people that would trust their life with a DMR/HDMR anywhere in the world, even today with newer options. I'd be interested in failure rates, infant mortality, and other measures, but some of those scopes saw absolute hell and created fans for life. I can't quantify that, but respect the track record and sources. Especially given the application.