People are missing a critical element of the statistics in their insistence in a large sample size. if a scope has a 0.5% failure rate (so in “truly scientific” testing it would fail .5% of the time, or 5 times out of 1000) do you know what the calculated odds of having TWO tested scopes in a row PASS? It’s extremely high, and you would expect to require a huge number of tests to find even ONE failure. BUT, now calculate the odds of getting TWO consecutive failures in a row—it’s infinitesimally tiny, the odds are ridiculously low. So if you test 2 scopes and BOTH of them fail, statistically that is much, much, much, MUCH less likely than passing twice in a row. And if you did it three times in a row…well, with a truly low failure rate it simply would be a “more than one in a gazillion” fluke. So if you test a couple scopes and have multiple failures…you cannot quantify the failure rate, but you can say pretty confidently there is a problem. Statistically speaking the low sample size with very high failure rate is far more relevant than people are giving it credit for.
You know the saying “a good plan now is better than a perfect plan tomorrow”? The corollary to that is “some data now is better than perfect data tomorrow”.