Cliff Grays Podcast with Aaron Davidson

Do they have their own ballistic software / optics line? Are they really good and using someone’s blank, other own, chambering it, to their own I. House manufactured action and putting it in a stock they designed for the hunter with explicit purpose?

Just answer yes or no.


I think the smooth brains really miss the point, sound poor, then hurl insults when this whole thing started about a podcast and a drop test.
YES
 
How about you enlighten me.

And, how about you provide me something better. Because if you don’t have anything better, it doesn’t matter how bad it is. It’s still the best there is. Again, I DO NOT CARE if good scopes fail. As long as bad scopes don’t pass, it has value. What is my alternative?
I got you bby.

“Clearly, Aaron did not read the notes on the scope testing…many variables are either addressed or semi-controlled.”

  • Claim without specifics. Saying variables are “addressed” or “semi-controlled” isn’t the same as demonstrating control. Which variables? How were they measured, bounded, and audited? Without a written protocol, tolerances, and QC checks, this is assertion, not evidence.
  • “Semi-controlled” invites bias. Partial control often shifts variance from random to systematic (operator, setup, environment). That tends to make results look repeatable while actually reflecting a hidden bias in the rig or method.

“Three failures in a row says something different than three passes in a row from small samples.”
  • Only under independence and identical conditions. Run-length in a Bernoulli process is meaningful if trials are i.i.d. If the same test setup systematically induces failure (e.g., impact angle, turret orientation, a stressed ring stack), those three “independent” failures may be three reads of the same bias.
  • Asymmetric inference. Three passes don’t prove reliability; agreed. But three fails don’t cleanly estimate population failure rate either—especially with convenience sampling, no randomization, and operator effects.
  • Math is conditional on unknowns. If a scope truly “passes” with probability p, then three fails in a row occur with (1-p)^3. Example: if p=0.9, (1-0.9)^3=0.1^3=0.001 (0.1%). If p=0.7, it’s 0.3^3=0.027 (2.7%). The point: without a credible estimate of p from unbiased, controlled data, the “wow” factor of a fail-run is hard to interpret.


“The drop tests aren’t scientific… but the open ‘available to anyone’ aspect has massive value.”
Availability ≠ validity. Openness is great, but decision value comes from measurement quality: calibrated height, measured impact energy, controlled surface durometer, defined orientation, pre-registered pass/fail criteria, and blinded scoring.
  • Construct validity gap. Does this test replicate field-relevant loads? Mixed, unmeasured impact vectors may overweight turret-first impacts and underweight recoil & vibration—skewing failure modes away from what most users experience.


“If a test is pretty repeatable with similar results…it has some validity.”


  • Repeatable ≠ correct. A biased bathroom scale is “repeatable.” Validity requires accuracy against a traceable reference (e.g., instrumented drop, collimator-based zero shift, tall-target tracking with error bounds), not just consistency.







“Manufacturers’ proprietary tests don’t help me; I need the same test across brands.”








  • False dilemma. It’s not “this open test or nothing.” The real bar is standardized, audited third-party methods (documented rigs, instrumented impacts, blind labeling). Proprietary data can still be probative if independently verified; open data can still mislead if poorly controlled.







“Unless critics replace it with something better that’s available, they’re blowing hot air.”








  • Burden of proof is on the test. Critique doesn’t require offering a turnkey replacement; it requires showing threats to validity (confounding, bias, poor reliability). “Use it until something better exists” is a policy stance, not a scientific defense.







“This is the ONLY option other than sticking my head in the sand.”

  • Availability bias. Claims of uniqueness ignore other reliability evidence (e.g., warranty/RMA rates, controlled tracking tests, recoil/vibration standards, multi-lab ring-down/box tests). If those aren’t consolidated, that’s a curation problem, not proof they don’t exist.

“Worrying about throwing out good scopes doesn’t matter to me; I just want a higher chance of a reliable one.”
  • Screening math cuts both ways. A harsh, noisy test with unknown specificity can spike false rejects—filtering out many good units and preferentially selecting designs robust to this impact profile, not necessarily to real-world use. Decision quality depends on sensitivity/specificity and the cost ratio of false fail vs false pass, none of which are quantified
“I had 3 of 4 scopes from one maker fail; swapping only the scope fixed it.
  • Confounding remains. Identical torque ≠ identical clamping force (lubricity, screw stretch, torque wrench calibration). Tube OD variances, wall thickness, and ring ovalization cause different stress states for each scope in the same rings. Without ABAB crossover (fail scope → good scope → fail scope again) and independent verification (collimator), you risk mistaking interaction effects for unit defects.
  • Selection & survivorship bias. Four units isn’t a population study. Batch effects, early production, or retailer pre-screening can skew your sample. Your experience is valid for you, but it doesn’t estimate brand-level failure rates.
“It isn’t rocket surgery to narrow it down when swapping scopes flips the result.”
  • Post hoc flip isn’t isolation. Flips can stem from small shifts in eye position, parallax, mounting tension release/re-clamp, rail stress relief, or ring seating. Isolation needs blinded mounting, fixture-based aim (no shooter influence), order randomization, and test-retest to rule out regression to the mean.
“Variables like angle, surface, landing point aren’t controlled—but that’s fine because the test is accessible.”
  • Uncontrolled inputs change the outcome distribution. Without fixed orientation (e.g., turret-first vs eyepiece-first), you’re not comparing like-for-like across designs. A scope robust to side impacts may look “bad” if the test over-represents turret-down hits. Accessibility doesn’t excuse mixing apples and anvils.

“It seems pretty repeatable with similar results.”

Where’s the reliability stat? “Seems” needs numbers: intra-rater agreement, test–retest variance, effect sizes with confidence intervals, and inter-lab reproducibility. If two operators can’t reproduce each other’s results under the same protocol, repeatability is illusory.
Core methodological gaps (the “why it’s vulnerable” list)

  • No pre-registered protocol: Without a frozen playbook (heights, surfaces, orientations, pass/fail thresholds, sample sizes), it’s easy to unconsciously tune conditions.
  • No instrumentation: Lack of measured acceleration/energy means you don’t know what you actually applied.
  • No blinding/randomization: Brand knowledge and order effects can influence setup, inspection, and interpretation.
  • Small-n with convenience sampling: Results are fragile and prone to runs, selection bias, and overinterpretation.
  • Outcome measure muddiness: Group shift can be shooter-, ammo-, or condition-driven; optical collimation or tall-target tracking would isolate the scope.
  • Unknown error rates: Sensitivity/specificity of the test to true mechanical failure modes are unquantified.
Constructive upgrades (minimal overhead, big payoff)
  • Fix three orientations (turret-down, ocular-down, side-impact) with a simple jig; photograph each setup.
  • Use one standard surface (documented durometer) and a measured drop height.
  • Blind the brand/model (tape the markings); randomize test order.
  • Pre-register pass/fail thresholds (e.g., ≥1.0 MOA zero shift after N drops) and publish all results, not just notable ones.
  • Verify zero shift with a collimator (no shooter noise) and add a tall-target tracking check pre/post.
  • Report CIs for shifts and a simple power analysis for planned sample sizes.


Bottom line: your policy argument (open, comparable, better than nothing) is understandable. But the scientific argument hinges on control, measurement, and error rates. Until those are nailed down, consecutive failures, personal flip-tests, and “seems repeatable” carry less evidentiary weight than they appear.
 
It’s so interesting to continually hear people talk about something with so much assurance when they haven’t read what is actually done, don’t understand it, and have never attempted to replicate it.

You know, something like the scientific method.
It’s easy to dismiss something when you’ve never actually tried it. Especially with internet confirmation bias from others who also have never tried it.

Honestly makes a guy want to just delete the little bit of internet he actually has (Rokslide).
 
I got you bby.

“Clearly, Aaron did not read the notes on the scope testing…many variables are either addressed or semi-controlled.”

  • Claim without specifics. Saying variables are “addressed” or “semi-controlled” isn’t the same as demonstrating control. Which variables? How were they measured, bounded, and audited? Without a written protocol, tolerances, and QC checks, this is assertion, not evidence.
  • “Semi-controlled” invites bias. Partial control often shifts variance from random to systematic (operator, setup, environment). That tends to make results look repeatable while actually reflecting a hidden bias in the rig or method.

“Three failures in a row says something different than three passes in a row from small samples.”
  • Only under independence and identical conditions. Run-length in a Bernoulli process is meaningful if trials are i.i.d. If the same test setup systematically induces failure (e.g., impact angle, turret orientation, a stressed ring stack), those three “independent” failures may be three reads of the same bias.
  • Asymmetric inference. Three passes don’t prove reliability; agreed. But three fails don’t cleanly estimate population failure rate either—especially with convenience sampling, no randomization, and operator effects.
  • Math is conditional on unknowns. If a scope truly “passes” with probability p, then three fails in a row occur with (1-p)^3. Example: if p=0.9, (1-0.9)^3=0.1^3=0.001 (0.1%). If p=0.7, it’s 0.3^3=0.027 (2.7%). The point: without a credible estimate of p from unbiased, controlled data, the “wow” factor of a fail-run is hard to interpret.


“The drop tests aren’t scientific… but the open ‘available to anyone’ aspect has massive value.”
Availability ≠ validity. Openness is great, but decision value comes from measurement quality: calibrated height, measured impact energy, controlled surface durometer, defined orientation, pre-registered pass/fail criteria, and blinded scoring.
  • Construct validity gap. Does this test replicate field-relevant loads? Mixed, unmeasured impact vectors may overweight turret-first impacts and underweight recoil & vibration—skewing failure modes away from what most users experience.


“If a test is pretty repeatable with similar results…it has some validity.”


  • Repeatable ≠ correct. A biased bathroom scale is “repeatable.” Validity requires accuracy against a traceable reference (e.g., instrumented drop, collimator-based zero shift, tall-target tracking with error bounds), not just consistency.







“Manufacturers’ proprietary tests don’t help me; I need the same test across brands.”








  • False dilemma. It’s not “this open test or nothing.” The real bar is standardized, audited third-party methods (documented rigs, instrumented impacts, blind labeling). Proprietary data can still be probative if independently verified; open data can still mislead if poorly controlled.







“Unless critics replace it with something better that’s available, they’re blowing hot air.”








  • Burden of proof is on the test. Critique doesn’t require offering a turnkey replacement; it requires showing threats to validity (confounding, bias, poor reliability). “Use it until something better exists” is a policy stance, not a scientific defense.







“This is the ONLY option other than sticking my head in the sand.”

  • Availability bias. Claims of uniqueness ignore other reliability evidence (e.g., warranty/RMA rates, controlled tracking tests, recoil/vibration standards, multi-lab ring-down/box tests). If those aren’t consolidated, that’s a curation problem, not proof they don’t exist.

“Worrying about throwing out good scopes doesn’t matter to me; I just want a higher chance of a reliable one.”
  • Screening math cuts both ways. A harsh, noisy test with unknown specificity can spike false rejects—filtering out many good units and preferentially selecting designs robust to this impact profile, not necessarily to real-world use. Decision quality depends on sensitivity/specificity and the cost ratio of false fail vs false pass, none of which are quantified
“I had 3 of 4 scopes from one maker fail; swapping only the scope fixed it.
  • Confounding remains. Identical torque ≠ identical clamping force (lubricity, screw stretch, torque wrench calibration). Tube OD variances, wall thickness, and ring ovalization cause different stress states for each scope in the same rings. Without ABAB crossover (fail scope → good scope → fail scope again) and independent verification (collimator), you risk mistaking interaction effects for unit defects.
  • Selection & survivorship bias. Four units isn’t a population study. Batch effects, early production, or retailer pre-screening can skew your sample. Your experience is valid for you, but it doesn’t estimate brand-level failure rates.
“It isn’t rocket surgery to narrow it down when swapping scopes flips the result.”
  • Post hoc flip isn’t isolation. Flips can stem from small shifts in eye position, parallax, mounting tension release/re-clamp, rail stress relief, or ring seating. Isolation needs blinded mounting, fixture-based aim (no shooter influence), order randomization, and test-retest to rule out regression to the mean.
“Variables like angle, surface, landing point aren’t controlled—but that’s fine because the test is accessible.”
  • Uncontrolled inputs change the outcome distribution. Without fixed orientation (e.g., turret-first vs eyepiece-first), you’re not comparing like-for-like across designs. A scope robust to side impacts may look “bad” if the test over-represents turret-down hits. Accessibility doesn’t excuse mixing apples and anvils.

“It seems pretty repeatable with similar results.”

Where’s the reliability stat? “Seems” needs numbers: intra-rater agreement, test–retest variance, effect sizes with confidence intervals, and inter-lab reproducibility. If two operators can’t reproduce each other’s results under the same protocol, repeatability is illusory.
Core methodological gaps (the “why it’s vulnerable” list)

  • No pre-registered protocol: Without a frozen playbook (heights, surfaces, orientations, pass/fail thresholds, sample sizes), it’s easy to unconsciously tune conditions.
  • No instrumentation: Lack of measured acceleration/energy means you don’t know what you actually applied.
  • No blinding/randomization: Brand knowledge and order effects can influence setup, inspection, and interpretation.
  • Small-n with convenience sampling: Results are fragile and prone to runs, selection bias, and overinterpretation.
  • Outcome measure muddiness: Group shift can be shooter-, ammo-, or condition-driven; optical collimation or tall-target tracking would isolate the scope.
  • Unknown error rates: Sensitivity/specificity of the test to true mechanical failure modes are unquantified.
Constructive upgrades (minimal overhead, big payoff)
  • Fix three orientations (turret-down, ocular-down, side-impact) with a simple jig; photograph each setup.
  • Use one standard surface (documented durometer) and a measured drop height.
  • Blind the brand/model (tape the markings); randomize test order.
  • Pre-register pass/fail thresholds (e.g., ≥1.0 MOA zero shift after N drops) and publish all results, not just notable ones.
  • Verify zero shift with a collimator (no shooter noise) and add a tall-target tracking check pre/post.
  • Report CIs for shifts and a simple power analysis for planned sample sizes.


Bottom line: your policy argument (open, comparable, better than nothing) is understandable. But the scientific argument hinges on control, measurement, and error rates. Until those are nailed down, consecutive failures, personal flip-tests, and “seems repeatable” carry less evidentiary weight than they appear.
How about get off the internet and go shoot? 🤓
 
Dude, not here to argue with you . You can drink Aaron’s KOOLAID. I still have my opinion of Aaron and his marketing, and his proclaimed know it all status, as nothing but a Carnival Barker . His arrogance precedes him and he offends many. I don’t have the patience to teach you, you are obviously a GW fanboy. Every gun builder seeks out Aaron’s knowledge and expertise. TFF. By the way your boy Aaron does not own a scope manufacturing company. Also the builders I mentioned do have proprietary stocks. Most of these builders use refined CRF Mod 70 actions, Granite Mtn Arms Mauser’s. D’Arcy also has his own action . Again you can be bedazzled by Aaron’s actions, I’m not. By the way , I checked with D’Arcy and he said he didn’t get any assistance on his design and engineering or consultation from Aaron. LOL
No hard feelings, I will hunt with my Echols, Simillion, Penrod, Buehler, Heilman rifles and you have fun with your GW’s and whatever else he’s marketing.
 
He asked for links to all the stuff you said the other manufacturers make. Not to blather on and throw out the names of all the manufacturers you can think of. Talk about coming across arrogant.
 
I listened to the entire podcast. It is clear that Aaron knows a LOT about shooting. And I too was interested in his spin on lighter calibers, negative comb stocks, and the $50k he spent on a bench to conduct drop testing.

My issue is most all of my rifles are either Tikka, Savage, or Weatherby Vanguard, and they all have factory barrels and Vortex Viper scopes. Someone like Aaron Davidson would laugh me out of his shop. I looked at his website and there was a scope for $2300. I am sure it is worth it too, but that is just so FAR out of my class.
Same here!
 
It’s so interesting to continually hear people talk about something with so much assurance when they haven’t read what is actually done, don’t understand it, and have never attempted to replicate it.

You know, something like the scientific method.
Often it's not what you say but how you say it, to really get the message across.

Sent from my SM-S926U using Tapatalk
 
Back
Top