The model ran. The pipeline worked. The dashboard refreshed on schedule. And yet the business decision that followed was wrong, late, misaligned, or quietly ignored. The last one is the most expensive failure of all because it looks like success right up until nothing happens. Executives often assume the issue was execution or tooling. It rarely is.
The real constraint in data science is not algorithms or computations. It is judgment: knowing what question to ask, what level of accuracy is worth paying for, what not to optimize, and when to stop believing your own metrics.
This is about that judgment. Not how to code models, but how to govern, evaluate, and demand good decision-making from the data science work happening inside your organization.
Because if you are a decision maker, your job is not to understand every model. Your job is to recognize when the work is sound, when it is misdirected, and when sophistication is being mistaken for value.
Data Intuition: The Sanity Check That Saves Millions
Strong data scientists develop an internal alarm system. When a result looks “too good,” it usually is. This is not gut feel in the mystical sense. It is pattern recognition built from experience.
A seasoned practitioner constantly asks quiet questions:
- Does this magnitude make sense?
- What would have to be true for this to be real?
- What simple explanation could also produce this result?
If a model claims 99% accuracy, good judgment immediately asks about base rates. If a forecast predicts a 1,500% revenue increase, good judgment pauses before ordering champagne.
This intuition often manifests as quick mental math: weighing probabilities or napkin-math sanity checks. Executives should not expect perfection, but they should expect skepticism. Results should survive basic plausibility tests before they are operationalized.
When intuition is absent, organizations drift toward two extremes: blind faith in numbers or total cynicism about analytics. Neither scales.
The Economics of Accuracy and the Baseline Imperative
Accuracy is not free. And more accuracy is rarely proportionally more valuable.
Most business problems follow a familiar curve. A simple model delivers most of the value quickly. Additional complexity yields diminishing returns, while costs compound through development time, infrastructure, maintenance, and organizational fragility.
This is where judgment separates disciplined teams from expensive ones.
The Mandatory Baseline (The ‘Dumb’ Model)
Good judgment refuses to build a complex model until a deliberately simple one exists.
This is the Baseline Imperative: Never approve a sophisticated model unless a “dumb” baseline has already been built.
That baseline might be an average, a rule of thumb, or a basic spreadsheet trendline. Its job is not to be impressive. (It will not get you a keynote at a tech conference.) Its job is to establish the denominator.
Without a baseline, there is no ROI calculation. You cannot know whether a deep learning system is “worth it” unless you know that a one-line heuristic already gets you most of the way there.
Sometimes that baseline wins outright. I have seen multi-month modeling efforts lose to a three-line Excel moving average. Not because the data scientists were incompetent, but because the underlying business signal was stable, slow-moving, and well-behaved. In those cases, complexity did not add insight. It added maintenance.
Executives should actively ask:
- What is the baseline?
- How well does it perform?
- What incremental lift does the complex approach deliver?
- What does that lift cost us over the next two years?
This single discipline eliminates a surprising amount of over-engineering and quietly saves real money.
Problem Framing: Where Most Projects Quietly Fail
Data science does not fail at deployment. It fails earlier, when the problem is framed incorrectly.
Framing determines what data is collected, what metric is optimized, and what decisions are justified. Once the framing is wrong, no amount of modeling rescues the outcome.
Correlation Is Not Permission
Patterns are easy to find. Causes are not.
Observational data can suggest hypotheses, but judgment lies in knowing when evidence is suggestive versus decisive. A relationship between two variables does not grant permission to act as if one causes the other.
Strong teams communicate uncertainty clearly. They articulate assumptions, identify hidden variables, and recommend tests when causal claims matter. Weak teams present correlations as conclusions and let executives infer certainty where none exists.
If you ever hear, “The data proves…,” that is your cue to slow the meeting down.
When Measures Become Targets (Goodhart’s Law)
There is a well-known principle in economics called Goodhart’s Law: when a measure becomes a target, it stops being a good measure. While the phrase gets quoted often, the nuance is frequently missed. In practice, there are two distinct failure modes.
Bad Proxy: The metric was always a bad proxy. Measuring productivity by lines of code rewards verbosity, not outcomes.
Gaming: The metric was initially reasonable, but people figured out how to exploit it. Click-through rate is the canonical example. Once heavily optimized, it invites clickbait that drives short-term engagement and long-term damage.
The executive lesson is subtle but critical. Judgment is not just picking the right metric. It is monitoring that metric for decay over time as incentives and behavior adapt.
Metrics are not dashboard decorations. They are contracts with human beings.
Intellectual Humility, and the Pre-Mortem
Experienced data scientists know what their models cannot say. They communicate limits clearly. They resist false precision.
This humility can feel uncomfortable in organizations that reward confidence. But it is one of the strongest predictors of long-term success.
One of the most effective ways to institutionalize humility is not a post-mortem, but a pre-mortem.
The Pre-Mortem Technique
Before a project begins, ask the team a simple question: “Imagine it is six months from now, and this project has failed spectacularly. Write down exactly what happened.”
This exercise surfaces assumptions that would otherwise remain buried. Data quality issues emerge. Dependencies get named. Misaligned expectations show up early.
Most importantly, it gives junior team members permission to say the quiet parts out loud. In standard planning meetings, those concerns often remain politely unspoken.
The Executive Prompt
For executives, the trigger question is even simpler: “Walk me through the worst-case scenario.”
Not rhetorically. Literally. This reframes risk as something discussable, not disloyal (even if it does make the room go quiet for a moment).
Ask your data lead to explain what failure would look like, what would cause it, and how you would know it was happening. Organizations that normalize pre-mortems have fewer surprises and better decision hygiene.
Ethics Is Not Optional (and Not Abstract)
Ethical failures in data science rarely come from bad intent. They come from neglect.
Bias emerges because training data reflects history. Privacy is violated because “we already have the data.” Harm occurs because no one asked who would be affected downstream.
Good judgment treats ethics as part of problem framing, not a compliance checklist. It asks:
- Who could this harm?
- Who might this disadvantage?
- What would we say if this showed up on the front page?
Executives set the tone here. When leaders reward speed without guardrails, corners get cut. When they ask ethical questions early, teams design differently.
Ethics is not a brake on innovation. It is a brake on real harm… and the reputational catastrophe that usually follows.
Adaptability, Generative AI, and Skepticism as a Skill
Generative AI has made sophisticated outputs cheap, fast, and plausible. This makes the Baseline Imperative more important, not less.
When answers arrive instantly and sound confident, the discipline is no longer building something impressive. The discipline is defining what “good enough” actually means before the system speaks.
These tools produce plausible outputs at remarkable speed, yet they break in non-obvious ways. Strong teams treat AI assistance the way they treat junior analysts: fast, helpful, and always reviewed. Weak teams treat outputs as authoritative.
Adaptability means learning new tools. Judgment means understanding how and why they break.
The executive posture should be pragmatic optimism:
- Pilot quickly.
- Compare against simple baselines.
- Validate relentlessly.
- Keep humans accountable.
The goal is not to resist new technology. It is to absorb it without surrendering discernment.
What ‘Good’ Looks Like
In organizations with strong data science judgment, a few patterns repeat:
- Simple baselines precede complex models
- Metrics are revisited as behavior adapts
- Uncertainty is communicated without apology
- Pre-mortems are routine, not theatrical
- Ethics is embedded, not bolted on
- Leaders ask better questions every quarter
These organizations do not worship data. They use it.
The Executive Takeaway
Data science does not fail because models are weak. It fails because judgment is underdeveloped, under-rewarded, or quietly outsourced to systems that do not understand context.
Your job as a decision maker is not to become technical. It is to demand clarity, baselines, humility, and alignment.
The companies that win with data are not the ones with the most sophisticated models. They are the ones that know when sophistication is unnecessary… a judgment that, inconveniently, cannot be automated.
For more columns from Michael Bagalman’s Data Science for Decision Makers series, click here (from All Things Innovation) and here (from All Things Insights).
Contributor
-
Michael Bagalman brings a wealth of experience applying data science and analytics to solve complex business challenges. As VP of Business Intelligence and Data Science at STARZ, he leads a team leveraging data to inform decision-making across the organization. Bagalman has previously built and managed analytics teams at Sony Pictures, AT&T, Publicis, and Deutsch. He is passionate about translating cutting-edge techniques into tangible insights executives can act on. Bagalman holds degrees from Harvard and Princeton and teaches marketing analytics at the university level. Through his monthly column, he aims to demystify important data science concepts for leaders seeking to harness analytics to drive growth.
View all posts





























































































































































































































