AI Solutions

Loading

What analytics engineering didn’t prepare me for in AI

What analytics engineering  didn’t prepare me for in AI

From the onset, analytics engineering seemed like the final evolution phase of business intelligence.

It created order through modern warehouses and declarative modeling tools. Dashboards became trusted, lineage was visible, and metrics were version-controlled.

Over time, it gained traction through communities like dbt Labs, where it was established as a discipline focused on documentation, reproducibility, and testing best practices.

💡
For many, analytics engineering felt closer to software engineering than traditional reporting.

With the arrival of Artificial Intelligence, the focus shifted from describing the past to predicting the future.

Systems began generating new outputs and working with probabilities rather than fixed answers. Instead of deterministic SQL queries, teams began working with uncertainty. Compared to the clean, predictable nature of dashboards, AI systems feel fundamentally different.

This was the turning point.

Analytics engineering prepared me to build reliable reports. AI requires building intelligent systems. That shift demands a full-stack mindset.


Analytics engineering foundations: What we were trained to optimize

The data warehousing tradition was the foundation of analytics engineering.

We learned to prioritize clarity of structure and dimensional modelling, drawing from texts like The Data Warehouse Toolkit. The goal was consistency and trust, where the same SQL query would always produce the same result.

This determinism became the basis of stakeholder confidence. It worked because it created stability and shared understanding across teams.

However, it also introduced a set of assumptions:

  • Transformations are rigid
  • Answers are exact
  • The world is structured

AI challenges each of these assumptions.


The mindset gap: Deterministic pipelines vs probabilistic systems

Machine learning operates on probability, while analytics engineering is built on certainty.

💡
A dashboard might report revenue as $1.2m. A model, on the other hand, might predict a 72% probability that a customer will churn. One is definitive, the other is contextual.

Research from Harvard Business Review reinforces this shift. Thomas H. Davenport and Rajeev Ronanki explain that successful AI systems deliver value within constraints, with usefulness taking priority over perfection.

This reframes what “correct” means.

Instead of asking whether something is correct, teams focus on:

  • Performance improvement
  • Comparison to a baseline
  • Value delivered to users

As a result, fixed validation gives way to experimentation. Metrics become distributions rather than absolutes, and progress is measured iteratively. For engineers used to deterministic systems, this shift can feel unfamiliar, yet it becomes essential.

AI’s new era: Train once, infer forever in production AI
Why the future of AI systems will be driven by inference and agent workloads.
What analytics engineering  didn’t prepare me for in AI

The data problem gets harder: Messy inputs, drift, and continuous quality

AI introduces a level of complexity that structured analytics rarely encounters.

Data extends beyond clean tables to include logs, images, and unstructured text, all of which require ongoing interpretation and engineering. This data also evolves over time.

In traditional analytics, issues like null values or broken schemas were visible and relatively easy to diagnose. In AI systems, challenges emerge more subtly. Models can degrade while systems appear to function as expected.

Distributions shift.

User behavior evolves.

Language changes.

Simple checks such as row counts provide limited coverage in this context.

Modern AI systems require continuous monitoring and active data management. As Bernard Marr highlights, value from AI comes from actively governed data.

Data quality becomes an ongoing responsibility.


New responsibilities: From transformation to models and MLOps

In analytics, pipelines end at insight. In AI, they extend to action.

Dashboards support human decision-making. Models automate decisions.

This shift introduces a new set of responsibilities:

  • Model deployment and rollback
  • Training and evaluation
  • Monitoring predictions in production
  • Ensuring training consistency

The lifecycle becomes continuous rather than static.

Guidance from Google formalizes this approach under MLOps, where models are treated as production systems.

Frameworks like the ML Test Score, developed by Eric Breck, provide structured ways to assess readiness and manage risk.

The risks are well documented. D. Sculley shows how quickly complexity builds in machine learning systems when pipelines are fragile or loosely defined.

Over time, shortcuts accumulate and systems become unstable.

ML systems are engineering problems, as well as data challenges.

AI’s split future: Control vs autonomy in frontier systems
AI is splitting in two directions. One path is controlled, restricted, and security-first. The other is open, autonomous, and scaling fast. The real question isn’t which is better, it’s what this means for trust.
What analytics engineering  didn’t prepare me for in AI

The full-stack reality: Infrastructure, product, and human trust

As soon as models are embedded into applications, the scope expands.

💡
Concerns such as latency, cost, and scalability become central. Real-time systems and APIs become part of the data workflow. At this point, work extends beyond reporting into product development.

Trust also becomes a defining factor.

Dashboards present verifiable numbers. Models make decisions that impact users. This introduces new expectations around transparency, bias, and accountability.

Users want explanations. Regulators expect oversight.

Trust becomes something that is designed, measured, and maintained alongside technical performance.


Conclusion

Analytics engineering provided strong foundations in lineage, reproducibility, testing, and discipline.

AI builds on these foundations while introducing uncertainty, continuous change, and new system-level challenges.

The boundaries between engineering, analytics, and product continue to converge. Data professionals increasingly think across the full stack, from data models to real-world impact.

The goal is to extend analytics engineering.

From clean dashboards to intelligent systems. From static pipelines to adaptive ones.

This is the shift AI demands, and it highlights the gap that analytics engineering alone did not fully address.


References

  • Eric Breck, Polyzotis, N., Roy, S., Whang, S., & Zinkevich, M. (2017). The ML Test Score: A Rubric for ML Production Readiness.
  • Thomas H. Davenport, & Rajeev Ronanki (2018). Artificial Intelligence for the Real World. Harvard Business Review.
  • Google (2020). MLOps: Continuous Delivery and Automation Pipelines in Machine Learning.
  • Kimball, R., & Ross, M. (2013). The Data Warehouse Toolkit. Wiley.
  • Martin Kleppmann (2017). Designing Data-Intensive Applications. O’Reilly Media.
  • Bernard Marr (2021). Data Strategy: How to Profit from a World of Big Data, Analytics and AI. Kogan Page.
  • D. Sculley et al. (2015). Hidden Technical Debt in Machine Learning Systems. In Neural Information Processing Systems Proceedings.

3 easy ways to get the most out of Claude Code

3 easy ways to get the most out  of Claude Code

The difference between a developer who gets mediocre results and one who ships faster than ever comes down to one thing:

How well they have set Claude up to succeed. 

Claude Code works best as a smarter collaborator. It’s closer to onboarding a new engineer, one who needs context, structure, and clear boundaries to do their best work.

Here is how to give it exactly that…

Why AI safety breaks at the system level

Why AI safety breaks at the system level

Why AI safety breaks at the system level

Two developments in AI have started to reveal a deeper shift in how intelligent systems are built and deployed. 

One model operates behind closed doors, supporting a small group tasked with securing critical infrastructure. Another operates in the open, generating software across extended sessions with minimal supervision.

Same field. Very different philosophies.

For AI professionals, this contrast highlights a more meaningful question than model benchmarks or parameter counts: 

What kind of AI ecosystem is emerging, and how does it shape the way AI systems are designed, deployed, and trusted?


The rise of system-level risk in AI

Recent research explores how AI safety at the model level does not always translate into system-level safety in real-world deployments.

A model can demonstrate strong model alignment during evaluation, yet exhibit entirely different behaviors when embedded within LLM agents. Once connected to tools, APIs, and external environments, the model operates within a broader agentic system that introduces new dynamics.

These dynamics include:

  • Multi-step reasoning across complex workflows
  • Tool use and API integration within agent frameworks
  • Persistent memory in AI systems across sessions
  • Interaction with external and unstructured data sources

Each layer adds complexity. Each interaction expands the AI risk surface.

The result is a shift from isolated model behavior toward emergent system behavior in AI. That shift carries implications for how AI governance and safety are understood and implemented.


So why is model alignment alone not enough?

Model alignment focuses on constraining outputs within acceptable boundaries. Techniques such as reinforcement learning from human feedback (RLHF), constitutional AI, and benchmark-driven evaluation aim to shape responses toward desired behaviors.

💡
Once a model becomes part of an agentic AI system, those constraints operate within a more complex loop. The model plans, acts, observes, and updates. Over time, these cycles create opportunities for unintended outcomes within AI-driven workflows.

Key factors that drive this gap include:

  • Context expansion in large language models. Agents operate across extended contexts, often combining structured and unstructured data. This creates opportunities for subtle inconsistencies to influence decisions.
  • Tool integration and execution risk. Access to external tools introduces operational risk. A safe response at the language level can translate into an unsafe action at the system level.
  • Goal persistence in autonomous agents. AI agents maintain objectives across multiple steps. Small deviations in reasoning can compound over time, leading to outcomes that diverge from initial intent.
  • Evaluation mismatch in AI systems. Many AI evaluation frameworks focus on single-turn interactions. Agent-based systems require multi-step evaluation and scenario testing to reflect real-world usage.

Together, these factors create a gap between how AI safety is measured and how AI systems behave in production.

Analytics engineering’s AI gap: Full-stack data perspective
From clean dashboards to messy intelligence systems.
Why AI safety breaks at the system level

The emergence of agentic complexity

Agent-based systems represent a transition from static inference toward dynamic execution. This shift introduces a new category of challenges in AI system architecture and enterprise AI deployment.

In traditional deployments, the model serves as a component within a controlled pipeline. In agentic AI systems, the model takes on a more active role, making decisions that influence future states and downstream actions.

This creates a form of operational complexity that resembles distributed systems engineering more than standalone models.

Core characteristics of agentic complexity in AI include:

  • Stateful AI interactions across time
  • Non-deterministic execution in LLM agents
  • Feedback loops in autonomous AI systems
  • Interdependencies between tools and model reasoning

These characteristics require a different approach to AI orchestration, monitoring, and control.


What this means for enterprise AI system design

As AI systems evolve, design priorities are shifting. Model performance remains important, yet AI system reliability, observability, and governance are gaining equal weight in enterprise environments.

A few principles are starting to define best practice in AI system design:

  • Design for containment in AI systemsSystems benefit from clearly defined boundaries around agent capabilities. Limiting access to sensitive tools and data reduces exposure to system-level risk.
  • Prioritize observability in AI workflowsDetailed logging and monitoring enable teams to understand how decisions are made across multi-step processes. This supports both debugging and AI governance frameworks.
  • Structure AI workflows explicitlyBreaking tasks into defined stages improves reliability. Structured workflows guide the model through complex processes while reducing ambiguity.
  • Align evaluation with real-world AI deploymentTesting frameworks need to reflect real usage conditions. Multi-step evaluation, red teaming, and adversarial testing provide more meaningful insights than static benchmarks.

These principles reflect a broader shift toward system-level thinking in AI engineering. The focus moves from optimizing individual models to managing interactions across the entire AI stack.

AI’s split future: Control vs autonomy in frontier systems
AI is splitting in two directions. One path is controlled, restricted, and security-first. The other is open, autonomous, and scaling fast. The real question isn’t which is better, it’s what this means for trust.
Why AI safety breaks at the system level

A new layer of responsibility in AI governance

For organizations deploying AI, this shift introduces a new layer of responsibility. AI safety can no longer be treated as a property of the model alone. It becomes a property of the entire AI system architecture.

This includes:

  • How LLM agents are configured and orchestrated
  • What tools and data sources AI systems can access
  • How decisions are monitored, logged, and audited
  • How failures in AI systems are detected and contained

This perspective aligns closely with practices in cybersecurity, risk management, and distributed systems design. It emphasizes defense in depth, continuous monitoring, and controlled deployment environments.


The path forward for agentic AI systems

The evolution of AI systems points toward a more mature phase of development. Early progress focused on expanding model capabilities and scale. The next phase focuses on integrating those capabilities into robust, production-ready AI systems.

This transition creates opportunities for teams that invest in:

  • AI system architecture and orchestration
  • Agent frameworks and workflow design
  • AI governance and compliance

It also raises the bar for what it means to deploy enterprise AI responsibly.

💡
The contrast between controlled and open deployments highlights the range of possible approaches. Some systems prioritize containment, validation, and safety-first deployment. Others prioritize accessibility, speed, and iteration.

Both approaches contribute to the evolving AI ecosystem.


Closing thoughts on AI system reliability

AI is entering a phase where system design defines success. Models continue to improve, yet their impact depends on how they are embedded within complex, real-world systems.

The concept of “safe models” remains important. At the same time, it represents only one layer of a broader challenge.

For AI professionals, the opportunity lies in bridging the gap between model capability and system reliability. That work defines the next frontier of AI engineering and deployment.

It also answers a question that continues to gain relevance: What makes an AI system truly safe at scale?

Behind the Blog: Jazz and Journalism


Behind the Blog: Jazz and Journalism

This is Behind the Blog, where we share our behind-the-scenes thoughts about how a few of our top stories of the week came together. This week, we discuss the Madonna-whore algorithm, reader tips, and jazz.

SAM: Yesterday morning I published a story I started working on weeks ago and only in the last week or so felt enough distance from the topic to be able to articulate it clearly: My year in the wedding planning social media abyss. The piece is a long, more sourced BTB, and I don’t have a ton to add to what’s said in it, but I do want to highlight some of the comments I’ve gotten so far that touch on things the story doesn’t elaborate on.

The Destroyed Remnants of a Lost World Are Falling to Earth, Scientists Discover

🌘
Subscribe to 404 Media to get The Abstract, our newsletter about the most exciting and mind-boggling science news and studies of the week.

The Destroyed Remnants of a Lost World Are Falling to Earth, Scientists Discover

The remnants of a bizarre long-lost world that fell apart before our planet was fully formed are falling to Earth in the form of meteorites, according to a new study in Earth and Planetary Science Letters

For decades, scientists have puzzled over the origin of angrites, a rare class of about 70 meteorites with unique volcanic compositions that suggest they were forged in a large ancient object with differentiated layers, including a metallic core and a magma ocean.

Scientists have long assumed that this object, the so-called angrite parent body (APB), was roughly a few hundred miles across, similar in size to the asteroid 4 Vesta. But researchers recently raised the tantalizing possibility that the APB might have been much larger, perhaps on the scale of Earth’s moon.

Now, a team led by Aaron Bell, an experimental petrologist and an assistant research professor at the University of Colorado, Boulder, has discovered “the first unequivocal evidence supporting the large angrite parent body hypothesis, which posits that the angrites are samples derived from a protoplanet that was catastrophically disrupted during the earliest evolutionary stages of the inner solar system,” according to the new study.

“It probably got destroyed in the early solar system, so [angrites] are remnants of a lost protoplanet,” Bell said in a call with 404 Media. “A few pieces broke off and are now in the asteroid belt, and a few of them have come to Earth, and we’ve picked them up.”

Angrites date back about 4.56 billion years, making them among the oldest known volcanic rocks. They belong to a class of stony “achondritic” meteorites that contain the crystalized signatures of melted rock, such as basalts, hinting that they originate in larger bodies that underwent some degree of planetary processing and layered differentiation, even if those early planetary embryos never accreted into full planets. 

“Angrites are interesting in that they don’t have a known parent body,” Bell said. “It’s never been definitively identified, and that’s one of the mysteries.”

“There are a bunch of arguments about why angrites are so geochemically unusual,” he added. “They’re kind of this oddity.” 

Most models of early planetary accretion predict that relatively small objects formed within the first few million years of the solar system, which is why the APB was assumed to be an asteroid-sized object, rather than a much larger nascent planet.

While working on a previous study, Bell became interested in an aluminum-rich angrite from Northwest Africa, known as NWA 12,774, which was classified in 2019. The meteorite is one of a handful of unusual primitive angrites that appear to have been crystallized at high pressure within the APB, indicating that it formed deep under the surface and therefore might shed light on the size of this bygone world.

“Even among angrites, there’s only four or five that have these primitive compositions,” Bell said, adding that the meteorite had “off-the-charts aluminum content, which is really very unusual.”

Bell and his colleagues developed a geobarometer—a tool that calculates the pressures at which rocks and minerals formed—-that estimated it would take at least 1.7 gigapascals to account for the rock’s special properties. This pressure corresponds to an object with a minimum radius of 620 miles (1,000 kilometers), which is just under the size of Pluto. The APB may even have been as large as the Moon, which has a roughly 1000-mile radius. 

“Clearly, within the first few million years of solar system evolution, you could grow planetary embryos that were 1,000-plus kilometers” in radius, Bell said. “We’re talking within three million years of the condensation of the first solids in the solar system, so it’s right at the beginning.”

The discovery suggests that the APB may have been a first-generation protoplanet that coalesced and shattered millions of years before the familiar worlds of our solar system took full shape. Judging by the strange properties of angrites, the APB was also on track to be a very different kind of world than Earth and its neighbors, had it survived the chaotic environment of its infancy. 

Angrites are “geochemically fundamentally different, and that’s why people were interested in the first place—because they were odd,” Bell said. “They don’t look like garden-variety

basalts you get from Mars or the Moon or Earth.”

“It’s sort of this path not taken—or maybe it was, but we just have a couple pieces of it that tell us something we didn’t know,” he concluded. “There were once large bodies that, maybe, didn’t look like the terrestrial planets.” 

🌘
Subscribe to 404 Media to get The Abstract, our newsletter about the most exciting and mind-boggling science news and studies of the week.

FAA Scraps Civil and Criminal Penalties for Flying Drones Near ICE Vehicles


FAA Scraps Civil and Criminal Penalties for Flying Drones Near ICE Vehicles

On Wednesday the Federal Aviation Administration rescinded a temporary flight restriction (TFR) that created a no-fly zone within 3,000 feet of “Department of Homeland Security facilities and mobile assets.” The new restriction softened the language of the original and abandoned the threat of civil or criminal penalties but added the Department of Justice to the list of protected agencies.

A 2025 TFR restricted the presence of drones around Department of Energy and Pentagon assets. The FAA added ICE and CBP to the list of restricted agencies in January as ICE began operations in Minneapolis. The no-fly zone covered 3,000 feet around any ICE vehicle. Anyone who was caught violating it could be fined or jailed. Because ICE agents often drive through the city in unmarked vehicles it was impossible for drone operators to know if they were violating the order and local journalists who use drones to take pictures and monitor law enforcement activities were grounded.



Earlier this month, Minnesota journalist Rob Levine sued the FAA over the TFR. In a motion filed earlier this week, Levine’s lawyers argued that the FAA had violated his rights and should rescind the restrictions. Core to their argument was the unmarked vehicles which they said created a “flotilla of invisible, moving bubbles,” according to court documents. “Under any standard, the TFR’s chilling sweep violates the First Amendment as applied to the Petitioner’s use of drones in photojournalism.”

The FAA replaced the TFR this week after Levine’s lawyers filed the motion. The new advisory lessened restrictions, including dropping the language around 3,000 feet and criminal penalties, but expanded the amount of protected assets. 

“UAS operators are advised to avoid flying in proximity to: Department of War, Department of Energy, Department of Justice, and Department of Homeland Security covered mobile assets,” the new TFR said. “UAS operators who fly within this airspace are warned that…DOW, DOE, DOJ, or DHS may take action that results in the interference, disruption, seizure, damaging, or destruction of unamended [aircraft] deemed to pose a credible safety or security threat to covered mobile assets.”

Despite the threat to shoot journalist’s drones out of the sky, Levine and his lawyers see the new TFR as a victory. “This is a big win. It was heartbreaking to have my drones grounded at a time of such importance to my community, but I’m looking forward to getting back up there and getting back to my journalism as soon as possible,” Levine said in a statement provided to 404 Media.

Grayson Clary, a lawyer with Reporters Committee for Freedom of the Press who took on Levine’s case, said there is still work to do. “We’re glad to see the FAA rescind its original order, which was an egregious overreach that had serious consequences for reporters nationwide. But this kind of arbitrary back-and-forth from the FAA is exactly the problem, and we intend to make clear to the D.C. Circuit that this restriction never should have been implemented in the first place,” he said.