Why Non-Deterministic LLMs Fail in Multi-Decade Manufacturing Codebases

AI-assisted software development is rapidly becoming part of the automotive manufacturing technology landscape.

‍

Development teams are using AI coding assistants to accelerate code generation, support application modernisation, streamline documentation efforts and reduce the effort associated with maintaining ageing technology estates. The promise is compelling. Systems that once required months of manual investigation can seemingly be understood in a fraction of the time, while modernisation initiatives that previously depended on scarce institutional knowledge can now be supported by AI-generated insights.

‍

However, many automotive manufacturers are discovering a significant gap between the promise of AI-assisted development and its practical performance in real-world environments.

‍

The problem is rarely the AI model itself.

‍

The problem is the quality, accessibility and structure of the system knowledge surrounding the code.

‍

This challenge frequently emerges in legacy-heavy manufacturing environments where critical operational systems have evolved over decades. Production planning platforms, manufacturing execution systems, quality management applications, supplier integration environments and inventory systems have often been modified repeatedly to accommodate new vehicle programmes, changing business requirements, evolving compliance obligations and operational improvements.

‍

The code remains. The systems continue to function.

‍

What is often missing is a structured understanding of how those systems work, why they were designed that way and how they interact with the wider operational environment.

‍

This is why many organisations find that AI development tools struggle to deliver reliable outcomes when applied to legacy environments. AI can only reason effectively about the information it can access. If critical knowledge remains fragmented across technical documentation, spreadsheets, archived project records and institutional expertise, even the most advanced coding assistant will operate with an incomplete view of the environment.

‍

The result is not a technology problem.

‍

It is a knowledge problem.

Legacy Complexity Is a Knowledge Problem Before It Is a Technology Problem

Much of the discussion around AI-assisted development assumes that access to source code is sufficient for intelligent optimisation.

‍

In practice, automotive manufacturing environments reveal why this assumption often falls short.

‍

Large language models are highly capable of analysing code structures, generating documentation and identifying patterns. What they cannot automatically reconstruct is decades of undocumented business logic, operational exceptions, integration dependencies and process-specific knowledge that exist outside the application itself.

‍

Most automotive manufacturers operate technology environments that have evolved over many years rather than being designed as unified architectures.

‍

A production scheduling platform may interact with ERP systems, supplier portals, quality applications, inventory management platforms and manufacturing execution systems. A quality management workflow may depend on engineering specifications, inspection procedures, supplier records and operational data from multiple production lines.

‍

Over time, these environments accumulate complexity.

‍

Business rules become embedded within operational processes. Integration logic becomes dispersed across multiple systems. Technical decisions are documented inconsistently or not at all. Knowledge that was once shared among project teams becomes concentrated within a small number of experienced employees.

‍

As a result, organisations frequently possess large amounts of information but lack a structured understanding of how that information connects.

‍

This creates a significant challenge for AI-assisted development.

‍

Before systems can be optimised, modernised or enhanced, the knowledge embedded within them must first be made visible.

Why AI Development Tools Underperform in Legacy Environments

Many AI coding tools perform exceptionally well when operating within environments where relevant system knowledge is readily available and can be supplied as prompt context.

‍

Legacy manufacturing environments rarely provide this advantage.

‍

The challenge is not simply undocumented code. It is what can be described as an In-Context Retrieval Deficit.

‍

An AI coding assistant examining a bare source code repository suffers from a form of localized amnesia. It can analyse the files immediately available to it, infer patterns within modules and generate syntactically correct code. However, much of the intelligence required to understand how legacy manufacturing systems actually operate exists outside the repository itself.

‍

Cross-module dependencies may be documented in obsolete interface specifications. Integration topologies may be scattered across architecture diagrams and project records created years earlier. Historical incident patterns may reside in service tickets, spreadsheets or operational reports. Exception handling procedures may exist only within engineering work instructions or the institutional knowledge of experienced personnel.

‍

Consider a production planning application that has supported multiple vehicle programmes over a fifteen-year period.

‍

The source code reveals scheduling algorithms and workflow structures. What it does not necessarily reveal are the business decisions that shaped those algorithms, the downstream systems that depend upon them, the historical failures that led to particular design choices or the operational exceptions introduced over time.

‍

From the perspective of the AI model, these relationships effectively do not exist.

‍

As a result, recommendations generated from source code alone are inherently constrained because the model is reasoning against only a partial representation of the environment.

‍

The issue is not model capacity.

‍

Modern large language models possess substantial reasoning capabilities. What they lack is access to the surrounding operational context.

‍

Without a machine-readable representation of dependencies, interfaces, process relationships and historical knowledge, there is no graph topology that can be supplied into the model's context window. The AI can analyse individual components, but it cannot reliably reason across the broader system landscape.

‍

This is why organisations often discover that AI-generated recommendations require extensive human validation when applied to legacy environments.

‍

The problem is not that the model lacks intelligence.

‍

The problem is that the model lacks system visibility.

‍

Until fragmented documentation, operational records and historical artefacts are transformed into structured and machine-readable knowledge assets, AI-assisted development will continue to operate with an incomplete understanding of the environments it is attempting to optimise.

Technical Documentation Is No Longer a Project Artefact. It Is a Context Engineering Layer.

Historically, technical documentation was treated primarily as a project deliverable.

‍

Once applications were deployed, documentation often became static while systems continued to evolve around it. As a result, many legacy environments accumulated large collections of specifications, change records and operational documents that were useful for human reference but poorly suited for machine consumption.

‍

In an AI-assisted development environment, this approach creates significant limitations.

‍

The challenge is no longer simply maintaining documentation.

‍

It is performing what can be described as Document-to-Context Engineering.

‍

Large language models do not derive intelligence solely from source code. Their effectiveness depends heavily on the quality of the context supplied to them. This means that interface specifications, change logs, architecture diagrams, configuration maps, operational procedures and historical records can no longer be treated as static documents intended only for human readers.

‍

They must increasingly be transformed into structured, machine-readable assets. Technical specifications describe why systems were designed in particular ways. Interface definitions describe how applications communicate. Configuration maps reveal dependencies and environmental relationships. Change logs capture the historical evolution of business logic.

‍

Operational records expose exception handling patterns and production realities. Individually, these artefacts provide fragments of understanding.

‍

When structured and normalised, they become a context layer that allows AI systems to reason across the broader environment rather than within isolated code repositories.

‍

The challenge facing most organisations is that this information remains dispersed across archived documents, spreadsheets, obsolete specifications and disconnected repositories. Although the knowledge exists, it is rarely organised in a form that can be dynamically retrieved and injected into an AI model's prompt context.

‍

As a result, valuable operational intelligence remains invisible to the systems attempting to analyse it.

‍

This is why forward-looking organisations are beginning to treat technical documentation not as a compliance exercise but as context infrastructure.

‍

The objective is no longer simply preserving documents.

‍

It is transforming documentation into machine-readable configuration files and structured knowledge assets that can continuously supply context to both engineers and AI-assisted development tools.

‍

Because in modern software environments, documentation is no longer just something people read.

‍

It is increasingly something machines reason with.

Structuring System Knowledge for Reliable AI Outcomes

The organisations achieving the greatest value from AI-assisted modernisation are increasingly focusing on knowledge structuring before optimisation.

‍

The objective is not simply to organise documentation.

‍

It is to build a Unified Technical Metagraph.

‍

Most organisations already possess much of the knowledge required to understand their legacy environments. Technical specifications, architecture records, process documentation, application repositories, change histories and operational artefacts all contain valuable intelligence. The problem is that these assets exist in fragmented forms across disconnected systems and incompatible formats.

‍

As a result, critical relationships remain hidden.

‍

A source code repository may contain a function that implements a particular business rule. The architectural rationale for that function may exist within an Architecture Decision Record (ADR). Deployment dependencies may reside within configuration files. Historical failures may be documented within operational runbooks. Process exceptions may be described in spreadsheets maintained by production teams.

‍

Viewed independently, these artefacts provide only partial understanding.

‍

The challenge is to programmatically extract and connect these relationships.

‍

This is where structured data extraction, cleansing and normalisation become critical.

‍

Rather than simply digitising documents, the objective is to construct a machine-readable graph that links technical entities across multiple dimensions. A code symbol can be connected directly to an Architecture Decision Record, deployment configuration, interface specification, operational runbook and historical change record. Dependencies become explicit rather than implicit. Business logic becomes traceable rather than hidden.

‍

As these relationships are captured, isolated information begins to evolve into a Unified Technical Metagraph representing how systems, processes and operational knowledge interact across the enterprise.

‍

This creates a stronger foundation for both human decision-making and AI-assisted development.

‍

Instead of reasoning only from application code, AI systems can retrieve contextual information from across the knowledge graph, supplying the model with architectural rationale, dependency structures, operational history and process intelligence.

‍

Through retrieval-augmented generation (RAG), this multi-dimensional context graph can be dynamically injected into the model's prompt context. This enables the AI to reason across relationships that would otherwise remain invisible.

‍

Without this contextual layer, an AI coding assistant remains largely a sophisticated syntax autocomplete tool.

‍

With it, the assistant becomes capable of operating as an architecturally aware modernisation engine, able to understand not only what the code does, but why it exists, how it interacts with surrounding systems and what operational constraints govern its behaviour.

‍

The difference is not the model.

‍

The difference is the quality and structure of the knowledge environment surrounding it.

Making Dependencies and Logic Flows Visible

One of the biggest challenges in legacy environments is understanding how systems and processes depend on one another.

‍

Dependencies often exist across multiple applications, departments and operational workflows. Logic flows may span technical systems, manual processes and external partners.

‍

Unfortunately, much of this information is rarely maintained in a structured form.

‍

Dependency information may be scattered across technical documentation, interface specifications, process diagrams and historical project records. Operational logic may exist within procedures and work instructions rather than within application code.

‍

This fragmentation creates risk during modernisation initiatives because teams often struggle to understand the potential impact of system changes.

‍

By extracting and organising technical information into structured knowledge assets, organisations can improve visibility into how systems support operational processes.

‍

This does not simply improve documentation quality.

‍

It improves understanding.

‍

When dependencies, logic flows and technical context become easier to access, organisations can make more informed decisions regarding system enhancement, migration and optimisation.

‍

AI-assisted development initiatives benefit directly from this improved visibility because the surrounding knowledge environment becomes richer, more consistent and easier to interpret.

From Static Documentation to Reusable Operational Knowledge

The most valuable technical documentation is not documentation that sits unused in a repository. It is documentation that actively supports operational decision-making.

‍

Forward-looking automotive manufacturers are increasingly recognising that technical knowledge should be treated as a long-term business asset. The same information that supports a modernisation initiative today may help future engineering teams understand legacy systems, support operational continuity, accelerate onboarding or inform future transformation programmes.

‍

This is particularly important as organisations continue to invest in AI-enabled development capabilities.

‍

AI tools are most effective when operating within environments where knowledge is structured, accessible and reusable. Organisations that invest in creating these foundations position themselves to derive greater value from both modernisation programmes and future AI initiatives.

‍

The goal is not simply to preserve documentation. The goal is to preserve understanding.

Why Merit Data & Technology

Merit Data & Technology helps organisations address one of the most significant barriers to successful legacy modernisation: fragmented system knowledge.

‍

Through data harvesting, intelligent document processing, data extraction, cleansing and normalisation services, Merit helps organisations transform technical documentation and legacy information assets into structured, reusable knowledge resources.

‍

Technical specifications, process documentation, historical records and operational artefacts can be extracted and organised into formats that improve accessibility, consistency and long-term usability. By transforming fragmented information into structured data assets, organisations gain a clearer understanding of the systems, processes and operational knowledge that support their technology environments.

‍

This structured foundation supports legacy modernisation initiatives by making critical information easier to locate, interpret and reuse. It also helps organisations establish the information foundations required for future AI-assisted development and optimisation efforts. Rather than relying solely on application code, organisations gain access to the broader operational context needed to understand how systems function, interact and support business processes.

‍

The result is not simply better documentation. It is a more complete and trustworthy knowledge foundation that supports informed decision-making, modernisation and long-term operational resilience.

‍

AI-assisted development tools are transforming how organisations approach software engineering and legacy modernisation. However, in automotive manufacturing environments, their effectiveness depends heavily on the quality and structure of the knowledge surrounding the systems they analyse.

‍

Source code alone is rarely enough. Without access to structured technical documentation, operational context, dependencies and process knowledge, AI tools operate with an incomplete understanding of the environments they are attempting to optimise.

‍

The manufacturers that achieve the greatest value from AI-assisted development will not necessarily be those deploying the most advanced coding tools. They will be the ones that first invest in making their system knowledge visible, accessible and reusable. Because ultimately, an AI coding tool is only as smart as the documentation, context and operational intelligence it has available to learn from.

- Authored by Rubaina Rauf & Tharun Mathew

Your AI Coding Tool Is Only as Smart as Your System Documentation: The Legacy Complexity Problem