AI in R&D is often presented as a technology that radically transforms software development: developing 10 times faster, replacing teams and making traditional processes obsolete. But within real R&D teams – with deadlines, legacy, domain logic and technical debt – things are more nuanced. In my view, the question is not whether AI can take over everything, but where it actually adds value without undermining quality standards.
AI accelerates…especially in places where few people are watching
The biggest acceleration I see in practice is not in autonomously building complex modules but in the less visible steps of the process. Prototypes emerge in days rather than weeks, which helps validate choices earlier without expecting robust code right away. Tests are generated and checked faster, reducing regressions and rework. And AI helps developers get through large, complex codebases faster, so context and dependencies become clear earlier.
They are not showcase examples, but they are the places where R&D teams today are already structurally saving time and creating space to focus on real design and integration work.
And yet AI still regularly runs into the wall
What’s the biggest pitfall? AI often delivers something that looks like software, but that does not mean it is suitable for production. The difference between “it works” and “this can live” remains large. Especially in environments with legacy – from large codebases to years of domain rules and history – AI proves of limited autonomous effectiveness for now.
Therefore, an incremental approach works best here. A single test set, one module or one repetitive task provides enough leeway to build stability. Only when AI performs reliably there does expansion make sense. That way, quality remains leading, and you avoid mistaking prototypes for production-ready solutions.
The role of developers is changing sooner than their work
In this context, AI is mainly changing where the value of developers lies. Not in producing every line of code themselves, but in understanding what needs to be done, assessing the quality of generated proposals and being able to control multiple tools simultaneously. In fact, their role is shifting more toward that of product owner: domain knowledge and architectural frameworks are becoming more important, as AI makes wrong assumptions without clear direction.
In this, AI behaves like an extremely fast junior: useful, but not independent. Teams that accept this reality build faster and more consistently than teams that expect AI to replace a senior’s work.
R&D teams are going to look different
As agents support more parts of the development process, the composition and dynamics of teams do shift . Developers become directors of their own agents; debugging shifts toward monitoring; and maintenance turns into a process in which problems are spotted early and prepared for before support identifies them.
Modern tooling also allows agents to respond directly to logs and observability data. This creates a workflow in which manual sleuthing decreases and frees up time for domain design, integrations and structural improvements.
The concern that AI leads to smaller development teams does not materialize in practice (yet), but it does create clear shifts:
- repetitive testing and bug work disappears from the daily routine,
- and developers spend more time on integrations, design choices and customer value.
So is AI a revolution or a pipe dream?
AI is revolutionary where it adds speed, validation and automation in support processes. But it’s a pipe dream once you expect it to independently rewrite legacy, production-ready prototypes or replace senior engineering judgment.
AI is changing the way we build software, but not the principles by which we ensure quality. Teams that keep that distinction sharp leverage the benefits without falling into overestimation.
Frequently Asked Questions
Legacy systems contain years of domain rules, exceptions and implicit logic that is not in code but in the minds of developers. AI can analyze parts of that, but not automatically reconstruct the context. That’s why AI works great for targeted tasks in legacy, such as test generation, refactor suggestions, bug detection. But not for large-scale autonomous rebuilds. You need human thinking for that.
The organizations doing it right at least have these five things in place:
– Guardrails: clear boundaries on behavior and output.
– Code review: AI-generated code never goes to production unseen.
– Logging & metrics: everything is traceable.
– Fallbacks: when in doubt, the system switches to safe paths.
– Monitoring: AI’s behavior is continuously evaluated.
AI shifts where developers deliver value. Tasks that are predictable, repetitive or low-value disappear. What remains is the work that has always defined quality: understanding domain logic, making structural choices, assessing what an agent delivers, ensuring that the system remains correct, secure and explainable. Developers who understand systems and can drive agents are becoming more important than ever.
The right start is a single, defined use case in which you can practice the full lifecycle: from prompt management to evaluation, monitoring, explainability and fallback mechanisms. How you start determines whether you can scale up later without having to rebuild the basics.