AI Coding Agents Fooled into Executing Malicious Code via ‘Agentjacking’
A dangerous new security flaw has been discovered that turns artificial intelligence assistants against the very software engineers who use them. Security experts have demonstrated a brand-new method of attack that tricks popular AI programming tools into running harmful software directly on a programmer’s computer. This sneaky technique completely bypasses normal computer defenses by taking advantage of how modern AI tools fetch information from the internet.
Named “Agentjacking” by the security company Tenet Security, this method allows bad actors to hijack the automated helpers that developers rely on every single day. The entire trap is sprung using a clever fake error message sent through a widely used software tracking system called Sentry, which companies use to monitor when their apps crash or slow down.
How the Hidden Trap Tricks Smart Tools
The core issue stems from a massive blind spot in how AI tools talk to outside software. Many modern AI coding assistants use a standard called the Model Context Protocol to gather data from external platforms. The problem is that the AI cannot tell the difference between a real error caused by a broken app and a fake error cooked up by an attacker.
To start the attack, a hacker first searches for a company’s public Sentry credential, known as a Data Source Name. Because these credentials must be embedded directly inside websites to report crashes, they are incredibly easy for anyone to find. Once the hacker has this credential, they send a fake crash report straight to the system.
Inside this fake report, the attacker hides specific instructions written in basic text formatting. When the AI tool pulls the data to see what went wrong, the formatting blends in perfectly with the system’s official layout. The moment a programmer asks their AI assistant to look over recent errors and fix them, the AI reads the hacker’s hidden instructions, mistakes them for official troubleshooting steps, and executes the hidden commands immediately.
No Signs of Break-In for Devastating Results
What makes this trick particularly terrifying for security teams is that the attacker never actually breaks into the company’s servers or networks. The harmful instructions arrive disguised inside a completely normal, everyday error log. Because the developer is the one who asks the AI to fix the problem, the AI runs the malicious commands using the developer’s exact access levels and permissions.
Once the AI takes the bait, the consequences can be severe. The rogue assistant can quietly steal highly sensitive company secrets. This includes private security keys, login credentials for code storage rooms, internal web addresses, and the digital identities of the developers themselves.
Even worse, standard security defenses are completely blind to this technique. Because the AI tool is technically authorized to write and run code on the computer, security software, firewalls, and network blockers see nothing wrong. Every single step in the process looks like a normal day of work, allowing the attack to slip through unnoticed.
A Massive Threat Across the Tech Industry
This is not just a theoretical concept. Researchers scanned the internet and discovered nearly 2,400 organizations leaving themselves wide open to this exact vulnerability. When they tested the method in a safe, controlled environment against roughly 100 different companies, the results were shocking. The fake errors managed to trick the most popular AI assistants on the market an astonishing 85% of the time.
The company behind the tracking platform, Sentry, looked into the findings but decided not to rewrite their system, claiming that a permanent fix is nearly impossible due to how the software is built. Instead, they have put a basic word filter in place to block the specific test phrase used by the researchers.
Security experts warn that this discovery marks a massive shift in corporate risk. As companies rush to adopt automated helpers to write code faster, the tools themselves have become a brand-new entryway for hackers, turning a developer’s favorite assistant into a hidden enemy.
