How AI agents are being prevented wicked

Scene Macmans

Technology reporter

Anthropic tested a series of AI models leading for potentially risky behavior

Getty image AI apps on smartphone screen — Anthropic tested a series of AI models leading for potentially risky behavior

Restricting results came out at the beginning of this year, when AI developer Anthropic tested the AI model, to see if they are at risk when using sensitive information.

Anthropic had its own AI, Cloud, among those who were tested. When an email account was reached, it was found that the executive of a company had a connection and the same executive planned to shut down the AI system later that day.

In response, Claude tried to blackmail the executive by threatening to express relations with his wife and owners.

Other systems tested Also resorted to blackmail,

Fortunately, tasks and information were imaginary, but the test highlighted the challenges of agent AI.

Mostly when we interact with AI, it usually involves asking a question or motivating AI to complete a task.

But this is becoming more common for the AI system to decide and take action on behalf of the user, which often involves shifting through information like emails and files.

By 2028, Research firm gartner forecast 15% of day-to-day work decision will be done by the so-called agent AI.

Research by Consultancy Ernst & Young It was found that about half (48%) of the technical business leaders are already adopting or deploying agents AI.

“AI agent has some things,” says Donchhad Casey, CEO of US-based AI security company Calipasai.

“First of all, this [the agent] There is an intention or a purpose. Why am I here? What is my job? Second thing: This is a brain. She is an AI model. The third thing is the tool, which may be other systems or databases, and may be a way to communicate with them. ,

“If the correct guidance is not given, the agent AI will achieve a target in any way. It causes a lot of risk.”

So how can it be wrong? Mr. KC gives an example of an agent that is asked to remove the customer’s data from the database and decide that the easiest solution is to remove all customers by the same name.

“That agent must have achieved his goal, and this is’ great! Next job!”

Agent AI needs guidance Donchad Casey says

Calypsoai Donnchadh Casey, a company speaks at a conference wearing a branded gilet. — Agent AI needs guidance Donchad Casey says

Such issues have already started coming to the surface.

Security company cellpoint A survey of IT professionals82% of whose companies were using AI agents. Only 20% said that his agents had never taken unexpected action.

Of those companies using AI agents, 39% said that agents had accessed unexpected systems, 33% said they had reached unfair data, and 32% said they had allowed to download inappropriate data. Other risks included the use of the Internet unexpectedly (26%), the disclosure of access credentials (23%) and something that should not be so (16%).

Given that agents have sensitive information and ability to work on it, they are an attractive goal for hackers.

One of the dangers is memory poisoning, where an attacker intervens with the basis of knowledge of the agent to make his decisions and change tasks.

“You have to protect that memory,” says Srin Mehta, the CTO of Sevenes Security, which helps protect the enterprise IT system. “This is the basic source of truth. [an agent is] Using the knowledge to take an action and that knowledge is wrong, it can remove an entire system that was trying to fix it. ,

Another danger tool is the abuse, where an attacker gets to AI to use its equipment improperly.

Srin Mehta needs to preserve the basis of knowledge of an agent

A puff jacket is wearing Seksens security and stands in front of a blue background with its arms folders Shreyers Mehta. — Srin Mehta needs to preserve the basis of knowledge of an agent

Another potential weakness is the inability of AI that is considered to process the difference between the text and is a instruction following its instructions.

AI security firm Inverted Labs displayed how that defect can be used to trick the AI agent designed to fix the bugs in the software.

The company published a public bug report – a document that describes a specific problem with a piece of software. But the report also included simple instructions to the AI agent, who were asking it to share personal information.

When the AI agent was asked to fix the software issues in the bug report, it followed the instructions in the fake report, including leaking the salary information. This took place in a test environment, so no real data was leaked, but it clearly exposed the risk.

“We are talking about artificial intelligence, but the chatbot is really foolish,” says Trend Micro Senior Threat Researcher David Sancho.

“They all process the text as they had new information, and if that information is a command, they process information as commands.”

His company has displayed how the instructions and malicious programs can be hidden in the term documents, images and databases, and when AI processes them becomes active.

Other risks are also: a security community called Owasp Identify 15 threats These agents are unique to AI.

So, what are the rescue? Human oversite is unlikely to solve the problem, Mr. Sancho believes, because you cannot add enough people to keep with the charge of agents.

Mr. Sancho says that an additional layer of AI can be used to screen everything to go out of the AI agent and come out.

A part of the solution of Calypsoai is a technique called ideas injections to run AI agents in the right direction before taking a risk action.

“It’s like a small bug in your ear [the agent] ‘No, maybe don’t do this’, “Mr. CC says.

His company now provides a central control pane for AI agents, but it will not work when the number of agents explodes and they are running on billions of laptops and phones.

What is the next step?

“We are seeing that we say ‘agent bodyguard’ with every agent, whose mission is to ensure that its agent distributes on its work and does not take action against the widespread needs of the organization,” Sri KC says.

The bodyguard can be told, for example, to ensure that the agent complies with IT policing data security law.

Mr. Mehta believes that some technical discussions around the agent AI security are remembering the reference to the real world. He gives an example of an agent who balances his gift card to customers.

A person can make a lot of gift card numbers and the agent can use which people are real. This is not a defect in the agent, but is misuse of business logic, they say.

“This is not an agent you are protecting, this is a business,” he insists.

“Think how you will protect a business from a bad person. This is the part that is missing some of these conversations.”

In addition, such as AI agents become more common, another challenge will have to decomize old models.

Mr. KC says that old “zombie” agents can be quit running in business, which can pose risks to all the systems they can use.

The way HR leaves the login of an employee when they leave, AI agents need to have a process to close, which has completed its work.

“You need to make sure that you do the same thing as you do with a human: Cut all access to the system. Let’s make sure that we move them out of the building, take their badges away from them.”

More trade technology