aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Browse All

All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.

to
Export CSV
6364 items

llm 0.29

infonews
industry
Mar 17, 2026

This is a monthly briefing about LLM (large language model) developments from March 2026, curated by Simon Willison. The content appears to be a sponsorship announcement for a paid email digest service rather than a discussion of a specific AI issue or vulnerability.

Simon Willison's Weblog

Arbitrary code execution via crafted project files in Kiro IDE

highvulnerability
security
Mar 17, 2026

Kiro IDE, an AI-powered development environment for building autonomous software agents, has a vulnerability (CVE-2026-4295) that allows arbitrary code execution (running unintended commands on a system) when users open malicious project files. The flaw exists in versions before 0.8.0 due to improper trust boundary enforcement (failing to verify that data comes from a safe source).

What the EU AI Act Means for Staffing Businesses

inforegulatory
policy
Mar 17, 2026

The EU AI Act, effective August 2, 2026, classifies AI systems used in hiring and employment decisions (such as candidate screening, ranking, and performance monitoring) as high-risk and requires businesses that deploy them to conduct risk assessments, perform bias testing, maintain human oversight, and provide transparency disclosures. Staffing companies, recruitment platforms, and workforce intermediaries are responsible for compliance even if they did not build the technology, and this obligation applies globally if the AI system affects anyone in the EU.

AI Flaws in Amazon Bedrock, LangSmith, and SGLang Enable Data Exfiltration and RCE

highnews
security
Mar 17, 2026

Researchers discovered that Amazon Bedrock AgentCore Code Interpreter allows outbound DNS queries (the system that translates website names to IP addresses) even when configured with no network access, letting attackers steal data and run commands by using DNS as a secret communication channel. Amazon says this is intended functionality and recommends users switch to VPC mode (a virtual private network configuration) instead of sandbox mode for better isolation. Separately, a flaw in LangSmith (a tool for managing AI language model workflows) allows attackers to steal user login tokens through URL parameter injection (inserting malicious data into web addresses).

Now everyone in the US is getting Google’s personalized Gemini AI

infonews
industry
Mar 17, 2026

Google has expanded access to its Personal Intelligence feature, which connects various Google apps (like YouTube, Gmail, and Google Photos) to give Gemini (Google's AI assistant) more context for better responses. Previously available only to paid subscribers, this feature is now accessible to free-tier users in the US through Search, Chrome, and the Gemini app, though it remains limited to personal accounts and not business or education accounts.

Tech Giants Invest $12.5 Million in Open Source Security

infonews
policyindustry

Microsoft shakes up Copilot AI leadership team, freeing up Suleyman to build new models

infonews
industry
Mar 17, 2026

Microsoft is reorganizing its AI leadership, moving Jacob Andreou into a new executive role overseeing both consumer and commercial Copilot assistants, while freeing up Mustafa Suleyman to focus on building new AI models as part of Microsoft's superintelligence (advanced AI systems aiming toward human-level reasoning) efforts. This restructuring comes as Microsoft's Copilot adoption lags significantly behind competitors like ChatGPT and Gemini, and as investors pressure the company to show returns on its AI investments.

Microsoft appoints a new Copilot boss after AI leadership shake-up

infonews
industry
Mar 17, 2026

Microsoft is reorganizing its leadership to unify its Copilot assistant (an AI tool that helps users with tasks) across consumer and business products, which have been developed separately. The AI CEO Mustafa Suleyman will now focus on building Microsoft's own AI models rather than directly managing Copilot's features for individual users.

The future of code is exciting and terrifying

infonews
industry
Mar 17, 2026

AI coding tools like Claude Code are changing how software development works, with more people able to write code and experienced developers spending less time writing code themselves and more time managing AI agents (programs that can act somewhat autonomously) and projects. The article explores what these rapid changes mean for both the code being produced and the people who create it.

Surf AI Raises $57 Million for Agentic Security Operations Platform

infonews
industry
Mar 17, 2026

Surf AI, a company building an agentic security operations platform (software that uses AI agents, or autonomous programs that take actions without human intervention, to handle security tasks), has announced its launch with $57 million in funding from major investors. The article focuses on the company's funding announcement rather than a specific security issue or problem.

Top 5 Things CISOs Need to Do Today to Secure AI Agents

infonews
securitypolicy

New font-rendering trick hides malicious commands from AI tools

mediumnews
securitysafety

Microsoft stops force-installing the Microsoft 365 Copilot app

infonews
industry
Mar 17, 2026

Microsoft has temporarily stopped automatically installing the Microsoft 365 Copilot app (an AI assistant integrated with productivity software like Word and Excel) on Windows devices outside the European Economic Area, though the company has not explained why the rollout was halted. When the automatic installation resumes, IT administrators will be able to disable it through the Microsoft 365 Apps admin center by unchecking the automatic installation setting.

Boosting Active Defense Persistence: A Two-Stage Defense Framework Combining Interruption and Poisoning Against Deepfake

inforesearchPeer-Reviewed
security

FORCE: Byzantine-Resilient Decentralized Federated Learning via Game-Theoretic Contribution Aggregation

inforesearchPeer-Reviewed
security

The Download: OpenAI’s US military deal, and Grok’s CSAM lawsuit

infonews
securitysafety

AWS Bedrock’s ‘isolated’ sandbox comes with a DNS escape hatch

highnews
security
Mar 17, 2026

Researchers discovered that AWS Bedrock's Sandbox mode for AI agents isn't as isolated as promised because it allows outbound DNS queries (requests to translate domain names into IP addresses), which attackers can exploit to secretly communicate with external servers, steal data, or run remote commands. AWS acknowledged the issue but decided not to patch it, calling DNS resolution an 'intended functionality' needed for the system to work properly, and instead updated their documentation to clarify this behavior.

Alibaba launches agentic AI tool for businesses with Slack, Teams integration plans

infonews
industry
Mar 17, 2026

Alibaba released Wukong, a new agentic AI tool (software that can take proactive actions on company systems, not just respond to questions) designed to help businesses manage multiple AI agents through a single interface with planned integration into messaging apps like Slack and Microsoft Teams. The platform handles tasks such as document editing, approvals, and meeting transcription, though the company acknowledges that giving AI agents broad access to company data raises privacy and security concerns.

Open, Closed and Broken: Prompt Fuzzing Finds LLMs Still Fragile Across Open and Closed Models

infonews
securityresearch

Introducing GPT-5.4 mini and nano

infonews
industry
Mar 17, 2026

OpenAI released GPT-5.4 mini and nano, smaller and faster versions of their GPT-5.4 model designed for high-volume tasks where response speed matters. GPT-5.4 mini runs more than 2x faster than GPT-5 mini while approaching the performance of the full GPT-5.4 model on coding and reasoning tasks, while GPT-5.4 nano is the smallest and cheapest option for simpler jobs like classification and data extraction. These models work best in applications like coding assistants, AI subagents (specialized AI components that handle specific subtasks), and systems that interpret screenshots, where being fast and cost-effective is more important than raw capability.

Previous160 / 319Next
AWS Security Bulletins
EU AI Act Updates

Fix: For Amazon Bedrock: migrate from Sandbox mode to VPC mode, implement a DNS firewall to filter outbound DNS traffic, audit IAM roles to follow the principle of least privilege (giving services only the minimum permissions they need), and use strict security groups and network ACLs. For LangSmith: update to version 0.12.71 or later (released December 2025), which addresses the token theft vulnerability.

The Hacker News
The Verge (AI)
Mar 17, 2026

Five major technology companies (Anthropic, AWS, Google, Microsoft, and OpenAI) have collectively invested $12.5 million into the Linux Foundation (a nonprofit organization that maintains critical open source software) to support long-term security improvements in open source projects. This funding aims to strengthen the security of widely-used software that many other programs depend on.

SecurityWeek
CNBC Technology
The Verge (AI)
The Verge (AI)
SecurityWeek
Mar 17, 2026

AI agents are autonomous software systems that can plan, decide, and act independently across connected systems, often without human oversight, creating significant security risks that traditional guardrails (like prompt filtering) cannot adequately address. The article argues that identity-based access control, rather than prompt restrictions or network controls, is the foundation for securing AI agents. CISOs must treat AI agents as first-class identities, shift from guardrails to strict access control, and eliminate shadow AI (unauthorized agents) through continuous discovery and visibility of agent identities.

BleepingComputer
Mar 17, 2026

Researchers discovered a font-rendering attack that hides malicious commands from AI assistants by using custom fonts and CSS styling to display one message to users while keeping harmless text visible to AI tools analyzing the webpage's HTML. The attack successfully tricked multiple popular AI assistants (like ChatGPT, Claude, and Copilot) into giving false safety assessments, exploiting the gap between what an AI reads in code and what a user actually sees rendered in their browser.

Fix: Microsoft was the only vendor that fully accepted and addressed the issue. LayerX recommends that AI assistants should analyze both the rendered visual page and the underlying code together and compare them to better evaluate safety. Additional recommendations to AI vendors include treating fonts as a potential attack surface, extending code parsers to scan for foreground/background color matches, near-zero opacity text, and abnormally small fonts.

BleepingComputer

Fix: According to the source, when automatic installation resumes, IT administrators can opt out by: signing into the Microsoft 365 Apps admin center, navigating to Customization > Device Configuration > Modern App Settings, selecting the Microsoft 365 Copilot app, and clearing the 'Enable automatic installation of Microsoft 365 Copilot app' checkbox. Additionally, the source mentions that Microsoft is testing a new policy called RemoveMicrosoftCopilotApp that would allow IT admins to uninstall Copilot from devices managed via Microsoft Intune or System Center Configuration Manager (SCCM, software for managing large numbers of computers).

BleepingComputer
research
Mar 17, 2026

This research addresses a weakness in active defense systems against deepfakes (AI-generated fake videos or images): these defenses often fail when attackers retrain their models on protected samples. The authors propose a Two-Stage Defense Framework (TSDF) that uses dual-function adversarial perturbations (carefully designed noise patterns that disrupt both the deepfake output and the attacker's retraining process) to make defenses more persistent by poisoning the data (corrupting the training information) that attackers would use to adapt their models.

Fix: The source describes the proposed defense framework (TSDF) as the solution but does not mention an existing patch, update, or mitigation for current systems. The paper presents the framework as a research contribution rather than a fix for deployed software. N/A -- no mitigation for existing systems discussed in source.

IEEE Xplore (Security & AI Journals)
research
Mar 17, 2026

Decentralized Federated Learning (DFL, a way for multiple computers to train AI models together without a central server) is vulnerable to Byzantine attacks (when malicious participants send bad data to sabotage the learning process). The paper proposes FORCE, a new method that uses game theory concepts (mathematical models of strategy and fairness) to identify and exclude malicious clients by checking their model loss (how well their models perform) instead of checking gradients (the direction to improve), making DFL more resistant to these attacks.

IEEE Xplore (Security & AI Journals)
Mar 17, 2026

This newsletter roundup covers multiple AI-related developments, including OpenAI's partnership with the US military (potentially for applications like selecting strike targets), xAI's Grok facing a lawsuit over generating non-consensual intimate images (deepfakes, or synthetic media created to impersonate real people), and China approving the world's first commercial brain chip (a BCI, or brain-computer interface that reads signals from the brain) for medical use. The piece also highlights concerns from AI safety experts, including OpenAI's own wellbeing team opposing a new 'adult mode' feature.

MIT Technology Review

Fix: AWS updated documentation to clarify that Sandbox mode permits DNS resolution. Security teams should inventory all active AgentCore Code Interpreter instances and migrate to VPC mode (a more restricted network environment).

CSO Online
CNBC Technology
Mar 17, 2026

Researchers created a genetic algorithm-inspired prompt fuzzing method (automatically generating variations of harmful requests while keeping their meaning) that found significant weaknesses in guardrails (safety systems protecting LLMs) across multiple AI models, with evasion rates ranging from low to high depending on the model and keywords used. The key risk is that while individual jailbreak attempts (tricking an AI to ignore its safety rules) may have low success rates, attackers can automate this process at scale to reliably bypass protections. This matters because LLMs are increasingly used in customer support and internal tools, so guardrail failures can lead to safety incidents and compliance problems.

Fix: The source recommends five mitigation strategies: treating LLMs as non-security boundaries, defining scope, applying layered controls, validating outputs, and continuously testing GenAI with adversarial fuzzing (automated testing with malicious inputs) and red-teaming (simulated attacks to find weaknesses). Palo Alto Networks customers can use Prisma AIRS and the Unit 42 AI Security Assessment products for additional protection.

Palo Alto Unit 42
OpenAI Blog