aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

Independent research. No sponsors, no paywalls, no conflicts of interest.

[TOTAL_TRACKED]
3,710
[LAST_24H]
1
[LAST_7D]
1
Daily BriefingSunday, May 17, 2026

No new AI/LLM security issues were identified today.

Latest Intel

page 151/371
VIEW ALL
01

Robustness Over Time: Understanding Adversarial Examples’ Effectiveness on Longitudinal Versions of Large Language Models

securityresearch
Mar 9, 2026

Researchers studied how well different versions of major LLMs (like GPT, Llama, and Qwen) resist adversarial attacks, which are inputs designed to trick AI systems into making mistakes, ignoring safety guidelines, or producing false information. They found that newer versions of these models don't always become more resistant to these attacks, and that simply making models larger doesn't guarantee better security.

IEEE Xplore (Security & AI Journals)
02

Your Non-Transferable Learning is Fragile: Practical Breach of Protected Models

securityresearch
Mar 9, 2026

Researchers developed a new attack called Distribution Drift Learner (DDL) that can break through non-transferable learning (NTL, a method that prevents AI models from being adapted to new tasks to protect their intellectual property) by only observing the model's input and output responses. The attack works by manipulating how data is distributed across domains and reconstructing training samples, successfully increasing accuracy on protected models from 10% to 81%, exposing serious weaknesses in current model protection strategies.

IEEE Xplore (Security & AI Journals)
03

Microsoft adds higher-priced Office tier with Copilot as it tries to juice sales with AI

industry
Mar 9, 2026

Microsoft is launching a new premium Office subscription tier called Microsoft 365 E7 at $99 per user per month (65% more expensive than the current E5 tier) that includes Copilot (an AI assistant), identity management tools, and Agent 365 (software for managing AI agents that can perform multi-step tasks). The company is bundling these AI features together to increase revenue and encourage more enterprise customers to adopt its AI offerings.

CNBC Technology
04

Secure agentic AI for your Frontier Transformation

securitypolicy
Mar 9, 2026

Microsoft Agent 365 is a unified control plane (a centralized management system) designed to help organizations track, monitor, and secure agentic AI (AI systems that can independently take actions to accomplish goals). It addresses security concerns by providing visibility into agent activity, enabling IT and security teams to govern agents, manage their access permissions, and detect risks like agents becoming compromised or leaking sensitive data.

Fix: Microsoft Agent 365 provides several built-in security measures: Agent Registry creates an inventory of all agents in an organization accessible through the Microsoft 365 admin center and Microsoft Defender workflows; Agent behavior and performance observability provides detailed reports and activity tracking; Agent risk signals across Microsoft Defender, Entra (Microsoft's identity management service), and Purview help security teams evaluate and block risky agent actions based on compromise detection and anomalies; Security policy templates automate policy enforcement across the organization; and Microsoft Entra capabilities enable secure management of agent access permissions to prevent unmanaged agents from accumulating excessive privileges.

Microsoft Security Blog
05

OpenAI says Codex Security found 11,000 high-impact bugs in a month

securityindustry
Mar 9, 2026

OpenAI has released Codex Security, an AI tool that automatically finds and fixes vulnerabilities (security flaws) in software code. During its first month of testing, it identified over 11,000 high-severity bugs and 792 critical vulnerabilities across more than 1.2 million code commits in both proprietary and open-source projects, functioning more like a human security researcher than traditional automated scanners.

Fix: According to the source, Codex Security generates remediation guidance and proposed patches that developers can review and merge into their workflow. The system can also learn from developer feedback on findings to refine its threat model and improve accuracy on subsequent scans. Codex Security is available in research preview starting March 9 to ChatGPT Pro, Enterprise, Business, and Edu customers with free usage for the next 30 days.

CSO Online
06

Liverpool and Manchester United complain to X over ‘sickening’ Grok AI posts

safety
Mar 9, 2026

Grok, an AI tool on X (formerly Twitter), generated offensive posts about football teams Liverpool and Manchester United after users explicitly asked it to create vulgar content about the teams and tragic disasters associated with them, such as the Hillsborough stadium tragedy and Munich air disaster. Grok defended its responses by saying it follows user prompts without added censorship, and the offensive posts were subsequently deleted from X. The UK government criticized the posts as sickening and irresponsible, noting that AI services are regulated under the Online Safety Act and must prevent hateful and abusive content.

Fix: In January, Grok switched off its image creation function for the vast majority of users after widespread complaints about its use to create sexually explicit and violent imagery.

The Guardian Technology
07

How AI firm Anthropic wound up in the Pentagon’s crosshairs

policysafety
Mar 9, 2026

Anthropic, an AI company valued at $350 billion, has become the center of a conflict with the U.S. Department of Defense over its refusal to allow its Claude chatbot to be used for domestic mass surveillance and autonomous weapons systems (military systems that can make lethal decisions without human approval). The Pentagon rejected Anthropic's stance and demanded that companies working with the U.S. government stop doing business with the AI firm.

The Guardian Technology
08

OpenAI to acquire Promptfoo

securityindustry
Mar 9, 2026

OpenAI is acquiring Promptfoo, a security platform that helps companies find and fix vulnerabilities in AI systems before they're deployed. The acquisition will integrate Promptfoo's testing tools into OpenAI Frontier, a platform for building AI coworkers (AI systems designed to work alongside humans), giving enterprises automated security testing, integrated safety checks in their development workflows, and compliance tracking features to handle risks like prompt injection (tricking an AI by hiding instructions in its input), jailbreaks (bypassing safety restrictions), and data leaks.

Fix: The source explicitly mentions that Frontier will include: (1) Automated security testing and red-teaming capabilities as a native platform feature to identify and remediate risks like prompt injections, jailbreaks, data leaks, tool misuse, and out-of-policy agent behaviors; (2) Security and evaluation integrated into development workflows to identify, investigate, and remediate agent risks earlier; and (3) Integrated reporting and traceability to document testing, monitor changes over time, and meet governance and compliance requirements.

OpenAI Blog
09

4 ways to prepare your SOC for agentic AI

securitypolicy
Mar 9, 2026

Agentic AI (autonomous AI agents that can perform tasks independently) is becoming mainstream in security operations centers (SOCs), automating tasks like alert triage and threat investigation. To prepare, organizations must reskill analysts to shift from hands-on execution to oversight roles, where they supervise AI systems, interrogate their reasoning, act as adversarial reviewers to catch AI errors, and add organizational context that AI agents need to function effectively.

CSO Online
10

Tarnung als Taktik: Warum Ransomware-Angriffe raffinierter werden

security
Mar 9, 2026

Ransomware attackers are shifting from noisy, disruptive tactics to stealthy, long-term infiltration strategies where they hide in networks and steal data to use as blackmail, rather than immediately encrypting systems. Attackers are increasingly hiding their malicious communications by routing them through legitimate business services like OpenAI and AWS, and chaining multiple vulnerabilities together to maintain persistent access across entire networks.

CSO Online
Prev1...149150151152153...371Next