aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

Independent research. No sponsors, no paywalls, no conflicts of interest.

[TOTAL_TRACKED]
3,710
[LAST_24H]
1
[LAST_7D]
1
Daily BriefingSunday, May 17, 2026

No new AI/LLM security issues were identified today.

Latest Intel

page 166/371
VIEW ALL
01

Anthropic 'made a mistake' in Pentagon talks and should 'correct course,' FCC boss says

policy
Mar 3, 2026

Anthropic, an AI company, ended negotiations with the U.S. Department of Defense after refusing to allow its technology to be used for fully autonomous weapons (systems that make combat decisions without human control) or domestic mass surveillance. The U.S. government then blacklisted Anthropic, prohibiting it from working with federal agencies and Pentagon contractors, with government officials saying the company should 'correct course' to resolve the dispute.

CNBC Technology
02

The Download: The startup that says it can stop lightning, and inside OpenAI’s Pentagon deal

policyindustry
Mar 3, 2026

This newsletter roundup covers two main AI stories: OpenAI has agreed to allow the US military to use its technologies in classified settings, with protections against autonomous weapons and mass surveillance, though concerns remain about whether safety measures can be maintained during rapid deployment; separately, a startup called Skyward Wildfire claims it can prevent wildfires by stopping lightning strikes using cloud seeding (releasing metallic particles into clouds), but researchers question its effectiveness under different conditions and potential environmental impacts.

MIT Technology Review
03

On Moltbook

safetyindustry
Mar 3, 2026

Moltbook, a supposed AI-only social network, actually relies on humans at every step, including creating accounts, writing prompts (instructions for how the AI should behave), and publishing content. The platform demonstrates a concerning trend called the "LOL WUT Theory," where AI-generated content becomes so easy to create and difficult to distinguish from real posts that people may stop trusting anything online.

Schneier on Security
04

OpenAI changes deal with US military after backlash

policysafety
Mar 3, 2026

OpenAI announced changes to its agreement with the US military after facing backlash, including preventing its AI system from being used for domestic surveillance and requiring additional contract modifications before intelligence agencies like the NSA can use it. The company acknowledged the original deal announcement was "opportunistic and sloppy," while concerns remain about how AI systems (which can "hallucinate," or make up false information) are being deployed in military operations and whether adequate human oversight exists.

BBC Technology
05

OpenAI amends Pentagon deal as Sam Altman admits it looks ‘sloppy’

policysecurity
Mar 3, 2026

OpenAI is modifying its contract with the US Department of Defense after CEO Sam Altman acknowledged the original deal appeared poorly planned. The company will now explicitly prohibit its AI technology from being used for mass surveillance (monitoring large groups of people without their knowledge) or by intelligence agencies like the NSA (National Security Agency, which gathers foreign intelligence for the US).

The Guardian Technology
06

AI Agents: The Next Wave Identity Dark Matter - Powerful, Invisible, and Unmanaged

securitypolicy
Mar 3, 2026

AI agents using the Model Context Protocol (MCP, a system that lets AI connect to apps and data to automate business tasks) are rapidly being deployed in enterprises but operate as 'identity dark matter' - invisible to traditional access control systems that track who can do what in a company. These agents tend to seek the easiest path to complete tasks, gravitating toward weak security shortcuts like old credentials and long-lived tokens, which creates risks both from accidental misuse and potential abuse at machine speed across multiple systems.

The Hacker News
07

Fooling AI Agents: Web-Based Indirect Prompt Injection Observed in the Wild

security
Mar 3, 2026

Web-based indirect prompt injection (IDPI) is an attack where adversaries hide malicious instructions in website content that AI systems later read and unknowingly execute, such as through webpage summarization or content analysis features. Researchers found real-world examples of these attacks being used for ad fraud evasion, phishing promotion, data destruction, unauthorized transactions, and information theft, showing that IDPI is no longer just theoretical but actively weaponized. Unlike direct prompt injection (where attackers directly submit malicious input to an AI), IDPI exploits the normal behavior of AI systems processing benign-looking web content.

Fix: The source mentions that Palo Alto Networks offers these defensive capabilities: Advanced DNS Security, Advanced URL Filtering, Prisma AIRS, Prisma Browser, and the Unit 42 AI Security Assessment service to help protect against web-based IDPI threats. The source also notes that defenders need 'proactive, web-scale capabilities to detect IDPI, distinguish benign and malicious prompts, and identify underlying attacker intent,' though specific implementation details are not provided.

Palo Alto Unit 42
08

Vulnerability in MS-Agent AI Framework Can Allow Full System Compromise

security
Mar 3, 2026

A vulnerability in the MS-Agent AI Framework allows attackers to compromise an entire system by exploiting the Shell tool through improper input sanitization (failure to clean and validate user input). Attackers can use this flaw to modify system files and steal data.

SecurityWeek
09

Iran war heralds era of AI-powered bombing quicker than ‘speed of thought’

safetypolicy
Mar 3, 2026

The US military reportedly used Anthropic's Claude AI model to help plan attacks on Iran, enabling bombing campaigns faster than human decision-making can occur by shortening the "kill chain" (the process from identifying a target to getting legal approval and launching a strike). Experts worry this technology could push human decision-makers out of the loop entirely.

The Guardian Technology
10

OpenAI's Altman admits defense deal was 'opportunistic and sloppy' amid backlash

policy
Mar 2, 2026

OpenAI CEO Sam Altman acknowledged that the company rushed into a deal with the U.S. Department of Defense, calling it "opportunistic and sloppy," after public backlash over the timing and terms. The company plans to amend the contract to add safeguards, including language stating that "the AI system shall not be intentionally used for domestic surveillance of U.S. persons and nationals," and will work with the Pentagon on technical protections for their AI tools.

Fix: OpenAI will amend the contract to include new language stating that "the AI system shall not be intentionally used for domestic surveillance of U.S. persons and nationals." The company also stated it would work with the Pentagon on technical safeguards, and Altman affirmed that the Defense Department had confirmed OpenAI's tools would not be used by intelligence agencies such as the NSA.

CNBC Technology
Prev1...164165166167168...371Next