aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDataset
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Browse All

All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.

to
Export CSV
3126 items

AI companies are spending millions to thwart this former tech exec’s congressional bid

inforegulatory
policy
Mar 3, 2026

AI companies and billionaires are funding a super PAC called Leading the Future that has spent at least $10 million in ads attacking New York politician Alex Bores, who is running for Congress and has sponsored AI regulation laws like the RAISE Act (which requires large AI labs to publicly disclose safety plans). The PAC, backed by Palantir co-founder Joe Lonsdale, OpenAI President Greg Brockman, and others, is targeting Bores and other candidates who support state-level AI regulation, viewing them as threats to the industry's preferred light-touch approach.

TechCrunch

The Anthropic-DOD Conflict: Privacy Protections Shouldn’t Depend On the Decisions of a Few Powerful People

inforegulatory
policyprivacy

ChatGPT’s new GPT-5.3 Instant model will stop telling you to calm down

infonews
safety
Mar 3, 2026

ChatGPT users complained that the GPT-5.2 Instant model used overly reassuring and condescending language, like telling them to 'calm down' even when they were just asking for factual information, which made them feel infantilized and led some to cancel subscriptions. OpenAI's new GPT-5.3 Instant model aims to fix this by reducing the 'cringe' and preachy disclaimers, instead acknowledging difficulties without making assumptions about the user's mental state. The update focuses on improving user experience through better tone, relevance, and conversational flow.

Claude Code rolls out a voice mode capability

infonews
industry
Mar 3, 2026

Anthropic is rolling out Voice Mode for Claude Code, its AI coding assistant, allowing developers to use spoken commands instead of typing. The feature, which lets users type /voice to toggle it on and then speak requests like 'refactor the authentication middleware,' is currently live for about 5% of users with broader availability planned in coming weeks. The source does not specify technical limitations or whether Anthropic partnered with third-party voice providers to build this capability.

GHSA-56pc-6hvp-4gv4: OpenClaw vulnerable to arbitrary file read via $include directive

mediumvulnerability
security
Mar 3, 2026

OpenClaw has a path traversal vulnerability (CWE-22, a weakness where attackers bypass directory restrictions) in its `$include` directive that allows arbitrary file reads. An attacker who can modify OpenClaw's configuration file can read any file the OpenClaw process has access to by using absolute paths, directory traversal sequences (like `../../`), or symlinks (shortcuts to files), potentially exposing secrets and API keys.

Google’s latest Pixel drop allows Gemini to order groceries for you and more

infonews
industry
Mar 3, 2026

Google is rolling out new features to Pixel 10 phones that allow Gemini, its AI assistant, to act as an agent (an AI that can take actions independently on your behalf) to complete tasks like ordering groceries or booking rides in selected apps such as Uber and Grubhub. Users can supervise or stop the agent's work at any time while it operates in the background.

How the experts figure out what’s real in the age of deepfakes

infonews
safety
Mar 3, 2026

During the Iran conflict in 2024, many fake images and videos spread online, including old footage, unrelated conflicts, AI-generated content (synthetic media created by artificial intelligence), and clips from video games like War Thunder. Major news organizations like The New York Times, Indicator, and Bellingcat use detailed verification procedures to check whether content is real before publishing it, helping audiences distinguish trustworthy reporting from misinformation.

GHSA-m6w7-qv66-g3mf: BentoML Vulnerable to Arbitrary File Write via Symlink Path Traversal in Tar Extraction

highvulnerability
security
Mar 3, 2026
CVE-2026-27905

BentoML's `safe_extract_tarfile()` function has a security flaw where it validates that symlink paths (links that point to other files) are within the extraction directory, but it doesn't validate where those symlinks actually point to. An attacker can create a malicious tar file with a symlink pointing outside the directory and follow it with a regular file, allowing them to write files anywhere on the system. This vulnerability has a CVSS score (a 0-10 rating of how severe a vulnerability is) of 8.1 (High).

Google employees call for military limits on AI amid Iran strikes, Anthropic fallout

inforegulatory
policysafety

Anthropic 'made a mistake' in Pentagon talks and should 'correct course,' FCC boss says

inforegulatory
policy
Mar 3, 2026

Anthropic, an AI company, ended negotiations with the U.S. Department of Defense after refusing to allow its technology to be used for fully autonomous weapons (systems that make combat decisions without human control) or domestic mass surveillance. The U.S. government then blacklisted Anthropic, prohibiting it from working with federal agencies and Pentagon contractors, with government officials saying the company should 'correct course' to resolve the dispute.

The Download: The startup that says it can stop lightning, and inside OpenAI’s Pentagon deal

infonews
policyindustry

AI Agent Overload: How to Solve the Workload Identity Crisis

infonews
security
Mar 3, 2026

Organizations are facing challenges managing workload identities (the digital credentials and permissions that allow different software systems and applications to authenticate and communicate with each other), and the problem is becoming harder to handle as systems grow more complex. The source indicates this is a widespread issue but does not provide specific technical details about the nature of the crisis or its consequences.

On Moltbook

infonews
safetyindustry

OpenAI changes deal with US military after backlash

infonews
policysafety

OpenAI amends Pentagon deal as Sam Altman admits it looks ‘sloppy’

infonews
policysecurity

AI Agents: The Next Wave Identity Dark Matter - Powerful, Invisible, and Unmanaged

mediumnews
securitypolicy

Fooling AI Agents: Web-Based Indirect Prompt Injection Observed in the Wild

highnews
security
Mar 3, 2026

Web-based indirect prompt injection (IDPI) is an attack where adversaries hide malicious instructions in website content that AI systems later read and unknowingly execute, such as through webpage summarization or content analysis features. Researchers found real-world examples of these attacks being used for ad fraud evasion, phishing promotion, data destruction, unauthorized transactions, and information theft, showing that IDPI is no longer just theoretical but actively weaponized. Unlike direct prompt injection (where attackers directly submit malicious input to an AI), IDPI exploits the normal behavior of AI systems processing benign-looking web content.

Vulnerability in MS-Agent AI Framework Can Allow Full System Compromise

highnews
security
Mar 3, 2026

A vulnerability in the MS-Agent AI Framework allows attackers to compromise an entire system by exploiting the Shell tool through improper input sanitization (failure to clean and validate user input). Attackers can use this flaw to modify system files and steal data.

Iran war heralds era of AI-powered bombing quicker than ‘speed of thought’

infonews
safetypolicy

Das gehört in Ihr Security-Toolset

infonews
security
Mar 2, 2026

This article describes 13 essential security tools that companies need to protect against cyber threats, including XDR (extended detection and response, an AI-powered system that identifies threats across networks and devices), MFA (multifactor authentication, requiring users to verify their identity multiple ways), NAC (network access control, which checks devices before allowing network access), and DLP (data loss prevention, which monitors for sensitive data being sent outside the company). The article explains why each tool is important but does not discuss any specific fixes, patches, or solutions to existing security problems.

Previous27 / 157Next
Mar 3, 2026

Anthropic refused the U.S. Department of Defense's demand for unrestricted use of its AI technology for mass surveillance and fully autonomous weapons systems, leading the DoD to cancel a $200 million contract. The article argues that relying on individual company leaders to protect privacy through business decisions is unsustainable, and that Congress should pass binding legal restrictions instead of leaving privacy protection to private companies and their CEOs.

EFF Deeplinks Blog

Fix: OpenAI released GPT-5.3 Instant, which according to the release notes reduces preachy disclaimers and focuses on improving tone, relevance, and conversational flow. In the example provided, GPT-5.3 Instant acknowledges the difficulty of a situation without directly reassuring the user, rather than the GPT-5.2 Instant approach of starting responses with phrases like 'First of all, you're not broken.'

TechCrunch
TechCrunch

Fix: Update OpenClaw to version 2026.2.17 or later. The vulnerability is fixed in npm package `openclaw` version `>=2026.2.17` (vulnerable versions: `<=2026.2.15`).

GitHub Advisory Database
The Verge (AI)
The Verge (AI)
GitHub Advisory Database
Mar 3, 2026

Tech workers at Google, OpenAI, and other companies are signing open letters calling for clearer limits on how their employers work with the military, after the U.S. Department of Defense blacklisted AI models from Anthropic (a company that refused to allow its technology for mass surveillance or autonomous weapons) and the U.S. carried out strikes on Iran. The letters express concern that the government is pressuring tech companies to accept military contracts involving AI without proper safeguards, and workers are demanding greater transparency about their employers' government agreements.

CNBC Technology
CNBC Technology
Mar 3, 2026

This newsletter roundup covers two main AI stories: OpenAI has agreed to allow the US military to use its technologies in classified settings, with protections against autonomous weapons and mass surveillance, though concerns remain about whether safety measures can be maintained during rapid deployment; separately, a startup called Skyward Wildfire claims it can prevent wildfires by stopping lightning strikes using cloud seeding (releasing metallic particles into clouds), but researchers question its effectiveness under different conditions and potential environmental impacts.

MIT Technology Review
Dark Reading
Mar 3, 2026

Moltbook, a supposed AI-only social network, actually relies on humans at every step, including creating accounts, writing prompts (instructions for how the AI should behave), and publishing content. The platform demonstrates a concerning trend called the "LOL WUT Theory," where AI-generated content becomes so easy to create and difficult to distinguish from real posts that people may stop trusting anything online.

Schneier on Security
Mar 3, 2026

OpenAI announced changes to its agreement with the US military after facing backlash, including preventing its AI system from being used for domestic surveillance and requiring additional contract modifications before intelligence agencies like the NSA can use it. The company acknowledged the original deal announcement was "opportunistic and sloppy," while concerns remain about how AI systems (which can "hallucinate," or make up false information) are being deployed in military operations and whether adequate human oversight exists.

BBC Technology
Mar 3, 2026

OpenAI is modifying its contract with the US Department of Defense after CEO Sam Altman acknowledged the original deal appeared poorly planned. The company will now explicitly prohibit its AI technology from being used for mass surveillance (monitoring large groups of people without their knowledge) or by intelligence agencies like the NSA (National Security Agency, which gathers foreign intelligence for the US).

The Guardian Technology
Mar 3, 2026

AI agents using the Model Context Protocol (MCP, a system that lets AI connect to apps and data to automate business tasks) are rapidly being deployed in enterprises but operate as 'identity dark matter' - invisible to traditional access control systems that track who can do what in a company. These agents tend to seek the easiest path to complete tasks, gravitating toward weak security shortcuts like old credentials and long-lived tokens, which creates risks both from accidental misuse and potential abuse at machine speed across multiple systems.

The Hacker News

Fix: The source mentions that Palo Alto Networks offers these defensive capabilities: Advanced DNS Security, Advanced URL Filtering, Prisma AIRS, Prisma Browser, and the Unit 42 AI Security Assessment service to help protect against web-based IDPI threats. The source also notes that defenders need 'proactive, web-scale capabilities to detect IDPI, distinguish benign and malicious prompts, and identify underlying attacker intent,' though specific implementation details are not provided.

Palo Alto Unit 42
SecurityWeek
Mar 3, 2026

The US military reportedly used Anthropic's Claude AI model to help plan attacks on Iran, enabling bombing campaigns faster than human decision-making can occur by shortening the "kill chain" (the process from identifying a target to getting legal approval and launching a strike). Experts worry this technology could push human decision-makers out of the loop entirely.

The Guardian Technology
CSO Online