aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Browse All

All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.

to
Export CSV
2841 items

How we contain Claude across products

infonews
securitysafety
May 30, 2026

Anthropic published documentation explaining how they use multiple containment techniques to restrict what Claude can do across their products. They use process sandboxes (isolated execution environments), virtual machines (complete simulated computers), filesystem boundaries (limiting file access), and egress controls (preventing unauthorized data transfer) to prevent AI agents from accessing credentials, exfiltrating data (stealing information), or reaching unintended systems, even if a user, the AI model, or an attacker tries to find workarounds.

Fix: Anthropic implements containment through: gVisor for Claude.ai, Seatbelt (macOS) and Bubblewrap (Linux) for Claude Code, and full VMs using Apple's Virtualization framework (macOS) or HCS (Windows) for Claude Cowork. They also prevent credentials from entering sandboxes in the first place, ensuring they cannot be exfiltrated regardless of how an agent tries to access them.

Simon Willison's Weblog

Model X-Ray: Detection of hidden malware in AI model weights using few shot learning

inforesearchPeer-Reviewed
security

Anthropic’s alliance with pope on AI harms: all in good faith or ‘Vatican-washing?’

infonews
policyindustry

A Multi-Frequency Temporal Spatio-Transformer for adversarially robust IoT intrusion detection

inforesearchPeer-Reviewed
research

Microsoft and security researcher’s dueling posts about cybersecurity disclosures get nasty

infonews
security
May 29, 2026

A cybersecurity researcher named Nightmare Eclipse and Microsoft had a public conflict over responsible disclosure practices, with the researcher publishing vulnerability details after claiming Microsoft ignored his reports, while Microsoft argued that uncoordinated disclosures (releasing bug information before patches are available) create unnecessary risk for users. Tom Gallagher, a Microsoft security executive, acknowledged the debate over whether current patching practices fit today's landscape but stated the company is not currently changing its policies, though it will continue to evaluate them.

DNS-AID will make AI agents easier to discover, says Linux Foundation

infonews
industry
May 29, 2026

The Linux Foundation is promoting DNS-AID, a new standard that allows AI agents (autonomous programs that can act independently) to find and communicate with each other using DNS (the system that translates website names into IP addresses) instead of requiring separate proprietary registries. DNS-AID enables agents and MCP (Model Context Protocol, a standard for how agents exchange information) servers to use the existing internet infrastructure as a vendor-neutral directory, with domain owners creating a special DNS address at _index._agents.{domain} as a discovery point.

SpaceX skeptics have added reason for concern after Musk comments diverge from IPO filing

infonews
industrypolicy

Dan Ives: Anthropic’s growth is 'just the tip of the spear' for AI rally

infonews
industry
May 29, 2026

Anthropic, an AI company, recently achieved a $965 billion valuation after securing $65 billion in funding, and analyst Dan Ives believes investor interest in AI is far from peaked and will expand to data layer companies (companies that manage and organize data). Ives predicts a major market rally with several large public offerings planned for 2026, though some analysts warn this could signal a market peak similar to the dot-com bubble of the late 1990s.

EU seeks to 'intensify' talks with U.S. on advanced cyber AI models, official tells CNBC, amid Mythos concerns

infonews
policysecurity

Boston Children’s uses AI to unlock new diagnoses

infonews
industry
May 29, 2026

Boston Children's Hospital integrated AI (artificial intelligence) across its entire organization as a core part of clinical and operational work, rather than treating it as a separate experiment. By building an enterprise AI layer (a shared, secure internal AI system used across teams) and redesigning workflows in areas like supply chain and surgical scheduling, the hospital has diagnosed over 40 previously unresolved rare conditions, saved approximately 60,000 hours of staff time, and enabled more than one-third of employees to use AI daily in their work.

How Braintrust turns customer requests into code with Codex

infonews
industry
May 29, 2026

Braintrust, an AI observability company, uses Codex (OpenAI's code-generation AI model) to quickly turn customer feature requests into working preview branches in minutes, with half the team adopting it within one month. The speed of Codex enables faster feedback loops with customers and allows engineers to test ideas in real time rather than letting requests sit in a backlog. Codex's ability to handle large amounts of text output without slowing down makes it more effective than other models for this workflow.

‘Like a billionaire on acid’: Star Wars director Gareth Edwards comes out in favour of AI

infonews
industry
May 29, 2026

Film director Gareth Edwards publicly endorsed generative AI (software that creates content like images or text from descriptions) for movie-making at an Amazon event, comparing it favorably to traditional CGI (computer-generated imagery) and calling it a tool as fundamental as a camera. Edwards argued that filmmakers have no reason to avoid adopting AI since it can help with creative work and will eventually surpass CGI in quality.

What 2,000 Exposed Vibe-Coded Apps Reveal About the Limits of Most Security Stacks

infonews
securitypolicy

Adobe’s conversational AI agent is a mediocre design intern

infonews
industry
May 29, 2026

Adobe's Firefly AI Assistant is a conversational AI agent designed to automate tasks within Adobe's design software while keeping users in control of the creative process, unlike traditional AI image generators that work independently. The assistant acts as a multitasking middleman that can operate design apps on behalf of users, though early testing suggests the results are not particularly impressive despite the tool's thoughtful approach to preserving creative control.

Cybersecurity trends in SEC filings

infonews
policy
May 29, 2026

In 2023, the SEC required public companies to disclose cybersecurity risk management in their annual filings, prompting an analysis of the top 200 S&P companies' cybersecurity leadership structures. The analysis found that Chief Information Security Officers (CISOs) lead cybersecurity at over 70% of companies with an average of 23 years of experience, most commonly reporting to the Chief Information Officer, while the Audit Committee oversees cybersecurity at about 60% of companies, and the NIST Cybersecurity Framework (a set of best practices for managing cyber risks) is the most referenced security standard.

GDPR set the tone for regulatory action — and the AI fine pushback to come

infonews
policy
May 29, 2026

Big tech companies are legally challenging GDPR (General Data Protection Regulation, Europe's data protection law) fines, with nearly 40% of the €7.1 billion in fines announced over eight years either annulled or under appeal. While GDPR successfully established a global 72-hour breach notification standard (the requirement that organizations tell people within three days if their data is stolen), experts note the framework has structural weaknesses that companies exploit in court, and upcoming AI regulations may face similar challenges.

Shadow AI: The Hidden Risk Expanding Across the Enterprise

infonews
securitypolicy

Strengthening societal resilience with Rosalind Biodefense

infonews
policyindustry

Anthropic's run-rate revenue hits $47 billion

infonews
industry
May 28, 2026

Anthropic, an AI company, announced that its run-rate revenue (an annualized projection based on current monthly earnings) has grown to $47 billion as of May 2026, up from $30 billion in April 2026. This represents extraordinarily rapid growth, with the company increasing its run-rate revenue more than 10 times annually over the past three years, driven by widespread adoption among enterprise customers.

IBM and Red Hat want to become the ‘security clearinghouse’ for open source applications in the enterprise

infonews
securityindustry
1 / 143Next
research
May 30, 2026

Researchers have developed a technique called Model X-Ray that can detect hidden malware embedded in AI model weights (the numerical parameters that make up a trained AI system) using few-shot learning (training a detector with only a small number of examples). This work addresses a security risk where attackers could hide malicious code inside AI models that might go undetected during normal use.

Elsevier Security Journals
May 30, 2026

Pope Leo XIV released a major teaching warning about AI's harms, including job displacement, accelerated warfare, and environmental exploitation. Anthropic co-founder Chris Olah spoke at the Vatican ceremony, which some experts criticize as potentially creating superficial 'feelgood' messaging rather than substantive critical examination of AI risks.

The Guardian Technology
May 29, 2026

Researchers developed a new AI model called a Multi-Frequency Temporal Spatio-Transformer that can detect when attackers try to break into Internet of Things devices (IoT, everyday connected devices like smart home sensors). The model is designed to remain accurate even when attackers deliberately try to fool it using adversarial attacks (techniques that manipulate input data to trick AI systems into making wrong predictions). This research addresses the challenge of keeping IoT network security systems reliable against sophisticated attacks.

Elsevier Security Journals
CSO Online
CSO Online
May 29, 2026

Elon Musk's social media post about SpaceX's deal with AI company Anthropic contradicts details in SpaceX's IPO (initial public offering, when a private company sells shares to the public) filing, creating confusion for investors. The filing says Anthropic will pay SpaceX $1.25 billion per month through May 2029, but Musk claimed the lease is only 180 days with a 90-day cancellation option, potentially worth far less. This discrepancy matters because it affects how much revenue SpaceX can expect from this new compute capacity (computing power) business.

CNBC Technology
CNBC Technology
May 29, 2026

The European Union wants to increase discussions with the U.S. about advanced AI models that have cyber capabilities, particularly after Anthropic's Mythos model (a very powerful AI system) raised concerns about AI-powered cyberattacks. The EU has not yet received access to preview Mythos, and the White House has opposed expanding access to the model beyond the U.S. due to security concerns, though Anthropic says it is developing safeguards and expects to release Mythos-class models to customers within weeks.

Fix: Anthropic stated that 'Models of Mythos' capability require strong cyber safeguards before they can be generally released' and 'We're making swift progress on developing these safeguards and expect to be able to bring Mythos-class models to all our customers in the coming weeks.' No specific technical safeguards or implementation details are described in the source text.

CNBC Technology
OpenAI Blog
OpenAI Blog
The Guardian Technology
May 29, 2026

Employees are using AI-driven development platforms (vibe coding, where non-programmers build working applications by describing what they want) to quickly build custom applications and connect them to company systems, then publish them on the public internet without involving security teams or implementing basic access controls. A study found over 2,000 such exposed applications containing sensitive data across major companies, sitting unprotected because traditional security tools like EDR (endpoint detection and response, software that monitors what happens on company devices) and DLP (data loss prevention, software that blocks sensitive information from leaving the company) were designed to catch different types of threats and don't detect these cloud-to-cloud connections or applications built in web browsers.

The Hacker News
The Verge (AI)
CSO Online
CSO Online
May 29, 2026

Organizations are rapidly adopting unauthorized AI tools without proper security oversight, creating 'shadow AI' (unsanctioned AI use that bypasses governance controls) that exposes sensitive data and creates new attack surfaces. Traditional security tools like firewalls and Zero Trust architecture (a security model that requires verification for every access request) cannot detect AI-specific threats such as prompt injection (tricking an AI by hiding malicious instructions in its input), leaving companies vulnerable to data leaks, compliance failures, and attacks that exploit AI systems.

Fix: CrowdStrike Falcon AI Detection and Response (AIDR) is designed to provide visibility, control, and protection for AI-driven environments and can identify and stop AI-specific threats such as prompt injection.

CrowdStrike Blog
May 28, 2026

OpenAI is launching Rosalind Biodefense, a program that gives vetted developers access to GPT-Rosalind (a reasoning model trained for life sciences) to build defensive tools against biological threats like pandemics. The company is also expanding trusted access to this model for select U.S. government and allied partners working on public health and biodefense, supported by safety measures like capability assessments, expert red teaming, and security controls to prevent misuse.

OpenAI Blog
Simon Willison's Weblog
May 28, 2026

IBM and Red Hat announced Project Lightwell, a $5 billion initiative to create an AI-powered 'security coordination layer' that helps enterprises discover and fix vulnerabilities (security weaknesses) in open source software faster. The clearinghouse will deliver validated patches directly into existing software supply chains without requiring upgrades, starting with Java/Maven code and eventually expanding to other programming languages.

Fix: Project Lightwell will backport fixes (apply patches to older versions) to exact dependency versions that have already been tested and deployed, operate on configuration manifests like pom.xml so code remains in controlled enterprise environments, and deliver fixes across dependency chains. Enterprises will receive validated patches spanning Red Hat platforms and independent community code, and can share fixes upstream through a 'secure map' so the wider open-source community can incorporate them.

CSO Online