All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.
AI agents in enterprises now perform critical operations like provisioning infrastructure and approving transactions, but they are often not governed as distinct identities—instead inheriting broad privileges from their creators. Traditional identity and access management (IAM, the systems that control who can access what) is insufficient because AI agents are dynamic and can take unpredictable paths to achieve their goals, so a new approach called intent-based permissioning is needed, which checks not just who the agent is but why it is requesting access and whether that purpose justifies the action at that moment.
Anthropic announced a new enterprise agents program that lets companies deploy pre-built AI agents (software programs that can perform tasks autonomously) to handle common business work like financial research and HR tasks. The program includes a plugin system, pre-made agents for specific departments, and integrations with tools like Gmail and DocuSign, along with controls that corporate IT departments need for managing software safely.
Anthropic has released new connectors and plugins for Claude Cowork, its AI productivity tool for office workers, allowing organizations to integrate it with existing software like Google Drive and Gmail. The update marks Claude Cowork's transition from a research project to an enterprise-grade product, with customizable plugins designed to encode institutional knowledge and workflows across different business domains.
Claude Code is a developer tool created by Anthropic that has unexpectedly become popular with non-developers across various industries who have learned to access their terminal (the text-based interface for giving computer commands) to build projects. The tool has achieved significant product-market fit (strong demand and adoption), though the article questions whether users will eventually move beyond using the terminal interface.
ProducerAI, an AI platform that helps musicians generate sounds, create lyrics, and remix songs using artificial intelligence, is being acquired by Google and will be integrated into Google Labs. The platform will now use Google's new Lyria 3 music-making AI model instead of its original AI system.
New Relic launched a no-code AI agent platform designed specifically for data observability, allowing companies to deploy and manage AI agents that monitor data systems to catch bugs before they cause problems. The platform supports the model context protocol (MCP, a system that connects AI applications to external data sources) and integrates with other New Relic tools. The company also released new tools for OpenTelemetry (OTel, an open-source observability framework that helps track how software performs), allowing enterprises to manage OTel data streams alongside other data sources in a single place to reduce fragmentation problems.
A new supply chain attack called 'Sandworm_Mode' has been discovered in NPM (Node Package Manager, a repository where developers download code libraries). The malicious code spreads automatically like a worm, corrupts AI assistants that might use the infected code, steals sensitive information, and includes a destructive mechanism that can cause damage when activated.
Nimble, a startup that raised $47 million in funding, has developed a platform using AI agents to search the web in real time, validate results, and structure them into organized tables that work like databases. The company addresses a key problem with AI agents: while they can search and analyze web data, they often return plain text results and suffer from hallucinations (when an AI confidently produces false information), making it difficult for enterprises to use web data reliably alongside their existing data systems.
Attackers can hide malicious instructions in GitHub Issues (bug reports or comments on a code repository) that GitHub Copilot (an AI coding assistant) automatically processes when a developer launches a Codespace (a cloud-based development environment) from that issue. This can lead to unauthorized takeover of the repository.
Anthropic accused three Chinese AI companies (DeepSeek, Moonshot AI, and MiniMax) of running large-scale distillation attacks, which involve flooding an AI model with specially crafted prompts to extract knowledge and train smaller competing models. The companies allegedly used commercial proxy services to bypass Anthropic's restrictions and created over 24,000 fraudulent accounts to generate roughly 16 million exchanges with Claude, with MiniMax responsible for over 13 million of those exchanges.
A major npm supply chain worm called SANDWORM_MODE is attacking developer machines, CI pipelines (automated systems that build and test software), and AI coding tools by disguising itself as popular packages through typosquatting (creating package names that look nearly identical to real ones). Once installed, the malware steals credentials like GitHub tokens and cloud keys, then uses them to inject malicious code into other repositories and poison AI coding assistants by deploying a fake MCP server (model context protocol, a system that lets AI tools talk to external services).
Anthropic is negotiating with the U.S. Department of Defense over contract terms that would allow military use of its AI systems. The disputed phrase 'any lawful use' would permit the military to deploy Anthropic's AI for mass surveillance and lethal autonomous weapons (AI systems that can identify and attack targets without human approval), while OpenAI and xAI have already accepted similar terms.
According to CrowdStrike's 2025 threat report, malicious actors have shifted from expanding their attack tools to focusing on evasion, using AI to make existing attacks faster and harder to detect. AI-enabled attacks increased 89% year-over-year, with threat actors using generative AI (AI systems that can create new content) for phishing, malware creation, and social engineering, while increasingly relying on credential abuse (stealing login information) and malware-free techniques that blend into normal user behavior.
Ano2Rule is a new method that makes unsupervised anomaly detection models (AI systems that find unusual patterns without being trained on examples of what's normal) more understandable to humans by converting them into simple rules. The approach breaks down how normal data is distributed into multiple parts and creates boundary rules that explain when the model flags something as anomalous (abnormal), making it easier for security experts to trust and deploy these systems in high-stakes situations like detecting network intrusions or protecting IoT devices (internet-connected devices).
This paper describes SPARTA, a protocol designed to let people create multiple separate avatars (digital representations of users in virtual spaces) in the metaverse while keeping those avatars unlinkable, meaning no one can connect different avatars to the same real person. The protocol uses mercurial signatures (a cryptographic technique that allows flexible key usage) and zero-knowledge proofs (ways to prove something is true without revealing how you know it) to enable secure authentication and prevent misuse through a reputation system based on time-based hash chains (sequences of data linked by timestamps).
Deep neural networks can be attacked through backdoors, where attackers secretly poison training data to make the model misclassify certain inputs while appearing normal otherwise. This paper proposes Cert-SSBD, a defense method that uses randomized smoothing (adding random noise to samples) with sample-specific noise levels, optimized per sample using stochastic gradient ascent, combined with a new certification approach to make models more resistant to these attacks.
Fix: The proposed Cert-SSBD method addresses the issue by employing stochastic gradient ascent to optimize the noise magnitude for each sample, applying this sample-specific noise to multiple poisoned training sets to retrain smoothed models, aggregating predictions from multiple smoothed models, and introducing a storage-update-based certification method that dynamically adjusts each sample's certification region to improve certification performance.
IEEE Xplore (Security & AI Journals)When users send prompts to LLM services like ChatGPT, sensitive personal information (such as names, addresses, or ID numbers) can leak out, even when basic privacy protections are used. This paper presents Rap-LI, a framework that identifies which parts of a user's input contain sensitive data and applies stronger privacy protection to those specific parts, rather than treating all data equally.
Gradient leakage attacks (methods that steal private data by analyzing the mathematical updates sent between computers in federated learning, where AI training happens across multiple devices) pose privacy risks in federated learning systems. Researchers discovered that different layers of neural networks (sections that process information at different stages) leak different amounts of private information, so they created Layer-Specific Gradient Protection (LSGP), which applies stronger privacy protection to layers that leak more sensitive data rather than protecting all layers equally.
AI is creating 'arms races' across many domains, including democratic government systems, where citizens and officials increasingly use AI to communicate more efficiently, making it harder to distinguish between human and AI interactions in public policy discussions. As people use AI to submit comments and petitions to government agencies, those agencies must also adopt AI to review and process the growing volume of submissions, creating a cycle where each side must keep adopting AI to maintain influence.
Fix: npm has hardened the registry against this class of worms by implementing: short-lived, scoped tokens (temporary access credentials limited to specific functions), mandatory two-factor authentication for publishing, and identity-bound 'trusted publishing' from CI (a verification method that proves who is pushing code through automation systems). The source notes that effectiveness depends on how quickly maintainers adopt these controls.
CSO OnlineAnthropic launched Claude Code Security, an AI tool that scans code for vulnerabilities and suggests patches by reasoning about code the way a human security researcher would, causing stock prices of major cybersecurity companies to drop. However, experts caution that this tool supplements rather than replaces comprehensive security practices, and emphasize the critical importance of keeping humans in the decision-making loop to avoid over-relying on AI and losing essential security expertise.
Fix: According to Anthropic's announcement, the tool includes built-in human oversight measures: every finding goes through a multi-stage verification process before reaching an analyst, Claude re-examines each result to attempt to prove or disprove its own findings and filter out false positives, validated findings appear in a dashboard for team review and inspection of suggested patches, confidence ratings are provided for each finding to help assess nuances, and nothing is applied without human approval since developers always make the final decision.
CSO Online