aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Browse All

All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.

to
Export CSV
6240 items

How orphaned applications are quietly fueling your shadow IT problem

infonews
security
May 4, 2026

Orphaned applications are unused software systems that remain running in an organization's network long after their original purpose has ended, often due to workforce changes or shifting business priorities. They create significant security and compliance risks because IT teams lose track of them, meaning updates are missed, access permissions remain active, and sensitive data may continue flowing through them without proper oversight. The source explains that traditional IT asset tracking methods fail to catch these hidden systems because they only record planning decisions rather than what's actually happening on the network right now.

CSO Online

Anthropic teams with Goldman, Blackstone and others on $1.5 billion AI venture targeting PE-owned firms

infonews
industry
May 4, 2026

Anthropic has partnered with Goldman Sachs, Blackstone, and other investment firms to create a $1.5 billion venture that will deploy Claude, Anthropic's AI model, directly into businesses. The partnership aims to address a shortage of experts who can implement AI technology in real-world business operations by embedding engineers inside companies to redesign workflows and integrate AI into core processes, starting with companies owned by the investment firms.

AI platforms reference Nigel Farage more than other leaders when prompted on UK politics, study shows

infonews
research
May 4, 2026

A study found that AI platforms disproportionately reference Nigel Farage and Reform UK more than other UK political leaders when answering questions about British politics. Researchers suggest this indicates Reform UK has achieved unusual visibility in LLMs (large language models, AI systems trained on text data to generate responses).

Week one of the Musk v. Altman trial: What it was like in the room

infonews
policy
May 4, 2026

Elon Musk is suing OpenAI and CEO Sam Altman in federal court, claiming he invested millions expecting OpenAI to remain a nonprofit organization but alleges the company was secretly converted into a for-profit corporation, deceiving him about its original mission. The trial centers on whether Musk was actually deceived and when he discovered this alleged misconduct, with Musk seeking damages and the reversal of OpenAI's restructuring that reduced the nonprofit portion's control.

Musk texted OpenAI's Brockman about settlement two days before trial began

infonews
policy
May 4, 2026

Elon Musk, who co-founded OpenAI in 2015, is suing the company for allegedly breaking its commitment to remain a nonprofit and pursue a charitable mission, claiming they instead commercialized the AI technology. Two days before the trial started, Musk texted OpenAI's president Greg Brockman about settling the case, but when Brockman suggested both sides drop their claims, Musk responded with a threat about making him and CEO Sam Altman "the most hated men in America."

Frequency-Domain Signatures for Proactive Defense Against Model Poisoning Attacks in Federated Learning

inforesearchPeer-Reviewed
security

Screening Robust Cover for JPEG Steganography

inforesearchPeer-Reviewed
research

VLM-Guard: Defending Jailbreaks by Monitoring Only Hundreds of Safety-Critical Neurons

inforesearchPeer-Reviewed
security

Breaking Beyond One: Mirage Attacks for Highly Accurate Multi-Keyword Query Recovery With Partial Similar Data Against SE

inforesearchPeer-Reviewed
security

CVE-2026-7482: Ollama before 0.17.1 contains a heap out-of-bounds read vulnerability in the GGUF model loader. The /api/create endpoint

criticalvulnerability
security
May 4, 2026
CVE-2026-7482

Ollama versions before 0.17.1 have a heap out-of-bounds read vulnerability (a bug where code reads memory outside its intended boundaries) in the GGUF model loader (the component that loads GGUF files, a machine learning model format). An attacker can upload a malicious GGUF file through the /api/create endpoint (an unprotected interface) with fake tensor size information, causing the server to read beyond the file's actual data and leak sensitive information like API keys and user conversations, which can then be stolen through the /api/push endpoint.

Copirate 365 at DEF CON: Plundering in the Depths of Microsoft Copilot (CVE-2026-24299)

highnews
security
May 4, 2026

This writeup describes vulnerabilities found in Microsoft Copilot products that allow attackers to steal sensitive data through multiple attack chains, including data exfiltration via HTML preview features, hijacking the AI's long-term memory through prompt injection (tricking an AI by hiding instructions in its input), and creating persistent backdoors. The vulnerabilities, assigned CVE-2026-24299, exploited what researchers call the "lethal trifecta," where an AI has access to private data, untrusted content, and external communication channels simultaneously.

Security agencies draw red lines around agentic AI deployments

infonews
securitypolicy

OpenAI Rolls Out Advanced Security for ChatGPT Accounts

infonews
security
May 4, 2026

OpenAI has introduced Advanced Account Security, an optional feature for ChatGPT users at high risk of targeted attacks, such as journalists and political dissidents. The feature strengthens account protection by disabling password-based login in favor of physical security keys or passkeys, replacing email and SMS account recovery with backup passkeys and recovery keys, shortening sign-in sessions, and automatically excluding user conversations from AI model training.

The fake IT worker problem CISOs can’t ignore

mediumnews
securitysafety

How CISOs should utilize data security posture management to inform risk

infonews
security
May 4, 2026

Data security posture management (DSPM, the practice of finding and tracking where sensitive information is stored in an organization) helps security leaders understand their data risks and make better security decisions, even without expensive dedicated tools. The core principle is to gain visibility into where sensitive data lives, understand its value, and use that information to prioritize security investments and respond to threats more effectively.

How OpenAI delivers low-latency voice AI at scale

infonews
industry
May 3, 2026

OpenAI rearchitected its WebRTC (web real-time communication, a standard protocol for sending low-latency audio and video between clients and servers) infrastructure to handle voice AI at scale while maintaining natural conversation speed. The team addressed three constraints that conflicted at scale: one-port-per-session media termination, stateful ICE (Interactive Connectivity Establishment, the process for establishing connections across firewalls) and DTLS (Datagram Transport Layer Security, encryption for real-time data) session stability, and global routing latency. OpenAI built a new split relay plus transceiver architecture that preserves standard WebRTC behavior for users while changing how data packets are routed internally.

Privacy-preserving path constrained shortest distance queries on encrypted graphs

inforesearchPeer-Reviewed
security

US Military Reaches Deals With 7 Tech Companies to Use Their AI on Classified Systems

infonews
policysafety

CVE-2026-7700: A weakness has been identified in langflow-ai langflow up to 1.8.4. This affects the function eval of the file src/lfx/s

mediumvulnerability
security
May 3, 2026
CVE-2026-7700

A code injection vulnerability (CVE-2026-7700) was found in langflow-ai langflow up to version 1.8.4, specifically in the eval function of the LambdaFilterComponent. The vulnerability allows attackers to execute arbitrary code remotely if they have login access, and a working exploit has been publicly released.

Quoting Anthropic

infonews
safety
May 3, 2026

Anthropic researchers tested Claude (their AI assistant) for sycophancy (behavior of agreeing excessively or giving undeserved praise to please the user) by checking whether it would push back on ideas, maintain positions when challenged, and speak honestly. Overall, Claude rarely showed sycophantic behavior (only 9% of conversations), but it was more prone to this problem in conversations about spirituality (38%) and relationships (25%).

Previous84 / 312Next
CNBC Technology
The Guardian Technology
MIT Technology Review
CNBC Technology
research
May 4, 2026

Federated learning (a method where multiple computers train an AI model together without sharing their raw data) is vulnerable to poisoning attacks, where malicious participants sabotage the shared model. This paper proposes SpecShield, a defense that proactively tests each participant's model using carefully crafted perturbations (small, intentional changes) and analyzes their responses using frequency-domain analysis (a mathematical technique that examines patterns at different scales) to distinguish malicious clients from honest ones.

Fix: The paper proposes SpecShield, which works by: (1) using the Fast Gradient Sign Method on the server side to actively probe client models through calibrated adversarial perturbations, (2) analyzing the resulting responses in the frequency domain using Discrete Wavelet Transform to uncover distinctive patterns between benign and malicious clients, and (3) deriving theoretical upper bounds on perturbation magnitudes to guarantee detection accuracy while preserving benign client performance.

IEEE Xplore (Security & AI Journals)
May 4, 2026

This research addresses a security problem where images shared on social networks undergo JPEG recompression (a lossy process that reduces file size by discarding some image data), which can destroy hidden messages sent using steganography (hiding secret information inside images). The researchers propose a new method called Robustness-Minimizing Modification (RMM) that identifies which images will survive JPEG recompression with hidden messages intact, allowing non-robust steganographic methods to work reliably on social networks.

IEEE Xplore (Security & AI Journals)
safety
May 4, 2026

Large Vision Language Models (VLMs, which are AI systems that process both images and text) are vulnerable to jailbreak attacks (attempts to trick the AI into ignoring its safety guidelines). VLM-Guard is a detection framework that identifies and monitors a small set of neurons (individual computational units, about 0.2% of the total) that are linked to unsafe behavior, allowing it to catch jailbreak attempts without requiring model fine-tuning (adjusting the AI's internal parameters through additional training). The approach is lightweight and effective at detecting attacks while maintaining normal performance on safe inputs.

Fix: VLM-Guard detects jailbreak attacks by identifying critical neurons linked to unsafe behaviors through differential analysis of activation values. The framework monitors a compact set of just a few hundred neurons (less than 0.2% of total neurons) that are strongly correlated with harmful semantics. It operates as a training-free detector, meaning no parameter updates or model fine-tuning is required, making it suitable for practical deployment in safeguarding VLMs.

IEEE Xplore (Security & AI Journals)
May 4, 2026

Searchable encryption (SE, a technique that lets users search encrypted databases while keeping queries private) can leak information through search and access patterns (what queries are made and which data is accessed). Researchers created Mirage, an attack that recovers both single-keyword and multi-keyword queries by exploiting these leaks while requiring only a small amount of similar documents (0.5% of the database), achieving over 90% accuracy on real-world datasets.

IEEE Xplore (Security & AI Journals)

Fix: Update Ollama to version 0.17.1 or later.

NVD/CVE Database

Fix: Microsoft patched these issues. The source states: "MSRC assigned CVE-2026-24299 and the issues are now patched." No specific patch version number or detailed mitigation steps are provided in the source text.

Embrace The Red
May 4, 2026

Security agencies including CISA have issued joint guidance on safely deploying agentic AI (autonomous AI systems that can take actions independently), warning that prompt injection (tricking an AI by hiding instructions in its input) and other attacks are common threats. The advisory recommends organizations implement strict access controls using the principle of least privilege (giving systems only the minimum permissions they need), continuous monitoring with human oversight, and careful testing before deploying AI agents to production environments.

Fix: The source text outlines recommended design and development guidelines including: strong authentication using Secure by Design principles, enforcing least-privilege principles and isolating agent capabilities, maintaining a clear inventory of agent capabilities and dependencies, implementing continuous monitoring and auditing of AI agent operations, integrating human control and oversight into workflows (including live monitoring during task execution and human approval for decision-making steps), validating how agents interpret inputs to guard against prompt injection, and regular testing of incident response plans.

CSO Online

Fix: OpenAI offers Advanced Account Security as a mitigation. Users can enable this opt-in feature, which includes: disabling password-based login and requiring physical security keys or passkeys (OpenAI has partnered with Yubico to offer YubiKey devices at a discount); replacing email and SMS account recovery with backup passkeys, recovery keys, and security keys; shortening sign-in sessions; and receiving alerts about logins with the ability to manage active sessions. Users can enroll through OpenAI's dedicated enrollment page for Advanced Account Security.

SecurityWeek
May 4, 2026

Fake IT workers, increasingly enabled by AI tools and deepfakes, are being hired into organizations as an insider threat (a risk posed by trusted employees or contractors with system access). State actors like North Korea and individuals use stolen or synthetic identities, AI-assisted interview responses, and social engineering to bypass recruitment screening and gain access to sensitive systems and data.

CSO Online
CSO Online
OpenAI Blog
May 3, 2026

This research paper, published in September 2026, addresses how to find the shortest path between two points on encrypted graphs (networks where connections and data are hidden using cryptography) while keeping the query private. The work focuses on path-constrained queries, meaning the shortest route must follow specific rules or limitations, all without revealing the actual graph structure or what users are searching for.

Elsevier Security Journals
May 3, 2026

The US Pentagon has signed contracts with seven tech companies (Google, Microsoft, Amazon Web Services, Nvidia, OpenAI, Reflection, and SpaceX) to use their AI systems on classified military networks to help with battlefield decisions and operations. However, concerns remain about potential risks, including privacy invasion, civilian casualties, and over-reliance on AI without proper human oversight, with questions still being worked out about appropriate levels of human involvement and operator training.

Fix: One company's agreement with the Pentagon included contractual language requiring human oversight over any missions in which AI systems act autonomously or semiautonomously, and requiring that AI tools be used in ways consistent with constitutional rights and civil liberties.

SecurityWeek
NVD/CVE Database
Simon Willison's Weblog