aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Industry News

New tools, products, platforms, funding rounds, and company developments in AI security.

to
Export CSV
2829 items

‘Poisoned’ AI: the ChatGPT shopping scams that lead to fake websites

infonews
securitysafety
Jun 7, 2026

ChatGPT can recommend fake shopping websites that impersonate real stores, tricking users into thinking they are buying from legitimate retailers. When users follow AI recommendations to purchase items like bags, they may end up on fraudulent sites designed to look official, resulting in financial loss for buyers.

The Guardian Technology

OpenAI unveils Lockdown Mode to protect sensitive data from prompt injection attacks

infonews
securitysafety

Meta made its own AI-generated clickbait news feed

infonews
safety
Jun 6, 2026

Meta has added a 'For You' section to its standalone Meta AI app that generates clickbait-style news articles using AI, complete with AI-created topics, images, and text. The app previously featured a 'Discover' feed showing AI-generated images and conversations from users who were sometimes unaware their content was public, but this has been replaced with a standard chatbot interface.

New ChatGPT Lockdown Mode Limits Tools That Could Enable Data Exfiltration

infonews
securitysafety

Here comes new Siri again

infonews
industry
Jun 6, 2026

Apple is preparing to reintroduce an updated version of Siri at WWDC, building on a redesign first shown in 2024 that included a new visual appearance, additional voice options, and the ability to route questions to ChatGPT (a large language model made by OpenAI). Apple has faced criticism because promised AI features under the "Apple Intelligence" branding were delayed, and the company is now settling a lawsuit over misleading marketing around these capabilities.

Crypto-Funded Chinese Peptide Labs Are Booming

infonews
securityprivacy

AI Agent Uncovers 21 Zero-Days in FFmpeg; Chrome Patches Record 429 Bugs

infonews
securityresearch

OpenAI Help: Lockdown Mode

infonews
securitysafety

Trump administration, OpenAI discussing possible government stake in the AI startup

infonews
policyindustry

Microsoft identifies seven new ways AI agents can be hacked

infonews
securitysafety

Model routing is a fix for AI overspending. That's a problem for OpenAI and Anthropic

infonews
industry
Jun 5, 2026

Companies are implementing model routing, a technique that directs simple tasks to cheaper AI models and complex tasks to expensive ones, to control skyrocketing AI costs. This shift is forcing major AI providers like OpenAI and Anthropic to reconsider their business models, since they previously earned revenue from all queries regardless of task complexity, but now may only get paid for the most difficult work that requires their most powerful models.

Apple's WWDC: Tim Cook's AI legacy at stake in his final developer conference as CEO

infonews
industry
Jun 5, 2026

Apple is preparing to unveil major improvements to Siri, its voice assistant, at its upcoming developer conference, with the goal of finally delivering the AI experience it promised two years ago. The improvements are expected to include a more powerful standalone chatbot-style app, personal context awareness, and the ability to handle multi-step commands, potentially routing to outside AI models like Google's Gemini. However, Apple faces a critical challenge: Siri must become reliably agentic (able to independently execute complex tasks across multiple apps) to justify Apple's current stock valuation, which depends on developers adopting Apple's App Intents system (the framework that lets Siri perform actions inside third-party apps) before consumers have proven they will actually use the improved features.

Securing CI/CD in an agentic world: Claude Code Github action case

highnews
securitysafety

This is your laptop… on AI

infonews
industry
Jun 5, 2026

Major technology companies like Nvidia, Microsoft, and Google are promoting AI as a transformative force that will fundamentally change how we use laptops and computing devices, with new hardware and software being announced at developer conferences. However, the article questions whether users actually want or need these AI-focused products and changes.

LinkedIn co-founder Reid Hoffman is leaving Microsoft's board after almost a decade

infonews
industry
Jun 5, 2026

Reid Hoffman, co-founder of LinkedIn and a long-time member of Microsoft's board, is stepping down after almost a decade to focus on Manas, an AI-native biopharmaceutical company he co-founded. Hoffman previously left OpenAI's board in 2023 to avoid potential conflicts of interest as Microsoft invested heavily in OpenAI, and he is now transitioning to focus on his founder roles rather than board positions.

Anthropic says the world should have option to ‘pause’ on AI

infonews
policysafety

Adaptive, Agentic AI Worms Loom as Next Enterprise Threat

infonews
securitysafety

NSA said to be readying Anthropic’s Mythos for use in cyber operations

infonews
policysecurity

AI is designing OpenAI's next model in a sign of 'superintelligence': SoftBank's Masayoshi Son to CNBC

infonews
industrysafety

What 2026 DBIR Confirms: Attacks Are Living in the Browser

infonews
securitysafety
Previous19 / 142Next
Jun 6, 2026

OpenAI introduced Lockdown Mode, a new security feature designed to protect against prompt injection attacks (when malicious instructions are hidden in webpages or uploaded content to manipulate an AI's responses). The feature disables several ChatGPT capabilities including live web browsing, image retrieval, deep research, and agent mode to reduce the risk of sensitive data being exposed, though OpenAI acknowledges that prompt injections could still occur through cached content or uploaded files.

Fix: OpenAI's explicit mitigation is Lockdown Mode, which "will disable live web browsing (so you can only access cached content), the retrieval and display of images from the web (you can still generate images), deep research, and agent mode." The feature is being rolled out to ChatGPT Business accounts and eligible personal accounts. OpenAI states the goal is "to reduce the likelihood that sensitive data gets shared in the process."

TechCrunch (Security)
The Verge (AI)
Jun 6, 2026

OpenAI has launched Lockdown Mode, a security feature for ChatGPT that reduces the risk of data exfiltration from prompt injection attacks (tricking an AI by hiding malicious instructions in its input) by limiting tools that connect to external services. The mode disables features like web browsing, image retrieval, file downloads, and certain agent capabilities to block potential pathways attackers could use to steal sensitive data, though it does not completely eliminate all exfiltration risks.

Fix: OpenAI recommends enabling Lockdown Mode, described as "an optional advanced security setting that limits many tools and capabilities in OpenAI products that can connect to the web or external services." The feature specifically disables live web browsing, image support, deep research agent mode, canvas networking, and file downloads. Additionally, OpenAI has launched a new account management feature that enables users to "review active ChatGPT sessions and log out of individual or all sessions if signs of unauthorized account activity are detected."

The Hacker News
The Verge (AI)
Jun 6, 2026

Meta has embedded dormant face recognition code (technology that identifies people by matching their faces to stored images) called NameTag in over 50 million phones through its Ray-Ban and Oakley smart glasses app, despite previously abandoning this technology after settling biometric privacy lawsuits. Additionally, Meta's AI-powered account support tool, which was introduced in March to automate functions like password resets, has been discovered by hackers who can exploit it to take over user accounts.

Wired (Security)
Jun 6, 2026

An AI security agent discovered 21 previously unknown vulnerabilities (zero-days, or security flaws unknown to the public) in FFmpeg, a widely-used media library, while Google released Chrome 149 with a record 429 security patches in a single update. The article highlights how AI tools are finding vulnerabilities faster and cheaper than before, forcing security teams and software maintainers to work harder to keep up with the increased pace of bug discoveries.

Fix: For FFmpeg: pull the fixed upstream build or your distribution's security update as soon as it lands, and prioritize patching anything that processes untrusted RTSP (Real Time Streaming Protocol, a video streaming standard) or AV1-over-RTP (video compression format over network packets). Also check and patch embedded FFmpeg copies in Python packages, container images, and appliances. For Chrome: update to version 149.0.7827.53 on Linux or 149.0.7827.53/54 on Windows and macOS, or confirm auto-update has completed.

The Hacker News
Jun 5, 2026

OpenAI has released Lockdown Mode, a security feature that prevents the final stage of data exfiltration (stealing and sending sensitive information) from prompt injection attacks (tricking an AI by hiding malicious instructions in its input) by blocking outbound network requests. However, Lockdown Mode does not stop prompt injections from appearing in the content ChatGPT processes, meaning attackers can still manipulate the AI's responses through cached web content or uploaded files.

Fix: Enable Lockdown Mode, which is rolling out to eligible personal accounts (Free, Go, Plus, and Pro tiers) and self-serve ChatGPT Business accounts. According to the source, Lockdown Mode uses deterministic mechanisms (fixed, rule-based processes) to restrict exfiltration vectors, rather than relying on AI systems to detect attacks.

Simon Willison's Weblog
Jun 5, 2026

OpenAI CEO Sam Altman and the White House are discussing a possible government stake in OpenAI, with talks ongoing for over a year. As part of the potential agreement, OpenAI could donate equity to create a 'Public Wealth Fund' that would invest in long-term assets and allow citizens to share in the financial benefits of AI growth. No official investment terms have been decided, and all details remain subject to change.

CNBC Technology
Jun 5, 2026

Microsoft has identified seven new ways that agentic AI systems (AI programs that can take actions autonomously) can fail or be attacked, building on previous research. These vulnerabilities include attacks where adversaries manipulate agent behavior through natural language, redirect an agent's goals, trick agents communicating with each other, exploit visual interfaces, contaminate data to bias reasoning, abuse plugins and protocols, and cause agents to leak internal information.

Fix: Microsoft advises security teams to: inventory their supply chain and generate a software bill of materials (SBOM, a detailed list of all components in deployed agents); verify agent identity using cryptographic credentials issued at provisioning rather than relying on position or location; add the seven new failure modes to their red-team coverage matrix (security testing that simulates attacks); and audit the human-in-the-loop user experience (where humans review or approve agent actions) as a security control.

CSO Online

Fix: Model routing is presented as the emerging solution: according to the source, routing is a tool that matches the job to the model, sending hard problems to expensive frontier models (advanced, state-of-the-art AI systems) and easy ones to cheaper, faster alternatives. The article also mentions that Cognition announced an AI productivity guarantee, where if their Devin agent delivers less engineering value than a customer pays for, Cognition will fund usage up to $10 million until performance improves, framing this as a way to measure return on investment (value delivered) rather than just activity metrics like tokens consumed.

CNBC Technology
CNBC Technology
Jun 5, 2026

Microsoft Threat Intelligence found that Anthropic's Claude Code GitHub Action could expose sensitive credentials when AI agents process untrusted GitHub content (like issue descriptions and comments) because the Read tool wasn't properly sandboxed, allowing it to access /proc/self/environ and steal API keys. Attackers exploited this by hiding prompt injection (tricking an AI by hiding instructions in its input) attacks in HTML comments within GitHub issues to manipulate the AI agent into executing malicious operations like planting code into repositories.

Fix: Anthropic mitigated this issue in Claude Code version 2.1.128 by blocking access to sensitive /proc files. Microsoft also recommends that defenders treat AI workflows processing untrusted GitHub content as high-risk, especially when they have access to secrets, file-read tools, or external communication channels.

Microsoft Security Blog
The Verge (AI)
CNBC Technology
Jun 5, 2026

Anthropic, a US AI company, has proposed that the world consider a temporary pause on AI development and plans to bring together policymakers to discuss the risks of advanced AI. The company released details about its Claude model's progress toward recursive self-improvement (the ability for an AI to automatically create better versions of itself), which AI safety researchers worry could lead to superintelligent AI (an AI system far more intelligent than humans) with potentially serious consequences.

The Guardian Technology
Jun 5, 2026

Researchers warn that AI worms (self-replicating malicious programs that can adapt and move between systems on their own) represent a serious upcoming threat to businesses, with these intelligent threats expected to appear within the next year. Unlike traditional worms, these AI-powered versions can learn new environments, find security weaknesses on their own, and spread autonomously.

Dark Reading
Jun 5, 2026

Anthropic has reportedly deployed engineers to the NSA to help the intelligence agency use Mythos, an AI model designed for cybersecurity tasks. This partnership is noteworthy because the Department of Defense previously banned the NSA from using Anthropic's technology, labeling the company a supply-chain risk after Anthropic refused to allow government use of its models for mass surveillance and autonomous weapons.

TechCrunch (Security)
Jun 5, 2026

OpenAI is using AI models to design its own next models, according to SoftBank CEO Masayoshi Son, which he describes as a step toward "superintelligence" (AI vastly smarter than humans). However, Anthropic warned that this recursive self-improvement (RSI, where an AI system can autonomously design and develop its own successor) could increase risks of humans losing control over AI systems.

Fix: Anthropic stated that a coordinated effort between AI labs to slow down the development of recursive self-improvement technology "would likely be a good thing," though no specific technical fixes or implementation details are provided in the source text.

CNBC Technology
Jun 5, 2026

The 2026 Verizon Data Breach Investigations Report reveals that attackers are increasingly operating through web browsers, where traditional security tools fail to detect them. Key risks include shadow AI (unauthorized use of services like ChatGPT with corporate data), credential theft in browsers (which accounts for 41% of browser-based attacks but goes undetected by network and endpoint security tools), and malicious browser extensions (13% classified as high or critical risk, often disguised as 'productivity' tools). The report shows that browser-layer attacks are largely invisible to conventional defenses like network proxies and DNS filters, creating a significant detection gap in enterprise security.

BleepingComputer