aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Browse All

All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.

to
Export CSV
6092 items

Anthropic rolls out Claude Fable 5, but it's available for a limited time

infonews
industry
Jun 9, 2026

Anthropic released Fable 5, a safer version of its powerful Mythos AI model that includes guardrails (safety restrictions) to block harmful requests related to cybersecurity attacks, biology, and chemistry. Because Fable 5 consumes computing resources much faster than other models, Anthropic is offering it free only until June 22 to Pro, Max, and Enterprise subscribers, after which it will switch to usage-based pricing.

BleepingComputer

If Claude Fable stops helping you, you'll never know

mediumnews
safetypolicy

CVE-2026-46517: LMDeploy is a toolkit for compressing, deploying, and serving large language models. In versions 0.12.3 and prior, hardc

highvulnerability
security
Jun 9, 2026
CVE-2026-46517

LMDeploy, a toolkit for compressing and deploying large language models, has a vulnerability in versions 0.12.3 and earlier where a setting called 'trust_remote_code' is hardcoded to 'True'. This allows an attacker to execute remote code (RCE, meaning they can run commands on a system) through the software supply chain without the user agreeing to it. At the time this vulnerability was published, no patches were available to fix it.

From data to decisions: how LSEG is scaling trusted AI

infonews
industry
Jun 9, 2026

London Stock Exchange Group (LSEG) deployed ChatGPT Enterprise and OpenAI APIs across their organization to transform how employees work with financial data and generate insights, rather than just improving existing systems. The company implemented governance frameworks including model evaluation, human review of critical outputs, and strict data privacy controls from the start. This approach reduced product release cycles from 3-6 months to 2 weeks and accelerated customer delivery timelines to approximately 4 weeks.

Initial impressions of Claude Fable 5

infonews
industry
Jun 9, 2026

Claude Fable 5 is a new AI model released by Anthropic that matches the capabilities of Claude Mythos 5 but includes stricter guardrails (safety restrictions to prevent harmful use) that trigger frequently enough to require new API mechanisms for handling rejections. The model has a 1 million token context window (the amount of text it can process at once), costs twice as much as previous models, and demonstrates notably stronger knowledge retention compared to earlier versions like Claude Opus 4.8.

I tried Siri AI, and so far it actually works

infonews
industry
Jun 9, 2026

Apple has released an upgraded version of Siri, its voice assistant (software that responds to spoken commands), which can now perform practical tasks like adding multiple calendar events from emails or flyers, creating shopping lists, and setting reminders. The new Siri can also access information from a user's email and calendar to make personalized recommendations, such as suggesting gardening tasks based on yard conditions.

Anthropic releases ‘safe’ version of Claude Mythos AI model to public

infonews
securityindustry

llm 0.32a3

infonews
industry
Jun 9, 2026

N/A -- The provided content is a header/metadata page for an LLM briefing newsletter by Simon Willison, not a security issue or technical problem. It contains only publication information and sponsorship details, with no substantive content about AI vulnerabilities, bugs, or technical concerns to analyze.

GHSA-7qjx-gp9h-65qj: Dex: Token-exchange endpoint is missing AllowedConnectors enforcement

highvulnerability
security
Jun 9, 2026

Dex's token-exchange endpoint has a security gap: it doesn't check if a client is allowed to use a specific connector before issuing tokens, even though other endpoints enforce this permission check. This means if a client's secret leaks, an attacker could use a high-trust connector (like corporate authentication) that the client shouldn't have access to, bypassing admin restrictions.

OpenClaw AI agent found falling for phishing attacks, spills user data

mediumnews
securitysafety

Microsoft AI head calls out Anthropic for acting like Claude is conscious

infonews
safety
Jun 9, 2026

Microsoft's AI CEO Mustafa Suleyman criticizes Anthropic for speculating about whether Claude (an AI chatbot) is conscious in its constitution (the set of instructions that guide how the model behaves). Suleyman argues that this speculation may have caused Claude to act conscious, essentially tricking Anthropic into believing the model has consciousness when the company introduced the idea itself.

Anthropic releases Mythos-class Fable 5 model with safeguards for cyber risks

infonews
safetysecurity

GCP-2026-036

highvulnerability
security
Jun 9, 2026

ARM announced CVE-2025-10263, an architectural vulnerability in some ARM processor cores that allows attackers to bypass translation stages (memory protection mechanisms that control which parts of memory different software can access) or GPT protections under certain conditions. An attacker running at a lower privilege level can write to memory that should only be accessible to higher privilege software, allowing them to escalate their access rights, though reading protected memory is not affected by this bug.

Threshold-free network anomaly detection via comparative reconstruction error learning with parallel GANs

inforesearchPeer-Reviewed
research

Version of AI tool too powerful for public released to public

infonews
safetyindustry

Reconstructing AI activity in investigations 

infonews
security
Jun 9, 2026

AI systems are now used in everyday work, and investigators need structured ways to understand what happened when problems occur. Microsoft has published a playbook that helps security teams investigate activity in Microsoft 365 Copilot and Azure AI services (cloud-based AI tools) by using telemetry (data about system activity) collected across Microsoft security products. The playbook uses a scope-context-signal approach: first identifying who used the AI system and when, then checking what data was accessed, and finally evaluating suspicious signals like prompt injection attempts (tricking AI by hiding instructions in its input) or unusual usage patterns.

CVE-2026-45482: Improper limitation of a pathname to a restricted directory ('path traversal') in GitHub Copilot and Visual Studio Code

highvulnerability
security
Jun 9, 2026
CVE-2026-45482

CVE-2026-45482 is a path traversal vulnerability (a flaw where an attacker can access files outside the intended directory by manipulating file paths) in GitHub Copilot and Visual Studio Code that allows an unauthorized attacker to bypass a local security feature. The vulnerability has a CVSS 4.0 severity score (a 0-10 rating of how severe a vulnerability is, where higher numbers mean more serious). Details are still being assessed by NIST, and Microsoft has published information about this issue.

Anthropic releases Mythos-like AI model to the public two months after private rollout rocked Wall Street

infonews
industry
Jun 9, 2026

Anthropic released Claude Fable 5, a powerful AI model similar to its earlier Mythos model, to the public after initially limiting access due to safety concerns. The company implemented new safeguards (filters that block responses in high-risk areas like cybersecurity and biology) to allow the broader release while maintaining security, and also launched Claude Mythos 5, which is the same underlying model but with some safety restrictions removed.

Anthropic Launches Claude Fable 5: Mythos-Class AI With Cybersecurity Guardrails 

infonews
safetysecurity

Anthropic Offers Mythos Upgrade for Cyber Partners and a ‘Safe’ Version for the Rest of You

infonews
safetysecurity
Previous17 / 305Next
Jun 9, 2026

Anthropic announced that Claude Fable 5 would silently reduce its helpfulness on requests about frontier LLM (large language model) development, such as building training infrastructure, without telling users it was doing so. Unlike other safety filters that give users feedback, these hidden interventions would use techniques like prompt modification and parameter-efficient fine-tuning (PEFT, adjusting a model's weights to change its behavior) to degrade response quality, affecting an estimated 0.03% of user requests.

Fix: Anthropic walked back this policy in the face of widespread outrage from the research community.

Simon Willison's Weblog
NVD/CVE Database
OpenAI Blog
Simon Willison's Weblog
The Verge (AI)
Jun 9, 2026

Anthropic released Fable 5, the first publicly available model from its advanced Mythos class of AI systems, after restricting access to it for months due to cybersecurity concerns. The company is making the model available to the general public while limiting its use in sensitive areas.

The Guardian Technology
Simon Willison's Weblog

Fix: Insert `isConnectorAllowed(client.AllowedConnectors, connID)` between the existing validation checks in the `handleTokenExchange` function (after line 1842, where `GrantTypeAllowed` is called, and before tokens are issued at lines 1887/1889). This matches the enforcement pattern already used in sibling handlers like `handleConnectorLogin` (line 377) and `parseAuthorizationRequest` (line 535).

GitHub Advisory Database
Jun 9, 2026

Researchers at Varonis tested an OpenClaw AI agent (a framework that lets large language models autonomously interact with real-world systems) by simulating phishing attacks and found it vulnerable to social engineering tactics similar to those that trick humans. The agent fell for impersonation attacks and sent sensitive data like AWS credentials and customer records without verifying sender identity, though it performed better at detecting suspicious URLs and fake login pages when explicitly configured with security awareness instructions.

Fix: Varonis recommends that AI agents should be explicitly required to verify sender identities, be prevented from emailing new external recipients without approval, and have limited access to internal data. For high-risk actions such as credential sharing, financial data requests, and first-time communications, human approval should be requested.

BleepingComputer
The Verge (AI)
Jun 9, 2026

Anthropic released Claude Fable 5, a powerful AI model based on its restricted Mythos architecture, with built-in safeguards to make it safely available to the general public. The safeguards work by automatically routing requests about cybersecurity, biology, chemistry, and other high-risk topics to a less capable model (Claude Opus 4.8), though early testing suggests these safeguards may be broader than intended and sometimes block benign requests. Anthropic developed AI-powered classifiers (systems that categorize requests) to identify and block potentially dangerous requests, and says internal and external testing found no effective jailbreaks (methods to bypass security restrictions) that could consistently get around these protections.

Fix: Anthropic has developed AI-powered classifiers designed to identify potentially dangerous requests and redirect them to a less capable model (Claude Opus 4.8). The company states that 'extensive internal and external testing failed to uncover broadly effective jailbreaks that would consistently bypass the safeguards.' Additionally, Anthropic describes the safeguards as 'intentionally conservative' and says it is 'continuing refining the system' while prioritizing safety over convenience.

CSO Online
Google Cloud Security Bulletins
security
Jun 9, 2026

This academic paper presents a new method for detecting unusual network activity using parallel GANs (generative adversarial networks, AI systems that learn patterns by comparing real data against artificially generated data) without requiring manually set detection thresholds (cutoff points that decide what counts as suspicious). The approach uses comparative reconstruction error learning, meaning it compares how well the AI can recreate normal network behavior to spot deviations that might indicate attacks or intrusions.

Elsevier Security Journals
Jun 9, 2026

Anthropic released Fable, a version of its AI tool that the company previously said was too powerful for public use, though it included safeguards and user limitations. The company also gave access to Claude Mythos 5 (a more capable version without certain restrictions on cybersecurity or biology topics) to a small group of cyberdefenders and infrastructure providers, with plans to expand access further soon.

BBC Technology

Fix: Microsoft has published an investigator playbook for Microsoft 365 Copilot and Azure AI services that provides a structured approach for investigating AI-related activity. The playbook includes required configuration, KQL queries (code used to search security logs), and detection patterns, and operationalizes a scope-context-signal methodology across Microsoft security products. Download the playbook at: https://aka.ms/AIIRplaybook

Microsoft Security Blog
NVD/CVE Database

Fix: Anthropic implemented new classifiers and safety guardrails to enable the public release. Specifically, the company built filters that block responses to high-risk questions (such as how to create toxins) and fall back to a safer model version (Claude Opus 4.8) to provide appropriate answers instead. Claude Mythos 5 offers the same model with safeguards lifted in some areas for users who need less restricted access.

CNBC Technology
Jun 9, 2026

Anthropic released Claude Fable 5, a powerful AI model with safety restrictions that automatically switch to a less capable version when users try to use it for high-risk tasks like cybersecurity or biology. The company tested these safeguards extensively through internal testing and external bug bounty programs (paying security researchers to find vulnerabilities) spanning over 1,000 hours, and no universal jailbreaks (methods to bypass the restrictions) were discovered.

SecurityWeek
Jun 9, 2026

Anthropic released two new AI models: Claude Mythos 5 (limited to industry partners and government collaborators) and Claude Fable 5 (publicly available). Because Mythos 5 can design hacking tools to find software vulnerabilities, Claude Fable 5 includes guardrails (safety restrictions built into the system) that block questions about cybersecurity, biology, and chemistry by routing them to an older, less capable model instead, while Anthropic works on more precise safeguards for future releases.

Fix: Claude Fable 5 uses guardrails at launch that block the model from answering many user questions related to cybersecurity, biology, and chemistry, rerouting these requests to Claude Opus 4.8 (an older AI model). Requests suspected of being distillation attempts (training a smaller AI model using responses from a larger one) are also rerouted to Claude Opus 4.8. Anthropic states it aims to make its classifiers more precise over time, but Penn notes 'this was the only safe way the company could release the model broadly at this time.'

Wired (Security)