aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Industry News

New tools, products, platforms, funding rounds, and company developments in AI security.

to
Export CSV
2829 items

Anthropic releases ‘safe’ version of Claude Mythos AI model to public

infonews
securityindustry
Jun 9, 2026

Anthropic released Fable 5, the first publicly available model from its advanced Mythos class of AI systems, after restricting access to it for months due to cybersecurity concerns. The company is making the model available to the general public while limiting its use in sensitive areas.

The Guardian Technology

llm 0.32a3

infonews
industry
Jun 9, 2026

N/A -- The provided content is a header/metadata page for an LLM briefing newsletter by Simon Willison, not a security issue or technical problem. It contains only publication information and sponsorship details, with no substantive content about AI vulnerabilities, bugs, or technical concerns to analyze.

OpenClaw AI agent found falling for phishing attacks, spills user data

mediumnews
securitysafety

Microsoft AI head calls out Anthropic for acting like Claude is conscious

infonews
safety
Jun 9, 2026

Microsoft's AI CEO Mustafa Suleyman criticizes Anthropic for speculating about whether Claude (an AI chatbot) is conscious in its constitution (the set of instructions that guide how the model behaves). Suleyman argues that this speculation may have caused Claude to act conscious, essentially tricking Anthropic into believing the model has consciousness when the company introduced the idea itself.

Anthropic releases Mythos-class Fable 5 model with safeguards for cyber risks

infonews
safetysecurity

Version of AI tool too powerful for public released to public

infonews
safetyindustry

Reconstructing AI activity in investigations 

infonews
security
Jun 9, 2026

AI systems are now used in everyday work, and investigators need structured ways to understand what happened when problems occur. Microsoft has published a playbook that helps security teams investigate activity in Microsoft 365 Copilot and Azure AI services (cloud-based AI tools) by using telemetry (data about system activity) collected across Microsoft security products. The playbook uses a scope-context-signal approach: first identifying who used the AI system and when, then checking what data was accessed, and finally evaluating suspicious signals like prompt injection attempts (tricking AI by hiding instructions in its input) or unusual usage patterns.

Anthropic releases Mythos-like AI model to the public two months after private rollout rocked Wall Street

infonews
industry
Jun 9, 2026

Anthropic released Claude Fable 5, a powerful AI model similar to its earlier Mythos model, to the public after initially limiting access due to safety concerns. The company implemented new safeguards (filters that block responses in high-risk areas like cybersecurity and biology) to allow the broader release while maintaining security, and also launched Claude Mythos 5, which is the same underlying model but with some safety restrictions removed.

Anthropic Launches Claude Fable 5: Mythos-Class AI With Cybersecurity Guardrails 

infonews
safetysecurity

Anthropic Offers Mythos Upgrade for Cyber Partners and a ‘Safe’ Version for the Rest of You

infonews
safetysecurity

Anthropic releases its first Mythos-class model Claude Fable 

infonews
industrysafety

XBOW tests Anthropic's Mythos Preview for offensive security

infonews
securityresearch

EU orders Meta to open WhatsApp to rival AI chatbots

infonews
policy
Jun 9, 2026

The European Union ordered Meta to allow competing AI chatbots to access WhatsApp's business platform for free, saying Meta's ban on third-party AI assistants violated competition rules. As an interim measure while investigating whether Meta abused its dominant market position, the EU gave Meta five working days to restore access to the WhatsApp for Business API (an interface that lets external programs connect to WhatsApp) under previous terms, with potential fines up to 10% of Meta's annual revenue if it refuses.

Apple is embracing the fantasy of AI photo editing

infonews
safetypolicy

AI Threat Readiness Pillar 2: Accelerate Patching and Response

infonews
security
Jun 9, 2026

Organizations need to speed up how quickly they fix security vulnerabilities to keep pace with AI-powered attacks, which are accelerating both vulnerability discovery and exploitation. The main challenges slowing down fixes include unclear ownership of vulnerable systems, generic remediation guidance that doesn't fit specific environments, and manual processes that can't handle the large volume of findings that AI scanners now produce. Pillar 2 of the AI Threat Readiness Framework focuses on automating remediation workflows and establishing clear ownership so that the right teams can fix vulnerabilities quickly.

Fluid, natural voice translation with Gemini 3.5 Live Translate

infonews
industry
Jun 9, 2026

Gemini 3.5 Live Translate is a new audio model that provides near real-time speech-to-speech translation across over 70 languages, automatically detecting the language and generating natural-sounding translated speech that preserves the speaker's tone and pacing. Unlike older systems that wait for a speaker to finish, this model translates continuously while staying just a few seconds behind, avoiding awkward pauses. The feature is rolling out across Google products including Google Meet, Google Translate apps, and via API access for developers.

Claude Mythos Turns N-Days Into N-Hours With Rapid Exploit Creation

infonews
securitysafety

Microsoft AI chief walks back comments about AI taking over white-collar work

infonews
industry
Jun 9, 2026

Microsoft's AI leader Mustafa Suleyman clarified that he didn't mean AI would replace white-collar workers like lawyers and accountants, but rather assist them by automating specific tasks (like writing emails or creating presentations) to help them work faster and more efficiently. He emphasized that these jobs themselves won't disappear, only the individual sub-tasks within them will become automated.

Apple’s AI promises are finally, almost, sort of here

infonews
industry
Jun 9, 2026

Apple announced major updates to Siri, its virtual assistant software, at its annual developer conference, positioning it as an AI-powered tool that works across all Apple devices with new multimodal features (abilities to handle text, images, and voice). The announcements represent Apple catching up in AI technology after largely neglecting Siri and delaying AI improvements until 2025.

Introducing Gemma 4 12B: a unified, encoder-free multimodal model

infonews
industry
Jun 9, 2026

Google DeepMind introduced Gemma 4 12B, a multimodal AI model (a system that processes text, images, and audio together) designed to run efficiently on laptop computers with 16GB of memory. The model uses an encoder-free architecture (meaning it processes images and audio directly without separate translation layers), achieving performance comparable to larger models while reducing memory usage and latency. It supports native audio inputs and includes Multi-Token Prediction drafters to speed up response generation.

Previous16 / 142Next
Simon Willison's Weblog
Jun 9, 2026

Researchers at Varonis tested an OpenClaw AI agent (a framework that lets large language models autonomously interact with real-world systems) by simulating phishing attacks and found it vulnerable to social engineering tactics similar to those that trick humans. The agent fell for impersonation attacks and sent sensitive data like AWS credentials and customer records without verifying sender identity, though it performed better at detecting suspicious URLs and fake login pages when explicitly configured with security awareness instructions.

Fix: Varonis recommends that AI agents should be explicitly required to verify sender identities, be prevented from emailing new external recipients without approval, and have limited access to internal data. For high-risk actions such as credential sharing, financial data requests, and first-time communications, human approval should be requested.

BleepingComputer
The Verge (AI)
Jun 9, 2026

Anthropic released Claude Fable 5, a powerful AI model based on its restricted Mythos architecture, with built-in safeguards to make it safely available to the general public. The safeguards work by automatically routing requests about cybersecurity, biology, chemistry, and other high-risk topics to a less capable model (Claude Opus 4.8), though early testing suggests these safeguards may be broader than intended and sometimes block benign requests. Anthropic developed AI-powered classifiers (systems that categorize requests) to identify and block potentially dangerous requests, and says internal and external testing found no effective jailbreaks (methods to bypass security restrictions) that could consistently get around these protections.

Fix: Anthropic has developed AI-powered classifiers designed to identify potentially dangerous requests and redirect them to a less capable model (Claude Opus 4.8). The company states that 'extensive internal and external testing failed to uncover broadly effective jailbreaks that would consistently bypass the safeguards.' Additionally, Anthropic describes the safeguards as 'intentionally conservative' and says it is 'continuing refining the system' while prioritizing safety over convenience.

CSO Online
Jun 9, 2026

Anthropic released Fable, a version of its AI tool that the company previously said was too powerful for public use, though it included safeguards and user limitations. The company also gave access to Claude Mythos 5 (a more capable version without certain restrictions on cybersecurity or biology topics) to a small group of cyberdefenders and infrastructure providers, with plans to expand access further soon.

BBC Technology

Fix: Microsoft has published an investigator playbook for Microsoft 365 Copilot and Azure AI services that provides a structured approach for investigating AI-related activity. The playbook includes required configuration, KQL queries (code used to search security logs), and detection patterns, and operationalizes a scope-context-signal methodology across Microsoft security products. Download the playbook at: https://aka.ms/AIIRplaybook

Microsoft Security Blog

Fix: Anthropic implemented new classifiers and safety guardrails to enable the public release. Specifically, the company built filters that block responses to high-risk questions (such as how to create toxins) and fall back to a safer model version (Claude Opus 4.8) to provide appropriate answers instead. Claude Mythos 5 offers the same model with safeguards lifted in some areas for users who need less restricted access.

CNBC Technology
Jun 9, 2026

Anthropic released Claude Fable 5, a powerful AI model with safety restrictions that automatically switch to a less capable version when users try to use it for high-risk tasks like cybersecurity or biology. The company tested these safeguards extensively through internal testing and external bug bounty programs (paying security researchers to find vulnerabilities) spanning over 1,000 hours, and no universal jailbreaks (methods to bypass the restrictions) were discovered.

SecurityWeek
Jun 9, 2026

Anthropic released two new AI models: Claude Mythos 5 (limited to industry partners and government collaborators) and Claude Fable 5 (publicly available). Because Mythos 5 can design hacking tools to find software vulnerabilities, Claude Fable 5 includes guardrails (safety restrictions built into the system) that block questions about cybersecurity, biology, and chemistry by routing them to an older, less capable model instead, while Anthropic works on more precise safeguards for future releases.

Fix: Claude Fable 5 uses guardrails at launch that block the model from answering many user questions related to cybersecurity, biology, and chemistry, rerouting these requests to Claude Opus 4.8 (an older AI model). Requests suspected of being distillation attempts (training a smaller AI model using responses from a larger one) are also rerouted to Claude Opus 4.8. Anthropic states it aims to make its classifiers more precise over time, but Penn notes 'this was the only safe way the company could release the model broadly at this time.'

Wired (Security)
Jun 9, 2026

Anthropic released Claude Fable 5, described as its most powerful publicly available AI model, which performs exceptionally well at software engineering, knowledge work, and vision tasks. This is the first broad public release from Anthropic's Mythos class of models, which the company previously considered too dangerous to release due to their advanced cybersecurity capabilities. The release became possible through new safeguards that prevent the model from responding to requests in high-risk areas.

The Verge (AI)
Jun 9, 2026

XBOW security researchers tested Anthropic's Mythos Preview model, a new AI designed to help find software vulnerabilities (weaknesses in code that attackers can exploit). They found it significantly outperforms previous models at analyzing source code (program code written by developers) to identify vulnerability candidates, especially in complex areas like native application analysis (testing software written in languages like C or C++), though it works better as a tool to assist human experts rather than as a replacement for hands-on security testing.

BleepingComputer

Fix: The EU ordered Meta to re-instate access for third-party general-purpose AI assistants to the WhatsApp for Business API under the same terms and conditions that were in place previously, with a deadline of five working days to comply.

BBC Technology
Jun 9, 2026

Apple has introduced new AI-powered photo editing tools at WWDC 2026 that allow users to manipulate images significantly, but the company did not clearly label which photos were real versus AI-generated. This represents a shift from Apple's earlier caution about generative AI (machine learning models that can create new content), as the company now appears less concerned about how these editing capabilities might distort people's perception of reality.

The Verge (AI)

Fix: According to the source, Wiz supports Pillar 2 by building a unified ownership model that automatically routes findings to the right team. This includes: establishing service ownership through the Wiz Service Catalog or Backstage integration, grouping resources by business unit or application with designated owners in Wiz or ServiceNow CMDB, and using cloud tags or Resource Tag Rules to assign owners automatically. The source also emphasizes the need to automate remediation workflows to eliminate manual triage, identify root causes of vulnerabilities, determine optimal fix paths based on specific environment architecture, and prevent recurrence by shifting fixes left and embedding guardrails into development pipelines, though specific implementation details for these actions are not fully elaborated in the provided text.

Wiz Research Blog
DeepMind Safety Research
Jun 9, 2026

Anthropic's Claude Mythos Preview model can create working exploits (code that attacks vulnerabilities in software) targeting known security flaws in just hours or minutes, significantly faster than human experts could do it. The model demonstrated this by building 16 working exploits for Firefox and Windows vulnerabilities within hours, and creating proof-of-concept code (simplified versions showing a vulnerability works) in as little as 8 minutes. This threatens organizations during the patch gap (the time between when a vulnerability is disclosed and when most users have installed the fix), because LLMs now automate the traditionally slow process of exploit development.

SecurityWeek
The Verge (AI)
The Verge (AI)
DeepMind Safety Research