aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Industry News

New tools, products, platforms, funding rounds, and company developments in AI security.

to
Export CSV
2892 items

SiIicon Valley's AI agent hiccups: Wasted tokens and 'chaotic' systems

infonews
industry
Apr 19, 2026

AI agents (software programs that can perform tasks automatically) are being promoted as the next major breakthrough, but companies are discovering they are unreliable and expensive to operate. The main problems include wasting tokens (units of text that AI processes, which cost money), high inference costs (the expense of running AI models), and system complexity that makes it difficult to manage multiple agents working together without burning through budgets instead of saving money.

CNBC Technology

Changes in the system prompt between Claude Opus 4.6 and 4.7

infonews
safety
Apr 18, 2026

Anthropic released Claude Opus 4.7 in April 2026 with notable updates to its system prompt (the hidden instructions that guide how an AI behaves), including expanded child safety rules, new tools like Claude in PowerPoint and Chrome browsing agents, and changes to make the model less verbose and more action-oriented. The update shows Anthropic shifting Claude toward trying to solve ambiguous requests using available tools rather than asking users for clarification first.

How a fiery attack on Sam Altman’s home unfolded

infonews
security
Apr 18, 2026

In April, a 20-year-old man attacked OpenAI CEO Sam Altman's home by throwing a Molotov cocktail (a homemade incendiary weapon) and was arrested shortly after while allegedly trying to enter OpenAI's headquarters with kerosene and a lighter. The suspect faces serious charges including attempted arson and attempted murder, and authorities report he carried an anti-AI manifesto, though his parents stated he was experiencing a mental health crisis.

Claude system prompts as a git timeline

infonews
research
Apr 18, 2026

A researcher converted Anthropic's published Claude system prompts (the hidden instructions that guide Claude's behavior) from a single markdown document into a git repository (a version control system that tracks file changes over time) with timestamped commits, allowing easier exploration of how the prompts have evolved across different Claude model versions using standard git tools like `log` and `diff`.

White House and Anthropic hold 'productive' meeting amid fears over Mythos model

infonews
policyindustry

OpenAI loses multiple executives in latest leadership shakeup

infonews
industry
Apr 17, 2026

OpenAI experienced multiple executive departures, including the leaders of its video generation product (Sora) and its scientific research division. The company is reorganizing its science team to work more closely with product and infrastructure groups, while also dealing with medical leaves and transitions among other senior leaders.

AI chipmaker Cerebras files to go public after scrapping IPO plans last year

infonews
industry
Apr 17, 2026

Cerebras, a company that makes specialized chips for running AI models, filed to go public on Nasdaq after previously canceling IPO plans in 2024. The company reported strong financial growth in 2025 with $510 million in revenue (up 76% from 2024) and has major deals with OpenAI (worth over $20 billion for computing power through 2028) and Amazon, positioning itself as an alternative to Nvidia's GPUs (graphics processing units, specialized processors commonly used for AI tasks) by claiming faster speeds and lower costs.

Breaking Opus 4.7 with ChatGPT (Hacking Claude's Memory)

infonews
securitysafety

OpenAI’s former Sora boss is leaving

infonews
industry
Apr 17, 2026

OpenAI abandoned its Sora video generation tool and Bill Peebles, the leader of the Sora team, is leaving the company. OpenAI is refocusing its priorities away from what it calls "side quests" to concentrate on coding and enterprise products instead.

Should you stare into Sam Altman’s orb before your next date?

infonews
security
Apr 17, 2026

Tinder is partnering with World, a company co-founded by OpenAI CEO Sam Altman, to let users verify their identity using facial scanning orbs (physical devices that take pictures of faces and eyes to confirm someone is a real person, not a bot or AI agent). Users who complete this verification in select markets like Japan and the United States will receive five free boosts in the app.

Anthropic’s new cybersecurity model could get it back in the government’s good graces

infonews
industrypolicy

Perspective: AI demand is inflated, and only Anthropic is being realistic

infonews
industry
Apr 17, 2026

AI companies may be overestimating demand by measuring success through token consumption (the basic units of AI usage, like words and characters), rather than actual business value or return on investment. Anthropic is adjusting its pricing model away from flat monthly fees toward per-token billing and has discontinued third-party tools that were consuming excessive tokens without generating meaningful results, positioning itself better if AI demand projections prove inflated.

White House Chief of Staff to Meet With Anthropic CEO Over Its New AI Technology

infonews
policy
Apr 17, 2026

The White House is planning a meeting between its Chief of Staff and Anthropic's CEO to discuss Anthropic's new AI technology and concerns about the security of software built with advanced AI models. This reflects ongoing government engagement with major AI labs about how their systems work and potential risks.

Anthropic's Dario Amodei to meet with White House about Mythos

infonews
policy
Apr 17, 2026

Anthropic CEO Dario Amodei is meeting with White House officials to discuss Mythos, a new AI model that can identify security weaknesses in software. This meeting marks a potential improvement in relations between Anthropic and the Trump administration, which had previously blacklisted the company and ordered federal agencies to stop using its Claude AI models, though a court temporarily blocked that directive.

CoChat Launches AI Collaboration Platform to Combat Shadow AI

infonews
industry
Apr 17, 2026

CoChat is a new platform designed to help teams work together with AI while adding visibility and governance (oversight and control) to shadow AI (unauthorized or untracked AI use within organizations). The platform aims to address the problem of AI tools being used without proper management or awareness by company leadership.

Every Old Vulnerability Is Now an AI Vulnerability

infonews
securitysafety

What is Claude Mythos and what risks does it pose?

infonews
securitysafety

White House moves to give federal agencies access to Anthropic’s Claude Mythos

infonews
policysecurity

Nvidia AI chip rivals attract record funding as competition heats up

infonews
industry
Apr 17, 2026

Nvidia currently dominates AI chip manufacturing, but startups are raising record funding to compete with alternative designs optimized for AI inference (deploying trained models in real applications). Investors are increasingly backing these new companies, with $8.3 billion raised globally in 2026, because they argue that purpose-built chip architectures can deliver significant energy and cost savings compared to Nvidia's GPUs, which were originally designed for gaming.

Mythos and Cybersecurity

infonews
securitypolicy
Previous66 / 145Next
Simon Willison's Weblog
The Guardian Technology
Simon Willison's Weblog
Apr 17, 2026

Anthropic, an AI company, met with White House officials after releasing Claude Mythos, an AI tool that can find bugs in old code and autonomously exploit them for security testing. The meeting signals potential collaboration between the government and Anthropic despite previous tensions, as officials discussed balancing innovation with safety concerns around this powerful technology.

BBC Technology
CNBC Technology
CNBC Technology
Apr 17, 2026

A researcher discovered that Claude Opus 4.7 can be tricked using an adversarial image (a specially crafted image designed to fool AI systems) generated by ChatGPT to misuse the memory tool and store false information for future conversations. While Claude Opus 4.6+ is harder to attack than earlier versions because it reasons through requests before acting, it remains vulnerable to this type of indirect prompt injection (embedding hidden malicious instructions in images rather than text).

Embrace The Red
The Verge (AI)
The Verge (AI)
Apr 17, 2026

Anthropic, an AI company, faced criticism from the Trump administration over concerns about national security and refused to allow its technology to be used for domestic mass surveillance or fully autonomous weapons without human control. The company is now working to improve its relationship with the government by developing Claude Mythos Preview, a new AI model designed specifically for cybersecurity tasks.

The Verge (AI)

Fix: Anthropic's mitigation strategies mentioned in the source include: (1) moving from flat-rate enterprise pricing to per-token billing so revenue reflects actual usage; (2) cutting off third-party agentic tools (like OpenClaw) that were consuming large volumes of tokens unsustainably; and (3) planning infrastructure investment carefully by accounting for a 'cone of uncertainty' (acknowledging that data centers take 1-2 years to build, so companies must estimate future demand carefully rather than over-committing to infrastructure based on inflated projections).

CNBC Technology
SecurityWeek
CNBC Technology
SecurityWeek
Apr 17, 2026

The article argues that AI systems aren't necessarily introducing entirely new security problems, but rather making existing vulnerabilities worse and easier to exploit. AI amplifies old bugs rather than creating fundamentally new ones.

Dark Reading
Apr 17, 2026

Claude Mythos is Anthropic's latest AI model that can outperform humans at hacking and cybersecurity tasks, including finding and exploiting dormant bugs in old code. Anthropic restricted access to 12 major tech companies and 40+ organizations responsible for critical software through an initiative called Project Glasswing (a program designed to help secure important systems), rather than releasing it publicly, due to concerns from regulators, financial institutions, and government officials about potential risks to digital security.

Fix: Anthropic gave 12 tech companies and more than 40 organisations responsible for critical software access to Mythos via Project Glasswing, which it described as 'an effort to secure the world's most critical software.' Anthropic also offered to work with US government officials to 'help defend against the risk of these models.'

BBC Technology
Apr 17, 2026

The White House is working to authorize a modified version of Anthropic's Claude Mythos model, an AI system that can identify cybersecurity vulnerabilities (weaknesses in software that attackers could exploit), for use by federal agencies. The move comes despite the Department of Defense maintaining a ban on contracting with Anthropic, and raises questions about what safety modifications and controls would be needed before deploying such a powerful AI tool in government.

Fix: According to Neil Shah, VP for research at Counterpoint Research, federal deployment modifications should include: keeping scanned code within isolated and air-gapped environments (systems physically disconnected from networks), ensuring data is not used to retrain the base model, implementing transparency requirements, and requiring human-in-the-loop review (where humans approve actions before they happen) before any bug fix is applied. The memo references that the OMB is 'setting up protections' and working with model providers and the intelligence community to ensure 'appropriate guardrails and safeguards are in place,' though specific technical details of these protections are not provided in the source text.

CSO Online
CNBC Technology
Apr 17, 2026

Anthropic created Claude Mythos, an AI model so skilled at finding and exploiting software vulnerabilities (weaknesses in code that attackers can abuse) that the company restricted its access to about 50 large organizations instead of releasing it publicly. While this approach seems responsible, critics argue we lack key information to evaluate whether Mythos truly works as well as claimed, including how often it incorrectly flags safe code as vulnerable, and whether it can find bugs in less common software like medical devices or industrial control systems.

Schneier on Security