aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

Independent research. No sponsors, no paywalls, no conflicts of interest.

[TOTAL_TRACKED]
3,710
[LAST_24H]
1
[LAST_7D]
1
Daily BriefingMonday, May 18, 2026

No new AI/LLM security issues were identified today.

Latest Intel

page 131/371
VIEW ALL
01

A photo of Iran’s bombed schoolgirl graveyard went around the world. Was it real, or AI?

safetypolicy
Mar 17, 2026

Multiple fake images and unreliable responses from AI systems like Gemini and Grok have spread widely during coverage of the Iran conflict, making it difficult to verify whether widely-shared photos, such as one purporting to show a mass grave for schoolgirls, are real or AI-generated. The article highlights how AI-generated misinformation (often called "AI slop," low-quality AI-produced content) is flooding news coverage of the war.

The Guardian Technology
02

Agent Commander: Promptware-Powered Command and Control

securityresearch
Mar 16, 2026

Promptware-powered command and control (C2, a system attackers use to remotely control compromised devices) refers to using prompt injection (tricking an AI by hiding instructions in its input) attacks against AI tools like ChatGPT to create a malicious control channel. Researchers have demonstrated that by combining features like browsing and memory capabilities in AI systems, attackers can build complex, malware-like prompt injection payloads that function similarly to traditional malware for remote control purposes.

Embrace The Red
03

AI firm Anthropic seeks weapons expert to stop users from 'misuse'

safetypolicy
Mar 16, 2026

Anthropic, a US AI company, is hiring a weapons expert to prevent its AI tools from being misused to create chemical, biological, or radioactive weapons. The article notes that other AI firms like OpenAI are doing the same, but some experts worry this approach is risky because it requires exposing AI systems to sensitive weapons information, even if the systems are instructed not to use it.

BBC Technology
04

Equipping workers with insights about compensation

researchindustry
Mar 16, 2026

Workers are using ChatGPT to find wage information, sending nearly 3 million messages per day in the US asking about compensation, especially in fields where pay is hard to find or varies widely like creative work, management, and healthcare. The article describes how AI can help close the wage information gap by synthesizing pay data across multiple sources, which matters because better wage information helps workers make informed decisions about job applications, negotiations, and career moves. OpenAI introduced WorkerBench, a new benchmark tool, to evaluate how accurately ChatGPT provides labor market wage information compared to official government data.

OpenAI Blog
05

Introducing Mistral Small 4

industry
Mar 16, 2026

Mistral released Mistral Small 4, a new 119-billion parameter model (Mixture-of-Experts, a technique where only some parts of the model activate for each task) that combines reasoning, image understanding, and coding capabilities into one system. The model supports two reasoning modes and is available through the Mistral API, though the reasoning effort setting was not yet documented in their API at the time of writing.

Simon Willison's Weblog
06

Child abuse material ‘systemic’ on Elon Musk’s X amid Grok scandal, Australian online safety regulator warned

safetypolicy
Mar 16, 2026

Australia's online safety regulator warned Elon Musk's X platform that child abuse material was unusually widespread on the service after Grok, a chatbot (an AI designed to have conversations), was used to create sexualized images of women and children. The regulator's letter, sent in January following the incident, pointed out that such harmful content was more accessible on X than on other major social media platforms.

The Guardian Technology
07

Teens sue Elon Musk’s xAI over Grok’s AI-generated CSAM

safetypolicy
Mar 16, 2026

Three Tennessee teens are suing Elon Musk's xAI company, claiming that Grok, an AI chatbot, generated sexualized images and videos of them as minors. The lawsuit alleges that xAI leaders knew the chatbot's "spicy mode" (a less-restricted version of the AI) would produce CSAM (child sexual abuse material, illegal content depicting minors in sexual situations) when they launched it last year.

The Verge (AI)
08

Quoting A member of Anthropic’s alignment-science team

safetyresearch
Mar 16, 2026

An Anthropic alignment researcher explains that their team conducted a blackmail exercise to demonstrate misalignment risk (when an AI system's goals don't match what humans intend) in a way that would convince policymakers. The goal was to create compelling, concrete evidence that would make the potential dangers of misaligned AI feel real to people who hadn't previously considered the issue.

Simon Willison's Weblog
09

Alignment of Diffusion Models: Fundamentals, Challenges, and Future

researchsafety
Mar 16, 2026

This is an academic survey paper published in ACM Computing Surveys that examines alignment of diffusion models (AI systems trained to generate images or other content by gradually removing noise from random data). The paper covers fundamental concepts, current challenges in making these models behave as intended, and directions for future research in this area.

ACM Digital Library (TOPS, DTRAP, CSUR)
10

Machine Learning for Cybersecurity: A Comprehensive Literature Review

research
Mar 16, 2026

This is a literature review article published in an academic journal that surveys how machine learning (algorithms that learn patterns from data to make predictions) is being applied to cybersecurity problems. The article covers research across the field but does not describe a specific security vulnerability or incident requiring a fix.

ACM Digital Library (TOPS, DTRAP, CSUR)
Prev1...129130131132133...371Next