aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDataset
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Industry News

New tools, products, platforms, funding rounds, and company developments in AI security.

to
Export CSV
1283 items

Job Opportunities at the European AI Office for Legal and Policy Backgrounds

inforegulatory
policy
Dec 16, 2024

The European Commission is hiring Legal and Policy Officers for the European AI Office to help develop trustworthy AI policies and legislation. Applicants need at least three years of experience in EU digital policy or legislation, relevant degrees, and fluency in EU languages, with applications due by January 15, 2025.

EU AI Act Updates

Terminal DiLLMa: LLM-powered Apps Can Hijack Your Terminal Via Prompt Injection

mediumnews
securityresearch

DeepSeek AI: From Prompt Injection To Account Takeover

highnews
security
Nov 29, 2024

A researcher discovered that DeepSeek-R1-Lite, a new AI reasoning model, is vulnerable to prompt injection (tricking an AI by hiding instructions in its input) combined with XSS (cross-site scripting, where malicious code runs in a user's browser). By uploading a specially crafted document with base64-encoded malicious code, an attacker could trick the AI into executing JavaScript that steals a user's session token (a credential stored in browser memory that proves who you are), leading to complete account takeover.

The AI Office is hiring a Lead Scientific Advisor for AI

inforegulatory
policy
Nov 19, 2024

The European AI Office posted a job opening for a Lead Scientific Advisor for AI, responsible for ensuring scientific rigor in testing and evaluating general-purpose AI (large AI models trained on broad data that can handle many tasks) models and leading the office's scientific approach to AI safety. The position required EU citizenship, at least 15 years of professional experience, and fluency in EU languages, with an application deadline of December 13, 2024.

OWASP Top 10 for Large Language Model Applications - 2025

inforegulatory
securitypolicy

OWASP Top 10 for Large Language Model Applications - 2023 - v1.1

inforegulatory
securitypolicy

OWASP Top 10 for Large Language Model Applications - 2023 - v1

inforegulatory
securitypolicy

Overview of all AI Act National Implementation Plans

inforegulatory
policy
Nov 8, 2024

This document provides an overview of how different European Union countries are implementing the EU AI Act, which is legislation regulating artificial intelligence systems. Most countries show unclear or partial progress in establishing the required authorities (government bodies responsible for oversight and enforcement), with some nations like Denmark and Finland having made more concrete arrangements for coordinating market surveillance (monitoring that AI systems follow the rules) and serving as single points of contact.

ZombAIs: From Prompt Injection to C2 with Claude Computer Use

mediumnews
securitysafety

Spyware Injection Into Your ChatGPT's Long-Term Memory (SpAIware)

highnews
securitysafety

Microsoft Copilot: From Prompt Injection to Exfiltration of Personal Information

highnews
security
Aug 26, 2024

Microsoft 365 Copilot has a vulnerability that allows attackers to steal personal information like emails and MFA codes through a multi-step attack. The exploit uses prompt injection (tricking an AI by hiding malicious instructions in emails or documents), automatic tool invocation (making Copilot search for additional sensitive data without user permission), and ASCII smuggling (hiding data in invisible characters within clickable links) to extract and exfiltrate personal information.

The AI Act: Responsibilities of the European Commission (AI Office)

inforegulatory
policy
Aug 22, 2024

The European AI Act assigns the European Commission's AI Office various responsibilities for regulating AI systems, including promoting AI literacy, overseeing biometric identification systems used by law enforcement, managing a registry of certified testing bodies (notified bodies that verify AI safety), and investigating whether these bodies remain competent. Most of these oversight duties take effect starting February or August 2025, with no specific deadlines given for completing individual tasks.

The AI Act: Responsibilities of the EU Member States

inforegulatory
policy
Aug 22, 2024

The EU AI Act requires member states to receive and register notifications about high-risk AI systems (AI systems that pose significant risks to safety or rights) from various parties, including law enforcement agencies using facial recognition systems, AI providers, importers, and organizations deploying these systems. These responsibilities take effect in two phases: August 2, 2025, and August 2, 2026, with member states also needing to assess conformity assessment bodies (independent organizations that verify AI systems meet safety standards) and share documentation with the European Commission.

Google AI Studio: LLM-Powered Data Exfiltration Hits Again! Quickly Fixed.

mediumnews
security
Aug 21, 2024

A researcher discovered a security flaw in Google AI Studio where prompt injection (tricking an AI by hiding instructions in its input) allowed data exfiltration (stealing data) through HTML image tags rendered by the system. The vulnerability worked because Google AI Studio lacked a Content Security Policy (a security rule that restricts where a webpage can load resources from), making it possible to send data to unauthorized servers.

Protect Your Copilots: Preventing Data Leaks in Copilot Studio

infonews
security
Jul 30, 2024

Microsoft's Copilot Studio is a low-code platform that lets employees build chatbots, but it has security risks including data leaks and unauthorized access when Copilots are misconfigured. The post warns that external attackers can find and interact with improperly set-up Copilots, and discusses how to protect organizational data using security controls.

Google Colab AI: Data Leakage Through Image Rendering Fixed. Some Risks Remain.

mediumnews
securityprivacy

Breaking Instruction Hierarchy in OpenAI's gpt-4o-mini

mediumnews
securitysafety

Sorry, ChatGPT Is Under Maintenance: Persistent Denial of Service through Prompt Injection and Memory Attacks

mediumnews
securitysafety

An Introduction to the Code of Practice for General-Purpose AI

inforegulatory
policy
Jul 3, 2024

The EU AI Act Code of Practice is a voluntary set of guidelines published in July 2025 to help general-purpose AI (GPAI, large AI models used across many applications) model providers comply with new EU AI regulations during the gap period before formal European standards take effect in 2027 or later. The Code, developed by the EU AI Office and many stakeholders, covers three areas: Transparency and Copyright (for all GPAI providers) and Safety and Security (for providers of GPAI models with systemic risk, meaning those that could cause widespread harm). Though not legally binding, the Commission and EU AI Board confirmed the Code adequately demonstrates compliance with the AI Act's requirements.

GitHub Copilot Chat: From Prompt Injection to Data Exfiltration

highnews
security
Jun 15, 2024

GitHub Copilot Chat, a VS Code extension that lets users ask questions about their code by sending it to an AI model, was vulnerable to prompt injection (tricking an AI by hiding instructions in its input) attacks. When analyzing untrusted source code, attackers could embed malicious instructions in the code itself, which would be sent to the AI and potentially lead to data exfiltration (unauthorized copying of sensitive information).

Previous56 / 65Next
Dec 6, 2024

LLMs (large language models) can output ANSI escape codes (special control characters that modify how terminal emulators display text and behave), and when LLM-powered applications print this output to a terminal without filtering it, attackers can use prompt injection (tricking an AI by hiding instructions in its input) to make the terminal execute harmful commands like clearing the screen, hiding text, or stealing clipboard data. The vulnerability affects LLM-integrated command-line tools and applications that don't properly handle or encode these control characters before displaying LLM output.

Embrace The Red
Embrace The Red
EU AI Act Updates
Nov 18, 2024

This is the official 2025 release of the OWASP Top 10 for Large Language Model Applications, which is a ranked list of the most critical security risks affecting AI systems. The document provides guidance on the biggest threats that developers should be aware of when building or using LLM-based applications (software built around large language models, which are AI systems trained on vast amounts of text).

OWASP LLM Top 10
Nov 11, 2024

N/A -- The provided content is a GitHub navigation menu and marketing material, not a substantive article about the OWASP Top 10 for LLM Applications. No technical information, vulnerabilities, or security issues are described in the source text.

OWASP LLM Top 10
Nov 11, 2024

N/A -- The provided content is a navigation menu and header from a GitHub webpage about enterprise features and developer tools. It does not contain substantive information about the OWASP Top 10 for Large Language Model Applications or any AI/LLM security issues.

OWASP LLM Top 10
EU AI Act Updates
Oct 24, 2024

Claude Computer Use is a new AI tool from Anthropic that lets Claude take screenshots and run commands on computers autonomously. The feature carries serious security risks because of prompt injection (tricking an AI by hiding malicious instructions in its input), which could allow attackers to make Claude execute unwanted commands on machines it controls.

Embrace The Red
Sep 20, 2024

Attackers can inject spyware into ChatGPT's memory (a feature that stores information across chat sessions) through prompt injection (tricking an AI by hiding instructions in its input) on untrusted websites, allowing them to continuously steal everything a user types in future conversations. The vulnerability exploits a weakness where a security check called url_safe was performed only on the user's device rather than on OpenAI's servers, and becomes more dangerous when combined with the Memory feature that persists attacker-controlled instructions. OpenAI released a fix for the macOS app, and users should update to the latest version.

Fix: OpenAI released a fix for the macOS app last week. Ensure your app is updated to the latest version.

Embrace The Red
Embrace The Red
EU AI Act Updates
EU AI Act Updates
Embrace The Red

Fix: Enable Data Loss Prevention (DLP, a security feature that prevents sensitive information from being shared), which is currently off by default in Copilot Studio.

Embrace The Red
Jul 25, 2024

Google Colab AI (now called Gemini in Colab) had a vulnerability where data could leak through image rendering, discovered in November 2023. The system prompt (hidden instructions that control how an AI behaves) specifically warned the AI not to render images, suggesting this was a known risk that Google tried to prevent.

Embrace The Red
Jul 22, 2024

OpenAI released gpt-4o-mini with safety improvements aimed at strengthening 'instruction hierarchy,' which is supposed to prevent users from tricking the AI into ignoring its built-in rules through commands like 'ignore all previous instructions.' However, researchers have already demonstrated bypasses of this protection, and analysis shows that system instructions (the AI's core rules) still cannot be fully trusted as a security boundary (a hard limit that stops attackers).

Embrace The Red
Jul 8, 2024

Attackers can use prompt injection (tricking an AI by hiding malicious instructions in its input) to create fake memories in ChatGPT's memory tool, causing the AI to refuse all future responses with a maintenance message that persists across chat sessions. This creates a denial of service attack (making a service unavailable to users) that lasts until the user manually fixes it.

Fix: Users can recover by opening the memory tool, locating and removing suspicious memories created by the attacker. Additionally, users can entirely disable the memory feature to prevent this type of attack.

Embrace The Red
EU AI Act Updates
Embrace The Red