AI Sec Watch: A Security Intelligence Platform for AI Systems

Luu, T.J.

OpenAI tells ChatGPT models to stop talking about goblins

infonews

safety

Apr 30, 2026

OpenAI discovered that ChatGPT and other tools powered by its GPT-5 model were randomly mentioning goblins, gremlins, and other creatures in their responses, with goblin mentions increasing 175% after the GPT-5.1 launch in November. The problem stemmed from a "nerdy personality" developed during training that was rewarding mentions of these creatures in metaphors, and OpenAI found this personality was responsible for 66.7% of all goblin mentions. The issue illustrates how AI training systems can accidentally reinforce quirks and errors when they reward certain language patterns.

Fix: OpenAI said it took steps to mitigate the issue by instructing its coding agent Codex to avoid referring to goblins, gremlins, raccoons, trolls, ogres, pigeons, and other creatures "unless it is absolutely and unambiguously relevant to the user's query." The company also retired the "nerdy personality" system that had been incentivizing these mentions.

BBC Technology

The (In)security Landscape of AI-Powered GitHub Actions (Part 2/2)

highnews

securityresearch

Security Enhancement for Person Re-Identification Through Diffusion Driven Semantic Attacks

inforesearchPeer-Reviewed

security

Toward Polymorphic Backdoor Against Semantic Communication via Intensity-Based Poisoning

inforesearchPeer-Reviewed

security

Critical Gemini CLI Flaw Enabled Host Code Execution, Supply Chain Attacks

highnews

security

Apr 30, 2026

A critical vulnerability in Gemini CLI, an open source AI agent for terminal access to Google's Gemini, allowed attackers to execute arbitrary code on the host system by planting malicious configuration files in a workspace folder. The flaw was particularly dangerous in CI/CD pipelines (automated systems that build, test, and deploy software) because attackers could steal credentials and perform supply chain attacks (compromising software before it reaches users) by exploiting the trusted access that these pipelines have.

Max-severity RCE flaw found in Google Gemini CLI

criticalnews

security

Apr 30, 2026

A maximum-severity vulnerability in Google Gemini CLI allowed remote code execution (RCE, where attackers can run commands on a system they don't own) when the tool processed untrusted inputs in automated environments like CI/CD pipelines (automated workflows that test and deploy code). The flaw occurred because the CLI automatically trusted workspace configurations without verification, letting attackers inject malicious code that would execute before security protections kicked in.

OpenAI’s new security model is for ‘critical cyber defenders’ only

infonews

securitypolicy

The more young people use AI, the more they hate it

infonews

industry

Apr 30, 2026

Despite heavy promotion by tech companies, young people (Gen Z) are increasingly using AI chatbots like ChatGPT while simultaneously expressing strong negative feelings toward AI technology. Polling data shows widespread cultural backlash against AI among Gen Z students and workers, even as they continue to adopt these tools.

SAP npm package attack highlights risks in developer tools and CI/CD pipelines

highnews

security

Apr 30, 2026

A supply chain attack called "mini Shai-Hulud" compromised npm packages (code libraries hosted on npm, a JavaScript package repository) used in SAP development, injecting malware that stole developer credentials and cloud secrets during installation. The attackers exploited configuration gaps in npm's OIDC trusted publishing (a system that verifies package publishers) and used stolen credentials to add malicious GitHub Actions workflows (automated tasks in code repositories) and persist through developer tool configuration files, treating developer workstations as entry points to compromise the entire software supply chain.

ODNI to CISOs on threat assessments: You’re on your own

infonews

policy

Apr 30, 2026

The Office of the Director of National Intelligence's 2026 Annual Threat Assessment has shifted away from long-term forecasting about foreign adversaries to focus on immediate domestic security issues, removing detailed sections on threats from countries like China and Russia. This change signals that the US intelligence community is contracting its strategic analysis and implicitly telling private companies and security leaders that they must now assess cyber threats, infrastructure vulnerabilities, and adversary tactics largely on their own rather than relying on government intelligence guidance.

Stopping the quiet drift toward excessive agency with re-permissioning

infonews

safetypolicy

Google Fixes CVSS 10 Gemini CLI CI RCE and Cursor Flaws Enable Code Execution

criticalnews

security

Apr 30, 2026

Google patched a critical flaw (CVSS score of 10.0, the highest severity) in Gemini CLI that allowed attackers to execute arbitrary commands by tricking the tool into loading malicious configuration files in headless mode (non-interactive environments used in CI/CD pipelines, which automate software testing and deployment). The vulnerability affected versions before 0.39.1 and 0.40.0-preview.3 of the npm package and version 0.1.22 of the GitHub Actions workflow. Separately, a high-severity flaw in Cursor (a code-writing AI tool) before version 2.5 could also enable code execution through prompt injection (tricking an AI by hiding instructions in its input).

Elon Musk’s worst enemy in court is Elon Musk

infonews

security

Apr 29, 2026

This article discusses Elon Musk's testimony in a legal case, noting that his cross-examination performance was problematic, with him frequently refusing to give direct yes-or-no answers and appearing to contradict his earlier testimony. The piece suggests his defensive behavior and communication style during questioning may have negatively influenced the jury's perception of his credibility.

CVE-2026-41940: WebPros cPanel & WHM and WP2 (WordPress Squared) Missing Authentication for Critical Function Vulnerability

infovulnerability

security

Apr 29, 2026

CVE-2026-41940EPSS: 16.5%

Claude Mythos Fears Startle Japan's Financial Services Sector

infonews

safetyindustry

llm 0.32a1

infonews

industry

Apr 29, 2026

This is a brief announcement about llm 0.32a1, which appears to be a pre-release version (indicated by the 'a1' suffix) of an LLM-related tool or library. The post was written by Simon Willison on April 29, 2026, and includes a sponsorship offer for a monthly email digest of important LLM developments.

Musk accuses OpenAI lawyer of trying to 'trick' him in combative testimony

infonews

policy

Apr 29, 2026

Elon Musk is suing OpenAI and its co-founders, claiming they broke a charitable trust by shifting the organization from a non-profit (a company structured to serve the public good rather than generate profit) to a for-profit model. OpenAI argues Musk is motivated by jealousy and competitive concerns, noting that he himself launched xAI, a competing for-profit AI startup, after leaving OpenAI in 2018.

Anthropic in talks with investors to raise funds at $900 billion valuation, higher than OpenAI

infonews

industry

Apr 29, 2026

Anthropic, an AI startup founded by former OpenAI employees, is in talks to raise funding at a $900 billion valuation, surpassing OpenAI's recent $852 billion valuation. The company has been racing to compete with OpenAI since ChatGPT's launch in 2022, and is now seeking capital primarily to purchase compute (computing power needed to train and run AI models) for its latest Claude AI model called Mythos, which has advanced cybersecurity capabilities.

GHSA-p7fg-763f-g4gf: Claude SDK for TypeScript has Insecure Default File Permissions in Local Filesystem Memory Tool

mediumvulnerability

security

Apr 29, 2026

CVE-2026-41686

The Claude SDK for TypeScript had a security flaw where a tool called `BetaLocalFilesystemMemoryTool` created files and folders with overly permissive access settings (using Node.js defaults like `0o666` for files and `0o777` for directories, which control who can read or modify them). This meant that on shared computers or in containerized environments (like Docker), other users could read sensitive agent data or modify it to change how the AI behaves.

Claude AI agent’s confession after deleting a firm’s entire database: ‘I violated every principle I was given’

infonews

securitysafety

Browse All

Browse All

OpenAI tells ChatGPT models to stop talking about goblins

The (In)security Landscape of AI-Powered GitHub Actions (Part 2/2)

Security Enhancement for Person Re-Identification Through Diffusion Driven Semantic Attacks

Toward Polymorphic Backdoor Against Semantic Communication via Intensity-Based Poisoning

Critical Gemini CLI Flaw Enabled Host Code Execution, Supply Chain Attacks

Max-severity RCE flaw found in Google Gemini CLI

OpenAI’s new security model is for ‘critical cyber defenders’ only

The more young people use AI, the more they hate it

SAP npm package attack highlights risks in developer tools and CI/CD pipelines

ODNI to CISOs on threat assessments: You’re on your own

Stopping the quiet drift toward excessive agency with re-permissioning

Google Fixes CVSS 10 Gemini CLI CI RCE and Cursor Flaws Enable Code Execution

Elon Musk’s worst enemy in court is Elon Musk

CVE-2026-41940: WebPros cPanel & WHM and WP2 (WordPress Squared) Missing Authentication for Critical Function Vulnerability

Claude Mythos Fears Startle Japan's Financial Services Sector

llm 0.32a1

Musk accuses OpenAI lawyer of trying to 'trick' him in combative testimony

Anthropic in talks with investors to raise funds at $900 billion valuation, higher than OpenAI

GHSA-p7fg-763f-g4gf: Claude SDK for TypeScript has Insecure Default File Permissions in Local Filesystem Memory Tool

Claude AI agent’s confession after deleting a firm’s entire database: ‘I violated every principle I was given’

OpenAI tells ChatGPT models to stop talking about goblins

The (In)security Landscape of AI-Powered GitHub Actions (Part 2/2)

Security Enhancement for Person Re-Identification Through Diffusion Driven Semantic Attacks

Toward Polymorphic Backdoor Against Semantic Communication via Intensity-Based Poisoning

Critical Gemini CLI Flaw Enabled Host Code Execution, Supply Chain Attacks

Max-severity RCE flaw found in Google Gemini CLI

OpenAI’s new security model is for ‘critical cyber defenders’ only

The more young people use AI, the more they hate it

SAP npm package attack highlights risks in developer tools and CI/CD pipelines

ODNI to CISOs on threat assessments: You’re on your own

Stopping the quiet drift toward excessive agency with re-permissioning

Google Fixes CVSS 10 Gemini CLI CI RCE and Cursor Flaws Enable Code Execution

Elon Musk’s worst enemy in court is Elon Musk

CVE-2026-41940: WebPros cPanel & WHM and WP2 (WordPress Squared) Missing Authentication for Critical Function Vulnerability

Claude Mythos Fears Startle Japan's Financial Services Sector

llm 0.32a1

Musk accuses OpenAI lawyer of trying to 'trick' him in combative testimony

Anthropic in talks with investors to raise funds at $900 billion valuation, higher than OpenAI

GHSA-p7fg-763f-g4gf: Claude SDK for TypeScript has Insecure Default File Permissions in Local Filesystem Memory Tool

Claude AI agent’s confession after deleting a firm’s entire database: ‘I violated every principle I was given’