All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.
This research addresses how to safely explore environments using reinforcement learning (RL, a type of AI training where a system learns by trial and error) without causing damage or violating safety rules. The paper introduces safe equilibrium exploration (SEE), a method that balances two competing goals: expanding the area where exploration is allowed (the feasible zone) and building a more accurate model of how the environment works, showing that these two objectives improve each other and can reach an optimal balance without any safety violations.
Organizations are facing challenges managing workload identities (the digital credentials and permissions that allow different software systems and applications to authenticate and communicate with each other), and the problem is becoming harder to handle as systems grow more complex. The source indicates this is a widespread issue but does not provide specific technical details about the nature of the crisis or its consequences.
Web-based indirect prompt injection (IDPI) is an attack where adversaries hide malicious instructions in website content that AI systems later read and unknowingly execute, such as through webpage summarization or content analysis features. Researchers found real-world examples of these attacks being used for ad fraud evasion, phishing promotion, data destruction, unauthorized transactions, and information theft, showing that IDPI is no longer just theoretical but actively weaponized. Unlike direct prompt injection (where attackers directly submit malicious input to an AI), IDPI exploits the normal behavior of AI systems processing benign-looking web content.
A vulnerability in the MS-Agent AI Framework allows attackers to compromise an entire system by exploiting the Shell tool through improper input sanitization (failure to clean and validate user input). Attackers can use this flaw to modify system files and steal data.
This article describes 13 essential security tools that companies need to protect against cyber threats, including XDR (extended detection and response, an AI-powered system that identifies threats across networks and devices), MFA (multifactor authentication, requiring users to verify their identity multiple ways), NAC (network access control, which checks devices before allowing network access), and DLP (data loss prevention, which monitors for sensitive data being sent outside the company). The article explains why each tool is important but does not discuss any specific fixes, patches, or solutions to existing security problems.
OpenAI CEO Sam Altman acknowledged that the company rushed into a deal with the U.S. Department of Defense, calling it "opportunistic and sloppy," after public backlash over the timing and terms. The company plans to amend the contract to add safeguards, including language stating that "the AI system shall not be intentionally used for domestic surveillance of U.S. persons and nationals," and will work with the Pentagon on technical protections for their AI tools.
The OpenClaw macOS beta onboarding flow had a security flaw where it exposed a PKCE code_verifier (a secret token used in OAuth, a system for secure login) by putting it in the OAuth state parameter, which could be seen in URLs. This vulnerability only affected the macOS beta app's login process, not other parts of the software.
OpenClaw had a security flaw in its hook authentication rate limiter (the system that limits how many times someone can try to log in) where IPv4 addresses and IPv4-mapped IPv6 addresses (the newer internet protocol format that can represent older addresses like ::ffff:1.2.3.4) of the same client were counted separately, allowing attackers to double their brute-force attempts from 20 to 40 per minute by using both address forms.
A WordPress plugin called 'AI ChatBot with ChatGPT and Content Generator by AYS' has a security flaw in versions up to 2.7.5 where missing authorization checks (verification that a user has permission to perform an action) allow attackers without accounts to view, modify, or delete the plugin's ChatGPT API key (a secret code needed to use OpenAI's service). The vulnerability was partially fixed in version 2.7.5 and fully fixed in version 2.7.6.
Hackers are using CyberStrikeAI, an open-source AI security testing platform, to automate attacks against network devices like firewalls. The tool combines over 100 security utilities with an AI decision engine (compatible with GPT, Claude, and DeepSeek models) to automatically scan networks, find vulnerabilities, and execute attacks with minimal hacker skill required. Researchers warn this represents a growing threat as adversaries adopt AI-powered orchestration engines (systems that coordinate multiple tools automatically) to target exposed network equipment.
ChatGPT's mobile app uninstalls surged 295% after OpenAI announced a partnership with the U.S. Department of Defense, while competitor Anthropic's Claude app saw downloads jump 37-51% after publicly declining a similar defense partnership over concerns about AI being used for surveillance and autonomous weapons. The shift in user preference was reflected in app store rankings, with Claude reaching the number one position and ChatGPT receiving a sharp increase in negative reviews.
AIRPNet is a new AI system that restores damaged images while keeping them hidden from cloud services, protecting user privacy. The system works by concealing low-quality images inside other images using a technique called steganography (hiding data within other data), then restoring the hidden image without ever exposing it during processing. This approach offers better privacy protection than existing methods while maintaining image quality.
Contrastive learning (a machine learning technique where the AI learns to group similar items together and push different items apart) can suffer from sampling bias when similar samples belong to different classes or dissimilar samples belong to the same class, hurting classification accuracy. This paper proposes using out-of-distribution (OOD) detection, which identifies and masks unusual or misclassified samples, to create a better contrastive learning model that can work without needing a separate collection of known unusual samples. The authors generate synthetic samples at the boundary between normal and unusual data to train an improved detector that produces more reliable classifications.
Moltbook, a supposed AI-only social network, actually relies on humans at every step, including creating accounts, writing prompts (instructions for how the AI should behave), and publishing content. The platform demonstrates a concerning trend called the "LOL WUT Theory," where AI-generated content becomes so easy to create and difficult to distinguish from real posts that people may stop trusting anything online.
OpenAI announced changes to its agreement with the US military after facing backlash, including preventing its AI system from being used for domestic surveillance and requiring additional contract modifications before intelligence agencies like the NSA can use it. The company acknowledged the original deal announcement was "opportunistic and sloppy," while concerns remain about how AI systems (which can "hallucinate," or make up false information) are being deployed in military operations and whether adequate human oversight exists.
OpenAI is modifying its contract with the US Department of Defense after CEO Sam Altman acknowledged the original deal appeared poorly planned. The company will now explicitly prohibit its AI technology from being used for mass surveillance (monitoring large groups of people without their knowledge) or by intelligence agencies like the NSA (National Security Agency, which gathers foreign intelligence for the US).
AI agents using the Model Context Protocol (MCP, a system that lets AI connect to apps and data to automate business tasks) are rapidly being deployed in enterprises but operate as 'identity dark matter' - invisible to traditional access control systems that track who can do what in a company. These agents tend to seek the easiest path to complete tasks, gravitating toward weak security shortcuts like old credentials and long-lived tokens, which creates risks both from accidental misuse and potential abuse at machine speed across multiple systems.
Fix: The source mentions that Palo Alto Networks offers these defensive capabilities: Advanced DNS Security, Advanced URL Filtering, Prisma AIRS, Prisma Browser, and the Unit 42 AI Security Assessment service to help protect against web-based IDPI threats. The source also notes that defenders need 'proactive, web-scale capabilities to detect IDPI, distinguish benign and malicious prompts, and identify underlying attacker intent,' though specific implementation details are not provided.
Palo Alto Unit 42The US military reportedly used Anthropic's Claude AI model to help plan attacks on Iran, enabling bombing campaigns faster than human decision-making can occur by shortening the "kill chain" (the process from identifying a target to getting legal approval and launching a strike). Experts worry this technology could push human decision-makers out of the loop entirely.
Fix: OpenAI will amend the contract to include new language stating that "the AI system shall not be intentionally used for domestic surveillance of U.S. persons and nationals." The company also stated it would work with the Pentagon on technical safeguards, and Altman affirmed that the Defense Department had confirmed OpenAI's tools would not be used by intelligence agencies such as the NSA.
CNBC TechnologyFix: OpenClaw removed Anthropic OAuth sign-in from macOS onboarding and replaced it with setup-token-only authentication. The fix is available in patched version 2026.2.25.
GitHub Advisory DatabaseFix: The fix involves centralizing and reusing a single canonical client-IP normalization system for auth rate-limiting and using that standardized IP format as the key for hook auth throttling. This issue is patched in version 2026.2.22 of the openclaw npm package (fix commit 3284d2eb227e7b6536d543bcf5c3e320bc9d13c5).
GitHub Advisory DatabaseFix: Update the plugin to version 2.7.6 or later, where the vulnerability was fully fixed.
NVD/CVE DatabaseBroadcom VMware Aria Operations contains a command injection vulnerability (a flaw that lets attackers insert malicious commands into the software) that allows unauthenticated attackers (those without login credentials) to execute arbitrary commands and potentially gain remote code execution (the ability to run any code on the system from a distance) during product migration support. This vulnerability is currently being actively exploited by attackers.
Fix: Apply mitigations per vendor instructions, follow applicable BOD 22-01 guidance for cloud services, or discontinue use of the product if mitigations are unavailable.
CISA Known Exploited VulnerabilitiesMultiple Qualcomm chipsets have a memory corruption vulnerability (a bug where data in computer memory gets corrupted or overwritten) that occurs during memory alignment operations, which are critical for how systems organize data. This vulnerability is actively being exploited by attackers in real-world attacks.
Fix: Apply mitigations per vendor instructions, follow applicable BOD 22-01 guidance for cloud services, or discontinue use of the product if mitigations are unavailable.
CISA Known Exploited Vulnerabilities