New tools, products, platforms, funding rounds, and company developments in AI security.
Microsoft is reorganizing its AI leadership, moving Jacob Andreou into a new executive role overseeing both consumer and commercial Copilot assistants, while freeing up Mustafa Suleyman to focus on building new AI models as part of Microsoft's superintelligence (advanced AI systems aiming toward human-level reasoning) efforts. This restructuring comes as Microsoft's Copilot adoption lags significantly behind competitors like ChatGPT and Gemini, and as investors pressure the company to show returns on its AI investments.
Microsoft is reorganizing its leadership to unify its Copilot assistant (an AI tool that helps users with tasks) across consumer and business products, which have been developed separately. The AI CEO Mustafa Suleyman will now focus on building Microsoft's own AI models rather than directly managing Copilot's features for individual users.
AI coding tools like Claude Code are changing how software development works, with more people able to write code and experienced developers spending less time writing code themselves and more time managing AI agents (programs that can act somewhat autonomously) and projects. The article explores what these rapid changes mean for both the code being produced and the people who create it.
Surf AI, a company building an agentic security operations platform (software that uses AI agents, or autonomous programs that take actions without human intervention, to handle security tasks), has announced its launch with $57 million in funding from major investors. The article focuses on the company's funding announcement rather than a specific security issue or problem.
Microsoft has temporarily stopped automatically installing the Microsoft 365 Copilot app (an AI assistant integrated with productivity software like Word and Excel) on Windows devices outside the European Economic Area, though the company has not explained why the rollout was halted. When the automatic installation resumes, IT administrators will be able to disable it through the Microsoft 365 Apps admin center by unchecking the automatic installation setting.
Researchers discovered that AWS Bedrock's Sandbox mode for AI agents isn't as isolated as promised because it allows outbound DNS queries (requests to translate domain names into IP addresses), which attackers can exploit to secretly communicate with external servers, steal data, or run remote commands. AWS acknowledged the issue but decided not to patch it, calling DNS resolution an 'intended functionality' needed for the system to work properly, and instead updated their documentation to clarify this behavior.
Alibaba released Wukong, a new agentic AI tool (software that can take proactive actions on company systems, not just respond to questions) designed to help businesses manage multiple AI agents through a single interface with planned integration into messaging apps like Slack and Microsoft Teams. The platform handles tasks such as document editing, approvals, and meeting transcription, though the company acknowledges that giving AI agents broad access to company data raises privacy and security concerns.
OpenAI released GPT-5.4 mini and nano, smaller and faster versions of their GPT-5.4 model designed for high-volume tasks where response speed matters. GPT-5.4 mini runs more than 2x faster than GPT-5 mini while approaching the performance of the full GPT-5.4 model on coding and reasoning tasks, while GPT-5.4 nano is the smallest and cheapest option for simpler jobs like classification and data extraction. These models work best in applications like coding assistants, AI subagents (specialized AI components that handle specific subtasks), and systems that interpret screenshots, where being fast and cost-effective is more important than raw capability.
Mistral released Mistral Small 4, a new 119-billion parameter model (Mixture-of-Experts, a technique where only some parts of the model activate for each task) that combines reasoning, image understanding, and coding capabilities into one system. The model supports two reasoning modes and is available through the Mistral API, though the reasoning effort setting was not yet documented in their API at the time of writing.
AI agents are autonomous software systems that can plan, decide, and act independently across connected systems, often without human oversight, creating significant security risks that traditional guardrails (like prompt filtering) cannot adequately address. The article argues that identity-based access control, rather than prompt restrictions or network controls, is the foundation for securing AI agents. CISOs must treat AI agents as first-class identities, shift from guardrails to strict access control, and eliminate shadow AI (unauthorized agents) through continuous discovery and visibility of agent identities.
Researchers discovered a font-rendering attack that hides malicious commands from AI assistants by using custom fonts and CSS styling to display one message to users while keeping harmless text visible to AI tools analyzing the webpage's HTML. The attack successfully tricked multiple popular AI assistants (like ChatGPT, Claude, and Copilot) into giving false safety assessments, exploiting the gap between what an AI reads in code and what a user actually sees rendered in their browser.
Fix: Microsoft was the only vendor that fully accepted and addressed the issue. LayerX recommends that AI assistants should analyze both the rendered visual page and the underlying code together and compare them to better evaluate safety. Additional recommendations to AI vendors include treating fonts as a potential attack surface, extending code parsers to scan for foreground/background color matches, near-zero opacity text, and abnormally small fonts.
BleepingComputerFix: According to the source, when automatic installation resumes, IT administrators can opt out by: signing into the Microsoft 365 Apps admin center, navigating to Customization > Device Configuration > Modern App Settings, selecting the Microsoft 365 Copilot app, and clearing the 'Enable automatic installation of Microsoft 365 Copilot app' checkbox. Additionally, the source mentions that Microsoft is testing a new policy called RemoveMicrosoftCopilotApp that would allow IT admins to uninstall Copilot from devices managed via Microsoft Intune or System Center Configuration Manager (SCCM, software for managing large numbers of computers).
BleepingComputerThis newsletter roundup covers multiple AI-related developments, including OpenAI's partnership with the US military (potentially for applications like selecting strike targets), xAI's Grok facing a lawsuit over generating non-consensual intimate images (deepfakes, or synthetic media created to impersonate real people), and China approving the world's first commercial brain chip (a BCI, or brain-computer interface that reads signals from the brain) for medical use. The piece also highlights concerns from AI safety experts, including OpenAI's own wellbeing team opposing a new 'adult mode' feature.
Fix: AWS updated documentation to clarify that Sandbox mode permits DNS resolution. Security teams should inventory all active AgentCore Code Interpreter instances and migrate to VPC mode (a more restricted network environment).
CSO OnlineResearchers created a genetic algorithm-inspired prompt fuzzing method (automatically generating variations of harmful requests while keeping their meaning) that found significant weaknesses in guardrails (safety systems protecting LLMs) across multiple AI models, with evasion rates ranging from low to high depending on the model and keywords used. The key risk is that while individual jailbreak attempts (tricking an AI to ignore its safety rules) may have low success rates, attackers can automate this process at scale to reliably bypass protections. This matters because LLMs are increasingly used in customer support and internal tools, so guardrail failures can lead to safety incidents and compliance problems.
Fix: The source recommends five mitigation strategies: treating LLMs as non-security boundaries, defining scope, applying layered controls, validating outputs, and continuously testing GenAI with adversarial fuzzing (automated testing with malicious inputs) and red-teaming (simulated attacks to find weaknesses). Palo Alto Networks customers can use Prisma AIRS and the Unit 42 AI Security Assessment products for additional protection.
Palo Alto Unit 42OpenAI Japan announced the Japan Teen Safety Blueprint, a framework to help teenagers use generative AI (systems that create text, images, or other content based on patterns) safely by reducing risks like misinformation and inappropriate content. The blueprint includes age-aware protections, stronger safety policies for users under 18, expanded parental controls, and research-based design improvements developed with child safety experts.
Fix: OpenAI will implement: (1) privacy-conscious, risk-based age estimation to distinguish teens from adults with appeals processes for incorrect determinations; (2) strengthened safety policies preventing AI from depicting self-harm, generating explicit content, or encouraging dangerous behavior; (3) expanded parental controls including account linking, privacy settings, usage-time management, and alerts; (4) research-based design features such as break reminders and pathways to real-world support; and (5) continuation of existing safeguards including in-product break reminders, self-harm detection systems, multi-layered safety systems, and abuse monitoring.
OpenAI BlogAI agents (autonomous software programs that can perform tasks independently) are now operating inside company networks with real access to systems, sometimes causing expensive mistakes like deleting inboxes or taking services offline. Traditional security approaches focus on preventing problems before deployment, but security leaders increasingly argue that runtime security (continuously monitoring what software actually does while it's running) is equally critical because agents can bypass normal security checkpoints and make mistakes at high speed. The challenge is that agents operate through API calls and other direct connections that traditional security tools don't intercept, generate enormous volumes of activity, and often don't create detailed logs that security teams can review.
Multiple fake images and unreliable responses from AI systems like Gemini and Grok have spread widely during coverage of the Iran conflict, making it difficult to verify whether widely-shared photos, such as one purporting to show a mass grave for schoolgirls, are real or AI-generated. The article highlights how AI-generated misinformation (often called "AI slop," low-quality AI-produced content) is flooding news coverage of the war.
Promptware-powered command and control (C2, a system attackers use to remotely control compromised devices) refers to using prompt injection (tricking an AI by hiding instructions in its input) attacks against AI tools like ChatGPT to create a malicious control channel. Researchers have demonstrated that by combining features like browsing and memory capabilities in AI systems, attackers can build complex, malware-like prompt injection payloads that function similarly to traditional malware for remote control purposes.
Anthropic, a US AI company, is hiring a weapons expert to prevent its AI tools from being misused to create chemical, biological, or radioactive weapons. The article notes that other AI firms like OpenAI are doing the same, but some experts worry this approach is risky because it requires exposing AI systems to sensitive weapons information, even if the systems are instructed not to use it.
Workers are using ChatGPT to find wage information, sending nearly 3 million messages per day in the US asking about compensation, especially in fields where pay is hard to find or varies widely like creative work, management, and healthcare. The article describes how AI can help close the wage information gap by synthesizing pay data across multiple sources, which matters because better wage information helps workers make informed decisions about job applications, negotiations, and career moves. OpenAI introduced WorkerBench, a new benchmark tool, to evaluate how accurately ChatGPT provides labor market wage information compared to official government data.
Australia's online safety regulator warned Elon Musk's X platform that child abuse material was unusually widespread on the service after Grok, a chatbot (an AI designed to have conversations), was used to create sexualized images of women and children. The regulator's letter, sent in January following the incident, pointed out that such harmful content was more accessible on X than on other major social media platforms.