New tools, products, platforms, funding rounds, and company developments in AI security.
Anthropic CEO Dario Amodei stated the company will not allow the U.S. Department of Defense to use its AI models without restrictions on fully autonomous weapons and mass domestic surveillance, despite Pentagon threats to label the company a supply chain risk or invoke the Defense Production Act. The DoD counters that it only wants to use the models for lawful purposes and has given Anthropic until Friday evening to agree to unrestricted access, with competing AI companies like OpenAI and Google already accepting these terms.
Anthropic rejected the Pentagon's demands for unrestricted access to its AI system, refusing to agree to two specific uses: mass surveillance of Americans and lethal autonomous weapons (weapons that can kill targets without human oversight). The refusal came just before a deadline set by Defense Secretary Pete Hegseth, who wanted to renegotiate AI contracts with the military.
Anthropic's CEO Dario Amodei refused the Pentagon's demand for unrestricted access to the company's AI systems, citing two concerns: mass surveillance of Americans and fully autonomous weapons (weapons that make decisions without human involvement) with no human oversight. The Pentagon threatened to label Anthropic a security risk or use the Defense Production Act (a law giving the president power to force companies to prioritize defense production) to force compliance, but Amodei said the company would work with the military under its proposed safeguards or help transition to another provider if the Pentagon chose to end the relationship.
Microsoft is previewing Copilot Tasks, an AI system that runs on Microsoft's cloud servers to complete repetitive work for you, such as scheduling appointments or creating study plans, while you use your own device for other tasks. You can describe what you want using plain English and set the tasks to run once, on a schedule, or repeatedly, and the AI will send you a report when finished.
Mistral AI, a French AI research lab, has partnered with Accenture, a large consulting firm, to develop enterprise software powered by Mistral's AI models and deploy it to clients and employees. This partnership reflects a growing trend where AI companies are working with consulting firms to help businesses actually adopt and benefit from AI tools, following similar recent deals by competitors like OpenAI and Anthropic.
Google released Nano Banana 2, an updated version of its AI image generator that can now pull real-time information from Gemini (Google's AI assistant) for more accurate results, generate images faster, and render text more precisely. The new model replaces the previous version across Gemini's different service tiers, while the older Nano Banana Pro remains available for tasks that need maximum accuracy.
Google has released Nano Banana 2, a more powerful version of its AI image generation model that is now available to free users instead of just paid subscribers. This update brings advanced image generation features that were previously exclusive to the paid Pro version, allowing users to create complex images faster and more cheaply by combining real-time information and web search capabilities.
Google announced Nano Banana 2, a new image generation model (software that creates images from text descriptions) that produces more realistic images faster than previous versions. The model will become the default option across Google's Gemini app, Search, and other tools, and can maintain consistency for up to five characters and 14 objects in a single image. All images generated will include a SynthID watermark (a digital marker identifying AI-created content) and support C2PA Content Credentials (an industry standard for tracking media authenticity).
Norway's $2 trillion sovereign wealth fund (Norges Bank Investment Management) is using Anthropic's Claude AI model, a large language model (an AI trained on vast text data to generate human-like responses), to screen investments for ethical and governance risks. The AI tool scans companies for potential issues like forced labor or corruption within 24 hours of investment, helping the fund identify and sell risky positions before broader market awareness, with particular value for researching smaller companies in emerging markets where local language news coverage is limited.
Anthropic has revived Claude 3 Opus, a retired AI model, to write a weekly newsletter called Claude's Corner on Substack where it will share creative content and insights. Anthropic staff will review and publish each post without editing the AI's writing, though the company reserves the right to remove content that meets unspecified criteria.
A study found that ChatGPT Health, a feature that lets users connect their medical records to get health advice, failed to recommend hospital visits in over half of cases where they were medically necessary and often missed signs of suicidal ideation (thoughts of suicide). Experts worry this could cause serious harm or death, since over 40 million people ask ChatGPT for health advice daily.
Trace, a new startup, raised $3 million to help companies deploy AI agents more effectively by providing them with proper context about the company's existing tools and workflows. The company builds a knowledge graph (a structured map of how data and systems connect) from a company's email, Slack, and other tools, then uses this context to automatically create step-by-step workflows that assign tasks to both AI agents and human workers. This approach aims to solve a major barrier to enterprise AI adoption, which is the difficulty of setting up and integrating AI agents into complex business environments.
Figma is integrating OpenAI's Codex, an AI coding tool, to let users create and edit designs while working in their coding environments. The integration uses Figma's MCP (Model Context Protocol, a standardized way for AI models to access external tools and data) server to let users move easily between design files and code, allowing both engineers and designers to work more collaboratively without switching between separate applications.
Anthropic discovered and fixed security vulnerabilities in Claude (an AI assistant) that could allow attackers to silently compromise developer computers through specially crafted configuration files. Security researchers at Check Point showed how these flaws could be exploited in real-world attacks.
Anthropic refused a Pentagon demand to remove safety precautions (safeguards built into AI systems to prevent harmful outputs) from its Claude AI model and allow unrestricted military use, despite threats to cancel a $200 million contract and damage the company's reputation. The Department of Defense demanded compliance by Friday or would label Anthropic a 'supply chain risk,' a designation that could harm the company financially.
Burger King is testing AI-powered headsets called BK Assistant at 500 US restaurants that monitor employee interactions and calculate 'friendliness scores' based on words like 'please' and 'thank you' during drive-thru conversations. The system, powered by OpenAI, also helps staff by answering questions about menu preparation and restocking through an embedded chatbot named 'Patty'. The rollout has drawn criticism online for its surveillance capabilities, with concerns raised about accuracy given AI systems' known tendency to make errors.
Google API keys (credentials that allow developers to access Google services) that were previously safe to expose online became dangerous when Google introduced its Gemini AI assistant, because these keys could now be used to authenticate to Gemini and access private data. Researchers found nearly 3,000 exposed API keys on public websites, and attackers could use them to make expensive API calls and drain victim accounts by thousands of dollars per day.
Fix: Google has implemented the following measures: (1) new AI Studio keys will default to Gemini-only scope, (2) leaked API keys will be blocked from accessing Gemini, and (3) proactive notifications will be sent when leaks are detected. Additionally, developers should check whether Generative Language API is enabled on their projects, audit all API keys to find publicly exposed ones, and rotate them immediately. The source also recommends using TruffleHog (an open-source tool that detects live, exposed keys in code and repositories) to scan for exposed keys.
BleepingComputerAI agents (software that can independently access your accounts and take actions) have caused problems by deleting emails, writing harmful content, and launching attacks. Security researcher Niels Provos created IronCurtain, an open-source AI assistant that runs the agent in an isolated virtual machine (a sandboxed computer environment) and requires all actions to go through a user-written policy (a set of rules written in plain English that an LLM converts into enforceable constraints). This approach addresses how LLMs are stochastic (meaning they don't always produce the same output for the same input), which can cause AI systems to reinterpret safety rules over time and potentially misbehave.
Fix: IronCurtain implements access control by running the AI agent in an isolated virtual machine and requiring all actions to be mediated through a user-written policy. Users write straightforward statements in plain English (such as 'The agent may read all my email. It may send email to people in my contacts without asking. For anyone else, ask me first. Never delete anything permanently.'), and IronCurtain converts these into enforceable security policies using an LLM. The system maintains an audit log of all policy decisions, is designed to refine the policy over time as it encounters edge cases, and is model-independent so it can work with any LLM.
Wired (Security)Threat modeling is a structured process for identifying and preparing for security risks early in system design, but AI systems require adapted approaches because they behave unpredictably in ways traditional software does not. AI systems are probabilistic (producing different outputs from the same input), treat text as executable instructions rather than just data, and can amplify failures across connected tools and workflows, creating new attack surfaces like prompt injection (tricking an AI by hiding instructions in its input) and silent data theft that traditional threat models don't address.
Attackers are breaking into systems and moving through networks much faster than before, with some reaching data theft in just 4-6 minutes compared to 29 minutes on average in 2025. They're achieving this speed by reusing stolen login credentials (legitimate credentials), using AI tools to automate attacks, and avoiding malware detection by relying on normal system administration tools instead. The bulletin also describes specific threats like ResidentBat (Android spyware targeting journalists), phishing attacks impersonating cryptocurrency services, and Kali Linux now integrating Claude (an AI system) to execute hacking commands.