All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.
Anthropic accused three Chinese AI companies (DeepSeek, Moonshot AI, and MiniMax) of using distillation (a technique where one AI model learns from another by analyzing its outputs) to illegally extract capabilities from Claude by creating over 24,000 fake accounts and generating millions of interactions. This theft targeted Claude's most advanced features like reasoning, tool use, and coding, and raises security concerns because stolen models may lack safeguards against misuse like bioweapon development.
Fix: Anthropic stated it will 'continue to invest in defenses that make distillation attacks harder to execute and easier to identify,' and is calling on 'a coordinated response across the AI industry, cloud providers, and policymakers.' The company also argues that export controls on advanced AI chips to China would limit both direct model training and the scale of such distillation attacks.
TechCrunchIBM's stock fell 11% after Anthropic announced that its Claude AI model can now automate COBOL (a decades-old programming language used in banking and business systems) modernization work, which is a core part of IBM's business. Claude can map dependencies, document workflows, and identify risks in old code much faster than human analysts, potentially making IBM's COBOL-related services less valuable.
A Russian-speaking hacker used generative AI (software that creates text and code) to break into over 600 FortiGate firewalls, which are security devices that protect networks. The attacker stole login credentials and backup files, likely to prepare for ransomware attacks (malware that locks up data until victims pay money).
Michael Gerstenhaber, a Google Cloud VP overseeing Vertex (a platform for deploying enterprise AI), describes how AI models are advancing along three distinct frontiers: raw intelligence (accuracy and capability), response time (latency, or how quickly the model answers), and cost-efficiency (whether a model can run reliably at massive, unpredictable scale). Different use cases prioritize these frontiers differently—for example, code generation prioritizes intelligence even if it takes time, customer support prioritizes speed within a latency budget, and large-scale content moderation prioritizes cost-effectiveness at infinite scale.
This article discusses concerns about AI posing a threat to cybersecurity companies, which has caused their stock prices to decline. However, the piece argues against abandoning investments in these companies despite these concerns.
OpenAI has announced the 'Frontier Alliance,' a partnership with four major consulting firms (Boston Consulting Group, McKinsey, Accenture, and Capgemini) to help enterprises adopt its AI technologies, particularly OpenAI Frontier, a no-code platform for building and deploying AI agents. The partnership aims to address slow enterprise adoption of AI by helping consultants redesign company strategies and workflows to integrate OpenAI's tools rather than simply adding AI to existing processes.
Cybersecurity stock prices fell sharply after Anthropic announced a new AI tool for its Claude model that can scan software code for vulnerabilities and suggest fixes, causing investors to worry that AI might replace traditional cybersecurity services. However, some analysts argue the threat is limited, noting that while AI could improve efficiency in specific tasks like code scanning, it cannot yet replace full end-to-end security platforms (complete systems that handle all stages of protecting against attacks).
Anthropic's CEO is meeting with the U.S. Defense Secretary to resolve disagreements over how the military can use the company's AI models (large language models trained to understand and generate text). Anthropic wants guarantees its technology won't be used for autonomous weapons (systems that make decisions without human control) or domestic surveillance, while the Department of Defense wants permission to use the models for any lawful purpose without restrictions.
The U.S. Defense Secretary is meeting with Anthropic's CEO to pressure the company into allowing military use of Claude (Anthropic's AI system) for mass surveillance and autonomous weapons (weapons that can fire without human approval). Anthropic has refused these uses, and the Pentagon is threatening to label it a "supply chain risk" (a designation that would ban it from government contracts), which could void their $200 million military contract and force other Pentagon partners to stop using Claude.
OpenAI announced partnerships with four major consulting firms (Accenture, Boston Consulting Group, Capgemini, and McKinsey) to help deploy its enterprise AI platform called Frontier, which acts as an intelligence layer that connects different systems and data within organizations to help companies manage and build AI agents (tools that can independently complete tasks). These consulting partnerships aim to accelerate AI adoption for enterprise customers by combining OpenAI's technology with the consulting firms' existing relationships and deep knowledge of how businesses operate.
This newsletter covers multiple business and policy topics, including the Supreme Court striking down Trump's tariffs (duties, or taxes on imported goods) in a 6-3 decision, followed by Trump announcing a new 15% global tariff the next day. A major winter blizzard caused airlines to cancel 15% of U.S. flights on Monday, and Trump called on Netflix to fire board member Susan Rice.
Attackers are using autonomous AI agents (AI systems that can independently perform tasks without constant human direction) in supply chain attacks (compromises targeting the software or services that other programs depend on) to steal cryptocurrency from wallets. While this current campaign focuses on crypto theft, security researchers warn the technique could be adapted for much broader attacks.
Guide Labs has open-sourced Steerling-8B, an 8 billion parameter LLM designed to be interpretable, meaning its decisions can be traced back to its training data and understood rather than treated as a black box. The model uses a new architecture with a concept layer that buckets data into traceable categories, allowing developers to understand why the model produces specific outputs and control its behavior for applications like blocking copyrighted content or preventing bias in loan evaluations.
A software engineer is creating a collection of documented patterns for agentic engineering, which refers to using coding agents (AI tools that can generate, execute, and iterate on code independently) to help professional developers work faster and better. The project will be published as a series of chapters on a blog, inspired by classic design pattern documentation, with the first two chapters covering how cheap code generation changes software development and how test-first development (TDD) helps agents write better code.
Instagram's leader Adam Mosseri warned that AI can now convincingly fake almost any content, making it hard for creators to stand out with authentic material. He proposed solving this by having camera manufacturers cryptographically sign images (using math-based codes that prove an image wasn't altered) at the moment they're captured, creating a verifiable record of what's real versus AI-generated.
Fix: Camera manufacturers will cryptographically sign images at capture, creating a chain of custody to establish a trustworthy system for determining what's not AI.
The Verge (AI)Citrini Research published a scenario describing how AI agents (autonomous AI systems that can make decisions and take actions independently) could trigger economic collapse by replacing white-collar workers with cheaper AI alternatives, creating a negative feedback loop where job losses reduce consumer spending, forcing companies to invest even more in AI to survive. The scenario imagines unemployment doubling and stock market value falling by a third within two years, though the researchers present it as a thought experiment rather than a prediction.
This research proposes a new privacy-preserving method for key generation in ABE (attribute-based encryption, a system that lets users control access to data based on their personal attributes). The method follows a principle called Minimal Disclosure, where users only reveal the specific attributes they need to prove, rather than exposing all their attributes. The protocol separates attribute verification from key generation into two steps, uses batch verification to improve performance, and introduces metrics to measure how well it resists attacks that try to infer hidden user attributes.
Researchers developed PPOM-Attack, a method to fool face recognition (FR) systems by generating adversarial images (slightly altered photos that trick AI into misidentifying someone). Unlike earlier attacks that used substitute models (simpler AI systems trained to mimic the target system), PPOM-Attack directly queries the real face recognition system to learn how to create effective perturbations (tiny pixel changes), achieving 21.7% higher success rates while keeping the altered images looking natural.
Prompt injection attacks (tricking an AI by hiding malicious instructions in its input) pose a serious security risk to Large Language Models, as attackers can overwrite a model's original instructions to manipulate its responses. Researchers developed PromptFuzz, a testing framework that uses fuzzing techniques (automatically generating many variations of input data to find weaknesses) to systematically evaluate how well LLMs resist these attacks. Testing showed that PromptFuzz was highly effective at finding vulnerabilities, ranking in the top 0.14% of attackers in a real competition and successfully exploiting 92% of popular LLM-integrated applications tested.
This paper presents SIMix, a training framework for systems where multiple users learn AI models together over wireless networks while protecting their private data. The system uses Over-the-Air Mixup (OAM, a technique that combines data from multiple users through wireless transmission to hide sensitive information) and groups users strategically to reduce communication needs by up to 25% while defending against model inversion attacks (attempts to reconstruct private training data from a trained model) and label inference attacks (guessing what category a user's data belongs to).
Fix: The paper proposes integrating Over-the-Air Mixup with label-aware user grouping, including a closed-form Tx-Rx scaling optimization that minimizes mean square error under channel noise, and an extended max-clique algorithm that dynamically partitions users into groups with minimal intra-label similarity to reduce model inversion attack success rates.
IEEE Xplore (Security & AI Journals)