Academic papers, new techniques, benchmarks, and theoretical findings in AI/LLM security.
AI agents (automated systems that can take actions based on AI decisions) are easy to build with modern tools, but they face several security threats. The OWASP Gen AI Security Project held a hackathon in New York where participants intentionally created insecure agents to identify common security problems.
As AI systems start connecting to real tools and databases through the Model Context Protocol (MCP, a system that lets AI models interact with external applications and data), new security risks appear that older security methods cannot fully handle. The OWASP GenAI Security Project has released research on how to secure MCP, offering defense-in-depth strategies (a layered security approach using multiple protective measures) to help developers build safer AI applications that can act independently in real time.
Version 4.9.0 is a release of the MITRE ATLAS framework, which documents attack techniques and defenses specific to AI systems. The update adds new attack methods like reverse shells (unauthorized remote access to a system), model corruption, and supply chain attacks targeting AI tools, while also updating existing security techniques and adding real-world case studies of AI-related security breaches.
Researchers created the Virology Capabilities Test (VCT), a benchmark measuring how well AI systems can solve complex virology lab problems, and found that leading AI models like OpenAI's o3 now outperform human experts in specialized virology knowledge. This is concerning because virology knowledge has dual-use potential, meaning the same capabilities that could help prevent disease could also be misused by bad actors to develop dangerous pathogens.
Fix: The authors recommend that highly dual-use virology capabilities should be excluded from publicly-available AI systems, and know-your-customer mechanisms (verification processes to confirm who customers are and what they'll use the technology for) could ensure these capabilities remain accessible only to researchers in institutions with appropriate safety protocols. As a result of the paper, xAI has added new safeguards to their systems.
CAIS AI Safety NewsletterThe OWASP Generative AI Security Project, an organization focused on application security, announced nine new corporate sponsors to support efforts in improving security for generative AI technologies. The sponsors, including companies like ByteDance and Trend Micro, represent increased investment and momentum in making AI systems more secure.
OWASP (Open Worldwide Application Security Project, a nonprofit that helps organizations secure their software) has renamed and promoted its OWASP Top 10 for LLM (large language model, an AI trained on massive amounts of text data) project to the OWASP Gen AI Security Project, expanding its focus from just listing AI vulnerabilities to providing broader guidance on governance, risk management, and compliance for generative AI systems. The project now includes over 600 experts from 18 countries and has published new resources like the Agentic AI Threats and Mitigations Guide (addressing security risks in autonomous AI systems) along with translations in six additional languages.
This content is a product navigation page for GitHub v4.8.0, listing features related to AI code creation, developer workflows, application security, and enterprise solutions. It does not contain technical information about a specific AI or LLM vulnerability, bug, or security issue.