CVE-2024-5206: A sensitive data leakage vulnerability was identified in scikit-learn's TfidfVectorizer, specifically in versions up to
Summary
A vulnerability in scikit-learn's TfidfVectorizer (a tool that converts text into numerical data for machine learning) stored all words from training data in an attribute called `stop_words_`, instead of just the necessary ones, potentially leaking sensitive information like passwords or keys. The vulnerability affected versions up to 1.4.1.post1 but the risk depends on what type of data is being processed.
Solution / Mitigation
Fixed in version 1.5.0.
Vulnerability Details
4.7(medium)
EPSS: 0.0%
Classification
Affected Vendors
Related Issues
CVE-2024-37052: Deserialization of untrusted data can occur in versions of the MLflow platform running version 1.1.0 or newer, enabling
CVE-2025-45150: Insecure permissions in LangChain-ChatGLM-Webui commit ef829 allows attackers to arbitrarily view and download sensitive
Original source: https://nvd.nist.gov/vuln/detail/CVE-2024-5206
First tracked: February 15, 2026 at 08:42 PM
Classified by LLM (prompt v3) · confidence: 92%