LiteLLM Compromised: A Developer's Guide to Incident Response, Alternatives, and LLM Gateway Migration

LiteLLM versions 1.82.7 and 1.82.8 were backdoored with credential-stealing malware, and if you run a self-hosted Python LLM proxy, here's the incident response playbook and the migration path that re

Apr 06, 2026

On March 24, 2026, a threat actor named TeamPCP published two malicious versions of LiteLLM to PyPI. Versions 1.82.7 and 1.82.8 contained a three-stage payload that harvested SSH keys, cloud credentials, Kubernetes secrets, and cryptocurrency wallets from every machine where the package existed. PyPI quarantined the entire LiteLLM package within 46 minutes.

Here’s the part that should keep you up at night. LiteLLM is an API key management gateway. By design, it has access to every LLM provider key in your organization. The attacker picked the one package that, once compromised, gives you everything.

This wasn’t an isolated event. It was the third strike in a five-day supply chain campaign. Aqua Security’s Trivy scanner got hit on March 19 (GHSA-69fq-xp46-6x23). Checkmarx’s KICS GitHub Actions followed on March 23 (kics-github-action#152, Checkmarx Update). LiteLLM was the final domino on March 24 (litellm#24512, LiteLLM Update).

The chain of compromise was straightforward. As confirmed in LiteLLM’s official security update, the project’s CI/CD pipeline ran Trivy without a pinned version. The compromised Trivy action exfiltrated the PYPI_PUBLISH token from the GitHub Actions runner. TeamPCP used that token to publish malicious packages directly to PyPI.

How Versions 1.82.7 and 1.82.8 Were Weaponized

Version 1.82.7 embedded the payload in proxy/proxy_server.py. It activated on import. Version 1.82.8 was nastier: it shipped a .pth file called litellm_init.pth that executed on every Python process startup, not just when LiteLLM was imported. Python’s site module processes all .pth files in site-packages during interpreter initialization, as documented in the GitHub issue.

The payload used double base64 encoding:

python

import os, subprocess, sys
subprocess.Popen([
    sys.executable, "-c",
    "import base64; exec(base64.b64decode('...'))"
])

Once executed, it ran three stages. Stage 1 harvested credentials: SSH keys, AWS/GCP/Azure tokens, environment variables, .env files, Kubernetes configs, Docker configs, database credentials, shell history, browser cookies, and cryptocurrency wallets. Stage 2 deployed privileged Alpine pods into the kube-system namespace on every reachable Kubernetes node, accessing cluster secrets and service account tokens. Stage 3 installed sysmon.py as a systemd service that polled checkmarx[.]zone/raw for additional binaries, giving the attacker persistent access even after discovery.

All harvested data was encrypted and exfiltrated via POST request to models.litellm[.]cloud, a lookalike domain controlled by TeamPCP.

The Blast Radius Is Bigger Than You Think

The .pth mechanism means the malware fired on every Python process on any machine where LiteLLM 1.82.8 was installed. You didn’t need to run import litellm. You didn’t need to start the proxy. You didn’t need to do anything at all. Python’s interpreter loads .pth files automatically during startup.

A data scientist running Jupyter, a DevOps engineer running Ansible, a backend dev testing a Flask endpoint: all compromised if the package existed anywhere in their Python environment. The malware ran silently in the background of every Python process on the host.

You also didn’t need to install it yourself. If another package in your dependency tree pulled it in, the malware still executed. LiteLLM sits in 36% of cloud environments, often as a transitive dependency that agent frameworks pull in without developers ever auditing. As reported in GitHub issue #24512, the researcher who discovered this attack found it because their Cursor IDE pulled LiteLLM through an MCP plugin without explicit installation.

How to Check If You’re Affected

Run these commands across local machines, CI/CD runners, Docker images, staging, and production:

bash

pip show litellm | grep Version
pip cache list litellm
find / -name "litellm_init.pth" 2>/dev/null

Then scan your egress logs. Any traffic to models.litellm[.]cloud or checkmarx[.]zone is a confirmed breach:

bash

# CloudWatch
fields @timestamp, @message
| filter @message like /models\.litellm\.cloud|checkmarx\.zone/

# Nginx
grep -E "models\.litellm\.cloud|checkmarx\.zone" /var/log/nginx/access.log

Check your transitive dependencies too:

bash

pip show litellm  # Check "Required-by" field

If other packages list LiteLLM as a dependency, it entered your environment without your explicit consent.

Incident Response Playbook

Step 1: Isolate immediately. Stop all running LiteLLM containers and scale down Kubernetes deployments:

bash

docker ps | grep litellm | awk '{print $1}' | xargs docker kill
kubectl scale deployment litellm-proxy --replicas=0 -n your-namespace

Step 2: Rotate every credential. The malware harvested everything it could find. Treat every secret that existed on an affected machine as known to the attacker. That means cloud provider tokens (AWS, GCP, Azure), all SSH keys in ~/.ssh/, database connection strings and passwords, every LLM provider API key (OpenAI, Anthropic, Gemini), Kubernetes service accounts and CI/CD tokens, and cryptocurrency wallets if wallet files were on the machine. Move funds immediately for crypto.

Step 3: Audit Kubernetes and remove all artifacts. The malware deployed privileged pods and installed a persistent backdoor, so check for both:

bash

# Check for lateral movement
kubectl get pods -n kube-system | grep -i "node-setup"
find / -name "sysmon.py" 2>/dev/null

# Full removal
pip uninstall litellm -y && pip cache purge
rm -rf ~/.cache/uv
find $(python -c "import site; print(site.getsitepackages()[0])") \
    -name "litellm_init.pth" -delete
rm -rf ~/.config/sysmon/ ~/.config/systemd/user/sysmon.service
docker build --no-cache -t your-image:clean .

Do not downgrade to an older version. Remove entirely and replace.

Why Self-Hosted Python LLM Proxies Are a Structural Risk

LiteLLM’s Python proxy inherits hundreds of transitive dependencies spanning ML frameworks, data processing libraries, and provider SDKs. Every dependency is a trust decision most teams make automatically with pip install --upgrade. When you add LiteLLM, you’re not just trusting LiteLLM. You’re trusting every package it depends on, every package those packages depend on, and every maintainer account tied to each one.

The .pth attack vector is especially dangerous because most supply chain scanning tools focus on setup.py, __init__.py, and entry points. The .pth mechanism is a legitimate Python feature for path configuration that has been largely overlooked as an injection vector. Expect this technique in future attacks. Traditional security scanning would not have caught it.

The LiteLLM maintainers did not rotate their CI/CD credentials for five days after the Trivy disclosure on March 19. If the maintainers couldn’t respond fast enough, most teams running their software had no chance. This is the inherent problem with the self-hosted model: you own the blast radius.

Migration Path: From LiteLLM to Prism

Prism is Future AGI’s managed AI Gateway. It does what LiteLLM did (route requests to 100+ LLM providers through a single API) but without requiring you to install a Python package or run anything in your infrastructure. You send requests using the standard OpenAI API format. Prism handles provider translation, failover, caching, guardrails, cost tracking, and streaming on its end.

Your attack surface shrinks from a Python environment with hundreds of transitive dependencies to an API key and a URL.

The migration is a single config change. Here’s what it looks like:

Before (LiteLLM):

python

from litellm import completion
response = completion(model="gpt-5", messages=[{"role": "user", "content": "Hello"}])

After (Prism):

python

from openai import OpenAI
client = OpenAI(base_url="https://gateway.futureagi.com", api_key="sk-prism-your-key")
response = client.chat.completions.create(
    model="gpt-5", messages=[{"role": "user", "content": "Hello"}]
)

Same OpenAI SDK format, same model naming, same response schema. TypeScript works the same way:

typescript

import OpenAI from "openai";
const client = new OpenAI({
    baseURL: "https://gateway.futureagi.com",
    apiKey: "sk-prism-your-key"
});
const response = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [{ role: "user", content: "Hello" }]
});

Provider keys are configured once in the Prism dashboard. No environment variables scattered across your codebase or sitting in .env files on developer machines.

For teams running LiteLLM as a Kubernetes deployment, update your environment variables and delete the proxy infrastructure:

yaml

env:
  - name: LLM_BASE_URL
    value: "https://gateway.futureagi.com"  # was http://litellm-proxy:4000
  - name: LLM_API_KEY
    value: "sk-prism-your-key"

Delete the LiteLLM pod, its service, Postgres, and Redis. That’s infrastructure you no longer maintain, patch, or worry about.

Prism also handles semantic caching (matching queries that mean the same thing but are worded differently) and applies 18+ built-in guardrails including PII detection and prompt injection prevention at the gateway layer. Cached responses return with X-Prism-Cost: 0. You can read the full docs.

python

from prism import Prism, GatewayConfig, CacheConfig
client = Prism(
    api_key="sk-prism-your-key",
    base_url="https://gateway.futureagi.com",
    config=GatewayConfig(
        cache=CacheConfig(enabled=True, mode="semantic", ttl="5m", namespace="prod"),
    ),
)

What This Changes Going Forward

The EU Cyber Resilience Act makes organizations legally responsible for the security of open-source components in their products. SOC 2 Type II audits now scrutinize dependency management. “We install the latest version from PyPI” is no longer an acceptable answer during a controls review. If your product uses LiteLLM and your customers’ credentials were exfiltrated, the liability falls on you, not the open-source maintainer.

Dependency pinning doesn’t fully solve this either. Pinning prevents pulling a new malicious version but not a compromised maintainer overwriting an existing tag. Hash verification (pip install --hash=sha256:<exact_hash>) is the real control. A managed gateway eliminates the need for pinning entirely because there’s no dependency to pin.

Every team running LLM applications now faces a straightforward choice: own your proxy infrastructure and inherit every supply chain risk, or use a managed gateway and reduce your trust boundary to an API endpoint. After March 24, 2026, the risk calculus has permanently shifted.

The question isn’t whether your open-source LLM gateway will be targeted. It’s whether your architecture limits the damage when it happens. Rotating credentials fixes today’s breach. Moving to a managed gateway fixes the category of problem.

Get started with Prism | Request a demo | Explore Future AGI

Frequently Asked Questions

Q: Is it safe to install any version of LiteLLM from PyPI right now? As of March 25, 2026, the entire LiteLLM package remains quarantined on PyPI, meaning no version is available for download. Teams that need LLM gateway functionality should evaluate alternatives like Future AGI’s Prism or vendor a known-safe version internally.

Q: Can dependency pinning alone prevent a supply chain attack like this? Pinning protects against pulling a new malicious version but doesn’t protect against a compromised maintainer overwriting an existing tag. Hash verification with pip install --hash=sha256:<exact_hash> is the stronger control, though most teams skip it because the tooling is inconvenient.

Q: How does a managed LLM gateway differ from a self-hosted proxy for credential security? A managed gateway like Prism stores provider API keys in its own infrastructure rather than in your Python environment or .env files. A compromised machine in your environment can’t exfiltrate LLM provider credentials because those credentials never touch your infrastructure.

Q: Were teams using LiteLLM’s official Docker image affected? Teams running the official LiteLLM Proxy Docker image were not directly impacted because that deployment path pins dependencies in requirements.txt. However, any team that built custom Docker images with pip install litellm without a version pin during the attack window should treat their environment as compromised.

Q: What Kubernetes indicators should I check after the incident? Look for unauthorized pods in kube-system matching node-setup-*, systemd services named sysmon.service, and privileged Alpine containers you didn’t deploy. The malware used Kubernetes lateral movement to spread across nodes, so audit cluster secrets and service account tokens for unauthorized access.

Q: Does migrating to Prism require rewriting application code? No. Prism uses the standard OpenAI SDK format, so migration is a single config change: update your base URL and API key. Your existing code, model names, and response handling all stay exactly the same.

Future AGI

Discussion about this post

Ready for more?