---
title: Protect Against SSRF Attacks
impact: CRITICAL
impactDescription: prevents attackers from making requests to internal infrastructure
tags: security, ssrf, networking, python, pyspark
---

## Protect Against SSRF Attacks

Server-Side Request Forgery (SSRF) allows an attacker to force the server to make requests to internal services, metadata APIs (like AWS/GCP metadata), or other restricted targets.

**Incorrect (unvalidated URL/target):**

```python
import requests

# Unsafe: user controls the URL directly
data_url = request.args.get('url')
response = requests.get(data_url)
# Attacker could pass: http://169.254.169.254/latest/meta-data/
```

**Correct (validation and allow-listing):**

```python
import requests
from urllib.parse import urlparse

ALLOWED_DOMAINS = ['trusted-source.com', 'api.company.com']

def safe_fetch(url):
    parsed = urlparse(url)
    if parsed.scheme not in ['http', 'https']:
        raise ValueError("Invalid scheme")
    
    if parsed.netloc not in ALLOWED_DOMAINS:
        raise ValueError("Domain not allowed")
    
    # Also check if IP is internal (optional but recommended)
    # ... IP validation logic ...
    
    return requests.get(url, timeout=5)

# PySpark Context: When reading from remote sources
# Ensure the source URL is validated or comes from a trusted configuration
trusted_s3_path = config.get('SOURCE_BUCKET_URL')
df = spark.read.parquet(f"{trusted_s3_path}/data.parquet")
```

**Prevention Strategies:**
- Use an allow-list of domains/IPs
- Validate URL scheme (enforce `https`)
- Block requests to internal/private IP ranges
- Disable HTTP redirections if possible
- Use dedicated service accounts with minimal permissions

**Tools:** Bandit, SonarQube, Semgrep