---
title: Always Use Parameterized Queries
impact: CRITICAL
impactDescription: prevents SQL and NoSQL injection attacks
tags: injection, sql, spark-sql, database, parameterized, security, python, pyspark
---

## Always Use Parameterized Queries

SQL injection is one of the top security vulnerabilities. Direct string concatenation allows attackers to execute arbitrary database commands, steal data, or destroy databases. In PySpark, this often happens when building Spark SQL queries with f-strings.

**Incorrect (string concatenation):**

```python
# SQL Injection vulnerability in standard SQL
user_id = request.args.get('id')
query = f"SELECT * FROM users WHERE id = '{user_id}'"
cursor.execute(query)

# SQL Injection vulnerability in PySpark SQL
table_name = "orders"
status = request.args.get('status')
# UNFAIR: f-string allows manipulation of the SQL structure
spark.sql(f"SELECT * FROM {table_name} WHERE status = '{status}'")
```

**Correct (parameterized queries):**

```python
# Parameterized query - standard DB-API
user_id = request.args.get('id')
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))

# Safe PySpark SQL (Spark 3.4+)
spark.sql(
    "SELECT * FROM {table} WHERE status = {status}",
    args={"table": "orders", "status": status}
)

# Safe PySpark using DataFrame API (Recommended)
from pyspark.sql.functions import col
status = request.args.get('status')
df = spark.table("orders").filter(col("status") == status)
```

**Benefits:**
- Prevents malicious code execution
- Improves query performance through plan caching
- Handles type escaping automatically

**Tools:** Bandit (B608), SonarQube, Semgrep
