# R2 SQL API Reference

SQL syntax, functions, operators, and data types for R2 SQL queries.

## SQL Syntax

```sql
SELECT column_list | aggregation_function
FROM [namespace.]table_name
WHERE conditions
[GROUP BY column_list]
[HAVING conditions]
[ORDER BY column | aggregation_function [DESC | ASC]]
[LIMIT number]
```

## Schema Discovery

```sql
SHOW DATABASES;           -- List namespaces
SHOW NAMESPACES;          -- Alias for SHOW DATABASES
SHOW SCHEMAS;             -- Alias for SHOW DATABASES
SHOW TABLES IN namespace; -- List tables in namespace
DESCRIBE namespace.table; -- Show table schema, partition keys
```

## SELECT Clause

```sql
-- All columns
SELECT * FROM logs.http_requests;

-- Specific columns
SELECT user_id, timestamp, status FROM logs.http_requests;
```

**Limitations:** No column aliases, expressions, or nested column access

## WHERE Clause

### Operators

| Operator                        | Example                                                               |
| ------------------------------- | --------------------------------------------------------------------- |
| `=`, `!=`, `<`, `<=`, `>`, `>=` | `status = 200`                                                        |
| `LIKE`                          | `user_agent LIKE '%Chrome%'`                                          |
| `BETWEEN`                       | `timestamp BETWEEN '2025-01-01T00:00:00Z' AND '2025-01-31T23:59:59Z'` |
| `IS NULL`, `IS NOT NULL`        | `email IS NOT NULL`                                                   |
| `AND`, `OR`                     | `status = 200 AND method = 'GET'`                                     |

Use parentheses for precedence: `(status = 404 OR status = 500) AND method = 'POST'`

## Aggregation Functions

| Function                     | Description           |
| ---------------------------- | --------------------- |
| `COUNT(*)`                   | Count all rows        |
| `COUNT(column)`              | Count non-null values |
| `COUNT(DISTINCT column)`     | Count unique values   |
| `SUM(column)`, `AVG(column)` | Numeric aggregations  |
| `MIN(column)`, `MAX(column)` | Min/max values        |

```sql
-- Multiple aggregations with GROUP BY
SELECT region, COUNT(*), SUM(amount), AVG(amount)
FROM sales.transactions
WHERE sale_date >= '2024-01-01'
GROUP BY region;
```

## HAVING Clause

Filter aggregated results (after GROUP BY):

```sql
SELECT category, SUM(amount)
FROM sales.transactions
GROUP BY category
HAVING SUM(amount) > 10000;
```

## ORDER BY Clause

Sort results by:

- **Partition key columns** - Always supported
- **Aggregation functions** - Supported via shuffle strategy

```sql
-- Order by partition key
SELECT * FROM logs.requests ORDER BY timestamp DESC LIMIT 100;

-- Order by aggregation (repeat function, aliases not supported)
SELECT region, SUM(amount)
FROM sales.transactions
GROUP BY region
ORDER BY SUM(amount) DESC;
```

**Limitations:** Cannot order by non-partition columns. See [gotchas.md](gotchas.md#order-by-limitations)

## LIMIT Clause

```sql
SELECT * FROM logs.requests LIMIT 100;
```

| Setting | Value  |
| ------- | ------ |
| Min     | 1      |
| Max     | 10,000 |
| Default | 500    |

**Always use LIMIT** to enable early termination optimization.

## Data Types

| Type        | SQL Literal     | Example                  |
| ----------- | --------------- | ------------------------ |
| `integer`   | Unquoted number | `42`, `-10`              |
| `float`     | Decimal number  | `3.14`, `-0.5`           |
| `string`    | Single quotes   | `'hello'`, `'GET'`       |
| `boolean`   | Keyword         | `true`, `false`          |
| `timestamp` | RFC3339 string  | `'2025-01-01T00:00:00Z'` |
| `date`      | ISO 8601 date   | `'2025-01-01'`           |

### Type Safety

- Quote strings with single quotes: `'value'`
- Timestamps must be RFC3339: `'2025-01-01T00:00:00Z'` (include timezone)
- Dates must be ISO 8601: `'2025-01-01'` (YYYY-MM-DD)
- No implicit conversions

```sql
-- ✅ Correct
WHERE status = 200 AND method = 'GET' AND timestamp > '2025-01-01T00:00:00Z'

-- ❌ Wrong
WHERE status = '200'              -- string instead of integer
WHERE timestamp > '2025-01-01'    -- missing time/timezone
WHERE method = GET                -- unquoted string
```

## Query Result Format

JSON array of objects:

```json
[
  { "user_id": "user_123", "timestamp": "2025-01-15T10:30:00Z", "status": 200 },
  { "user_id": "user_456", "timestamp": "2025-01-15T10:31:00Z", "status": 404 }
]
```

## See Also

- [patterns.md](patterns.md) - Query examples and use cases
- [gotchas.md](gotchas.md) - SQL limitations and error handling
- [configuration.md](configuration.md) - Setup and authentication
