# Image generation scripts

Two bash wrappers around the OpenAI and Google image APIs. Designed for use by the parent `image-generation` skill but can be run standalone.

## Setup

Both scripts require `curl`, `jq`, and `base64` (preinstalled on macOS). Make them executable once:

```bash
chmod +x ~/.claude/skills/image-generation/scripts/*.sh
```

Export API keys before running. Sources are documented in `~/.claude/projects/-Users-shaharshavit/memory/api-keys.md`:

```bash
export OPENAI_IMAGE_API_KEY='sk-proj-...'   # from "OpenAI (image generation)" section
export GEMINI_IMAGE_API_KEY='AQ.Ab8RN...'   # from "Google AI Studio (image generation)" section
```

## openai-image.sh

```bash
# Basic generation
./openai-image.sh \
  --prompt "minimalist black ceramic mug on marble, soft studio light" \
  --output ./generated-images/mug-hero.png \
  --quality high \
  --size 1024x1024

# Logo with transparent background
./openai-image.sh \
  --prompt "Logo brief: ..." \
  --output ./generated-images/logo.png \
  --quality high \
  --background transparent \
  --output-format png

# 4 ideation variants in one call (cheap)
./openai-image.sh \
  --prompt "..." \
  --output ./generated-images/explore.png \
  --quality low \
  --n 4

# Edit endpoint (presence of --ref switches modes)
./openai-image.sh \
  --prompt "Replace only the clothing with the navy suit. Preserve identity." \
  --output ./generated-images/edited.png \
  --ref ./model.png \
  --ref ./suit.png \
  --quality high \
  --input-fidelity high
```

## gemini-image.sh

```bash
# Default Flash 1K square
./gemini-image.sh \
  --prompt "..." \
  --output ./generated-images/test.png

# Pro 4K landscape hero
./gemini-image.sh \
  --prompt "..." \
  --output ./generated-images/hero.png \
  --model pro \
  --aspect 16:9 \
  --size 4K

# Multi-turn edit — pass previous output as reference
./gemini-image.sh \
  --prompt "Change only the sofa color to deep navy. Keep everything else exactly the same." \
  --output ./generated-images/edited.png \
  --model pro \
  --ref ./generated-images/original.png

# Brand-consistent variant with multiple references
./gemini-image.sh \
  --prompt "Image 1 is the logo. Image 2 is the brand color. Image 3 is the typography. Generate a launch hero..." \
  --output ./generated-images/branded-hero.png \
  --model pro \
  --aspect 16:9 \
  --size 4K \
  --ref ./brand/logo.png \
  --ref ./brand/colors.png \
  --ref ./brand/type.png

# Infographic with Google Search grounding (Pro only)
./gemini-image.sh \
  --prompt "Diagram of photosynthesis as a recipe..." \
  --output ./generated-images/photosynthesis.png \
  --model pro \
  --aspect 16:9 \
  --size 4K \
  --search

# Cheap exploration with Flash, thinking off
./gemini-image.sh \
  --prompt "..." \
  --output ./generated-images/explore.png \
  --model flash \
  --thinking minimal
```

## Behavior notes

- Both scripts print progress to stderr and the final saved path to stdout. Capture with `OUT=$(./openai-image.sh ...)` or pipe to `xargs open` to view immediately.
- OpenAI `--n 4` saves the first as `--output` and additional images as `--output` with `-2`, `-3`, `-4` appended before the extension.
- Gemini doesn't support `n>1` per call — use parallel calls for multiple variants. Or ask for "4 variations arranged in a 2×2 grid on one canvas" in the prompt and slice client-side.
- Errors from either API are echoed in JSON to stderr with exit code 1.
- The Gemini script only adds `thinkingConfig` when calling Flash; Pro thinks by default and doesn't accept the field.

## Quick test

```bash
mkdir -p /tmp/imagetest
./gemini-image.sh \
  --prompt "A red apple on a white plate, simple flat illustration." \
  --output /tmp/imagetest/apple.png \
  --model flash \
  --aspect 1:1 \
  --size 1K \
&& open /tmp/imagetest/apple.png
```