# Label Studio Integration Guide

This document explains how to integrate Label Studio with Things Factory using subdomain-based cookie sharing for SSO authentication.

## Architecture Overview

The integration uses **subdomain cookie sharing** instead of proxying, which simplifies deployment and improves performance.

### Flow Diagram

```
┌─────────────────────────────────────────────────────────────────┐
│ 1. Frontend calls SSO setup endpoint                            │
│    GET /label-studio/sso/setup                                  │
│    Credentials: include                                         │
└────────────────────┬────────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────────┐
│ 2. Backend requests JWT from Label Studio                       │
│    POST http://label.example.com:8080/api/sso/token             │
│    Authorization: Token {apiToken}                              │
│    Body: { email: "user@example.com" }                          │
└────────────────────┬────────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────────┐
│ 3. Backend sets cookie with shared domain                       │
│    Set-Cookie: ls_auth_token={jwt}                              │
│    Domain: .example.com  ← Shared across all *.example.com      │
│    Path: /                                                      │
│    SameSite: Lax                                                │
└────────────────────┬────────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────────┐
│ 4. Frontend loads Label Studio in iframe                        │
│    <iframe src="http://label.example.com:8080/projects/1">      │
│    Cookie is automatically sent with request                    │
└─────────────────────────────────────────────────────────────────┘
```

## Configuration

### 1. Domain Setup

Both Things Factory and Label Studio must be accessible via subdomain pattern:

**Development (localhost):**
```
Things Factory:  http://app.dataset.localhost:3000
Label Studio:    http://label.dataset.localhost:8080
```

**Production:**
```
Things Factory:  https://app.example.com
Label Studio:    https://label.example.com
```

**Important:** Modern browsers automatically resolve `*.localhost` to `127.0.0.1`, so no `/etc/hosts` modification is needed for development.

### 2. Label Studio Configuration

Create or update `.env` file in Label Studio installation:

```bash
# JWT SSO Settings
JWT_SSO_SECRET=your-secret-key-here
JWT_SSO_ALGORITHM=HS256
JWT_SSO_TOKEN_PARAM=token
JWT_SSO_EMAIL_CLAIM=email
JWT_SSO_USERNAME_CLAIM=username
JWT_SSO_FIRST_NAME_CLAIM=first_name
JWT_SSO_LAST_NAME_CLAIM=last_name
JWT_SSO_AUTO_CREATE_USERS=true
JWT_SSO_COOKIE_NAME=ls_auth_token

# CORS and CSRF Settings
CSRF_TRUSTED_ORIGINS=http://app.dataset.localhost:3000,http://label.dataset.localhost:8080
ALLOWED_HOSTS=localhost,label.dataset.localhost,*.dataset.localhost

# CSP Settings - Allow iframe embedding
ENABLE_CSP=False

# Session Settings
SESSION_COOKIE_SAMESITE=Lax
SESSION_COOKIE_SECURE=0  # Set to 1 for HTTPS
SESSION_COOKIE_HTTPONLY=0
SESSION_COOKIE_PATH=/
```

### 3. Things Factory Configuration

Add Label Studio configuration to your Things Factory config file:

**config/default.json:**
```json
{
  "labelStudio": {
    "serverUrl": "http://label.dataset.localhost:8080",
    "apiToken": "your-label-studio-api-token",
    "cookieDomain": ".dataset.localhost"
  }
}
```

**config/production.json:**
```json
{
  "labelStudio": {
    "serverUrl": "https://label.example.com",
    "apiToken": "your-label-studio-api-token",
    "cookieDomain": ".example.com"
  }
}
```

### 4. Get Label Studio API Token

1. Start Label Studio and login as admin
2. Navigate to **Account & Settings** → **Access Token**
3. Click **Create new token**
4. Copy the token and add it to Things Factory configuration

## Usage

### Embedding Label Studio

Use the `<label-studio-wrapper>` component to embed Label Studio:

```typescript
import '@things-factory/integration-label-studio/client'

// In your component
render() {
  const projectId = 123
  const path = `/projects/${projectId}`

  return html`
    <label-studio-wrapper .path=${path}></label-studio-wrapper>
  `
}
```

### API Endpoints

The integration provides the following endpoints:

**SSO Setup:**
```
GET /label-studio/sso/setup
```
Sets up SSO authentication by acquiring JWT token and setting cookie.

**Health Check:**
```
GET /label-studio/sso/health
```
Returns Label Studio integration status.

**Configuration:**
```
GET /label-studio/sso/config
```
Returns current Label Studio configuration.

## Development Setup

### 1. Start Label Studio

```bash
cd /path/to/label-studio
label-studio start --host label.dataset.localhost --port 8080
```

### 2. Start Things Factory

```bash
cd /path/to/things-factory
DEBUG=things-factory:* yarn workspace @things-factory/your-app run serve:dev
```

### 3. Access Application

Open browser to:
```
http://app.dataset.localhost:3000
```

## Troubleshooting

### Cookie Not Set

**Symptom:** SSO setup succeeds but cookie doesn't appear in browser

**Solutions:**
1. Verify `cookieDomain` in Things Factory config matches your domain pattern
2. Check Label Studio `ALLOWED_HOSTS` includes your subdomain
3. Ensure both services use same root domain (e.g., both use `*.dataset.localhost`)

### CORS Errors

**Symptom:** Browser blocks requests with CORS errors

**Solutions:**
1. Add Things Factory domain to Label Studio's `CSRF_TRUSTED_ORIGINS`
2. Verify `ALLOWED_HOSTS` includes all subdomain patterns
3. Check `ENABLE_CSP=False` in Label Studio configuration

### Authentication Loop

**Symptom:** Label Studio keeps showing login page

**Solutions:**
1. Verify JWT SSO settings in Label Studio `.env`
2. Check cookie is being sent with iframe requests (browser DevTools → Network)
3. Ensure `JWT_SSO_COOKIE_NAME=ls_auth_token` matches implementation

### Token Expiration

**Symptom:** Users get logged out after 10 minutes

**Solution:**
Label Studio JWT tokens expire after 600 seconds (10 minutes) by default. The integration automatically handles re-authentication on page refresh.

## Security Considerations

### Production Checklist

- ✅ Use HTTPS for all services
- ✅ Set `SESSION_COOKIE_SECURE=1` in Label Studio
- ✅ Set `httpOnly: true` in SSO route cookie options
- ✅ Use strong `JWT_SSO_SECRET`
- ✅ Limit `ALLOWED_HOSTS` to specific domains
- ✅ Enable CSRF protection
- ✅ Use firewall rules to restrict Label Studio access

### Cookie Domain Best Practices

**Do:**
- Use `.example.com` for sharing across all subdomains
- Use specific subdomain pattern for restricted access

**Don't:**
- Use `.localhost` (it's a Public Suffix, browsers block it)
- Use broad TLDs like `.com` (browsers will reject it)

## Migration from Proxy Approach

If you're migrating from the previous proxy-based integration:

### Changes Required

1. **Configuration:**
   - Add `cookieDomain` to Label Studio config
   - Update Label Studio `.env` with CORS settings

2. **Frontend:**
   - No changes needed - component interface remains the same
   - Iframe URLs now point directly to Label Studio (not proxied)

3. **Nginx/Proxy:**
   - Remove proxy rules for `/label-studio/*` paths
   - Label Studio should be directly accessible on its subdomain

### Benefits of Subdomain Approach

- ✅ **Simpler deployment** - No complex proxy configuration
- ✅ **Better performance** - No proxy overhead
- ✅ **Easier debugging** - Direct access to Label Studio
- ✅ **Standard approach** - Follows common SSO patterns
- ✅ **Better caching** - Browser can cache Label Studio assets

## References

- [Label Studio SSO Documentation](https://labelstud.io/guide/auth_setup.html#JWT-SSO)
- [Cookie Domain Specification](https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies#define_where_cookies_are_sent)
- [Public Suffix List](https://publicsuffix.org/)
