# RTZ-IO

TypeScript library for RTZ file format serialization and persistence. RTZ-IO provides a comprehensive solution for creating, packing, unpacking, and managing RTZ files with support for encryption, assets, and metadata validation.

## Features

- **Working State Management**: Initialize and validate RTZ working states
- **Serialization**: Pack working states into compressed RTZ files
- **Deserialization**: Unpack RTZ files back to working states
- **Encryption**: Optional AES-256-GCM encryption with key management
- **Assets**: Support for binary assets and file attachments
- **Validation**: Comprehensive data validation and integrity checking
- **CLI Tools**: Command-line interface for common operations
- **TypeScript**: Full TypeScript support with type definitions

## Installation

```bash
npm install rtz-io
```

## Quick Start

### Basic Usage

```typescript
import { 
  initWorkingState, 
  workingToPersistence, 
  persistenceToWorking 
} from 'rtz-io';

// Create a new working state
const workingState = initWorkingState();
workingState.header.title = 'My RTZ Document';
workingState.content.data = { message: 'Hello, RTZ!' };

// Pack to RTZ file
const buffer = await workingToPersistence(workingState);

// Unpack from RTZ file
const restored = await persistenceToWorking(buffer);
```

### With Encryption

```typescript
import { workingToPersistence, persistenceToWorking, generateKey } from 'rtz-io';

// Generate encryption key
const key = generateKey(32);
const keyProvider = async (keyId: string) => key;

// Pack with encryption
const encrypted = await workingToPersistence(workingState, {
  encrypt: true,
  keyProvider,
  keyId: 'my-key'
});

// Unpack with decryption
const restored = await persistenceToWorking(encrypted, { keyProvider });
```

### With Assets

```typescript
import fs from 'fs';

const workingState = initWorkingState();

// Add assets
workingState.content.assets = [
  {
    id: 'image-1',
    name: 'photo.jpg',
    type: 'image',
    size: imageData.length,
    checksum: calculateHash(imageData),
    data: imageData
  }
];

const buffer = await workingToPersistence(workingState, { includeAssets: true });
```

## API Reference

### Core Functions

#### `initWorkingState(): WorkingState`
Creates a new WorkingState object with default values.

#### `workingToPersistence(state: WorkingState, options?: PersistOptions): Promise<Buffer>`
Converts a WorkingState to RTZ file format.

**Options:**
- `encrypt?: boolean` - Enable AES-256-GCM encryption
- `keyProvider?: (keyId: string) => Promise<Buffer>` - Key provider function
- `keyId?: string` - Key identifier for encryption
- `compression?: 'none' | 'gzip' | 'deflate'` - Compression method
- `includeAssets?: boolean` - Include asset data in the archive

#### `persistenceToWorking(buffer: Buffer, options?: LoadOptions): Promise<WorkingState>`
Converts RTZ file format back to WorkingState.

**Options:**
- `keyProvider?: (keyId: string) => Promise<Buffer>` - Key provider for decryption
- `validateSignature?: boolean` - Verify manifest signatures
- `extractAssets?: boolean` - Extract asset data

#### `validateWorkingState(state: any): asserts state is WorkingState`
Validates that an object conforms to the WorkingState interface.

### Utility Functions

#### `canonicalizeJson(obj: any): string`
Creates a canonical JSON representation for consistent serialization.

#### `calculateHash(data: Buffer | string): string`
Calculates SHA-256 hash of data.

#### `generateKey(length?: number): Buffer`
Generates a random encryption key (default: 32 bytes for AES-256).

#### `generateId(): string`
Generates a random UUID v4.

### Types

```typescript
interface WorkingState {
  header: Header;
  content: Content;
  history: HistoryEntry[];
  usagePolicy: UsagePolicy;
}

interface Header {
  version: string;
  created: string;
  modified: string;
  id: string;
  title?: string;
  description?: string;
  tags?: string[];
  metadata?: Record<string, any>;
}

interface Content {
  type: string;
  data: any;
  encoding?: string;
  checksum?: string;
  assets?: Asset[];
}

interface Asset {
  id: string;
  name: string;
  type: string;
  size: number;
  checksum: string;
  path?: string;
  data?: Buffer;
}
```

## CLI Usage

The package includes a command-line tool for common RTZ operations:

```bash
# Initialize a new RTZ working state
rtz init ./my-project

# Pack a directory into an RTZ file
rtz pack ./my-project project.rtz

# Unpack an RTZ file
rtz unpack project.rtz ./extracted

# Show RTZ file information
rtz info project.rtz

# Generate an encryption key
rtz --generate-key ./my-key.bin

# Pack with encryption
rtz pack ./my-project project.rtz --encrypt --key-file ./my-key.bin

# Unpack encrypted file
rtz unpack project.rtz ./extracted --key-file ./my-key.bin
```

## JSON Splitting and Partitioning

RTZ-IO includes powerful JSON splitting capabilities that allow you to partition any JSON object into two files:
1. A "plain" file with sensitive fields replaced by UUID references
2. An "encrypted" file containing the original sensitive values keyed by UUIDs

This is useful for scenarios where you need to separate sensitive data for security, storage, or transmission purposes.

### Basic JSON Splitting

```typescript
import { splitJsonToFiles, mergeJsonFromFiles, CommonPredicates } from 'rtz-io';

const sensitiveData = {
  name: 'John Doe',
  email: 'john@example.com',
  password_enc: 'secret123',
  profile: {
    ssn_enc: '123-45-6789',
    credit_card_enc: '4111-1111-1111-1111',
    public_info: 'This is public'
  }
};

// Split into two files
const result = await splitJsonToFiles(sensitiveData, {
  plainFile: './output/plain.json',
  encryptedFile: './output/encrypted.json',
  predicate: CommonPredicates.encryptedFields // Matches fields ending with '_enc'
});

console.log(`Encrypted ${result.stats.totalEncryptedFields} fields`);
console.log(`Total size: ${result.sizes.totalBytes} bytes`);

// Later, reconstruct the original data
const reconstructed = await mergeJsonFromFiles({
  plainFile: './output/plain.json',
  encryptedFile: './output/encrypted.json'
});

// reconstructed === sensitiveData (deep equality)
```

### Custom Predicates

You can define custom logic for determining which fields should be encrypted:

```typescript
import { partitionJson, PartitionPredicates } from 'rtz-io';

// Custom predicate: encrypt fields containing 'secret' or 'password'
const customPredicate = (path: string, value: any) => {
  const fieldName = path.split('.').pop() || '';
  return /(?:secret|password)/i.test(fieldName);
};

// Built-in predicate utilities
const predicates = {
  // Fields ending with specific suffixes
  encryptedFields: PartitionPredicates.fieldEndsWith('_enc'),
  
  // Fields matching regex patterns
  sensitiveFields: PartitionPredicates.fieldMatches(/(?:password|secret|key|token)/i),
  
  // Specific field paths
  specificPaths: PartitionPredicates.specificPaths(['user.ssn', 'payment.card_number']),
  
  // Values matching patterns (e.g., credit card numbers)
  creditCards: PartitionPredicates.valueMatches([/^\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}$/])
};
```

### Common Predicate Types

The library includes pre-built predicates for common sensitive data patterns:

```typescript
import { CommonPredicates } from 'rtz-io';

const predicates = {
  // Fields ending with '_enc', '_encrypted', '_secret'
  encryptedFields: CommonPredicates.encryptedFields,
  
  // Fields containing 'password', 'secret', 'key', 'token'
  sensitiveFields: CommonPredicates.sensitiveFields,
  
  // Email addresses
  emails: CommonPredicates.emails,
  
  // Credit card numbers
  creditCards: CommonPredicates.creditCards,
  
  // Social Security Numbers (US format)
  ssn: CommonPredicates.ssn,
  
  // Phone numbers (various formats)
  phoneNumbers: CommonPredicates.phoneNumbers,
  
  // Combine multiple predicates
  combined: CommonPredicates.anyOf(
    CommonPredicates.encryptedFields,
    CommonPredicates.creditCards,
    CommonPredicates.ssn
  )
};
```

### Advanced Features

#### File Validation

```typescript
import { validateJsonSplitFiles } from 'rtz-io';

const validation = await validateJsonSplitFiles({
  plainFile: './plain.json',
  encryptedFile: './encrypted.json'
});

if (validation.valid) {
  console.log('Files are valid and can be merged');
} else {
  console.log('Validation errors:', validation.errors);
}
```

#### Backup Creation

```typescript
import { backupJsonSplitFiles } from 'rtz-io';

const backupResult = await backupJsonSplitFiles({
  plainFile: './plain.json',
  encryptedFile: './encrypted.json'
}, './backups');

console.log('Backups created:', backupResult);
```

#### Low-level Partitioning

For more control, use the low-level partitioning API:

```typescript
import { partitionJson, mergeJson, validatePartition, getPartitionStats } from 'rtz-io';

// Partition in memory (no file I/O)
const { plain, encryptedMap } = partitionJson(data, predicate);

// Get statistics
const stats = getPartitionStats({ plain, encryptedMap });
console.log(`Encrypted ${stats.totalEncryptedFields} fields`);

// Validate partition integrity
const isValid = validatePartition(originalData, { plain, encryptedMap });

// Merge back to original
const reconstructed = mergeJson(plain, encryptedMap);
```

### CLI Commands for JSON Splitting

The CLI includes commands for JSON splitting operations:

```bash
# Split a JSON file using encrypted fields predicate
rtz split-json data.json plain.json encrypted.json encrypted-fields

# Split using different predicates
rtz split-json data.json plain.json encrypted.json sensitive-fields
rtz split-json data.json plain.json encrypted.json emails
rtz split-json data.json plain.json encrypted.json credit-cards

# Merge split files back together
rtz merge-json plain.json encrypted.json reconstructed.json
```

Available predicate types for CLI:
- `encrypted-fields` - Fields ending with `_enc`, `_encrypted`, `_secret`
- `sensitive-fields` - Fields containing `password`, `secret`, `key`, `token`
- `emails` - Email address values
- `credit-cards` - Credit card number values  
- `ssn` - Social Security Number values
- `phone-numbers` - Phone number values

### JSON Splitting API Reference

#### `splitJsonToFiles(data, options): Promise<JsonSplitResult>`

Splits a JSON object into two files.

**Parameters:**
- `data: any` - The JSON object to split
- `options: JsonSplitOptions` - Configuration options

**Options:**
- `plainFile: string` - Path for the plain JSON file (with UUID references)
- `encryptedFile: string` - Path for the encrypted JSON file (sensitive values)
- `predicate: PartitionPredicate` - Function to determine which fields to encrypt
- `createDirs?: boolean` - Create directories if they don't exist (default: true)
- `prettyPrint?: boolean` - Pretty print JSON with indentation (default: true)
- `validate?: boolean` - Validate the split operation (default: true)

**Returns:**
- `JsonSplitResult` - Object containing statistics and file information

#### `mergeJsonFromFiles(options): Promise<any>`

Merges JSON data from two files to reconstruct the original object.

**Parameters:**
- `options: JsonMergeOptions` - Configuration options

**Options:**
- `plainFile: string` - Path to the plain JSON file
- `encryptedFile: string` - Path to the encrypted JSON file
- `validate?: boolean` - Validate the merge operation (default: true)

**Returns:**
- `any` - The reconstructed original object
````
