# @nakamura196/ndl-koten-ocr-web

Web-based OCR library for ancient Japanese text recognition using ONNX models. This is a web port of NDL Koten OCR.

## Features

- 🎯 Automatic layout detection for historical Japanese documents
- 📝 High-accuracy text recognition for classical Japanese characters
- 🚀 Runs entirely in the browser using WebAssembly
- 📦 Includes pre-trained ONNX models
- 🔧 Simple API with TypeScript support

## Installation

```bash
npm install @nakamura196/ndl-koten-ocr-web
```

The package includes all necessary ONNX models (約78MB), so installation may take a moment.

### Model Files

Model files are included in the package and loaded automatically from `node_modules`. No manual setup required!

## Quick Start

```typescript
import { NDLKotenOCR } from '@nakamura196/ndl-koten-ocr-web';

// Initialize OCR engine
const ocr = new NDLKotenOCR();
await ocr.init(); // Simple initialization with defaults

// Process an image
const image = document.getElementById('myImage') as HTMLImageElement;
const result = await ocr.process(image);

// Access results
console.log(result.text); // Extracted text
console.log(result.json); // Structured data with bounding boxes
console.log(result.xml); // XML format output
```

## Advanced Usage

### Custom Model Path

If you're serving models from a different location:

```typescript
await ocr.init({
  modelPath: '/static/models/', // Custom model directory
  progressCallback: (progress, message) => {
    console.log(`${progress}% - ${message}`);
  }
});
```

### Model Size

Currently includes small models only:

```typescript
// Small models (default)
await ocr.init({ modelSize: 'small' });
```

Note: Large models are defined in the code but not included in the current package.

### Processing Options

```typescript
const result = await ocr.process(image, {
  imageName: 'page_001',  // Optional: name for the image
  onProgress: (progress, message) => {  // Optional: progress callback during processing
    console.log(`Processing: ${progress * 100}% - ${message}`);
  }
});
```

### Manual Initialization (Advanced)

For complete control over model loading:

```typescript
await ocr.initialize(
  '/models/rtmdet-s-1280x1280.onnx',    // Layout detection model
  {},                                     // Layout config
  '/models/ndl.yaml',                     // Layout config file
  '/models/parseq-ndl-32x384-tiny-10.onnx', // Text recognition model
  {},                                     // Recognition config
  '/models/NDLmoji.yaml',                 // Character list config
  (progress, message) => {                // Progress callback
    console.log(`${progress}% - ${message}`);
  }
);
```

## Integration Examples

### Next.js / Vercel

```typescript
// components/OCRComponent.tsx
import { NDLKotenOCR } from '@nakamura196/ndl-koten-ocr-web';

const ocr = new NDLKotenOCR();

// Initialize with default settings (models from node_modules)
await ocr.init();

// Or specify custom path if needed
await ocr.init({
  modelPath: '/node_modules/@nakamura196/ndl-koten-ocr-web/models/'
});
```

### React Component

```tsx
import { useState, useEffect } from 'react';
import { NDLKotenOCR } from '@nakamura196/ndl-koten-ocr-web';

function OCRComponent() {
  const [ocr, setOcr] = useState<NDLKotenOCR | null>(null);
  const [isLoading, setIsLoading] = useState(true);

  useEffect(() => {
    const initOCR = async () => {
      const ocrInstance = new NDLKotenOCR();
      await ocrInstance.init({
        progressCallback: (progress, message) => {
          console.log(`Loading: ${progress}% - ${message}`);
        }
      });
      setOcr(ocrInstance);
      setIsLoading(false);
    };
    
    initOCR();
  }, []);

  const processImage = async (file: File) => {
    if (!ocr) return;
    
    const img = new Image();
    img.src = URL.createObjectURL(file);
    await img.decode();
    
    const result = await ocr.process(img);
    console.log('OCR Result:', result.text);
  };

  return (
    <div>
      {isLoading ? (
        <p>Loading OCR engine...</p>
      ) : (
        <input type="file" onChange={(e) => {
          if (e.target.files?.[0]) {
            processImage(e.target.files[0]);
          }
        }} />
      )}
    </div>
  );
}
```

## Output Formats

The OCR results are available in multiple formats:

### Text Format
```javascript
result.text // Plain text output
```

### JSON Format
```javascript
result.json // Structured data with coordinates
// {
//   document: {
//     image: {
//       text: [
//         { x: 100, y: 200, width: 50, height: 30, text: "文字" }
//       ]
//     }
//   }
// }
```

### XML Format
```javascript
result.xml // XML formatted output
```

## Advanced Features

### TEI/XML Conversion

Convert OCR results to TEI (Text Encoding Initiative) format:

```javascript
import { NDLKotenOCR, TEIConverter, TEIConversionData } from '@nakamura196/ndl-koten-ocr-web';

const ocr = new NDLKotenOCR();
await ocr.init();

// Process multiple images
const results = [];
for (const image of images) {
  const result = await ocr.process(image);
  results.push({
    ...result,
    imageName: image.name,
    imageWidth: image.width,
    imageHeight: image.height
  });
}

// Convert to TEI/XML
const teiConverter = new TEIConverter();
const teiData: TEIConversionData = {
  title: 'My Document',
  sourceUrl: 'https://example.com/manifest.json',
  results: results
};

const teiXml = teiConverter.convertOCRResults(teiData);
console.log(teiXml);
```

### IIIF Manifest Processing

Process images directly from IIIF manifests:

```javascript
import { NDLKotenOCR, IIIFProcessor } from '@nakamura196/ndl-koten-ocr-web';

// Initialize OCR engine
const ocr = new NDLKotenOCR();
await ocr.init();

// Create IIIF processor
const iiifProcessor = new IIIFProcessor(ocr);

// Process a IIIF manifest
const manifestUrl = 'https://example.com/manifest.json';
const { results, teiXml, manifest } = await iiifProcessor.processManifestUrl(
  manifestUrl,
  {
    maxImages: 10, // Process only first 10 images
    onImageProgress: (index, progress, message) => {
      console.log(`Image ${index + 1}: ${progress}% - ${message}`);
    }
  }
);

// results: Array of OCR results for each image
// teiXml: Complete TEI/XML document
// manifest: The parsed IIIF manifest

console.log('Processed', results.length, 'images');
console.log('TEI/XML:', teiXml);
```

### Processing Local Files with TEI Export

```javascript
import { NDLKotenOCR, TEIConverter } from '@nakamura196/ndl-koten-ocr-web';

const ocr = new NDLKotenOCR();
await ocr.init();

// Process files and generate TEI
async function processFilesWithTEI(files: File[]) {
  const results = [];

  for (const file of files) {
    const img = new Image();
    img.src = URL.createObjectURL(file);
    await img.decode();

    const result = await ocr.process(img, {
      imageName: file.name
    });

    results.push({
      ...result,
      imageName: file.name,
      imageWidth: img.naturalWidth,
      imageHeight: img.naturalHeight
    });
  }

  // Convert to TEI/XML
  const teiConverter = new TEIConverter();
  const teiXml = teiConverter.convertOCRResults({
    title: 'Batch OCR Results',
    results: results
  });

  // Download as file
  const blob = new Blob([teiXml], { type: 'text/xml' });
  const url = URL.createObjectURL(blob);
  const a = document.createElement('a');
  a.href = url;
  a.download = 'ocr-results.xml';
  a.click();
}
```

## Model Information

This package includes the following pre-trained models:

- **Layout Detection**: RTMDet-S (1280x1280) - Detects text regions in document images
- **Text Recognition**: PARSEQ-NDL (32x384-tiny-10) - Recognizes classical Japanese characters
- **Character Set**: NDLmoji - Comprehensive classical Japanese character mappings

Models are based on the work by:
- National Diet Library (NDL)
- Yuta Hashimoto (@yuta1984)

## Browser Compatibility

- Chrome 90+
- Firefox 89+
- Safari 15.4+
- Edge 90+

Requires WebAssembly and Web Workers support.

## Development

### Building from Source

```bash
git clone https://github.com/yuta1984/ndlkotenocr-lite-web
cd ndlkotenocr-lite-web/packages/ndl-koten-ocr-core
npm install
npm run build
```

### Running Tests

```bash
npm test
```

## Troubleshooting

### Models Not Loading
Ensure your web server is configured to serve `.onnx` files with the correct MIME type:
```
application/octet-stream
```

### CORS Issues
If serving models from a CDN, ensure CORS headers are properly configured.

### Memory Issues
For large images, consider resizing before processing:
```javascript
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
canvas.width = Math.min(image.width, 2000);
canvas.height = Math.min(image.height, 2000);
ctx.drawImage(image, 0, 0, canvas.width, canvas.height);
const result = await ocr.process(canvas);
```

## License

MIT

## Credits

This is a web port of [NDL Koten OCR](https://github.com/ndl-lab/ndlkotenocr) developed by:
- Original implementation: National Diet Library (NDL Lab)
- Web port: Yuta Hashimoto ([@yuta1984](https://github.com/yuta1984))
- npm package: Satoru Nakamura ([@nakamura196](https://github.com/nakamura196))

## Related Projects

- [NDL Koten OCR (Original)](https://github.com/ndl-lab/ndlkotenocr)
- [NDL Koten OCR Web Demo](https://github.com/yuta1984/ndlkotenocr-lite-web)

## Support

For issues and questions:
- [GitHub Issues](https://github.com/yuta1984/ndlkotenocr-lite-web/issues)
- [npm Package](https://www.npmjs.com/package/@nakamura196/ndl-koten-ocr-web)