# vision-camera-resize-plugin A [VisionCamera](https://github.com/mrousavy/react-native-vision-camera) Frame Processor Plugin for fast and efficient Frame resizing, cropping and pixel-format conversion (YUV -> RGB) using GPU-acceleration, CPU-vector based operations and ARM NEON SIMD acceleration. ## Installation 1. Install [react-native-vision-camera](https://github.com/mrousavy/react-native-vision-camera) (>= 3.8.2) and [react-native-worklets-core](https://github.com/margelo/react-native-worklets-core) (>= 0.2.4) and make sure Frame Processors are enabled. 2. Install vision-camera-resize-plugin: ```sh yarn add vision-camera-resize-plugin cd ios && pod install ``` ## Usage Use the `resize` plugin within a Frame Processor: ```tsx const { resize } = useResizePlugin() const frameProcessor = useFrameProcessor((frame) => { 'worklet' const resized = resize(frame, { scale: { width: 192, height: 192 }, pixelFormat: 'rgb', dataType: 'uint8' }) const firstPixel = { r: resized[0], g: resized[1], b: resized[2] } }, []) ``` Or outside of a function component: ```tsx const { resize } = createResizePlugin() const frameProcessor = createFrameProcessor((frame) => { 'worklet' const resized = resize(frame, { // ... }) // ... }) ``` ## Pixel Formats The resize plugin operates in RGB colorspace.

Name	`0`	`1`	`2`	`3`
`rgb`	R	G	B	R
`rgba`	R	G	B	A
`argb`	A	R	G	B
`bgra`	B	G	R	A
`bgr`	B	G	R	B
`abgr`	A	B	G	R

## Data Types The resize plugin can either convert to uint8 or float32 values:

Name	JS Type	Value Range	Example size
`uint8`	`Uint8Array`	0...255	1920x1080 RGB Frame = ~6.2 MB
`float32`	`Float32Array`	0.0...1.0	1920x1080 RGB Frame = ~24.8 MB

## Cropping When scaling to a different size (e.g. 1920x1080 -> 100x100), the Resize Plugin performs a center-crop on the image before scaling it down so the resulting image matches the target aspect ratio instead of being stretched. You can customize this by passing a custom `crop` parameter, e.g. instead of center-crop, use the top portion of the screen: ```ts const resized = resize(frame, { scale: { width: 192, height: 192 }, crop: { y: 0, x: 0, // 1:1 aspect ratio because we scale to 192x192 width: frame.width, height: frame.width }, pixelFormat: 'rgb', dataType: 'uint8' }) ``` ### Performance If possible, use one of these two formats: - `argb` in `uint8`: Can be converted the fastest, but has an additional unused alpha channel. - `rgb` in `uint8`: Requires one more conversion step from `argb`, but uses 25% less memory due to the removed alpha channel. All other formats require additional conversion steps, and `float` models have additional memory overhead (4x as big). When using TensorFlow Lite, try to convert your model to use `argb-uint8` or `rgb-uint8` as it's input type. ## react-native-fast-tflite The vision-camera-resize-plugin can be used together with [react-native-fast-tflite](https://github.com/mrousavy/react-native-fast-tflite) to prepare the input tensor data. For example, to use the [efficientdet](https://www.kaggle.com/models/tensorflow/efficientdet/frameworks/tfLite) TFLite model to detect objects inside a Frame, simply add the model to your app's bundle, set up VisionCamera and react-native-fast-tflite, and resize your Frames accordingly. From the model's description on the website, we understand that the model expects 320 x 320 x 3 buffers as input, where the format is uint8 rgb. ```ts const objectDetection = useTensorflowModel(require('assets/efficientdet.tflite')) const model = objectDetection.state === "loaded" ? objectDetection.model : undefined const { resize } = useResizePlugin() const frameProcessor = useFrameProcessor((frame) => { 'worklet' const data = resize(frame, { scale: { width: 320, height: 320, }, pixelFormat: 'rgb', dataType: 'uint8' }) const output = model.runSync([data]) const numDetections = output[0] console.log(`Detected ${numDetections} objects!`) }, [model]) ``` ## Benchmarks I benchmarked vision-camera-resize-plugin on an iPhone 15 Pro, using the following code: ```tsx const start = performance.now() const result = resize(frame, { scale: { width: 100, height: 100, }, pixelFormat: 'rgb', dataType: 'uint8' }) const end = performance.now() const diff = (end - start).toFixed(2) console.log(`Resize and conversion took ${diff}ms!`) ``` And when running on 1080x1920 yuv Frames, I got the following results: ``` LOG Resize and conversion took 6.48ms LOG Resize and conversion took 6.06ms LOG Resize and conversion took 5.89ms LOG Resize and conversion took 5.97ms LOG Resize and conversion took 6.98ms ``` This means the Frame Processor can run at up to ~160 FPS. ## Adopting at scale

This library helped you? Consider sponsoring!

This library is provided _as is_, I work on it in my free time. If you're integrating vision-camera-resize-plugin in a production app, consider [funding this project](https://github.com/sponsors/mrousavy) and contact me to receive premium enterprise support, help with issues, prioritize bugfixes, request features, help at integrating vision-camera-resize-plugin and/or VisionCamera Frame Processors, and more. ## Contributing See the [contributing guide](CONTRIBUTING.md) to learn how to contribute to the repository and the development workflow. ## License MIT --- Made with [create-react-native-library](https://github.com/callstack/react-native-builder-bob)