A VisionCamera Frame Processor Plugin to preform text detection on images using MLKit Vision Text Recognition

Overview

vision-camera-ocr

A VisionCamera Frame Processor Plugin to preform text detection on images using MLKit Vision Text Recognition.

Installation

yarn add vision-camera-ocr
cd ios && pod install

Add the plugin to your babel.config.js:

module.exports = {
  plugins: [
    [
      'react-native-reanimated/plugin',
      {
        globals: ['__scanOCR'],
      },
    ],

    // ...

Note: You have to restart metro-bundler for changes in the babel.config.js file to take effect.

Usage

{ 'worklet'; const scannedOcr = scanOCR(frame); }, []);">
import { labelImage } from "vision-camera-image-labeler";

// ...
const frameProcessor = useFrameProcessor((frame) => {
  'worklet';
  const scannedOcr = scanOCR(frame);
}, []);

Data

scanOCR(frame) returns an OCRFrame with the following data shape. See the example for how to use this in your app.

 OCRFrame = {
   result: {
     text: string, // Raw result text
     blocks: Block[], // Each recognized element broken into blocks
   ;
};

The text object closely resembles the object documented in the MLKit documents. https://developers.google.com/ml-kit/vision/text-recognition#text_structure

The Text Recognizer segments text into blocks, lines, and elements. Roughly speaking:

a Block is a contiguous set of text lines, such as a paragraph or column,

a Line is a contiguous set of words on the same axis, and

an Element is a contiguous set of alphanumeric characters ("word") on the same axis in most Latin languages, or a character in others

Contributing

See the contributing guide to learn how to contribute to the repository and the development workflow.

License

MIT

Comments
  •  Issue with frameProcessor - TypeError: Cannot read property 'scanOCR' of undefined

    Issue with frameProcessor - TypeError: Cannot read property 'scanOCR' of undefined

    I'm trying to use this frame processor, everything works fine, it scans the text, but, when i try to use the text, reanimated asks me to run the code on runOnJS, when i simply type. import { runOnJS } from "react-native-reanimated"; The app crashes with this error. TypeError: Cannot read property 'scanOCR' of undefined

    This is the import order on the component:

    import 'react-native-reanimated';
    import { runOnJS } from "react-native-reanimated";
    import { scanOCR } from "vision-camera-ocr";
    

    Package Json

    "react": "17.0.2",
    "react-native": "0.65.1",
    "vision-camera-ocr": "^1.0.0",
    "react-native-vision-camera": "^2.12.0",
    "react-native-reanimated": "^2.3.0-beta.2",
    
    opened by josumonk 5
  • Property '__scanOCR' doesn't exist

    Property '__scanOCR' doesn't exist

    Hi, i am trying to get start with video-camera-ocr project, but with no luck. When I wan't to use scanOCR(frame) function, I always get error: Property '__scanOCR' doesn't exist. I am using yarn to install.

    versions:

    iOS: 16.0.3
    xCode: 14.0.1
    

    my dependencies:

    "react-native": "0.70.3",
    "react-native-reanimated": "^2.11.0",
    "react-native-vision-camera": "^2.15.1",
    "vision-camera-ocr": "^1.0.0",
    

    my babel.config.js:

    module.exports = {
        presets: ['module:metro-react-native-babel-preset'],
        plugins: [
            [
                'react-native-reanimated/plugin',
                {
                    globals: ['__scanOCR', '__labelImage', ],
                },
            ],
        ],
    }
    

    Whole error:

    Property '__scanOCR' doesn't exist
    
    ReferenceError: Property '__scanOCR' doesn't exist
        at scanOCR (/Users/Developer/test/node_modules/vision-camera-ocr/src/index.tsx (49:7):1:33)
        at scanOCR (native)
        at _f (/Users/Developer/test/components/ScanScreen.js (36:45):1:77)
        at _f (native)
        at _f (/Users/Developer/test/node_modules/react-native-vision-camera/src/hooks/useFrameProcessor.ts (28:21):1:425)
        at _f (native)
    
    reanimated::REAIOSErrorHandler::raiseSpec()
        REAIOSErrorHandler.mm:18
    reanimated::ErrorHandler::raise()::'lambda'()::operator()()
    decltype(static_cast<reanimated::ErrorHandler::raise()::'lambda'()&>(fp)()) std::__1::__invoke<reanimated::ErrorHandler::raise()::'lambda'()&>(reanimated::ErrorHandler::raise()::'lambda'()&)
    void std::__1::__invoke_void_return_wrapper<void, true>::__call<reanimated::ErrorHandler::raise()::'lambda'()&>(reanimated::ErrorHandler::raise()::'lambda'()&)
    std::__1::__function::__alloc_func<reanimated::ErrorHandler::raise()::'lambda'(), std::__1::allocator<reanimated::ErrorHandler::raise()::'lambda'()>, void ()>::operator()()
    std::__1::__function::__func<reanimated::ErrorHandler::raise()::'lambda'(), std::__1::allocator<reanimated::ErrorHandler::raise()::'lambda'()>, void ()>::operator()()
    std::__1::__function::__value_func<void ()>::operator()() const
    std::__1::function<void ()>::operator()() const
    invocation function for block in vision::VisionCameraScheduler::scheduleOnUI(std::__1::function<void ()>)
    C663D847-B94F-3FB0-9254-32EDBC55315E
    C663D847-B94F-3FB0-9254-32EDBC55315E
    C663D847-B94F-3FB0-9254-32EDBC55315E
    C663D847-B94F-3FB0-9254-32EDBC55315E
    C663D847-B94F-3FB0-9254-32EDBC55315E
    _pthread_wqthread
    start_wqthread
    
    

    I tried just clone example from this repository, same problem... I tried other preprocessors (for example Image labeler and it's OK).

    Thank you.

    opened by vitzaoral 3
  • adding top and left to bounding frame return

    adding top and left to bounding frame return

    Hi! 👋

    Firstly, thanks for your work on this project! 🙂

    Today I used patch-package to patch [email protected] for the project I'm working on.

    adding top and left to bounding frame return

    Here is the diff that solved my problem:

    diff --git a/node_modules/vision-camera-ocr/android/src/main/java/com/visioncameraocr/OCRFrameProcessorPlugin.kt b/node_modules/vision-camera-ocr/android/src/main/java/com/visioncameraocr/OCRFrameProcessorPlugin.kt
    index 8ae6279..1e22c03 100644
    --- a/node_modules/vision-camera-ocr/android/src/main/java/com/visioncameraocr/OCRFrameProcessorPlugin.kt
    +++ b/node_modules/vision-camera-ocr/android/src/main/java/com/visioncameraocr/OCRFrameProcessorPlugin.kt
    @@ -88,6 +88,8 @@ class OCRFrameProcessorPlugin: FrameProcessorPlugin("scanOCR") {
             if (boundingBox != null) {
                 frame.putDouble("x", boundingBox.exactCenterX().toDouble())
                 frame.putDouble("y", boundingBox.exactCenterY().toDouble())
    +            frame.putInt("left", boundingBox.left)
    +            frame.putInt("top", boundingBox.top)
                 frame.putInt("width", boundingBox.width())
                 frame.putInt("height", boundingBox.height())
                 frame.putInt("boundingCenterX", boundingBox.centerX())
    diff --git a/node_modules/vision-camera-ocr/lib/typescript/index.d.ts b/node_modules/vision-camera-ocr/lib/typescript/index.d.ts
    index 47f1816..a3b6c3f 100644
    --- a/node_modules/vision-camera-ocr/lib/typescript/index.d.ts
    +++ b/node_modules/vision-camera-ocr/lib/typescript/index.d.ts
    @@ -2,6 +2,8 @@ import type { Frame } from 'react-native-vision-camera';
     declare type BoundingFrame = {
         x: number;
         y: number;
    +    top: number;
    +    left: number;
         width: number;
         height: number;
         boundingCenterX: number;
    @@ -41,4 +43,5 @@ export declare type OCRFrame = {
      * Scans OCR.
      */
     export declare function scanOCR(frame: Frame): OCRFrame;
    -export {};
    +export { };
    +
    diff --git a/node_modules/vision-camera-ocr/src/index.tsx b/node_modules/vision-camera-ocr/src/index.tsx
    index b4eeb76..608828a 100644
    --- a/node_modules/vision-camera-ocr/src/index.tsx
    +++ b/node_modules/vision-camera-ocr/src/index.tsx
    @@ -4,6 +4,8 @@ import type { Frame } from 'react-native-vision-camera';
     type BoundingFrame = {
       x: number;
       y: number;
    +  top: number;
    +  left: number;
       width: number;
       height: number;
       boundingCenterX: number;
    

    This issue body was partially generated by patch-package.

    opened by mat2718 0
  • Move GIF to better spot in README

    Move GIF to better spot in README

    Great library!

    When going through installation, the GIF placement just made it a bit hard to read the docs, so I just moved it above text. Hope that's ok!

    I also noticed the example was importing the wrong function from the wrong library, but I think that's fixed in #7

    opened by robinheinze 0
  • Does not work on iOS 14 (but OK on 15+)

    Does not work on iOS 14 (but OK on 15+)

    The result is always empty on iOS 14. This is even true for the bundled example project. I did try everything on a fresh project though. I mixed a few versions but nothing changes the outcome. Everything just fine on iOS 15.7, but just no results on 14.8.1.

    It could also be an issue with MLKit itself. I did quite some research but didn't find anything related to MLKit + iOS 14.

    opened by MSchmidt 2
  • ScanOCR is returning `undefined`

    ScanOCR is returning `undefined`

    Hi @aarongrider,

    My project suddenly started to fail when attempting to OCR vision camera frames. It throws Frame Processor threw an error: Cannot read property 'result' of undefined... my code (removed all not related code to the issue):

    import 'react-native-reanimated';
    import {runOnJS} from 'react-native-reanimated';
    import {scanOCR} from 'vision-camera-ocr';
    import {
      Camera,
      useCameraDevices,
      useFrameProcessor,
    } from 'react-native-vision-camera';
    
    const frameProcessor = useFrameProcessor(frame => {
      'worklet';
      const result = scanOCR(frame).result;
      if (result.text.length > 0) {
        runOnJS(processText)(result.text);
      }
    }, []);
    

    scanOCR(frame) is returning undefined instead of a OCR result... the weird thing is i haven't changed my code at all from when it was working.

    I ran the example project on the repo and it has the same issue.

    my dependencies:

    "react-native": "0.70.0",
    "react-native-reanimated": "^2.10.0",
    "react-native-vision-camera": "^2.14.1",
    "vision-camera-ocr": "^1.0.0",
    

    my babel.config.js:

    module.exports = {
      presets: ['module:metro-react-native-babel-preset'],
      plugins: [
        [
          'react-native-reanimated/plugin',
          {
            globals: ['__scanOCR'],
          },
        ],
      ],
    };
    

    if you need the complete component code let me know. As im new to React Native I don't have the knowledge to check the library to find the issue

    Thanks for the awesome library!

    opened by mihailpozarski 0
  • Text Recognition Bounds Units

    Text Recognition Bounds Units

    Hello. Currently I am using Text Recognition using react-native-camera library and it returns bounds for x, y coordinates which I can filter for limiting scan area, it works perfectly fine.

    But vision-camera-ocr is giving me different coordinates for scanning in the same location in the camera. I have try to use pixel density for converting corner points to dp but that also seems wrong bounds.

    Can anybody provide me the information about how can i filter the bounds for limiting the scan area or frame processors give the coordinates in which units.

    I am stuck with this for many days, any help would be appreciated.

    opened by codeapp17 4
Owner
Aaron Grider
Pixel perfect, cross platform tech
Aaron Grider
Fast computer vision library for SFM, calibration, fiducials, tracking, image processing, and more.

Table of Contents Introduction Cloning Repository Quick Start Gradle and Maven Building from Source Dependencies Help/Contact Introduction BoofCV is a

Peter Abeles 916 Jan 6, 2023
Roman Beskrovnyi 250 Jan 9, 2023
Creates ASCII art in Java from Images

Creates ASCII art in Java from Images. It can also save the ASCII art as image (.png) as well

Navjot Singh Rakhra 4 Jul 12, 2022
Open source Picture to text, text to Picture app

Pic SMS App Pic SMS is a free open source app. With Pic SMS, you can: convert pictures into text parts and send as SMS convert text parts into a pictu

Kaung Khant Kyaw 17 Feb 8, 2022
Diagrams as code is a term used for storing the source of a diagram image as a text file.

Diagrams as code Diagrams as code is a term used for storing the source of a diagram image as a text file. Examples are architecture diagrams, or diag

null 26 Nov 21, 2022
Million+ point universal gravity simulation using OpenGL and OpenCL

Universe Simulation on GPU A multi-million particle gravity simulation. The main program is org.davu.app.Space.main See each package.html for code det

David Uselmann 2 Jan 31, 2022
VisionCamera Frame Processor Plugin to label images using MLKit Vision

vision-camera-image-labeler A VisionCamera Frame Processor Plugin to label images using MLKit Vision Image Labeling. Installation npm install vision-c

Marc Rousavy 72 Dec 20, 2022
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

ANTLR v4 Build status ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating

Antlr Project 13.6k Dec 28, 2022
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

ANTLR v4 Build status ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating

Antlr Project 13.6k Jan 3, 2023
📺 Streaming OBS video/Mjpeg into maps on item frames at a high frame rate

MakiScreen Mjpeg ?? Streaming OBS video/Mjpeg into maps on item frames at a high frame rate images taken on TotalFreedom: play.totalfreedom.me How doe

null 4 Apr 8, 2022
A client mod that changes the debug frame graphs to use a dark theme

Dark Graph A client mod that changes the debug frame graphs to use a dark theme. To access the frame graphs hold alt then press F3. If you are connect

null 3 Dec 19, 2022
👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

Quick Info this library tries to solve language detection of very short words and phrases, even shorter than tweets makes use of both statistical and

Peter M. Stahl 532 Dec 28, 2022
Fast computer vision library for SFM, calibration, fiducials, tracking, image processing, and more.

Table of Contents Introduction Cloning Repository Quick Start Gradle and Maven Building from Source Dependencies Help/Contact Introduction BoofCV is a

Peter Abeles 916 Jan 6, 2023
Python wrapper around the BoofCV Computer Vision Library

PyBoof is Python wrapper for the computer vision library BoofCV. Since this is a Java library you will need to have java and javac installed. The form

Peter Abeles 44 Dec 30, 2022
MarioCash is a trust-based multi-dimensional blockchains built with a vision to connect everything and any blockchain networks.

MarioCash We will change the world by blockchain. What is mariocash? MARIOCASH is a trust-based multi-dimensional blockchains (branches) built with a

Brantley·Williams 23 Mar 10, 2022
Google's ML-Kit-Vision demo (android) for pre encoded video.

Android ML Kit Vision demo with Video Google's ML-Kit-Vision demo (android) for pre encoded video. Demos for camera preview and still image are also i

null 17 Dec 29, 2022
BungeeCord/Spigot plugin that fixes Multi-world detection by simulating mod presence on the server side

Companion for map mods Unofficial BungeeCord and Spigot (Paper) companion plugin for Xaero's Minimap (and their World Map), JourneyMap and VoxelMap. T

Artur Khusainov 3 Sep 18, 2022
CompreFace is a free and open-source face recognition system from Exadel

CompreFace can be easily integrated into any system without prior machine learning skills. CompreFace provides REST API for face recognition, face verification, face detection, landmark detection, age, and gender recognition and is easily deployed with docker

Exadel 2.6k Dec 31, 2022
The react-native Baidu voice library provides voice recognition, voice wake-up and voice synthesis interfaces. react-native百度语音库,提供语音识别,语音唤醒以及语音合成接口。

react-native-baidu-asr react-native-baidu-asr It is a Baidu speech library under React Native, which can perform speech recognition, speech wake-up an

dengweibin 11 Oct 12, 2022
Aplikasi Penerjemah dengan Voice Recognition

Air Translate Detail Bahasa pemrograman : Java Min SDK Version : 19 (Kitkat) Software : Android Studio Sumber Api : https://tools.helixs.tech//API/ Tu

Rahmatullah 4 Sep 1, 2022