Using Vision's Snapshot feature with an External OCR Service
  • 14 Mar 2024
  • 5 Minutes to read
  • Contributors

Using Vision's Snapshot feature with an External OCR Service


Article summary

Using Vision's Snapshot feature with an External OCR Service

Capture and send images to an external computer vision service API

NOTE

While you can accomplish this with Vision - another alternative is to use CoPilot. Read more about CoPilot and OCR here.

Overview

Vision's Snapshot feature can be used in conjunction with Tulip Connectors and an external OCR service. This article will guide you on how to quickly build a robust OCR (Optical Character Recognition) pipeline that detects text from the snapshot taken with a Vision Camera. Leveraging this functionality, you will be able to scan documents, read text from printed labels, or even text that's embossed or etched on items.

The following article will walk through how to use this feature with Google Vision OCR. The Google Vision OCR feature is capable of reading text in very harsh image conditions.

The steps this article will take you through:

  1. Setting up Tulip Vision and the Google Cloud Vision API
  2. How to create a Tulip Connector to the GCV API
  3. Building an app to take a snapshot, and communicate with the OCR connector function

Prerequisites

Setup Snapshot along with a Camera Configuration

Please make sure you've successfully setup a Vision camera configuration, and are familiar with Vision's Snapshot feature. For more information, see: Using the Vision Snapshot Feature

Enable Google Cloud Vision API and a Google Cloud Platform Project

Create a GCP project, and enable the Vision API by following the instructions as stated in this article: https://cloud.google.com/vision/docs/ocr.

Create an API Key on Google Cloud Platform to be used for Authentication

Follow the instructions stated in the article: https://cloud.google.com/docs/authentication/api-keys to create an API key for your GCP project. You can restrict the usage of this API Key and set appropriate permissions. Please consult your network manager to help you configure this.

Creating a Tulip Connector Function for Google OCR

The connector and connector function you build will be configured to match the type of request expected by the Vision API as stated in the following image:

Configuring your Connector Function:

  1. Create an HTTP Connector.

  1. Configure the Connector to point to the Google Vision API endpoint.

Host: vision.googleapis.com

TLS: Yes
3. Edit the connect's Headers to include the Content-Type.

  1. Test the Connector and Save the configuration.
  2. Next, create a POST request connector function, and add the following path to the endpoint: v1/images:annotate

  1. Add an image as an input to the connector function. Make sure the input type is Text.

  1. Ensure the request type is JSON, and that your Request Body matches your Google Vision API request type:

Note: Replace PUT_YOUR_API_KEY_HERE with your own API Key created in the steps above.
8. Next, test this connector function by converting an image of text into a base64string (to do so, you can use this website). Use this string as the test value for your image input variable.

You should receive a response back similar to:

  1. Set the output variable to point to the .responses.0.textAnnotations.0.description
  2. Save the connector function.

Creating a Tulip App that uses Snapshots and the Google OCR Connector

  1. Go to the App Editor and use the app created while setting up the Snapshot Trigger: Using the Snapshot Feature
  2. Next, create a button with a Trigger to call Connector Function. Use the image Variable that is stored by the Snapshot output as input to the connector function.

  1. Add a Variable, detected_text, to your app Step so you can view the results returned from the connector function:

  1. Test the app and observe the OCR results:

You have now created a Tulip Vision app that connects to Google Vision API OCR service. Try it now on your shop floor!

Further reading:


Was this article helpful?