Keeping an eye on Computer Vision
Table of Contents
Being Singapore’s largest network of container haulage service providers, efficiency is key. The development team in Haulio constantly explores new ways to improve the Haulio experience for shippers, drivers and Hauliers alike. Touted as a fast-growing start up powered by technology, we aim to maximize the use of computer vision to digitize our driver’s bookkeeping process by utilizing Optical Character Recognition (OCR) technology to identify printed characters and convert them into machine-encoded text.
Why container OCR?
An ideal solution for managing cargo, mobile shipping container scanning may be able to greatly reduce the chances of cargo ending up in the wrong port. Mobile OCR technology can make sure that container numbers are correctly recorded throughout the supply chain and cargo transport is correctly carried out. On top of that, it will save time by reducing the amount of manual data entry that needs to be performed on a daily basis.
Let's talk API
API stands for Application Programming Interface. In a nutshell, it is a set of clearly defined communication protocols and the tools for building software. SwiftOCR is the open source library available only on iOS, while cross platforms such as Swiftly Tesseract, ABBY, Cloud Vision and Azure Computer Vision are more flexible and allow software to run on most systems.
The APIs are first trained to identify alphanumeric characters. Sample images based on what our target audience frequently encounter are gathered, and we compare the accuracy of the different APIs tested on these sample images.
Results in Pictures
Here, we investigate how each API fares in character recognition. The image shows what the camera sees, and the text shows what alphanumberic character they recognize.
Azure Computer Vision
We have to consider that factors such as perspective and quality of the image, font of the characters and even the spacing between the characters could affect the accuracy of character recognition software.
Keeping all conditions constant, we conclude that Azure Computer Vision is the most accurate in identifying the characters on the containers in the sample images, followed by Cloud Vision, ABBY, Tesseract, and with SwiftOCR being the least accurate.
Improvements to consider
Future plans include working with more sample data to train the models and improve their performance. There is definitely room for improvement in attempting to simulate human vision into computers accurately, and Haulio will continue to work towards a seamless integration between the two, for the benefit of our users.