
Confidence levels, OCR hit rates and job rates: How to evaluate OCR system performance.
When it comes to Optical Character Recognition (OCR) and Optical Feature Recognition (OFR) systems, there is often a lot of confusion assessing OCR/OFR systems. High OCR hit rates don’t guarantee lower exception rates. Time to put everything into perspective.
Embracing digitization
Seaports and container terminals have invested heavily in infrastructure and automation during the past decade.
Traditionally a labor-intensive industry, innovation in communication, connectivity, and computing power has turned the industry upside down.
Advanced terminals are almost fully automated and digital twins are the next big thing. The capital-intensive industry is embracing digitization to improve operations efficiency, better customer satisfaction, and a higher return. Data capturing and terminal intelligence have moved from the operations desk to the boardroom.
Automation starts with reliable data
It is crucial to keep track of the continuous flow of in-and outgoing containers. Missing one container means it might be lost for weeks. Accurate registration of container flow is essential for operations organization.
One of the main data sources for the TOS comes from the intelligent camera systems at the terminal gates, in the yard or at the STS cranes loading and unloading the vessels.
The intelligent cameras register trucks and containers by reading container markings (OCR) as well as features (OFR). Typical OCR readings include the truck license plate, container number, ISO code and dangerous goods labels. Typical OFR readings include seal identification and automated damage inspection.
High-end versus low-end hardware
Missing data by lacking or misreading information will dramatically affect the OCR system performance. Inaccuracy will lead to handling more instead of fewer manual exceptions and a system perceived as unreliable by gate clerks.
On-camera image-analysis software will perform better when picture quality is at its best. High-quality lenses and high-bright lighting are key to provide easy-to-read images for the AI processor. If only a container number or license plate needs registration, a value-for-money solution might do the thing.
When high flows of trucks and container movements have to be processed 24/7 and in all weather conditions, or additional information is to be registered, the low-cost solution might eventually turn into replacement of the failing equipment. When information becomes critical, it is highly recommended to invest in a sustainable solution. Sometimes, one can’t afford to buy cheap. Cameras with high quality lenses, global shutter technology and remote focus capability will save you a lot of trouble.

AI and deep learning expertise
The Camco camera systems run AI engines for data capturing. The Camco expertise in deep learning and Convolutional Neural Networks (CNN) has been integrated into OCR camera systems with the highest accuracy available today. Convolutional Neural
Networks learn by experience, Camco feeds the network with large numbers of labeled images so the network learns the features that characterize objects. Defining the best-suited network architecture and tweaking the algorithms requires time and expertise.
A team of eight AI engineers is developing new applications and improving system performance by tweaking algorithms pushing recognition rates to a maximum. It requires a lot of resources for every percentage of OCR system improvement.

AI image-recognition software
Accuracy and speed are the key drivers for any OCR/OFR system performance.
Powerful CPUs use different algorithms to process different images at the same time. The processors are embedded, and on-camera computing instead of relying on a remote AI server makes the information faster.
All Camco cameras are equipped with a 12th-generation Intel processor.
Evaluating an OCR system performance
To evaluate AI vision-based system accuracy, KPIs have to be defined. Multiple KPIs can be used to measure AI vision technology performance. The main question is which KPI should be used for what purpose: a KPI in a Maintenance & Service agreement may indeed be different from KPI related to the system acceptance.
Finally, when defining AI vision KPI, it is key to only take into account data generated by the AI vision system.
Indeed, exceptions can also be created by missing or incorrect data from the TOS. These TOS-triggered jobs are not included in the AI vision-based KPI definition. In this article, we are only focusing on the OCR/OFR exceptions created by the AI vision system.
"Terminals want fast processes; too many unnecessary operator jobs will slow down operations"
Confidence level: accuracy is key
The confidence level gives an estimated probability regarding the correctness of the reading result. Its value is calculated based on the intermediate results of the algorithms.
The confidence is a value between 0 and 100 percent and is influenced by the quality and wear of the container markings, light conditions, and so on. In an industrial process, the goal is to reduce human interventions to the minimum and to rely on qualified data.
Note that TOS container vessel discharge or load data can be used by the OCR engine to increase the crane OCR hit rate. A missing digit can be recovered from the TOS system. Or truck appointment information can be used to correct missing data at the gates.
At Camco, we are very careful and use this additional data only for confidence level calculation. It is key to discard the TOS autocorrection in assessing a vendor OCR result, the raw OCR result.
"Missing data by lacking or misreading information will dramatically affect the OCR system performance"
OCR/OFR hit rate KPI: system performance
When measuring the performance of a stand-alone OCR/OFR system, the hit rate is the most relevant KPI.
Whereas the confidence level for each reading is automatically computed by the system, the hit rate is determined through human verification, i.e. by visual check of each picture with the corresponding system read.
The hit rate is a useful KPI to measure the performance of a stand-alone OCR/ OFR system, i.e. not integrated with any industrial process such as a truck gate process or crane process.
OCR/OFR job rate KPI: Process-performance
The OCR/OFR job rate KPI can be calculated automatically by counting the number of generated operator jobs for a population of passages (after deduction of the TOS-generated jobs).
A passage can be a truck, spreader or train. The typical population size has to be between 250 and 500 passages to be statistically significant. All passages in this population are used; none are discarded, even missed pictures are taken into account.
Based on this counting process, the job rate KPI equals the number of OCR/OFR operator jobs, expressed as a percentage of the total population size.
Combining OCR/OFR attributes leading to higher job rate
If the OCR/OFR process output contains many (independent) attributes (LPR, container ID, IMDG, seal presence, and so on), then the job rate will be higher due to the combination effect.
Therefore, it is useful to only use the minimal set of OCR/ OFR attributes needed for process execution. Although the job rate KPI is not an indicator of the correctness of the OCR/OFR result as such, it is a very useful operator intervention KPI measuring the performance of a business process integrated AI vision-based system.
The right KPI for every process
In a project context, the OCR/OFR hit rate KPI is a useful performance measure as it can be specified during project contract negotiations. It is a raw performance measurement of a standalone OCR/OFR system, independent of any customer-specific industrial process.
However, from an operational point of view, a terminal is more interested in the amount of generated operator work. This work determines the operator workload and the total process duration.
Terminals want fast processes; too many unnecessary operator jobs slow down the operations. In the context of an operation, after a project is accepted, the OCR/OFR job rate KPI is a more useful performance measure as it requires no human work to generate the KPI, and is a good indicator of operator workload.
Which AI Vision performance KPI? Both KPIs can co-exist. Each KPI measures the output of a different process. The advice is to use the OCR/ OFR hit rate KPI in a project context. During the solution’s lifetime, it is more useful to switch to the OCR/OFR job rate KPI. As mentioned before, TOS-generated jobs are to be excluded from the AI vision performance KPI.