How to use both computer vision and AI
August 22, 2023Computer vision and artificial intelligence (AI) each have their purpose in modern vision systems. When developing an imaging platform, certain challenges may arise when trying to implement AI in an existing computer vision system. Deciding what tasks are best suited for AI is initially difficult. Therefore, it can be beneficial to look to the experience of industry experts that have already taken many steps to provide a combination of AI and computer vision solutions.
Value of AI
Engineers initially began implementing AI to try to overcome difficult imaging challenges for traditional computer vision. Experts in traditional vision systems have existed for a long time and have developed decades of experience with computer vision tackling challenges in industrial applications including, but not limited to, machine vision automation, intelligent traffic systems (ITS) and aerial imaging. However, there are several applications where computer vision may not be appropriate.
Computer vision software is a great fit for factory applications where parameters are predetermined. Cameras are set up to continuously capture images as things pass by. Due to the repetitive nature of factory settings where the camera can capture the same subject, position and lighting for each image, the system is set up to identify if there are any anomalies.
However, with ITS, developing an automated solution for vehicle detection with computer vision can be tricky and even adapting to AI for the first time can itself be a significant challenge. The transition from computer vision to AI can be challenging because of preconceptions (primarily, the way each is set up for a particular task). In the traditional method, a developer may program specifications so that the software looks for a change in the image for vehicle detection. This may introduce challenges because if the software is triggering a camera whenever the image changes, to signify a vehicle passing by the camera, several other changes to the image could do the same. For instance, if the weather changes, the software may interpret this as a change that requires the camera to capture an image. Additionally, if something moves past the camera, such as a bird, this may trigger the camera as well.
In a factory setting where the same type of object is continuously passing by the camera, computer vision software can accurately image products/components/materials for inspection. However, there are several types of outdoor phenomena that can introduce inconsistency for the scene and cause complications when programing a solution. AI can provide value in this situation.
Training an AI is quite different from programing computer vision software. The process of training an AI is less set in stone. It is similar to introducing a subject to a student and having them review the material many times until they grasp an understanding. However, an AI needs to be focused with a particular subject in mind. When it comes to imaging a subject, such as a vehicle, the software must be able to distinguish what a vehicle looks like to be successful.
Instead of training an AI to be able to recognize everything it sees with the camera and to identify vehicles, the training is focused on only vehicles. By recognizing what a car, truck or motorcycle looks like, the AI will know to look out for these possible targets within an image. With this baseline, the AI can ignore all other issues that may arise in the real world of traffic imaging.
The AI will only perform as well as the data that was used to train it. If the quality of image data is poor, then the AI will most likely make many mistakes. In the case of rain, sun or clouds, the lighting and overall composition of an image may change, and the appearance of a vehicle may also change with the weather. Therefore, by providing an image set that includes not only sunny days, but images of vehicles in bad weather conditions, this provides a higher degree of reliability in the AI. By training with a plethora of images that include good variety, the AI will be able to handle edge cases more often and prevent the camera from missing cars as they pass by.
Value of computer vision
There are multiple ways to tackle imaging challenges. AI can provide a lot of new capability to a vision system, but it is not the only option. Traditional computer vision software can offer efficient solutions to various image challenges. Where AI can be beneficial for complex image analysis, it also comes with a draw back — the amount of strain it puts on computer resources. On the other hand, computer vision software offers simple programs that are easy to run on some of the smallest embedded systems.
An application like traffic can greatly benefit from computer vision software for applications such as optical character recognition (OCR). Instead of using an AI that might require more power through the use of a graphical processing unit (GPU) or field programmable gate array (FPGA), a computer vision OCR software simply cross checks the image for basic shapes to match one of a pre-set list of characters for license plates. Characters from each license plate in Figure 3 are passed through an OCR system and printed out for data collection of each vehicle that is captured by the camera system. This can be advantageous for effectively registering each license plate by scanning each character and saving that data for toll enforcement.
How to use them together
A vision system that uses AI can accomplish complicated image analysis. However, some imaging tasks are straightforward and do not require ample processing power. Therefore, traditional computer vision software can also play an important role to not overwhelm the system. By tailoring both AI and computer vision to different challenges, an imaging system can use the most efficient solution for each task.
For ITS, both OCR and vehicle detection are crucial for toll enforcement. But to ensure the computer vision software is able to reliably pass through a consistent image sample, the image system would also need to isolate each vehicle in an image. In any one image the camera may capture several vehicles with each license plate needing to be separately isolated and detected before being inspected by the computer vision software (See Figure 4). Similar to the vehicle detection software, a reliable way to detect the location of a license plate in an image is to train an AI to look through examples of vehicles on a road so it can more accurately adapt to the constantly changing scenes of street traffic.
Since every vehicle can have a variety of license plate positions and they can also take on a variety of colors, it can be advantageous to have an AI that is used to these edge cases. A license plate from a particular region can come in multiple colors because as new license plates are issued, they can have new styles that replace old formats. In Figure 5, different license plates are shown from various regions and even some from the same region (Ontario) with different designs that could confuse a vision system that is not well trained to accommodate these options.
Of course, various weather conditions can also affect how each image appears. Once a license plate is detected, the rest of the image can be cropped out so that only a rectangular license plate remains from which the OCR can be used to extract the characters. The three license plates in Figure 6 show various forms of dirt or mud that can obscure a license plate. Even license plate covers can reduce the light reflecting from the characters and can provide an additional challenge.
A real-world example of using both
One of Teledyne’s projects has been to develop an ITS imaging system for OCR. However, in this example, the design was initially intended to rely only on computer vision. The challenge was to build a system for tolling relying solely on computer vision with an accuracy of at least 95%.
After some development time, an accuracy of 75% was quickly achieved. Then, the engineers decided to try using AI.
After working with AI, the conclusion was that there are still certain tasks and applications that are better suited to traditional computer vision and the same goes for AI. The main reason for this is that AI is more demanding on the hardware and requires more processing power. Computer vision code is usually very targeted at a specific simple task to ensure the system can handle lots of images. The winning combination was to use AI for vehicle detection and identifying the license plates, but for OCR, computer vision still came out on top. With proven algorithms for applications like OCR, accuracy can be assured with traditional computer vision.
Now a more challenging scenario is embedding vehicle detection within a camera. By using AI, the team of imaging experts at Teledyne were able to find any vehicle that would appear in front of the camera and even extract the license plate from each vehicle. The biggest hurdle with this was to ensure consistent performance under all kinds of operating conditions. By feeding the image of the license plate to the OCR engine, the system was able to use computer vision to extract each character and save the license plates. This combination of the two systems into a hybrid solution proved to be the most efficient and effective solution. The AI was able to exceed the 75% benchmark and reach the desired 95% accuracy.
The lesson here is that this was not the initial choice. The original plan was not necessarily to create a hybrid system. However, the result is that the AI was able to better handle the more challenging task of identifying the vehicle, while traditional computer vision could easily manage the OCR. AI is quickly becoming more and more important in more and more imaging applications and is allowing customers to take previous performance thresholds to the next level.
To learn more about finding the best camera for your application, visit the Teledyne DALSA website or reach out to their imaging experts.