image recognition

Image refers to the description information of the object, and digital image is the digital representation of the object. Vision is the most important means for human beings to perceive the outside world. According to statistics, visual information accounts for 60% of information obtained by human beings, and images are an important way for human beings to obtain information. Therefore, more and more attention has been paid to the development of digital image processing technology closely related to vision, and image recognition technology has gradually formed.

With the development of digital image processing technology and the demand of practical application. Many problems do not require the output result to be a complete image, but re-segment and describe the processed image, extract effective features, and then judge and classify. This technology is image pattern recognition.

Image recognition technology is to use computer vision to collect objects, and based on image data, let the machine imitate human vision and automatically complete some information processing functions, so as to achieve the ability of human beings to recognize the images collected by vision and replace people to complete the task of image classification and recognition. For image recognition, we are faced with two-dimensional data signals or plane graphics. We should get rid of their different physical contents, consider the * * * nature of sample data classification, classify those with the same * * * nature into one category, and classify those with different * * * nature into one category. It is required to make the recognition result conform to the objective object as much as possible under the condition of minimum error probability, and to have the ability of analyzing, describing and judging various things and phenomena that people have.

Image recognition belongs to an important field of contemporary computer science research and has developed into an independent discipline. In recent years, this discipline has developed very rapidly, with a wide range of applications, covering almost all fields, from aerospace to biological science, information science, resources and environmental science, astronomy, physics, industry, agriculture, national defense, education, art and other fields and industries. Widely used in national defense economy, national defense construction, social security and social development, it has a far-reaching impact on the whole society. At present, optical character recognition (such as handwritten numeral recognition, postal code recognition, automobile license plate recognition, Chinese character recognition, bar code recognition, etc. ) and biometrics (such as face recognition, fingerprint recognition, iris recognition, etc. ) has been widely used in human daily life, which has a great impact on economy, military, culture and people's daily life.

Optical character recognition uses OCR reading equipment and intelligent vision system software to recognize texts that can be read by both machines and naked eyes. The input device used in OCR can be any kind of image acquisition device, such as CCD, scanner, digital camera and so on. By using this acquisition device, the OCR system will input the characters written by the writer himself into the computer as images, and then the computer will recognize them. Optical character recognition technology has been widely used in various commercial activities, and now it is applied to automation tasks. The information processed by character recognition can be divided into three categories: text information recognition, digital information recognition and bar code recognition.

Biometrics is a technology that uses some technology and means to identify people, so as to identify people according to this identification and achieve the purpose of supervision, management and control. The technologies and means of identity identification and personal information management emerge one after another. Traditional personal information identification methods include personal characteristics. Such as ID card, salary card, student ID card, magnetic card, smart card and password. These authentication methods are generally easy to be lost, cracked, forged and carried, which can no longer meet people's needs in terms of security and authentication speed. Although these technologies are convenient and quick, their fatal shortcomings are poor security, easy forgery and easy theft. In recent years, the wide application of computers makes it possible to identify people through biometric identification.

Biometric identification methods are increasingly used in the field of identity identification. Biometric identification technology refers to the technology to realize accurate identification of human identity based on the inherent characteristics of human body. These inherent features include face, iris, fingerprint, palm print and so on. , also known as biological model. Except for special circumstances such as trauma, these characteristics will generally accompany a person's life and will not change or change very little. Biometric technology is portable and lasting for everyone; It is universal and unique to different individuals, and it is superior to the traditional identification. The identification technology based on human biological characteristics has the advantages of safety, reliability, unique characteristics, difficulty in forgery and theft.

Combined with computer technology, many identification technologies based on human biological characteristics have been developed, such as face recognition technology, fingerprint recognition technology, iris recognition technology and so on. These recognition technologies have the advantages of convenient feature input, rich information and wide application range. Therefore, it has broad application prospects.

(1) Face recognition is mainly based on facial features. It is also one of the earliest biometric technologies used by people, and it is a friendly, intuitive and more acceptable identification method. In practical application, face recognition is simple and easy to use, without active participation of users, especially suitable for video surveillance and other applications. However, the disadvantage of face recognition is poor stability, which is easily disturbed by surrounding environment, decorations, age, expression and other factors, leading to recognition errors. In addition, the identification of twins and multiple births is still powerless.

(2) Iris recognition is mainly based on the physiological structure of iris, using the characteristics of filaments, spots, protrusions, rays, wrinkles, stripes and so on. It is said that no two irises are the same. Iris authentication has high reliability and low false acceptance rate and false rejection rate.

(3) Fingerprint identification is mainly carried out by analyzing the global features and local features of fingerprints, such as ridges, valleys, endpoints, bifurcation points and so on. With the development of fingerprint identification technology and the reduction of the price of fingerprint collection equipment, fingerprint identification is not only widely used in judicial and business activities, but also more and more used in terminal equipment such as notebook computers, mobile phones and memories. However, it is required to keep fingers clean and smooth when collecting fingerprints, and dirt and scars will bring difficulties to identification. The fingerprints of the elderly and manual workers are difficult to identify because of serious wear and tear. In addition, in the actual collection, it is found that because fingerprints are often used in criminal records, many people are afraid of recording fingerprints and psychologically unwilling to accept this kind of identification.

At present, whether it is the project development technology of character recognition (such as handwritten numeral recognition, postal code recognition, automobile license plate recognition, text recognition, etc.). ) or human biometrics (such as face recognition, fingerprint recognition, iris recognition, etc. ), which involve digital image processing, pattern recognition, artificial intelligence, intelligent computing and other disciplines. With the development of high technology, the application of these projects has become an important means to measure the level of contemporary high technology.

Image recognition technology is a combination of digital image processing and pattern recognition technology. Digital image processing is the basic behavior of using computers or other digital devices to process and process image information to meet the needs of target recognition. Pattern recognition studies how to use machines to realize people's ability to learn, recognize and judge things, so as to meet the judgment behavior of target recognition.

In order to simulate human image recognition activities, people put forward different image recognition models. Such as template matching model. The model holds that in order to recognize the objects in the image, there must be a memory pattern of the image to the objects in the past experience, which is also called template. If the current stimulus can match the template in the brain, the object will be recognized.

The basic process of image recognition is to extract essential expressions (such as various features) representing unknown sample patterns and match them with a set of standard pattern expressions (called dictionaries) pre-stored in the machine, and make judgments according to certain standards. From the set of standard pattern expressions stored in the machine, the expression closest to the input sample sub-pattern is found, and the category corresponding to this expression pattern is the recognition result. Therefore, image recognition technology is a process of automatically identifying and evaluating objects in images based on a large amount of information and data, existing experience and understanding, and using computer and mathematical reasoning methods.

The process of image recognition includes four steps: image acquisition (feature analysis), image preprocessing, feature extraction and pattern matching.

Firstly, the original information of the image is collected by high-definition camera, scanner or other image acquisition instruments. In the process of image acquisition, the differences in image size, angle, format and illumination intensity caused by mechanical reasons of equipment or other human factors will have a great impact on future operations, so the collected original images need to be preprocessed. The function of image preprocessing can be summarized as: using some means to normalize image information for subsequent processing. The function of image feature extraction part is to extract the feature information that best represents an object and convert it into the form of feature vector or matrix. Pattern matching means that the system compares the features of the image to be measured with the information in the feature library, and achieves the purpose of recognition by selecting a suitable classifier.

Image preprocessing technology is a series of operations before the formal processing of images. Because the image will inevitably be damaged to a certain extent and polluted by various noises during transmission and storage, the image will lose its essence or deviate from people's needs, which requires a series of preprocessing operations to eliminate the influence on the image. Generally speaking, image preprocessing technology is divided into two aspects, namely image enhancement and image restoration technology. Image enhancement technology occupies a large proportion in image preprocessing and is an essential step in image preprocessing. It is different from image restoration technology, and the purpose of image restoration is to restore the original essence of the image. The principle of image enhancement is to highlight the features people need and weaken the features they don't need. Generally speaking, there are two methods of image enhancement technology: spatial method and frequency domain method. Spatial domain rules mainly deal with images directly in spatial domain, which are divided into two aspects: point operation and domain operation (local operation). Among them, point operation includes image gray level transformation, histogram equalization, local statistics and other methods. Domain operations include image smoothing and image sharpening. Frequency domain rules only work on the transform values of images in a certain transform domain. For example, we carry out Fourier transform on the image, then calculate the frequency spectrum of the image in the transform domain, and finally transform the calculated image into the spatial domain. Frequency domain methods are usually divided into Qualcomm filtering, low-pass filtering, frequency band-pass filtering and band-stop filtering. Image restoration technology is a process of changing the degraded image by using the prior knowledge of the image. Image restoration technology requires us to establish an image model first, then reverse the degradation process, and finally get the optimal image before degradation.

Image transform domain processing takes spatial frequency (wave number) as an independent variable to describe the characteristics of the image, which can decompose the spatial change of the image into linear superposition of simple vibration functions with different amplitude, spatial frequency and phase. Various spatial frequency components and distributions in an image are called spatial spectrum. This decomposition, processing and analysis of image spatial frequency characteristics is called spatial frequency domain processing or wave number domain processing. Among many image transformation technologies, the commonly used ones are discrete cosine transform, Walsh transform, Fourier transform, Gabor transform and wavelet transform.

The basis vector of (1) discrete cosine transform DCT transform matrix is often considered as the best transform for transforming language and image signals because it is similar to Tobeliz vector. Although the compression efficiency is slightly lower than that of K-L transform, its efficient processing type is incomparable to K-L transform, and it has become the main link of international standards such as H.26 1, JPEG and MPEG. It is widely used in image coding.

(2) Walsh transform is an orthogonal transform, which can eliminate the correlation between adjacent sampling points and make the signal energy concentrated in the upper left corner of the transform matrix, and many zero values appear in other parts; Or within the allowable range of error, small values are allowed to be omitted, so as to achieve the purpose of data compression. Walsh transform is widely used in image transmission, radar, communication and biomedicine.

(3) Fourier transform is a commonly used orthogonal transform, and its main mathematical theoretical basis is Fourier series, which was put forward by the famous mathematician Fourier in 1822. Its main idea is to expand the periodic function into sine series. Fourier transform lays a theoretical foundation for images. It extracts and analyzes the information features of images by switching images back and forth in time-space domain and frequency domain, which simplifies the calculation workload. Known as the second language to describe image information, it is widely used in image transformation, image coding and compression, image segmentation and image reconstruction.

(4)Gabor transform belongs to windowed Fourier transform, which is a special case when the window function is Gaussian in short-time Fourier transform. Because of the limitation of Fourier transform, Gabor 1946 proposed windowed Fourier transform. A typical windowed Fourier transform method is a low-pass filter. Gabor function can extract relevant features of different scales and directions in frequency domain.

(5) Wavelet transform is inspired by Fourier transform, and Morlet put forward the concept of wavelet analysis in 1984. 1986, the famous mathematicians Meyer and Mallat cooperated to construct a unified method of image wavelet function-multi-scale analysis. At present, wavelet transform theory has achieved good results in the application of image denoising.

Frequency domain denoising is mainly because the effect of some images in spatial domain is not ideal, so I want to convert them into frequency domain for processing, that is, to approximate the objective function with a set of orthogonal function systems, so as to further get the coefficients of the corresponding series. Frequency domain processing is mainly used for processing related to image spatial frequency, such as image restoration, image reconstruction, radiation transformation, edge enhancement, image smoothing, noise suppression, spectrum analysis, texture analysis and so on.

Feature extraction is a concept in computer vision and image processing, which refers to extracting image information by computer to determine whether each image point belongs to an image feature. The result of feature extraction is to divide the points in the image into different subsets, which often belong to isolated points, continuous curves or continuous regions.

(1) function selection

The original feature number is large, or the original sample is in a high-dimensional space. Selecting some of the most effective features from a set of features to reduce the dimension of feature space is called feature selection. That is to say, features that have no or little contribution to category separability are simply ignored. Feature selection is a key problem in image recognition.

(2) Feature transformation

Through mapping or transformation, feature description in high-dimensional space can be described by features in low-dimensional space, which is called feature transformation. The feature obtained by feature transformation is some combination of the original feature set, and the new feature contains all the information of the original feature. Principal component analysis is the most commonly used feature transformation method.

Feature selection and extraction are very important, and feature selection is a key problem in pattern recognition. Because it is often difficult to find the most important features in many practical problems, or because the conditions cannot be measured, the task of feature selection and extraction becomes complicated and becomes one of the most difficult tasks in constructing pattern recognition systems. People pay more and more attention to this problem. The basic task of feature selection and extraction is how to find the most effective feature from many features. To solve the problem of feature selection and feature extraction, the core content is how to evaluate the existing features and how to generate better features through the existing features.

Common image feature extraction and description methods include color features, texture features and geometric features.

According to whether there are standard samples, pattern recognition can be divided into supervised learning and unsupervised learning. The classification or description of pattern recognition is usually based on the set of patterns that have been classified or described. People refer to this group of patterns as training sets, and the resulting learning strategies are called supervised learning. Learning can also be unsupervised. In this sense, the generated system does not need to provide prior knowledge of pattern classes, but learns to judge pattern classes based on statistical laws of patterns or similarity of patterns.

(1) data acquisition

Data acquisition refers to the use of various sensors to convert all kinds of information of the studied object into numerical values or symbol (string) sets that can be received by the computer. Traditionally, the space composed of such values or symbols (strings) is called pattern space. The key of this step is the selection of sensors.

In general, the data types obtained are as follows.

(2) Pretreatment

In order to extract effective information from these numbers or symbols (strings), preprocessing must be carried out to eliminate the noise in the input data or information, eliminate irrelevant signals, and leave only features closely related to the nature of the studied object and the recognition method adopted (such as representation shape, perimeter, area, etc.). For example, in fingerprint identification, the fingerprint image output by fingerprint scanning equipment will change with the contrast, brightness or background of the image, and sometimes it may be deformed. People are only interested in fingerprint lines, fingerprint bifurcation points and endpoints in the image, and do not need other parts and backgrounds of fingerprints. Therefore, it is necessary to adopt reasonable filtering algorithms, such as directional filtering and binary filtering based on block diagram, to filter out these unnecessary parts in fingerprint images.

(3) Feature extraction

Exchange original data, find the most effective feature from many features, get the feature that best reflects the nature of classification, and transform the high-dimensional metric space (the space composed of original data) into the low-dimensional feature space (the space for classification and recognition) to reduce the difficulty of subsequent processing. Features that are easy for human beings are difficult for machines to obtain, which is the problem of feature selection and extraction in pattern recognition. Feature selection and extraction is a key problem in pattern recognition. Generally speaking, the more types of candidate features, the better the result should be. However, this may lead to dimension disaster, that is, the feature dimension is too high for the computer to solve. When designing a pattern recognition system, how to determine the appropriate feature space is a very important problem. There are two basic methods to optimize feature space. Firstly, feature selection, if the selected feature space can make the distribution of similar objects compact, provides a good foundation for the successful design of classifier; On the other hand, if different kinds of samples are mixed together in this feature space, then no matter how good the design method is, it will not improve the accuracy of the classifier. The other is the combinatorial optimization of features, which transforms the original feature space through a mapping transformation to construct a new simplified feature space.

(4) Classification decision-making

Based on the pattern feature space, the last part of pattern recognition: classification decision can be made. The final output of this stage may be the type to which the object belongs, or the pattern number most similar to the object in the model database. The categories and characteristics of several samples are known. For example, the recognition of handwritten Arabic numerals is a classification problem of 10. The machine must first know the shape characteristics of each handwritten number. Different people have different ways to write the same number, and even the same person can write the same number in multiple ways, so it must let the machine know which category it belongs to. Therefore, it is necessary to establish a sample database of classification problems. Based on these sample bases, the discriminant classification function is established, which is realized by machines and called learning process. Then analyze the characteristics of an unknown new object and decide which category it belongs to. This is a supervised classification method.

The specific steps are to establish a training set in the feature space, know the category of each point in the training set, seek some discriminant function or criterion from these conditions, design a decision function model, and then determine the parameters in the model according to the samples in the training set, so that the model can be used for discriminant, and use the discriminant function or criterion to determine which category each unknown point belongs to. In the discipline of pattern recognition, this process is generally called the process of training and learning.

Classification rules are determined according to the information provided by training samples. The design of classifier is completed in the training process. With a batch of training samples, including various types of samples, the distribution law of various things in the feature space is roughly outlined, which provides information for determining what mathematical formulas to use and the parameters in these formulas. Generally speaking, it is up to people to decide what kind of classification function to use. The choice of classifier parameters or the results obtained in the learning process depends on what standard function the designer chooses. The optimal solutions of different criteria functions correspond to different learning results, and classifiers with different performances are obtained. The parameters in mathematical formulas are often determined by learning. In the learning process, if it is found that the currently used classification function will lead to classification errors, then the classification function can move in the right direction by providing information on how to correct the errors, which forms an iterative process. If there are fewer and fewer errors in the classification function and its parameters, it can be said that they converge gradually, the learning process will receive results and the design can be finished.

From the purpose of application, the four parts of pattern recognition system are quite different, especially in data preprocessing and classification decision. In order to improve the reliability of recognition results, it is often necessary to add knowledge base (rules) to correct possible errors, or reduce matching calculation by introducing restrictions, thus greatly reducing the search space of the pattern to be recognized in the model base.