Image processing for positioning mechanical device with Backpropagation algorithm and separate handling of RGB components

Different approaches for the use of Artificial Neural Networks ANNs, in the recognition of image patterns, have been used with variations ranging from the processing of the image data to the ANN architecture itself. This paper describes the development of a system that aims to recognize patterns of images with ANNs of three inputs that receive images decomposed into their RGB components. The ANNs have an architecture with two hidden layers of six neurons each, and use the algorithm Backpropagation. The built model normalizes RGB components with values between zero and one. The Backpropagation algorithm is used for the purpose of functional approximation of these components, and after training, the numerical arrangements obtained in the three outputs corresponding to the inputs are denormalized to form the resulting training image. Six image pattern had training in different ANNs, forming a system to recognized each pattern. The feasibility of using the model was verified with the tests for its generalization capacity. Images used to position a mechanical device, which did not participate in the training, were inserted into the system and from them the positioning of the device was performed, with a high degree of accuracy.


Introduction
The use of Artificial Neural Networks for the classification and recognition of image patterns in applications involving process automation has been increasingly widespread and in different areas. Regarding the architectures and algorithms of ANNs applied in digital image processing, recent works describe the use of feed forward networks to extract features such as image textures (Tuncer et al., 2019), and the use of Deep Convolutional Neural Networks to classify objects in 3D images and real-time identification of objects in video images or advanced automation applications (Lakhili et al., 2018;Newby et al., 2018;Sharma et al., 2018;Wei et al., 2021). Computer vision problems are characterized by the large amount of manipulated information, which require the treatment of high definition images with a large amount of additional information, such as coloring and positioning in three-dimensional space (Braga et al., 2012). This work describes a simpler alternative method for image processing, where the main approach is to decompose it into its RGB components (Gonzalez & Woods, 2010;Pedrini & Schartz, 2008;Solomon & Breckon, 2013), which enhances the characterization of a pattern as a function of color. The prototype developed in this work uses a passive sensor, a camera (Russel & Norvig, 2004), to observe and collect images of the environment. The positioning of the mechanical device from image processing through Neural Networks can serve, for example, in an Industrial Automation system, as an effector of a robot (Capelli, 2008). In the state of the art of computer vision, a technique used in this work, we can mention the mathematical model used to measure a 3D surface (Talon & Pellegrino, 2022) using angle measurements from embedded sensors; another type of computer vision problem, the synthesis of images from cross-views (Regmi & Borji, 2019) through the use of mapping points from one plane to another, homography; Deep Convolutional Neural Networks have also been relevant to the problem of image restoration with a high degree of accuracy (Sharma et al., 2018;Wang et al., 2020).

Artificial Neural Network for image processing
The image data processing of this work was initially elaborated from the Artificial Neural Network represented in Figure 1, which presents architecture with three inputs identified by xi, two hidden layers with six neurons each, identified by von and won, and six outputs identified by yn. Index n identifies the position of the neuron in its layer. In the context of machine learning (Baranauskas & Monard, 2000) the Artificial Neural Network uses the error correction learning rule, the Backpropagation algorithm (Faceli et al., 2019;Haykin, 2001;Luger, 2013), and the hidden layer neurons, as well as the outputs, use the bipolar sigmoid nonlinear activation function. In the image pattern recognition application described in (Mario et al., 2018), in image preprocessing it is segmented into its RGB components (Gonzalez & Woods, 2010;Pedrini & Schartz, 2008;Soloman, 2012;Solomon & Breckon, 2013), and these components are inserted into the network inputs, with the Red component inserted at input x1, the Green component inserted at input x2 and the Blue component inserted at input x3. The network is fully connected, as a consequence, one neuron in any layer is connected to all neurons in the previous layer (Haykin, 2001). In the application described in (Mario et al., 2018) the outputs y1, y2 and y3 converge respectively, after training, to inputs r, g and b. Thus, it is possible to reconstruct a pattern of the images submitted to training by recomposing the RGB components presented in the network outputs. Outputs y4, y5 and y6 are respectively replicas of outputs y1, y2 and y3. The hxij, vij, and wij synapses are linked respectively to input neurons, von neurons, won neurons, and outputs. Indices ij identify which neuron the synapse is linked to and the order of its position. Network training is described below in Algorithm 1: Algorithm 1 -start:
Calculation of the additive junction of von neurons;
Calculation of additive junction of won neurons; (11) 8. Calculation of activation function of won neurons; 9. Calculation of derivative of activation function won; (13) 10.
Calculation of the bias update rate of von, won and yn neurons; Research, Society and Development, v. 11, n. 2, e21311225768, 2022 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10.33448/rsd-v11i2.25768 5 In the experiments described in (Mario et al., 2018) it was observed that the use of six-output architecture improves the accuracy of image reproduction compared to three-output architecture. Updating network synapse weights so that output data converge to input data occurs most effectively with the strategy of using duplicate outputs, propagating error minimization to synapses that are directly or indirectly linked to the outputs.
After image acquisition, the image is preprocessed to segment it into its RGB components. Image data is inserted into the application in the form of a numerical matrix in which the elements of the same are sequentially distributed as in the notation of equation (35) of Algorithm 2, where n is the dimension of the matrix, called image_array. The elements R, G and B of the image_array are respectively the Red, Green and Blue components of the image. Algorithm 2 described below demonstrates the creation of RGB component arrays from image_array: In experiments with this Neural Network (Mario et al., 2018) it was trained with a representative image of a numerical digit of dimensions 20 by 25 pixels and 54 training cycles were performed. In sequence the trained image was inserted into the network with the values of the synaptic weights obtained at the end of the training. After recomposing the data obtained at outputs y1, y2 and y3, the resulting image can be recognized as an approximation of the trained image (Mario et al., 2018). The representation of the result of this phase of the experiment is shown in Figure 2. Research, Society andDevelopment, v. 11, n. 2, e21311225768, 2022 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10.33448/rsd-v11i2.25768 6 The generalizability of the network was tested by inserting a similar image to the trained one, and checking the network response for the similarity of output and input network images. The representation of the result obtained in this test is shown in Figure 3.
The development of the system for positioning a mechanical arm of this work is based on the results of the experiments performed with the described Artificial Neural Network (Mario et al., 2018), with the difference that the system network outputs should converge to distinct numerical value ranges for each of the six desired positions for mechanical arm travel. The change in network functionality will be made through Algorithm 1 equations (17), (18) and (19), where the values of x1, x2 and x3 will be replaced by numerical ranges equivalent to the positions of the mechanical arm.

Methodology
The development of this work is related to an experimental and quantitative research, based in results obtained in experiments carried out with two-layer Neural Networks for digital image recognition (Mario et al., 2018) and on the development of an electronic and mechanical prototype that is moved from a Backpropagation algorithm applied in Artificial Neural Networks.
Six image's patterns was used, different about as part position. The images relative to the positions of the piece were obtained through a digital camera, and were normalized to dimensions of 1.5 cm x 1.8 cm. Figure 4 shows the sets of images used for ANN training, respectively with the workpiece located in regions 1 through 6.

Neural Networks for image pattern recognition
To recognize the patterns of the image sets presented in Figure 4, ANNs with the architecture described in the introduction are used. The programming language used for the construction of Artificial Neural Networks was Java, in the programming environment -IDE, NetBeans. The architecture of ANNs with common inputs is shown in Figure 6. Neural Network 1 is trained to recognize the pattern of images in region 1, orange; Neural Network 2 is trained to recognize the pattern of images of region 2, red, and so on until Neural Network 6, trained to recognize the pattern of images of region 6, green.
In Neural Network training, algorithm 1 is changed so that the outputs y1 through y6 of each Neural Network in the structure converge to a single numerical value. The three error equations (17), (18) and (19) are then replaced by a single (48); notation of errors and outputs may be replaced respectively by e1-6 and y1-6. The variable s receives the numerical values for convergence of the outputs.

Training of Artificial Neural Networks
The training was performed with different values of the variable s for each of the Neural Networks that will identify the positions of the piece through the regions from 1 to 6. The values are broken down in Table 1:

Definition of the variation of the numerical value ranges of the Neural Networks outputs
Training defines the weight values of all synapses and biases of the Neural Networks. With these values set, the following Algorithm 1 steps are performed, which allow the recalculation of the output values considering the training weights:

Code's piece from Algorithm 1-start:
4. Calculation of the additive junction of von neurons;

Code's piece from Algorithm 1 -end:
The After defining the numerical value ranges of the Neural Networks outputs, the images used for training were inserted into the respective Neural Networks from 1 to 6, and then adjustments were made to the numerical value ranges and conditional order to mitigate the effect of overlapping values for same region, as can be seen in Graph 1.

Decision making system for defining the positioning of the mechanical device
The decision system uses to define the positioning of the mechanical arm in each region the numerical value ranges calculated for each Neural Network from 1 to 6, as described in Algorithm 3. In this algorithm, the variable p receives the numerical value corresponding to the displacement of the mechanical device and which is passed via the serial port to the stepping motors drive application. Its value is equivalent to the time and displacement of the horizontal stepping motor.

Mechanical device drive system
The mechanical device consists of two stepper motors, one for horizontal displacement and one for vertical displacement. A plastic rod is fixed to the vertically moving stepper motor shaft, and through the rod the arm approaches the position of the part. A plastic rod is fixed to the vertically moving stepper motor shaft, and through the rod the arm approaches the position of the part. The horizontal plane where the mechanical arm moves has dimensions of 18 cm by 6 cm and in this plane six regions of 9 cm2 were defined for workpiece displacement, four of these regions positioned side by side and two vertically displaced in relation to the others. Each region of the piece has a specific color, being region 1 orange, region 2 red, region 3 yellow, region 4 blue, region 5 purple and region 6 green. Figure 7 shows the mechanical device composed by the rods (2) and attached to the vertical displacement step motor (3), the horizontal displacement step motor (4) and the horizontal plane with the regions (1) where the mechanical device is positioned after the displacements. The electronic circuits for selecting, controlling and driving stepper motors can also be seen in figure below, position (5). Source: Authors (2022).

Tests of Neural Networks
Neural Network tests were performed with the image patterns used for the training and also with images similar to the positioning regions, but not participating in the training. Test results are obtained from the decision system described. For this generalization test an extension of the application was built, which loads the images in the neural network inputs.

Results
Following are the results of the tests with the image patterns used for the training of the Neural Networks and the results related to the generalizability of the networks, performed with untrained images.

Test results with training image standards
The training of Neural Networks 1 to 6 was performed with 300 cycles each. At the end of the training, the Neural Networks to recognize the image patterns of regions 1 to 6 were built with the final values of the synapse weights. The training images were then inserted into the Neural Network system, and each Neural Network system feeds the decision system; thus Neural Network 1 gives the decision system the value of 1 (y1), Neural Network 2 gives the decision system the value of 2 (y2), and this sequence is followed to Neural Network 6. Results are shown in Table 2.

Test results with untrained images
Neural Networks 1 to 6 were then submitted to the generalization test, that is, images that did not participate in the training were inserted in the Neural Networks, but with the image patterns of the respective regions of the mechanical arm positioning. Results are presented in Table 3.

Discussion
The experiments performed with the Neural Network described by Algorithm 1 (Mario et al., 2008) provided the perspective that it would be possible, through its architecture with the use of the six outputs, to enhance convergence to a data set equivalent to a particular image pattern. Network response enables the reconstruction of an image from the sets of RGB components displayed in the outputs. The expectation of this work was to transform the subjective connotation of image analysis as a response to a pattern into a concrete response that could be expressed through numerical values. Through these same experiments it can be verified that the architecture with one output did not produce the expected results using similar methodology for the training of the networks, that is, there was no convergence for different numerical value ranges that could identify different image patterns.
Regarding the results, in tests with trained images, errors are characterized by network responses pointing to images from neighboring regions that would be the correct ones. Since neural networks use the RGB components of the images, it is relevant to mention that there is similarity between the data from the orange-red and salmon-blue regions, where two of the three region identification errors occurred.
Adjustments made following conditionals as well as decision system ranges after the training step resulted in improvements in mechanical device positioning system responses. The performance in generalization tests with images that did not participate in training, with correctness of all images, is a tendency to the first tests and qualifies the system composed by Artificial Neural Networks as viable for the recognition of image patterns.

Conclusions
The objective proposed at the beginning of this work was achieved by demonstrating the feasibility of using image pattern recognition, using a less complex Neural Network architecture than those currently used for this purpose.
The studies and experiments that inspired and supported this work also show relevant results, such as the feasibility of recognizing and reconstructing image patterns through their RGB components.
The prototype built to demonstrate the development of this work is characterized by the predominant use of open source hardware and software, resulting in the easy obtaining of all components, as well as the low cost involved. On the other hand, the methodology used in this work for the analysis of image patterns, allows its use using other programming languages, while the use can be extended to applications where the condition is determined by image.
Noting the similarity of all images used in the experiment, where the difference occurs only by the presence of the circular piece over one of the regions 1 to 6, it should be noted the performance of the tests of the Neural Networks, with accuracy of 87.5 % overall for the images that participated in the training, and 100% accuracy for the generalization tests.
Final considerations: the developed model that covers both software and hardware can be adapted in future works for robotics applications where the positioning of mechanical artifacts is necessary from the interaction with image recognition, with an approach to image recognition that highlights the treatment of its attributes from the RGB components, differing from works where the images are segmented and evaluated by parts. This type of approach to image recognition also allowed a lower complexity for the Artificial Neural Network used, since its architecture is simpler when compared to the state of the art, in which the use of Deep Artificial Neural Networks has been frequent for manipulation of digital images.