Characterization and classification of numerical data patterns using Annotated Paraconsistent Logic and the effect of contradiction

This work describes the development of a computational mathematical model that uses Annotated Paraconsistent Logic - APL and a concept derived from it, the effect of contradiction, to identify patterns in numerical data for pattern classification purposes. The APL admits paraconsistent and paracomplete logical principles, which allow the manipulation of inconsistent and contradictory data, and its use allowed the identification and quantization of the attribute related to the contradiction. To validate the model, series of Raman spectroscopies obtained after exposure of proteins, lipids and nucleic acids, collected from cutaneous tissue cell samples previously examined for the detection of cancerous lesions, identified as basal carcinoma, melanoma and normal, were used. Initially, the attributes related to contradiction, derivative and median obtained from spectroscopies were identified and quantified. A machine learning process with approximately 31.6% of each type of samples detects a sequence of spectroscopies capable of characterizing and classifying the type of lesion through the chosen attributes. Approximately 68.4% of the samples are used for classification tests. The proposed model identified a segment of spectroscopies where the classification of test samples had a hit rate of 76.92%. As a differential and innovation of this work, the use of APL principles in a complete process of training, learning and classification of patterns for numerical data sets stands out, with flexibility to choose the attributes used for the characterization of patterns, and a quantity of samples of about one third of the total required for characterization.


The Annotated Paraconsistent Logic -APL
Classical Logic is founded in such a way that, given a proposition, it can assume two states, being then considered a binary logic. This condition limits the representation of numerous real-world situations, especially those that imply decisionmaking based on complex problems. In researches related to Artificial Intelligence, systems projects are studied based on more flexible types of logic and, therefore, adaptable to the complex problems of the real world. Non-Classical Logics admit in their foundations indefiniteness, ambiguities and contradictions, being more adequate to represent complex systems. Paraconsistent Logic belongs to the Non-Classical Logic family and admits contradictory states in its theoretical structure (Da Silva Filho et al., 2008).
Studies on the fundamentals of Annotated Paraconsistent Logic -APL, are described by Da Costa, Subrahmanian and Vago (Da Costa et al., 1991). To represent the admitted states of the APL, a lattice with four vertices can be used, with the socalled constants of annotation in each of the vertices: V -true, F -false, Т -Inconsistent, ⊥ -Paracomplete or Indeterminate.
Given the set of these constants  = { Т, V, F, ⊥ } and the lattice operator  = < |  |, , ~ > which can be characterized by a Hasse diagram. The operator over  is: ~: |  | → |  |, whose operations are described next to the lattice (Da Silva Filho et al.   Т = Т (the negation of an inconsistent proposition is inconsistent)  V = F (the negation of a true proposition is false)  F = V (the negation of a false proposition is true)  ⊥ = ⊥ (the negation of a paracomplete proposition is paracomplete) Source: Authors (2021).
We observe in the lattice the true and false states present also in Classical Logic, and the states that are additionally admitted by the Paraconsistent Logic, inconsistent and paracomplete, whose complements or negation are the values of these states..

The Paraconsistent Logic Annotated with an Annotation of Two Values -APL2v
APL2v allows the representation of two values  and  for the states of Paraconsistent Logic related to a given A four-vertex lattice to represent APL2v is shown in Figure 2, where a proposition is represented by P, the degree of evidence supporting a proposition is represented by , and the degree of evidence unfavorable to a proposition is represented by  (Da Silva Filho et al., 2008). P (, ) Т = Inconsistent = P (1, 1) V = True = P (1, 0) F = False = P (0, 1) ⊥ = Undetermined = P (0,0) Source: Authors (2021).
APL2v then admits two annotations or evidence for each of the APL states, which can vary between the numerical values 0 and 1. When each of the states is effectively characterized, that is, at the vertices, the evidence in favor and against a proposition, represented respectively by  and , assume the maximum or minimum values.

Paraconsistent Algorithms
Algorithms that express the principles of Annotated Paraconsistent Logic can be structured through modules or blocks that have inputs that are the evidence for or against a given proposition, and one or more outputs corresponding to the paraconsistent analysis. All numerical values that express the inputs are normalized between 0 and 1. The structuring of APL in blocks allows them to be interconnected, forming Paraconsistent Analysis Networks, so that the output of one block can be the input of another, that is, the resulting paraconsistent analysis becomes the input evidence for a subsequent block. A paraconsistent analysis block is shown in Figure 3. In the paraconsistent analysis block, the degrees of favorable and contrary evidence enter through inputs  e , while the output S presents a resultant degree of evidence that depends on the paraconsistent algorithm that is implemented inside the block, and some of these algorithms will be described in this job.

A The representation of certainty and contradiction in APL2v
Through the representative lattice of APL2v shown in Figure 4, it is possible to describe how values related to the evidence of certainty and contradiction of a given proposition are quantified, through the so-called Degree of Certainty -DC e Degree of Contradiction -DCt (Da Silva Filho et al., 2008). Ambos podem variar no intervalo [-1, 1]. São calculados a partir das evidências favorável e contrária de uma dada proposição.
The Degree of Certainty and Degree of Contradiction can then be defined as (Da Silva Filho et al., 2008): From the Degree of Certainty and the Degree of Contradiction, it is possible to define the so-called Interval of Certainty -, which is an indicator of how much certainty or contradiction, originated by the favorable Degree of Evidence -, and the Degree prevail in the paraconsistent analysis of Unfavorable Evidence -, of a given proposition. The Certainty Interval is also accompanied by a sign (+) indicating that the DCt tends to the True logic state or (-) indicating that the DCt tends to the False logical state. The Certainty Interval is defined as:

Paraconsistent Analysis Algorithm
In a paraconsistent analysis, the Real Degree of Certainty -RDC, is the result of the analysis where, at the end of it, it is desired to present a pure certainty value, that is, without influence of the effects of uncertainties. In calculating the Degree of Real Certainty, the result of the paraconsistent analysis must be subtracted from the value attributed to the effect of the influence of inconsistencies arising from conflicting information, free from the effect of contradiction (Da Silva Filho et al., 2008). Figure 5 shows the APL2v lattice with the intersection point between DC and DCt. For an analysis with null DCt, the DC value would be maximum and equal to 1. Since the distance between the maximum DC value and the intersection point is called d, the projection of d on the Degrees of Certainty axis is the value of Degree of Real Certainty -DRC. The Resulting Degree of Evidence -ER is the normalized DRC value. The Resulting Evidence Interval -E(±) represents, in the paraconsistent analysis, the same concept demonstrated by the Certainty Interval defined in equation (3), but it is calculated from the Degree of Contradiction normalized between values 0 and 1, called in the algorithm of Ctr. The lattice shows the Degree of Certainty and the Degree of Contradiction, the distance d between the maximum value of DC and the intersection point of the degrees of certainty and contradiction, and the projection of the distance d on the DC axis, which corresponds to the Degree of Real Certainty. Next, Algorithm 1 of paraconsistent analysis.

Algorithm Extractor of the Effects of Contradiction
The Contradiction Effects Extractor algorithm is used when there is a set of evidence for a given proposition subject to contradictions, and through the algorithm a paraconsistent analysis of this set of evidence is performed, subtracting the effects caused by the contradictions and presenting a single level of evidence resulting representative of the whole. The resulting numerical value is a Degree of Certainty from which the effects of the contradiction are subtracted. Algorithm 2 used in the contradiction effects extraction process is described below: (18) 6. Return to item 2 until the study group has only one element, then considering the resulting value of the analysis. Gμest =  μER; (19) Algorithm Extractor of the Effects of Contradiction -end.

Algorithm for quantizing the attributes of the spectroscopy series
The training and learning process values attributes of the spectroscopy series, and one of the attributes, the absence of contradiction, is obtained from the principles of APL2v. Sequential spectra are expressed by x(i) and x(i+1), where i is the position of each spectrum in the dataset, the following attributes will be calculated: increasing slew rate, decreasing slew rate and zero slew rate; the median that represents an approximation to the mean value of the series, while the resulting value of the contradiction extractor asserts a Degree of Certainty to the dataset free from the effects of the contradiction. These attributes are calculated as described in Algorithm 3, presented below: Algorithm 3: Quantization of spectroscopy series attributes -start: given the variables: a = increasing rate of change; b = decreasing rate of change; c = zero slew rate; d = Degree of Certainty free from the effects of contradiction; e = average; a = 0; b = 0; c = 0; 1. Calculating the increasing rate of change: if not: c = c + 0.1;} (23) 4. Quantization of contradiction extraction -perform steps 1 to 6 of Algorithm 2: d = Gμest; (24)  (19) 5. Calculation of the median in the spectral series: Algorithm for quantizing the similarities of characteristics of the spectroscopy series -end.

The use of APL2v in intelligent systems
Some works that use some of the APL2v principles proposed in this research are described below. Theoretical concepts of Paraconsistent Logic originated in Philosophy and that derive states such as inconsistency and contradiction have been approached in different works so that its principles are used in different areas of Science and Engineering (Priest, 2002;Norihiro & Heinrich, 2012;Carnielli et al., 2018). Studies involving Artificial Neural Networks based on Paraconsistent Logic suggest that they can be applied in decision systems that use pattern recognition (Martinez et al., 2021). An autonomous mobile module using ultrasound distance sensing, and whose trajectory is defined from inconsistencies and contradictions valuations, demonstrated that the APL2v principles could be used beyond theoretical limits (Da Silva Filho et al., 2001). A model to analyze cephalometric variables in orthodontics (Mario et al., 2010), was developed from Paraconsistent Artificial Neural Networks (Mario, 2003), and describes how the potential use of APL2v can be expanded to the medical field. An architecture that uses Paraconsistent Artificial Neural Cells for Learning -PANCL (Mario, 2003) and the algorithm that extracts the effects of contradictions, which, given a data set representative of evidence, returns a single value that is equivalent to the absence of contradiction in a data set (Da Silva Filho et al., 2011), are two principles used to demonstrate the concept of a filter that analyzes the signal-to-noise ratio in telecommunications (Carvalho Jr. et al., 2018). The treatment of signals equivalent to a proportional and integral controller used in an automation system, is structured through the concepts of APL2v, and promising results were obtained compared to a conventional feedback control (Coelho et al., 2019).
More recent theoretical studies evaluate the use of Artificial Neural Networks in conjunction with Paraconsistent Logic for use in decision-making systems (Martinez et al., 2021).
The attributes of Paraconsistent Logic were used in the analysis of data obtained by Raman spectroscopies to discriminate cutaneous tissue samples, and in the first work in this area a paraconsistent analysis network was described and

Objective
The model must identify, in numerical series resulting from Raman spectroscopy, data segments where their attributes can be quantized and, in this way, patterns for these series can be characterized. Demonstrate the development of a computational model based on Paraconsistent Logic and that uses a machine learning process; to present a characteristic of non-classical logic, the contradiction, to be quantized in the segments identified in the spectroscopy series. To present a Research, Society andDevelopment, v. 10, n. 13, e283101320830, 2021 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10.33448/rsd-v10i13.20830 8 decision-making system that selects the segments with the best assertiveness for pattern classification and through which it is possible to correlate the patterns characterized in the machine learning process to the series of spectroscopy that did not participate in the learning process, through tests of generalization.

Methodology
The development of this work is related to an experimental and quantitative research, whose numerical data analyzed were made available from the methodology used in the experiment developed by Silveira et al. (2012). For the construction of the computational mathematical model, which will be referred to as computational model, a Java programming language was used. Spectroscopy data were stored in an EXCEL  spreadsheet. The methodology used in the work can be described through the following sequence of procedures: a) With the availability of spectroscopy for NORMAL, CBC and MEL classifications, the computational model searches for these data according to the classification and then samples are separated for characterization and samples for tests. The sampling method used is based on holdout (Faceli et al., 2019), as the training and test data do not mix, but because of the availability of samples, approximately one third of the total pre-classified samples is used for training, and the rest for testing. The consequences for this type of procedure are discussed in Baranauskas and Monard (2000); b) The computational application uses the separate samples for the characterization to quantize the chosen attributes, through sequential segments of 20 wavenumbers. A specific algorithm in the computational model performs a scan so that all segments have quantized spectroscopies. The same process is performed for test samples; c) Considering that the data analyzed in this work are multivariate, that is, they have more than one attribute, in general, covariance and correlation measures would be used to analyze the relationship between more than two attributes (Faceli et al, 2019). As a differential, this work uses a Paraconsistent Analysis Network to calculate the evidence of similarity between the attributes of the characterization and test data, after their quantization in the previous step. The evidence of similarity is calculated for sequences of 20 wavenumbers, until the calculation is carried out for all; d) Then, all calculated values of evidence of similarity are exported to an EXCEL  spreadsheet, indexed with the respective segments, and a decision system implemented in the spreadsheet identifies the segments where the assertiveness of the correct classification between the training and test segments is greater than 75%.
Series of spectroscopies obtained through the methodology described in the work developed by Silveira et al. (2012) were used, which describes the use of Raman spectroscopy in cutaneous tissue biopsy fragments in the infrared range with a The machine learning process performed by the computational model uses reference samples to characterize the CBC, MEL and NORMAL patterns, and simultaneously uses test samples to identify the segments of the wavenumber series with greater precision in identifying the patterns. The total number of samples of each type of classification was equal to the amount of MEL spectroscopy available -19. Of this total, the averages of 6 samples, approximately 31.6%, were used to characterize the reference samples, while 13 samples or approximately 68.4% of the total were used for testing. Figure 6 is a representation of how the available samples were used to build the computational model. Research, Society and Development, v. 10, n. 13, e283101320830, 2021 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10.33448/rsd-v10i13.20830

Paraconsistent Analysis Network
For the characterization of the standards of the reference and test samples, a Paraconsistent Analysis Network is used.
The network architecture presents two sets of symmetric blocks, each one to receive reference and test pairs of 10 wavenumber, so that the analysis carried out by the network totals a total of 20 reference and test pairs of wavenumber.
Through the inputs µ1 and 1 of the Paraconsistent Analysis Network, the increasing rate of change parameter is quantified.
Via the µ2 and 2 inputs, the parameter equivalent to the decreasing rate of change is quantified. Inputs µ3 and 3 quantify the null rate of change. Inputs µ4 and 4 quantify the value equivalent to the extraction of the contradiction. Through the µ8 and 8 inputs, they quantify the median (Larson & Farber, 2010) of the wavenumber sequence. Output S5 is the resulting value of the increasing and decreasing rates of slew, while output S6 adds the null rate of slew to this analysis. Output S7 adds to the analysis the extraction of contradiction, which through output S9 adds the median in the paraconsistent analysis. Through the symmetry of the Paraconsistent Analysis Network, the analyzes referring to 10 spectra are inserted in the µ and µ inputs of the last block of the network. For the two analyzed segments to have equivalent weights in the last block, the input  is inserted so that its value is complemented, equivalent to a true evidence, that is,  = 1 -. The Paraconsistent Analysis Algorithm is then executed from step 1 to step 4. The output of the Paraconsistent Analysis Network presents a value that represents the evidence of how much a segment of 20 test wavenumber has similarity with one of the reference standards, CBC, MEL and NORMAL.
The Paraconsistent Analysis Network is shown in Figure 7. Then there is the Paraconsistent Analysis Network, formed by blocks of paraconsistent analysis, whose algorithms were previously presented, and which allow the interconnection of the values of the numerical data attributes related to Raman spectroscopy, to present in the output a resulting evidence equivalent to similarity between spectroscopy segments, which allows to identify for the pairs of training and test segments if they belong to the same classifications.

Model functionality to quantify similarities between Raman spectroscopies
The computational model starts scanning the wavenumber segments from the first and increments to the twentieth, with n1 being the identifier of the first spectrum and n20 being the identifier of the twentieth spectrum of the first segment. The second segment starts from the second spectrum n2 and has the twenty-first spectrum as its last spectrum, identified as n21.
Each available spectroscopic series consists of 1178 wavenumbers. As the algorithm scans every 20 wavenumbers, this corresponds to 1158 segments of 20 wavenumbers. Each segment of the wavenumber inserted in the Paraconsistent Analysis Network, presents in the output an evidence of similarity between the segment in relation to the reference standards and the analyzed test segment. This evidence is then directed to a decision-making system, which checks whether the value of evidence between the test and reference spectra segment had a value greater than 0.75 for the standard to which the test segment belongs, CBC, MEL or NORMAL. The decision system nominally identifies all wavenumber segments and has an assertiveness greater than 75%. An overview of the computational model is shown in Figure 8. The figure represents the entire sequence of steps initially described in the methodology, that is, from reading the data in the spreadsheets, quantizing attributes and calculating the similarity between training and test spectroscopy segments, passing through the decision system implemented again in the environment from the spreadsheet, to the identification of the wave numbers segments whose assertiveness in the classification exceeds the 75% rate.
The decision system consists of EXCEL  spreadsheets that receive the resulting evidence values calculated through the Paraconsistent Analysis Network. Through the use of formulas and conditionals applied to the cells that receive the evidence values, the wavenumbers segments are identified with greater precision. Figure 9 shows the spreadsheet used to identify the NORMAL classification segments. Similar spreadsheets were used for the CBC and MEL ratings. Source: Authors (2021).
The decision system represented in the figure where it is implemented in the spreadsheet identifies the segments where the assertiveness of the correct classification between the training and testing segments is greater than 75%. The complete functionality of the computational model for identifying the wavenumber segments is described using Algorithm 4: Algorithm 4: description of the general functionality of the model -top: 1. Reads reference spectroscopies and test spectroscopies; 2. Characterizes and quantifies the values of reference attributes through evidence; 3. Quantify the similarity between reference and test standards for spectroscopy segments; 4. Identify spectroscopy segments with similarity values between reference and test greater than 75%. description of the general functionality of the modelend.

Results and Discussion
An evaluation of the model's performance can be done by analyzing the responses of the Resulting Degrees of Evidence values, in the output of the Paraconsistent Analysis Network, for a segment of spectroscopy. In this section of the network, spectroscopy segments with different performance in terms of output value were selected for each type of classification, for a more comprehensive view of the model's behavior. Then, for each classification related to diagnoses, 6 samples were used to characterize the reference standards and 13 samples for testing. These quantities in the proportion of 33% and 66% respectively, are based on available samples, and in the case of MEL diagnosis, there were a total of 19 samples.
The results are presented in order to demonstrate the general behavior of the model as a function of the spectroscopy segments inserted in it, that is, high and low performance spectroscopy segments, for each type of classification.
Afterwards, after the model had calculated the accuracy for each spectroscopy segment, a specific spectroscopy segment that presented the best performance for the three classifications was identified. In this case, additional samples that did not participate in the machine learning process were used, 11 for the NORMAL classification and 77 samples for the CBC classification.
The model functionality compares the Degree of Evidence resulting from the tested classification with the other two classifications, and the answer is the highest resulting Degree of Evidence among the three classifications. When the highest resulting degree of evidence corresponds to the type of classification tested, the model's answer is correct. Otherwise, the answer is considered wrong.

Model results for CBC classification
The representative values of the precision of the Degrees of Evidence Resulting from the CBC classification test samples as a function of the Raman wave numbers are shown in Graph 2. For example, for the 20 wavenumbers spectroscopy from 827.8 to 854, 4 the precision is 100%, that is, 13 CBC test samples were classified correctly. For the segment of 20 wavenumbers from 947.4 to 972.7 the precision is 84.62%, that is, for 11 of the 13 samples tested the classification was done correctly. It is also possible to observe that there are segments with low precision, for example for the segment that starts in the 1680.1 wave number, with an accuracy of 7.7%, that is, only one test sample was classified correctly. In the chart, these mentioned points are indicated with arrows. We can then have a complete view of the model's performance for the CBC classification, for the entire range of wavenumbers.
Graph 2: accuracy of the model for the CBC classification as a function of the Raman wave numbers segments.

Source: Authors (2021).
Then there is the resulting graph of the decision system for the CBC classification brings the assertiveness rates as a function of the Raman wave numbers, pointing to the beginning of the segments with the best and worst performance. According to the graph, it can be seen that the result obtained by the Paraconsistent Analysis Network for the 13 test samples previously classified as CBC was correct, as they present a higher resulting value of evidence for the classification to which they belong.
Graph 4 shows the Resulting Degrees of Evidence for the segment of 20 wavenumber segments, 947.4 to 972.7, with an accuracy of 84.62%, that is, for 11 of the 13 CBC samples tested, the classification was done correctly. According to the graph, it can be seen that the result obtained by the Paraconsistent Analysis Network for 11 test samples previously classified as CBC is correct, as they present a greater value of resulting evidence for the classification to which they belong. For sample 6 the result found was the NORMAL classification and for sample 10 the result found was MEL.
Graph 5 shows the Resulting Degrees of Evidence for the wavenumber segments from 1680.1 to 1696.3 with only one test sample 4 correctly classified, accuracy of 7.7%.
Graph 5: Degrees of Resulting Evidence for the 13 CBC test samples, compared with the NORMAL, CBC and MEL reference standards, wavenumber segments from 1680.1 to 1696.3.

Source: Authors (2021).
Therefore, this graph represents the result of a wavenumber segment with low performance: for samples 1 to 3, 5 to 7, 9, and 11 to 13 the classification result was NORMAL. For samples 8 and 10 the classification result was MEL. Only samples 4 have the correct classification, which is CBC.

Model results for NORMAL classification
Samples of test results with NORMAL classification are shown in Figure 6. Initially, the accuracy of the model is shown for the Degrees of Resulting Evidence as a function of the wavenumber segments. Accuracy ranged from 100% to 7.7%. Next, the Resulting Degrees of Evidence results are shown for the segments of 20 wavenumbers from 580.5 to 609.8 with 100% accuracy, from 474.7 to 505 with 76.92% accuracy, and for the segment of wavenumbers that from 1470.1 to 1488.9, with an accuracy of 7.7%.  Many segments achieved 100% accuracy. In 6b are the results of the 13 test samples for the wavenumber range starting at 580.5 and all classifications are correct. In 6c we have the results for the wave segment starting at 474.7 and 10 of the 13 samples are correctly classified. Sample 3 is classified as MEL and samples 1 and 6 are classified as CBC. In 6d are the results of the wavenumber segment that starts at 1470.1, and which has a low performance. Only sample 11 is classified correctly.
Sample 6 is classified as MEL and the other 11 samples were classified as CBC.

Model results for MEL classification
Samples of test results with MEL classification are shown in Figure 7. Initially, the accuracy of the model for the  Figure 7a shows the performance of test samples for MEL classification as a function of wavenumber segments. Only one segment surpassed the 90% accuracy. In 6b are the results of the 13 test samples for the wavenumber range that starts at 1337.6 and is precisely the one that performs best in the MEL classification. In this segment 12 samples were correctly classified as MEL and only sample 5 was classified as CBC. In 6c we have the results for the wave segment starting at 1060.9 and 10 of the 13 samples are correctly classified. Sample 8 is rated CBC and samples 6 and 9 are rated NORMAL. In 7d are the results of the wavenumber segment starting at 1337.6. Three samples 2, 11 and 12 are classified correctly. Samples 5 to 10 are classified as CBC and samples 1 to 4 as well as 13 are classified as NORMAL.

Wavenumber segments with the best and worst accuracy for the three classifications
When performing the experiments to evaluate the model's performance for the NORMAL, CBC and MEL classifications, it can be seen that there is a segment, which starts at the wave numbers 1232 to 1253.8, where the model has the best accuracy considering the three ratings, with 84.61% for the NORMAL rating, 92.3% for the CBC rating and 76.92% for the MEL rating. It is also possible to verify a wavenumber segment of spectroscopy, from 1748.4 to 1763.7, with worse accuracy if the three classifications are considered, being 38.46% for the NORMAL classification, 15.38% for the CBC classification and 7.69% for the MEL classification. Figure 8 highlights these spectroscopic segments for the NORMAL, CBC and MEL reference samples. In the sequence of Figure 8, the Resulting Evidence Degrees are presented for the wavenumber segment from 1232 to 1253.8 respectively for the NORMAL, CBC and MEL test samples. the results of the MEL classification, with 10 samples correctly classified.

Tests with samples not used in the machine learning process for the most accurate wavenumber segment
Of the total available samples, 11 samples of the NORMAL classification were not used in the learning process. corresponds to a generalization test.

Discussion
The computational model proposed in this work achieved the objectives of identifying spectroscopic segments that could characterize patterns from previously defined attributes, some common to numerical series, such as the derivative and median, and another parameter that was quantized due to the use of LPA, or that is, the extraction of the effects of the contradiction. The possibility of choosing attributes that will characterize patterns can be an advantage over machine learning processes where this type of functionality is not explicit. In the specific case of the data sets used for the experiments in this work, it can be seen that in the segment with the best performance for classification, from frequency 1232 onwards, variation rates with variation rates with very evident differences, which allows us to better characterize this type of attribute to characterize a pattern. On the other hand, in the segment of spectroscopy with the worst performance, from frequency 1748.4, it can be seen that the rates of change are similar, making it difficult to characterize patterns. Figure 9 shows these details.
The use of Annotated Paraconsistent Logic allowed aggregating and quantifying evidence related to the attributes, through the use of a Paraconsistent Analysis Network, as well as quantifying the similarity between the attributes used to characterize the patterns of reference samples and tests of the types of classifications used in the process through the Resulting Degree of Evidence.
The developed model treated the Raman data without any correlation between their spectroscopy values at a given frequency and the respective classification of skin lesions, that is, the spectroscopies were treated only as numerical series, and from these the model quantified the similarity between the reference samples and tests for segments of 20 sequential spectroscopies.
Within the universe of samples available for the characterization of reference and test standards, the model was able to identify Raman spectroscopy segments with minimum accuracy results of 76.92% for the spectroscopy segment with the best performance for sample classification of tests. Source: Authors (2021).
The graph represents a sampling of spectroscopy as a function of wave numbers, highlighting the segments where the wave numbers with the best performance for correct classification, 1232, and with the worst performance, 1748.4, begin. In the segment started in 1232, significant differences can be seen, such as the attribute of the derivative, favoring the classifications.
In the segment that begins in 1748.4, the variations are similar for the three types of classification, making it difficult to quantify the differences and, consequently, carry out the classifications.

Conclusion
Through this work, it was possible to demonstrate the development of a computational model to identify parts of a data set that can characterize patterns with better performance based on predefined attributes. This is a differentiated approach within existing machine learning processes, firstly because it is possible to define attributes based on the characteristics of the data and identify specific parts of the set where the use of these attributes allows a characterization and subsequent classification of patterns where the accuracy is higher, anticipating performance and allowing the assessment of whether it is satisfactory for the application and also enabling the redefinition of attributes in search of the possibility of effective classification of patterns.
The use of Annotated Paraconsistent Logic brings the innovation of adding to the model the manipulation of concepts related to Non-Classical Logics, as was the case shown here of manipulating the contradiction present in a dataset. Another innovative aspect related to Paraconsistent Logic is that the Paraconsistent Analysis Network allows variations by aggregating evidence related to attributes, as well as their increment or even substitution. The network architecture can be adapted according to the characteristics of the analyzed data.
As demonstrated in the work, when parts of a dataset with the best performance for classification were identified, the proportion of samples to characterize a pattern, approximately 33% of the total, is significantly lower than in traditional machine learning processes. This is a promising feature of the model, as the amount used in learning can be a variable that can be adjusted in the process for characterizing patterns, as well as the definition of attributes itself.
Another product derived from the model's functionality is the possibility of identifying, from the tests between samples of the data sets, if there will be segments where the precision is significant for the classification.
It is concluded then that the proposed model brings innovations in relation to the already existing machine learning processes used for characterization and classification of patterns, both in the possibility of defining attributes of the data sets, as well as in the flexibility of aggregating these attributes through an architecture of network, in addition to the possibility of adding to the process the quantization of properties that are part of the real world but are difficult to abstract, as was the case of the contradiction, manipulated by the model described here.
Future work complementary to this can be done using other types of datasets. Another important aspect of this study, which could be improved for better conclusions regarding its validation, is in relation to the number of samples. The results presented here were marked by the small amount of pre-classified samples, especially the MEL classification samples, making a generalization test for this type of classification impossible. In this work, a houldout adaptation was used for the sampling of classification and test data, but if more samples are available, cross-validation and bootstrap techniques can be used, and then a more comprehensive evaluation of the model can be effected.
The results obtained, with identification of segments of wave numbers that perform better than 75% of correct answers for data classifications related to diagnoses, particularly involving skin lesions from Raman spectroscopy, allow an optimistic perspective in relation to the study carried out, and the amplitude of its application to aid diagnosis of this and other natures, as well as the classification of patterns of other types of numerical data with the adaptation of the model to the respective attributes.