Computational bases study for complexes containing Cd ( II ) and biological evaluation in silico Estudo de bases computacionais para complexos contendo Cd ( II ) e avaliação biológica in silico Estudio de bases computacionales para complejos que contienen Cd ( II ) y evaluación biológica in silico

Computational chemistry only gained international recognition after making a significant contribution to the scientific advances that resulted in Nobel prizes. With the technological evolution of the last decades, software was created with the aim of studying, investigating and understanding chemical processes at the molecular level of experimental studies. This promoted research agility and reduced costs with laboratory work. In this work, 5 different sets of computational bases were studied: STO-3G, LAN2DZ, SDD, 3-21G and DGDZVP, using the GaussView 5 and Gaussian 09w software with the DFT and B3LYP functional hybrid method. The distance and angle parameters of the di-u-chloro-bis complex [chlorine (4,7-dimethyl-1,10-phenanthroline) cadmium (II)] were obtained. The RMSD values obtained for each of the bases were observed. Molecular docking test was performed for each base, to verify which one had better parameters. It was noted in this study that the set of SDD bases presented the best results in the tests, being classified as the most suitable for studies of structures containing the element cadmium in its composition.


Introduction
Computational chemistry only gained international recognition in 1998, through John Pople and Walter Kohn. On the occasion, both received the Nobel Prize in Chemistry for their contributions to quantum chemistry. More recently, in 2013, another Nobel Prize was awarded to research involving applied methods of computational chemistry and the modeling of systems with several atoms, by Martins Karplus, Michael Levitt and Arieh Warshel (Ferreira et al., 2016).
This research field studies the various chemical processes through computer software, seeking to understand chemical properties at the molecular level, predicting and trying to explain certain behaviors observed experimentally (Vázquez et al., 2016).
With the current technological innovations, softwares and hardwares have evolved in a fascinating way and have also been integrated into the natural sciences through quantum chemical study, making this field promising advances (Raupp et al., 2008). Computational methodologies help in carrying out experimental studies, providing speed and economy in laboratory processes, avoiding several hours of work and waste of materials (LIMA, 2015).
In this work, the complex dimeric di-μ-chloro-bis [chlorine (4,7-dimethyl-1,10-phenanthroline) cadmium (II)], a complex with monocyclic Cd (II) binuclear characteristics with a symmetrical center and distorted square pyramid configuration (Warad et al., 2013), was selected for investigation, an in silico chemical study was carried out from different computational basis set used to optimize complexes that contains cadmium in their structure. For this, the basis set STO-3G, LAN2DZ, SDD, 3-21G and DGDZVP were used to obtain the values of distances, angles and infrared spectra. Molecular docking tests were also carried out to discover the best computational basis for conducting in silico research involving biological activity.

Docking Molecular
The computational biological activity evaluated occurred through the molecular docking process, using the bacterial enzyme of Escherichia coli, Regulatory protein rop (rop) as a target (Amprazi et al., 2014) code 4DO2, deposited with Protein Data Bank (PDB) (BERMAN et al., 2000), evaluating binding energy, inhibition constant, number of conformations and hydrogen bonds formed.
The AutoDock 1.5.6 program (Morris et al., 2009) was used the preparation of ligand and macromolecule, adding the hydrogen, performing the partial calculation of Gasteiger charge, non-polar hydrogen. The grid was selected based on the x, y and z coordinates of the active site based on literature data (Amprazi et al., 2014), a cubic box of 60 x 60 x 60 cm was used. A metallic parameter AD4 was used, the algorithm adopted in the process was Lamarckian GA. During the simulation occurred 100 runs with 150 populations and long evals, finally analyzing the conformations and molecular groupings (Bastos et al., 2020;Rocha et al., 2018).

Computational methods
Computational chemistry calculations are applied to the structures of matter (atoms and molecules), focusing mainly on molecular orbitals, where electrons move independently of each other's fields, varying occupied or free orbitals (Alcácer, 2012). The periodic table is based on the effective field, where electrons approach the central field, whereas Hartree's Theory does not take into account the difference in electrons (Braga, 2007). The Hartree-Fock (HF) method is the most commonly used, providing solutions for most electrons, improving semi-empirical calculations that have a lower computational cost. It also serves as a preliminary step for more advanced calculations that require more sophisticated computers (Morgon & Coutinho, 2007). (1) Where: (2) The Hamiltonian operator (H) when applied to atomic units by assigning Bohr rays and Hartree-Fock energy can be rewritten: (3) The Density Functional Theory (DFT) method brought better results compared to the HF in relation to the proximity of the experimental values, being the most used today. Its differential is due to the same working with systems with more than 20 atoms (Morgon & Coutinho, 2007). The theory works with the total energy of the system including permutation and correlation, based on the ground state energy of the electronic density.
To solve problems of n electrons, the Kohn-Sham equation is used, which plays a similar role to the Fock operator, defining the KS spin-orbitals.

Kohn-Sham equation:
(4) In this way, it is possible to calculate other important factors in addition to the total energy, for example, the ionization energy and the equilibrium configuration. The total energy uses expressions similar to HF, but when attributing the electronic density that is the fundamental variable for the theory of DFT.
Hartree-Fock total energy: Total Energy Density Functional Theory: The terms and are terms of permutation and correlation, both of which are omitted in the HF theory.
Thus, both are particular cases of the DFT theory, in which and ..
In several current works that describe biological activities and characterization of new chemical structures, computational calculations were performed using this method (Casella et al., 2016;Horchani et al., 2020;Kukovec et al., 2011;Rocha et al., 2018).

Computational basis set
In this work, 5 different basis set were adopted: STO-3G, 3-21G, LAN2DZ, DGDZVP and SDD. The first is applied Research, Society and Development, v. 10, n. 1, e45110111966, 2021 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10. 33448/rsd-v10i1.11966 to molecules that contain atoms of the second line (Na-Ar). This base calculates the Slater-type orbitals through 3 Gaussian functions (minimum s-p, extended s-p and s-p for atoms of the second line) thus not assigning the d functions that have an important role for the elements of the second line (Collins et al., 1976).
The 3-21G base was developed as a minimum expansion base for the transition metals of the second line, expanding after the previous one that assigned atoms of the first line. The orbitals assigned to it are of type s and p adjusting to the quantum number n. Thus, three new Gaussian expansions for atoms 1s, 3d and 4d were used, their performance being described through geometric and frequency calculations, and their valence comprises the 4d-5s functions (Dobbs & Hehre, 1986).
Los Alamos National 2 Duplo Zeta (LAN2DZ) is effective for the elements of the third line (K-Au), fourth line (Rb-Ag) and fifth line (Cs-Au), corresponding to the outermost orbitals ns2 and np6. This base was developed to understand the Gaussian orbital valence up to the 6p orbitals of these lines, obtaining effective potentials through the ab initios calculations used in the electrons (Hay et al., 1985).
DGDZVP is used in molecules that have transition metals and contains halogens, oxygen, carbonyl, nitrosyl and other substituents, in order to obtain the geometry and vibrational frequencies. On this basis, a satisfactory agreement with experimental values was observed, being considered a superior basis when compared to results with Hartree-Fock (Sosa et al., 1992).
The most recent base used in chemical calculation software is SDD. In it, good results were proven for structures containing lanthanides, reaching larger molecular orbitals and Gaussian valence (14s13p10d8f6g) / (10s8p5d4f3g), being accepted for chemical investigations of structures with elements of this group (Cao & Dolg, 2002).

Computational evaluation
The distances of the geometric arrangements between the most important atoms in the cadmium complex are shown in Table 1. The main ones are between cadmium and chlorine atoms present in the center of symmetry of the structure. 6 The experimental distances characterized by Warad et al. (2013) have values for Cd21-Cl22 (2.410 Å), Cd25 -Cl46 (2.410 Å), Cd21 -Cl23 (2.627 Å). The calculated values that showed the closest results were for the LAN2DZ base, being 2.447 Å, 2.477 Å and 2.699 Å, respectively, best than the DGDZVP bases averaging 0.011 Å and the SDD base averaging 0.047 Å. The second base that presented results closer to the experimental ones was DGDZVP, followed by SDD, STO-3G and 3-21G. Lighter basis set and with less computational costs showed distant values. The SDD base showed better results on the Cd21-N20 (2,370 Å), Cd21-N19 (2,366 Å), N44-C27 (1,372 Å) and N19-C3 (1,369 Å) connections, better than LAN2DZ at an average value of 0.167 Å and DGDZVP on average 0.189 Å. In general, the LAN2DZ base was better for 7 results, 3-21G for 7 results, SDD in 4 results, STO-3G in 1 result and the DGDZVP base did not present a best parameter in any of the distances.
The connection angles showed considerable divergence between experimental and computational values, this is due to the packaging factor of the crystal that ingluencias in the distance and binding needles (Hao et al., 2005;Lin et al., 2009;Steed & Steed, 2015), occurring at angles formed Cl22 -Cd21 -Cl23 (120.7º). This divergence is also notable in other theoretical works involving structures with the cadmium atom in its composition (Casella et al., 2016;Kukovec et al., 2011;Machura et al., 2012). The LAN2DZ basis set presented closer angles for the connection N45 -Cd25 -Cl46 (114.4º) and Cl46 -Cd25 -Cl24 (116.9º) showing a difference of 0.3º and 8.6º for DGDZVP, 0.8º and 1.3º for SDD respectively, the SDD it was better for the angles for the connection N20 -Cd21 -N19 (73.0º) and N44 -Cd25 -Cl46 (108.3º), showing a difference of 2.8º and 0.8º for LAN2DZ, 3.1º and 0.3º for DGDZVP respectively (Table 2). The computational basis that showed the most accurate results at the experimental angles was 3-21G, with a total of 4 very close values. The DGDZVP base was the one that obtained the least results, only being acceptable for the angle of N20 -Cd21 -Cl22 (114.9º).
The RMSD values calculated for the deviation of the experimental results and computational base set can be seen in Figure 2, for which the formula below was used.
Root Mean Square deviation: The SDD base showed a deviation of 0.05 Å in the distances, being the value that came closest to zero, thus becoming the most accepted in reference to the experimental values of distances from the crystallographic data. After that, the basis set that came closest to zero were 3-21G (0.06 Å) and LAN2DZ (0.06 Å), the basis set where the most deviation occurred was STO-3G (0.11 Å). A satisfactory RMSD must present a result of 2 Å or less (Ramírez & Caballero, 2018), in this study all the bases were satisfactory, the closer to zero the deviation, the better the result.

Biological evaluation
The biological evaluation occurred through the application of the molecular docking process, which uses the chemical structure as a ligand to perform the interaction with enzymes, obtaining the parameters of the biological activity (Table 3). In this process, it was observed, mainly, the number of conformations in the first cluster compared to 100 independent runs in the simulation and the hydrogen bonds that are extremely important to keep the complex stable in the system.
In this type of in silico study the values of binding energy and inhibition constant are fundamental, but variations were already expected due to different optimization calculations. the autodock software uses the equation of Energy de gibbs below to determine the binding energy of the complex with the biological target.
Energy de gibbs: All calculations obtained 100% of runs in the first cluster, showing the facility that the complex has to bind in that region of the enzyme. Hydrogen bonds were formed with the amino acid residue ASN27, with the exception of calculations using the STO-3G base, which, in addition to not resulting in this type of interaction, also diverged in hydrophobic interactions, having, however, the best value for interaction energy -5.29 kcal.mol-1.
The SDD basis set showed a binding energy of -4.87 kcal mol-1 and an inhibition constant of 269.94 µM where it interacted with 7 hydrophobic residues and formed the hydrogen bond with the ASN27 residue ( Figure 4).

Conclusion
The methods adopted for the evaluation of the basis set for complexes that had the element cadmium in its constitution presented good results. The STO-3G base, was discarded as a basis for this type of work due to the divergences in obtaining the infrared spectrum compared to the experimental one and in the formation of hydrogen bridges in the evaluation of molecular docking, the latter being of paramount importance for the stability of the complex.
The 3-21G base, in spite of obtaining the satisfactory RMSD value for distance, theoretical studies claim that it was developed for the transition metals of the second line, but not including cadmium. The three basis set of this theoretical study that accept cadmium are DGDZVP, LAN2DZ and SDD. The latter presented the best RMSD value distances (0.05 Å). The DGDZVP base presented the worst individual results of angles and distances for this complex, so it is not a good base set for molecules that contain cadmium.
It is concluded, therefore, that, for computational studies of structures containing the cadmium atom, the best basis set to be adopted is SDD, its spectrum satisfactorily described the important peaks of the complex, satisfying the conditions for study. The second best set of bases that can be adopted is the Los Alamos National 2 Duplo Zeta (LAN2DZ).