In silico analysis of the protein-protein interaction of the SARS-CoV-2 spike protein

The pandemic caused by the SARS-CoV-2 virus has represented a global challenge with a significant impact on public health since its emergence in 2019. Understanding the interactions between this virus and the human immune system is essential for the development of new strategies more effective therapies and diagnoses. This study aimed to predict epitopes for SARS-CoV-2 T and B cells, as well as evaluate the interaction of the spike protein with other viral proteins, using bioinformatics methods. SARS-CoV-2 protein sequences were collected from UniProt. Epitopes for T cells were predicted in silico using specific HLA alleles from the Bahia population. Epitopes for B cells were predicted using the IEDB server with multiple methods based on amino acid features. Protein-protein interaction was analyzed using the STRING database. The result is 10,671 peptides related to various SARS-CoV-2 viral proteins, including the spike, essential for the infection and pathogenesis of COVID-19. In addition to spike, proteins such as ORF3a, ORF7a, and ORF8 showed significant immunogenic potential. Protein-protein interaction analysis revealed that proteases such as TMPRSS2 and TMPRSS11D are crucial for viral entry and are potential therapeutic targets. This study expands the understanding of the molecular interactions of SARS-CoV-2, highlighting new therapeutic targets and clinical complications associated with COVID-19. The results provide valuable insights for the


Introduction
The SARS-CoV-2 virus was responsible for the global COVID-19 pandemic and, to this day, represents a major challenge for public health (Bar-On et al., 2020;Díaz-Castrillón & Toro-Montoya, 2020;Bussani et al., 2023).Since the first cases reported in 2019, the scientific community has been searching for effective strategies to identify, control and combat the spread of the vírus (Srivastana & Saxena, 2020;Kampf et al., 2020;Solanki et al., 2023).In this scenario, understanding the interactions of SARS-CoV-2 and its host's immune system is essential for the development of targeted therapies and more effective diagnostic tests (Díaz-Castrillón & Toro-Montoya, 2020;Eckerle & Meyer, 2020;Majra et al., 2021).
A fundamental point of virus-host interaction is the identification of specific segments of viral proteins that the human immune system is capable of recognizing.(Zhang et al., 2020;Sohail et al., 2021).The in silico identification of epitopes for T and B cells has become a valuable strategy in the search for possible therapeutic targets and the development of increasingly efficient vacines (Shang et al.,2020;Pan et al., 2021;Sohail et al., 2021;Chakraborty et al., 2021;Mukherjee et al., 2020).
Understanding how these epitopes stimulate immune responses can provide us with fundamental insights for developing immunization strategies (Yarmarkovich et al., 2020;Grifoni et al., 2021;Shomuradova et al., 2020;Devi et al., 2021;Fathollahi et al., 2023;Weingarten-Gabbay et al., 2024).Furthermore, understanding protein-protein interactions plays a crucial role in the pathogenesis of SARS-CoV-2.The spike (S) protein is extremely relevant due to its function of mediating the entry of the virus into host cells, and understanding how the spike interacts with other proteins of the virus itself and the consequences on host cells is essential to identify new targets therapies and identify the mechanisms used by the virus after infection (Arabi-Jeshvaghani et al., 2023;Navish & Uthayakumar et al., 2023;Kumar et al., 2023;Ozger, 2023;Zhou et al., 2023).
Advances in understanding virus-host interactions and predicting epitopes lead to the development of new therapies and even the improvement of existing ones.Directing these strategies to specific targets can boost the effectiveness of treatments and improve the sensitivity and specificity of tests.In this context, the article aim to predict epitopes for T and B cells, and evaluate the interaction of the spike protein with other SARS-CoV-2 proteins.
This study aims to apply bioinformatics methods to predict specific SARS-CoV-2 epitopes recognized by T and B cells using genomic and proteomic data.Furthermore, we seek to investigate the detailed interaction of the spike protein with other viral proteins, with a focus on identifying potential therapeutic targets and understanding the molecular mechanisms underlying viral pathogenesis.

Collection of protein sequences
The complete SARS-CoV-2 proteome was obtained from UniProt (ID: 2697049) via the UniProt online platform (https://www.uniprot.org/).The proteome includes all proteins encoded by the virus genome, essential for viral replication and interaction.

Allele selection
Human Leukocyte Antigen (HLA) alleles were selected using the Allele Frequencies database

In silico epitope mapping for T cells
The SARS-CoV-2 proteome was subjected to in silico epitope mapping for T cells using the IEDB platform (Immune Epitope Database; http://tools.iedb.org/main/tcell/).This tool identifies amino acid sequences with the potential to bind to class I Major Histocompatibility Complexes (MHC).Each viral protein was evaluated for its affinity with HLA alleles selected from the Bahia population.Epitopes with 100% affinity were identified and manually filtered to ensure accurate results.

In silico epitope mapping for B cells
The proteome was submitted to mapping on the IEDB platform (http://tools.iedb.org/main).The server has a collection of methods to predict linear B cell epitopes based on amino acid characteristics.The methods chosen were the BepiPred-2.0 server predicts B cell epitopes from the input protein sequence, amino acid residues above the default value (0.5) are predicted with part of an epitope (Jespersen et al., 2017), the evaluation of β-sheets for the prediction of antibody epitopes (Pellequer et al., 1993;Chou & Fasman, 1978), the surface accessibility method developed based on the surface accessibility scale of an antibody (Emini et al., 1985), flexibility scale based on the mobility of protein segments (Karplus & Schulz, 1985), antigenicity scalesemi-empirical methodwhich uses physicochemical properties of amino acid residues and the frequency of occurrence in segmental epitopes already known experimentally (Kolaskar & Tongaonkar, 1990), and hydrophilicity predictionconstruction of a hydrophilic scale based on peptide retention times during high-performance liquid chromatography (HPLC) (Parker & Hodges, 1986).

Analysis of protein-protein interactivity
To investigate interactions between SARS-CoV-2 viral proteins, the STRING tool (Search Tool for the Retrieval of Interacting Genes/Proteins; https://string-db.org/) was used.The STRING network provides a visualization of the molecular interactions between different proteins, based on experimental and predictive data from diverse sources.High confidence parameters (minimum score of 0.700) were used to guarantee the accuracy and relevance of the interactions identified.

Results and Discussion
The T and B cell prediction results in 10,671 peptides(these epitopes were selected through filtering in an Excel spreadsheet.)that are related to different protein fragments of SARS-CoV-2 involved in both the infection process and the clinical picture of COVID-19.In our study we observed peptides from the entire SARS-CoV-2 genome.In addition to protein S, other proteins drew attention because they have affinity for human HLA.Evaluating the function of these proteins where these sequences are found can bring insights for new research.The Table 1 below shows the proteins in which these peptides are contained and the functions they play in viral infection.
Table1 -Function of proteins containing epitopes.

Replicase Polyprotein 1a and Replicase Polyprotein 1ab
Essential components in the virus replication process.They are translated into several enzymes that have essential functions in the replication of viral genetic material.

Spike
Responsible for virus entry into host cells through interaction with the ACE2 receptor on human cells.
ORF3a, ORF6, ORF7a, ORF7b, ORF8, ORF9b, ORF9c, ORF10 They refer to open reading frames.Specific sequences of RNA that can be translated into proteins.Each sequence encodes a specific SARS-CoV-2 protein, with functions that may involve modulating the immune response and regulating gene expression.

Putative ORF3d protein
Function not fully described.

Nucleoprotein
It acts in the formation of the protein capsule that surrounds the viral genetic material.Its role is fundamental in maintaining the integrity of the viral genome.
Source: Authors himself.
This information and analysis allows us to identify potential targets for therapeutic interventions along with information on protein-protein interactions.In our study, we searched for information on how the S protein interacts with other proteins in order to provide some possible answers on how the pathologyand infectious process takes place.between the viral envelope and the cell membrane through ACE2.In a second pathway, virions are directed to an endosome and then the S2 subunit is cleaved by the lysosomal protease cathepsin B and L, responsible for the process of modeling the cellular matrix and production of peptide neurotransmitters (Bugge et al., 2009;Hook et al., 2012;Turk et al., 2000).
TMPRSS2 is a well-characterized human TTSP and is classified into three types that are classified according to the portion of the transmembrane domain, thus type I has a carboxy-terminal transmembrane domain; type II has an aminoterminal transmembrane domain; and type III has a glycosyl-phosphatidylinositol domain, a membrane anchoring portion (Tharapell et al., 2020;Kishimoto et al., 2021).Type II has 20 proteases that are divided between the families of hepsins/transmembrane proteases/serine (TMPRSS2), human airway trypsin-like protease (HAT)/differential expression in squamous cell carcinoma (DESC), matriptase and Corin (Tharapell et al., 2020;Kishimoto et al., 2021;Szabo & Bugge, 2007).
Our in silico analyzes suggest that a direct interaction occurs between TMPRSS11Dbelonging to the HAT/DESC family -ACE2 and TMPRSS2, as shown in Figure 1.
TMPRSS11D is co-expressed with ACE2 in bronchial epithelial cells and pneumocytes, together with TMPRSS2 which cleaves the S protein, dividing it into S1 and S2, increasing the virus's entry performance into the host cell (Shulla et al., 2011;Bittmann et al., 2020;Kishimoto et al., 2021).TMPRSS2 and TMPRSS11D emerge as promising targets for the development of therapeutic targets, once their activity is inhibited, effective strategies for preventing viral infection and the development of targeted and effective therapies can be devised..
In contrast, our analysis suggests that proteins such as KCND2 (voltage-gated potassium channel) and TMEM64 (transmembrane protein 64) act in distinct functions and are not directly involved in viral entry.KCND2 is part of voltageregulated potassium channels (VGPCs), mutations in this gene are related to arrhythmias and sudden cardiac death, symptoms frequently reported after SARS-CoV-2 infection (Dehghani-Samani et al., 2019;Sasha et al., 2022;Huseynov et al., 2023).
TMEM64 is a regulator of Ca2+ signaling pathways mediated by the receptor activator of NF-KB ligand (RANKL) involved in the regulation of osteoclast differentiation, this suggests that the virus may affect this regulation and cause arthralgias in patients with COVID-19 (Kim et al., 2013;Disser et al., 2020).These proteins play important roles in specific and distinct cellular processes, however, understanding these nuances in protein interactions is essential for the development of therapeutic strategies.
ORF3a (Ap3A) is an accessory protein located in late endosomes and plays its role in viral exit through lysosomal trafficking, inhibiting autophagic activity by blocking the fusion between autophagosomes and amphisomes with lysosomes (Miao et al., 2021).ENPP4 and FHIT, in our prediction, act concomitantly, promoting the hydrolysis of extracellular AP3A to produce AMP and ADP.ENPP4 promotes platelet aggregation in human plasma, favoring thrombus formation.FHIT, when mediated by viruses, increases the release of reactive oxygen species (ROS) and induces cell apoptosis (Albright et al., 2012;Albright et al., 2014;Zavalhia et al., 2018;Trapasso et al., 2008;Borza et al., 2022).
The ORF8 protein plays a role in modulating the host's immune response and the action of RIT1 under this protein suggests the triggering of a signaling cascade directing broad physiological cellular responses through the Ras/mitogenactivated protein kinase (MAPK) signaling pathway.RIT1 is part of the family of Ras guanosine triphosphate hydrolases (GTPases) and has a regulatory function in the proliferation, survival and differentiation of neuronal cells.Mutations in the signal transduction pathway result in disorders called RASopathies (Rauen, 2013;Van et al., 2020).RIT1 mRNS is widely expressed in the lungs, esophagus, blood and spleen (Lonsdale et al., 2013;Fang et al., 2016).Mutations in RIT1 have already been reported in other pathologies.Among the mutations are cardiac anomalies and blood diseases, this may suggest that SARS-CoV-2 can cause changes in RIT1 mRNA in host cells since respiratory complications, cardiac manifestations and vascular manifestations have been reported.(Aly et al., 2020;Han et al., 2022;Mehandru & Merad, 2022;Shukla et al., 2022;Ruggiero et al., 2022;Bowe et al., 2023;Sciaudone et al., 2023).Although the impacts of RIT1 mutations are evident in other pathologies, it is necessary to understand its mechanism at the molecular level in SARS-CoV-2 infections.

Conclusion
The present study investigated the prediction of T and B cells in relation to 10,671 SARS-CoV-2 peptides, highlighting the interaction of viral proteins with human HLA.We found that, in addition to the Spike (S) protein, other proteins, such as TMPRSS2 and TMPRSS11D, are crucial for virus entry into host cells and emerge as promising targets for antiviral therapies.We also identified proteins, such as KCND2 and TMEM64, which, although not directly involved in viral entry, are associated with clinical complications of COVID-19.This study provides valuable insights into virus protein-protein interactions, proposing new targets for therapeutic interventions and expanding understanding of the molecular mechanisms of infection which must be tested in bench and in vivo for confirmation.

Figure 1 -
Figure 1 -Graphical representation of the interaction networks of the S protein with other SARS-CoV-2 proteins.Direct and indirect interactions of Spike with functional associations are observed.