Research Article | Open Access
Novel Deleterious nsSNPs within MEFV Gene that Could Be Used as Diagnostic Markers to Predict Hereditary Familial Mediterranean Fever: Using Bioinformatics Analysis
Background. Familial Mediterranean Fever (FMF) is the most common autoinflammatory disease (AID) affecting mainly the ethnic groups originating from Mediterranean basin. We aimed to identify the pathogenic SNPs in MEFV by computational analysis software. Methods. We carried out in silico prediction of structural effect of each SNP using different bioinformatics tools to predict substitution influence on protein structure and function. Result. 23 novel mutations out of 857 nsSNPs are found to have deleterious effect on the MEFV structure and function. Conclusion. This is the first in silico analysis of MEFV gene to prioritize SNPs for further genetic mapping studies. After using multiple bioinformatics tools to compare and rely on the results predicted, we found 23 novel mutations that may cause FMF disease and it could be used as diagnostic markers for Mediterranean basin populations.
Familial Mediterranean Fever is an autosomal recessive inherited inflammatory disease [1–3] (however, it has been observed that a substantial number of patients with clinical FMF possess only one demonstrable MEFV mutation [4, 5]) that is principally seen in different countries [6–10]. However, patients from different ethnicities (such as Japan) are being increasingly recognized [2, 11], and the carrier frequency for MEFV genetic variants in the population in the Mediterranean basin is about 8% . Most cases of FMF usually present with acute abdominal pain and fever [1, 3, 7], both of which are also the main causes of referral in the emergency department . All these factors may help in medical treatment. Colchicine is the first line therapy , but in resistant cases (<10% of patients) , it affects the responsiveness to Colchicine ; other anti-inflammatory drugs can be used for extra anti-inflammatory effect . If FMF is not treated, it may be an etiologic factor for colonic LNH in children . MEFV gene is localized on 16p13.3 of chromosome 16 at position 13.3 which consists of 10 exons with 21600 bp [3, 19]. The disease is characterized by recurrent febrile episodes and inflammation in the form of sterile polyserositis. Amyloid protein involved in inflammatory amyloidosis was named AA (amyloid‐associated) protein and its circulating precursor was named SAA (serum amyloid‐associated). Amyloidosis of the AA type is the most severe complication of the disease. The gene responsible for FMF, MEFV, encodes a protein called pyrin or marenostrin and is expressed mainly in neutrophils [3, 19].
The definition of the MEFV gene has permitted genetic diagnosis of the disease. Nevertheless, as studies have unwrapped molecular data, problems have arisen with the clinical definitions of the disease . FMF is caused by mutations in the MEFV missense SNPs (we were focusing on SNPs which are located in the coding region because it is much important in disease causing potential, which are responsible for amino acid residue substitutions resulting in functional diversity of proteins in humans)  coding for pyrin, which is a component of inflammasome functioning in inflammatory response and production of interleukin-1β (IL-1β). Recent studies have shown that pyrin recognizes bacterial modifications in Rho GTPases, which results in inflammasome activation and increase in IL-1β. Pyrin does not directly recognize Rho modification but probably is affected by Rho effector kinase, which is a downstream event in the actin cytoskeleton pathway [19, 21, 22].
The aim of this study was to identify the pathogenic SNPs in MEFV using in silico prediction software and to determine the structure, function, and regulation of their respective proteins. This is the first in silico analysis in MEFV gene to prioritize SNPs for further genetic mapping studies. The usage of in silico approach has strong impact on the identification of candidate SNPs since they are easy and less costly and can facilitate future genetic studies .
2.1. Data Mining
The data on human MEFV gene was collected from National Center for Biological Information (NCBI) website . The SNP information (protein accession number and SNP ID) of the MEFV gene was retrieved from the NCBI dbSNP (http://www.ncbi.nlm.nih.gov/snp/) and the protein sequence was collected from Swiss Prot databases (http://expasy.org/) .
SIFT is a sequence homology-based tool  that sorts intolerant from tolerant amino acid substitutions and predicts whether an amino acid substitution in a protein will have a phenotypic effect. It considers the position at which the change occurred and the type of amino acid change. Given a protein sequence, SIFT chooses related proteins and obtains an alignment of these proteins with the query. Based on the amino acids appearing at each position in the alignment, SIFT calculates the probability that an amino acid at a position is tolerated conditional on the most frequent amino acid being tolerated. If this normalized value is less than a cutoff, the substitution is predicted to be deleterious. SIFT scores <0.05 are predicted by the algorithm to be intolerant or deleterious amino acid substitutions, whereas scores >0.05 are considered tolerant. It is available at (http://sift.bii.a-star.edu.sg/).
It is a software tool  to predict possible impact of an amino acid substitution on both structure and function of a human protein by analysis of multiple sequence alignment and protein 3D structure; in addition, it calculates position-specific independent count scores (PSIC) for each of the two variants and then calculates the PSIC scores difference between the two variants. The higher a PSIC score difference is, the higher the functional impact a particular amino acid substitution is likely to have. Prediction outcomes could be classified as probably damaging, possibly damaging or benign according to the value of PSIC as it ranges from (0_1); values closer to zero were considered benign while values closer to 1 were considered probably damaging and also it can be indicated by a vertical black marker inside a color gradient bar, where green is benign and red is damaging. nsSNPs that is predicted to be intolerant by SIFT has been submitted to PolyPhen as protein sequence in FASTA format obtained from UniproktB/Expasy after submitting the relevant ensemble protein (ESNP) there, and then we entered position of mutation, native amino acid, and the new substituent for both structural and functional predictions. PolyPhen version 2.2.2 is available at http://genetics.bwh.harvard.edu/pph2/index.shtml.
Provean is a software tool  which predicts whether an amino acid substitution or indel has an impact on the biological function of a protein. It is useful for filtering sequence variants to identify nonsynonymous or indel variants that are predicted to be functionally important. It is available at (https://rostlab.org/services/snap2web/).
Functional effects of mutations are predicted with SNAP2 . SNAP2 is a trained classifier that is based on a machine learning device called “neural network”. It distinguishes between effect and neutral variants/nonsynonymous SNPs by taking a variety of sequence and variant features into account. The most important input signal for the prediction is the evolutionary information taken from an automatically generated multiple sequence alignment. Also structural features such as predicted secondary structure and solvent accessibility are considered. If available also annotation (i.e., known functional residues, pattern, regions) of the sequence or close homologs are pulled in. In a cross-validation over 100,000 experimentally annotated variants, SNAP2 reached sustained two-state accuracy (effect/neutral) of 82% (at an AUC of 0.9). In our hands this constitutes an important and significant improvement over other methods. It is available at (https://rostlab.org/services/snap2web/).
An online Support Vector Machine (SVM) based classifier is optimized to predict if a given single point protein mutation can be classified as disease related or as a neutral polymorphism. It is available at (http://snps.biofold.org/phd-snp/phd-snp.html).
SNPs&GO is an algorithm developed in the Laboratory of Biocomputing at the University of Bologna directed by Prof. Rita Casadio. SNPs&GO is an accurate method that, starting from a protein sequence, can predict whether a variation is disease related or not by exploiting the corresponding protein functional annotation. SNPs&GO collects in unique framework information derived from protein sequence, evolutionary information, and function as encoded in the Gene Ontology terms and outperforms other available predictive methods . It is available at (http://snps.biofold.org/snps-and-go/snps-and-go.html).
P-MuT, a web-based tool  for the annotation of pathological variants on proteins, allows the fast and accurate prediction (approximately 80% success rate in humans) of the pathological character of single point amino acidic mutations based on the use of neural networks. It is available at (http://mmb.irbbarcelona.org/PMut).
2.9. I-Mutant 3.0
I-Mutant 3.0 is a neural network based tool  for the routine analysis of protein stability and alterations by taking into account the single-site mutations. The FASTA sequence of protein retrieved from UniProt is used as an input to predict the mutational effect on protein stability. It is available at (http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant3.0/I-Mutant3.0.cgi).
2.10. Modeling nsSNP Locations on Protein Structure
Project hope is a new online web-server to search protein 3D structures (if available) by collecting structural information from a series of sources, including calculations on the 3D coordinates of the protein, sequence annotations from the UniProt database, and predictions by DAS services. Protein sequences were submitted to project hope server in order to analyze the structural and conformational variations that have resulted from single amino acid substitution corresponding to single nucleotide substitution. It is available at (http://www.cmbi.ru.nl/hope).
We submitted genes and selected from a list of data sets that they wish to query. GeneMANIA’s  approach is to know protein function prediction integrating multiple genomics and proteomics data sources to make inferences about the function of unknown proteins. It is available at (http://www.genemania.org/).
3. Results and Discussion
FDR: false discovery rate is greater than or equal to the probability that this is a false positive.
23 novel mutations have been found (see Table 3) which affected the stability and function of the MEFV gene using bioinformatics tools. The methods used were based on different aspects and parameters describing the pathogenicity and provided clues on the molecular level about the effect of mutations. It was not easy to predict the pathogenic effect of SNPs using single method. Therefore, multiple methods were used to compare and rely on the results predicted. In this study we used different in silico prediction algorithms: SIFT, PolyPhen-2, Provean, SNAP2, SNP&GO, PHD-SNP, P-MuT, and I-Mutant 3.0 (see Figure 1).
This study identified the total number of nsSNP in Homo sapiens located in coding region of MEFV gene, which were investigated in dbSNP/NCBI Database . Out of 2369, there are 856 nsSNPs (missense mutations) submitted to SIFT server, PolyPhen-2 server, Provean sever, and SNAP2, respectively, and 392 SNPs were predicted to be deleterious in SIFT server. In PolyPhen-2 server, the result showed that 453 were found to be damaging (147 possibly damaging and 306 probably damaging showing deleterious). In Provean server our result showed that 244 SNPs were predicted to be deleterious. While in SNAP2 server the result showed that 566 SNPs were predicted to have effect. The differences in prediction capabilities refer to the fact that every prediction algorithm uses different sets of sequences and alignments. In Table 2 we submitted four positive results from SIFT, PolyPhen-2, Provean, and SNAP2 (see Table 1) to observe the disease causing one by SNP&GO, PHD-SNP, and P-Mut servers.
In SNP&GO, PHD-SNP and P-Mut softwares were used to predict the association of SNPs with disease. According to SNP&GO, PHD-SNP and P-Mut (70, 91 and 58 SNPs respectively) were found to be disease-related SNPs. We selected the triple disease-related SNPs only in 3 softwares for further analysis by I-Mutant 3.0, Table 3. I-Mutant result revealed that the protein stability decreased which destabilizes the amino acid interaction (S749Y, F743S, Y741C, F731V, I720T, L709R, V691G, W689R, G668R, V659F, F636C, H407Q, H407R, H404R, C398Y, H378Q, H378Y, and L86P). C375R, C395F, C395R, C395Y, and R461W were found to increase the protein stability (see Table 3).
BioEdit software was used to align 10 amino acid sequences of MEFV demonstrating that the residues predicted to be mutated in our band (indicated by red arrow) are evolutionarily conserved across species (see Figure 2). While Project HOPE software was used to submit the 23 most deleterious and damaging nsSNPs (see Figures 3–25), L86P: Proline (the mutant residue) is smaller than Leucine (the wild-type residue); this might lead to loss of interactions. The wild-type and mutant amino acids differ in size. The mutation is located within a domain, annotated in UniProt as Pyrin. The mutation introduces an amino acid with different properties, which can disturb this domain and abolish its function. The wild-type residue is located in a region annotated in UniProt to form an α-helix. Proline disrupts an α-helix when not located at one of the first 3 positions of that helix. In case of the mutation at hand, the helix will be disturbed and this can have severe effects on the structure of the protein.
GeneMANIA revealed that MEFV has many vital functions: chemokine production, inflammatory response, interleukin-1 beta production, interleukin-1 production, intracellular receptor signaling pathway, nucleotide-binding domain, Leucine rich repeat containing receptor signaling pathway, positive regulation of cysteine-type endopeptidase activity, positive regulation of endopeptidase activity, positive regulation of peptidase activity, regulation of chemokine production, regulation of cysteine-type endopeptidase activity, regulation of endopeptidase activity, regulation of interleukin-1 beta production, regulation of interleukin-1 production, and regulation of peptidase activity. The genes coexpressed with, sharing similar protein domain, or participated to achieve similar function were shown in (see Figure 26) Tables 4 and 5.
In this study we also retrieved all these SNPs as untested (V659F, L709R, F743S, S749Y). We found it to be all damaging. Our study is the first in silico analysis of MEFV gene which was based on functional analysis while all previous studies [34, 35] were based on frequency. This study revealed that 23 novel pathological mutations have a potential functional impact and may thus be used as diagnostic markers for Mediterranean basin populations.
In this work the influence of functional SNPs in the MEFV gene was investigated through various computational methods, which determined that S749Y, F743S, Y741C, F731V, I720T, L709R, V691G, W689R, G668R, V659F, F636C, R461W, H407Q,, H407R, H404R, C398Y, C395Y, C395F, C395R, H378Q, H378Y, C375R, and L86P are new SNPs having a potential functional impact and can thus be used as diagnostic markers. They constitute possible candidates for further genetic epidemiological studies with a special consideration of the large heterogeneity of MEFV SNPs among the different populations.
The data which support our findings in this study are available from the corresponding author upon reasonable request.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Mujahed I. Mustafa wrote Abstract, Methodology, and Result & Discussion. Fatima A. Abdelrhman did Introduction. Conclusion was written by Soada A. Osman. Writing the original draft was carried out by Mujahed I. Mustafa.
The authors wish to acknowledge the enthusiastic cooperation of Africa City of Technology, Sudan.
- S. Georgin-Lavialle, V. Hentgen, K. Stankovic Stojanovic et al., “Familial Mediterranean fever,” La Revue de Médecine Interne, vol. 39, no. 4, pp. 240–255, 2018.
- E. Ben-Chetrit and I. Touitou, “Familial mediterranean fever in the world,” Arthritis & Rheumatology, vol. 61, no. 10, pp. 1447–1453, 2009.
- E. Fragouli, E. Eliopoulos, E. Petraki et al., “Familial Mediterranean Fever in Crete: A genetic and structural biological approach in a population of 'intermediate risk',” Clinical Genetics, vol. 73, no. 2, pp. 152–159, 2008.
- M. G. Booty, J. C. Jae, S. L. Masters et al., “Familial Mediterranean fever with a single MEFV mutation: Where is the second hit?” Arthritis & Rheumatology, vol. 60, no. 6, pp. 1851–1861, 2009.
- D. Marek-Yagel, Y. Berkun, S. Padeh et al., “Clinical disease among patients heterozygous for familial Mediterranean fever,” Arthritis & Rheumatology, vol. 60, no. 6, pp. 1862–1866, 2009.
- N. Ebadi, A. Shakoori, M. Razipour et al., “The spectrum of Familial Mediterranean Fever gene (MEFV) mutations and genotypes in Iran, and report of a novel missense variant (R204H),” European Journal of Medical Genetics, vol. 60, no. 12, pp. 701–705, 2017.
- N. Cekin, M. E. Akyurek, E. Pinarbasi, and F. Ozen, “MEFV mutations and their relation to major clinical symptoms of familial mediterranean fever,” Gene, vol. 626, pp. 9–13, 2017.
- J. R. Al-Alami, M. K. Tayeh, D. A. Najib et al., “Familial mediterranean fever mutation frequencies and carrier rates among a mixed Arabic population,” Saudi Medical Journal, vol. 24, no. 10, pp. 1055–1059, 2003.
- R. Gershoni-Baruch, M. Shinawi, K. Leah, K. Badarnah, and R. Brik, “Familial mediterranean fever: prevalence, penetrance and genetic drift,” European Journal of Human Genetics, vol. 9, no. 8, pp. 634–637, 2001.
- N. Jalkh, E. Génin, E. Chouery et al., “Familial mediterranean fever in Lebanon: Founder effects for different MEFV mutations,” Annals of Human Genetics, vol. 72, no. 1, pp. 41–47, 2008.
- N. Tomiyama, Y. Higashiuesato, T. Oda et al., “MEFV mutation analysis of familial Mediterranean fever in Japan,” Clinical and Experimental Rheumatology, vol. 26, no. 1, pp. 13–17, 2008.
- R. Koshy, A. Sivadas, and V. Scaria, “Genetic epidemiology of familial Mediterranean fever through integrative analysis of whole genome and exome sequences from Middle East and North Africa,” Clinical Genetics, vol. 93, no. 1, pp. 92–102, 2018.
- T. Yoldaş, Ş. Kayali, I. Ertuǧrul, V. Doǧan, U. A. Örün, and S. Karademir, “Massive pericardial effusion and tamponade can be a first sign of familial mediterranean fever,” Pediatric Emergency Care, vol. 33, no. 9, pp. e48–e51, 2017.
- A. Slobodnick, B. Shah, S. Krasnokutsky, and M. H. Pillinger, “Update on colchicine, 2017,” Rheumatology (Oxford), vol. 57, no. suppl_1, pp. i4–i11, 2018.
- A. Corsia, S. Georgin-Lavialle, V. Hentgen et al., “A survey of resistance to colchicine treatment for French patients with familial Mediterranean fever,” Orphanet Journal of Rare Diseases, vol. 12, no. 1, p. 54, 2017.
- E. Verrecchia, L. L. Sicignano, M. La Regina et al., “Small intestinal bacterial overgrowth affects the responsiveness to colchicine in familial mediterranean fever,” Mediators of Inflammation, vol. 2017, Article ID 7461426, 6 pages, 2017.
- S. Akar, P. Cetin, U. Kalyoncu et al., “Nationwide experience with off-label use of Interleukin-1 targeting treatment in familial mediterranean fever patients,” Arthritis Care & Research, vol. 70, no. 7, pp. 1090–1094, 2018.
- O. E. Gurkan, G. Yilmaz, A. U. Aksu, Z. Demirtas, G. Akyol, and B. Dalgic, “Colonic lymphoid nodular hyperplasia in childhood: Causes of familial mediterranean fever need extra attention,” Journal of Pediatric Gastroenterology and Nutrition, vol. 57, no. 6, pp. 817–821, 2013.
- R. Heilig and P. Broz, “Function and mechanism of the pyrin inflammasome,” European Journal of Immunology, vol. 48, no. 2, pp. 230–238, 2018.
- T. Kallinich, B. Orak, and H. Wittkowski, “Role of genetics in familial Mediterranean fever,” Zeitschrift für Rheumatologie, vol. 76, no. 4, pp. 303–312, 2017.
- Y. Jamilloux, L. Lefeuvre, F. Magnotti et al., “Familial Mediterranean fever mutations are hypermorphic mutations that specifically decrease the activation threshold of the Pyrin inflammasome,” Rheumatology, vol. 57, no. 1, pp. 100–111, 2018.
- S. Özen, E. D. Batu, and S. Demir, “Familial mediterranean fever: recent developments in pathogenesis and new recommendations for management,” Frontiers in Immunology, vol. 8, p. 253, 2017.
- A. Bajard, S. Chabaud, C. Cornu et al., “An in silico approach helped to identify the best experimental design, population, and outcome for future randomized clinical trials,” Journal of Clinical Epidemiology, vol. 69, pp. 125–136, 2016.
- D. A. Benson, I. Karsch-Mizrachi, K. Clark, D. J. Lipman, J. Ostell, and E. W. Sayers, “GenBank,” Nucleic Acids Research, vol. 40, no. D1, pp. D48–D53, 2012.
- P. Artimo, M. Jonnalagedda, K. Arnold, D. Baratin, G. Csardi, E. De Castro et al., “ExPASy: SIB bioinformatics resource portal,” Nucleic Acids Research, vol. 40, no. W1, pp. W597–W603, 2012.
- N.-L. Sim, P. Kumar, J. Hu, S. Henikoff, G. Schneider, and P. C. Ng, “SIFT web server: predicting effects of amino acid substitutions on proteins,” Nucleic Acids Research, vol. 40, no. W1, pp. W452–W457, 2012.
- I. A. Adzhubei, S. Schmidt, L. Peshkin et al., “A method and server for predicting damaging missense mutations,” Nature Methods, vol. 7, no. 4, pp. 248-249, 2010.
- Y. Choi, G. E. Sims, S. Murphy, J. R. Miller, and A. P. Chan, “Predicting the functional effect of amino acid substitutions and indels,” PLoS ONE, vol. 7, no. 10, Article ID e46688, 2012.
- M. Hecht, Y. Bromberg, and B. Rost, “Better prediction of functional effects for sequence variants,” BMC Genomics, vol. 16, no. Suppl 8, p. S1, 2015.
- R. Calabrese, E. Capriotti, P. Fariselli, P. L. Martelli, and R. Casadio, “Functional annotations improve the predictive score of human disease-related mutations in proteins,” Human Mutation, vol. 30, no. 8, pp. 1237–1244, 2009.
- V. López-Ferrando, A. Gazzo, X. De La Cruz, M. Orozco, and J. L. Gelpí, “PMut: A web-based tool for the annotation of pathological variants on proteins, 2017 update,” Nucleic Acids Research, vol. 45, no. W1, pp. W222–W228, 2017.
- E. Capriotti, P. Fariselli, and R. Casadio, “I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure,” Nucleic Acids Research, vol. 33, no. suppl_2, pp. W306–W310, 2005.
- D. W. Farley, S. L. Donaldson, O. Comes et al., “The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function,” Nucleic Acids Research, vol. 38, no. 2, pp. W214–W220, 2010.
- E. Gumus, “The frequency of MEFV gene mutations and genotypes in Sanliurfa province, South-Eastern region of Turkey, after the Syrian Civil War by using next generation sequencing and report of a Novel Exon 4 Mutation (I423T),” Journal of Clinical Medicine, vol. 7, no. 5, p. 105, 2018.
- M. Medlej-Hashim, J.-L. Serre, S. Corbani et al., “Familial Mediterranean fever (FMF) in Lebanon and Jordan: a population genetics study and report of three novel mutations,” European Journal of Medical Genetics, vol. 48, no. 4, pp. 412–420, 2005.
Copyright © 2019 Mujahed I. Mustafa et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.