Research Article

Local Packing Density Is the Main Structural Determinant of the Rate of Protein Sequence Evolution at Site Level

Table 1

Site-specific properties.

SymbolProperty measuredName and description

CSRate of evolutionConSurf rate of evolution: estimated rate relative to the overall average, computed using an empirical Bayesian approach using the phylogenetic tree topology and branch lengths and the JTT probability matrix of amino acid substitutions as implemented in the ConSurf web server.
ETSequence variability Real-valued evolutionary trace: sequence variability score computed using a weighted average of sequence entropy with weights accounting for the topology of the phylogenetic tree.
KBSP Sequence conservationKarlin & Brocchieri Sum-of-Pairs: sequence conservation score computed by summing amino acid similarity scores over all amino acid pairs of the site’s column in a multiple sequence alignment. Similarity scores are obtained using a normalized JTT250 matrix.
VTSPSequence conservation Valdar & Thornton Sum-of-Pairs: sequence conservation score computed by summing amino acid similarity scores over all amino acid pairs of the site’s column in a multiple sequence alignment. Sequences are weighted, and similarity scores are obtained using a min-max normalized JTT250 matrix.
ENSequence variabilityEntropy: Shannon information entropy computed using the amino acid frequencies observed at the site’s MSA column.

CN Local packing Contact number: the number of within various distances of the site’s .
The cut-off distance ranges from 9 to 30 Å.
WCN Local packing Weighted contact number: measure of contact density obtained by summing the inverse square distances between the site’s and the rest of the sites of the protein.

ASASolvent accessibilityAccessible surface area: solvent accessibility of the site computed by rolling a 1.4-Å sphere over the residue’s molecular surface.
RSASolvent accessibilityRelative solvent accessibility: solvent accessibility of the site computed by rolling a 1.4-Å sphere over the residue’s molecular surface, divided by the maximum value for residues of the same type. We consider three different tables of values of maximum ASA resulting in three RSA measures: RSAR, RSAM, and RSAT.