Research Article

Identify High-Quality Protein Structural Models by Enhanced -Means

Table 1

Comparison between -means, -means++, and SPICKER on 56 protein decoys.

IndexPDBLenSizeBest-means++-means-meansSPICKERRandom

11abv1035260.5070.37010.38340.49100.38130.479
21af7725270.6230.50090.50090.48200.48740.322
31ah9635100.6960.50400.47430.47400.46570.434
41aoy655290.7110.64820.66950.66950.66950.622
51b4bA714600.4730.38150.42790.42700.45010.379
61b72A495340.6970.53970.39170.64100.49230.562
71bm8993290.3880.43320.37870.33200.35500.255
81bq9A535730.4650.35400.34590.39900.38730.411
91cewI1084520.7480.72940.71540.72900.71870.617
101cqkA1012840.8850.84390.85390.85390.85390.815
111csp673150.7530.71580.71580.71580.71580.686
121cy5A922730.8930.86850.88390.86800.88390.876
131dcjA735250.3680.32990.36450.31700.32640.334
141di2A693740.8430.76220.76630.76200.76630.374
151dtjA742850.8140.79010.75810.73700.79010.705
161egxA1153520.8270.76730.76730.76730.76730.768
171fadA925140.6520.57160.57550.57550.57550.553
181fo5A853400.5680.53910.53910.52300.52960.469
191g1cA983070.7870.74730.77320.78000.77320.621
201gjxA775250.5150.23750.38070.38100.42980.191
211gnuA1175530.6470.53530.53530.53500.54560.509
221gpt474690.5530.51300.53770.50600.49270.517
231gyvA1173370.7760.74060.74060.75400.74060.753
241hbkA893000.7080.66330.66330.66330.66330.599
251itpA685260.5110.30690.31520.31500.30960.335
261jnuA1042690.7680.74570.72370.69800.72370.711
271kjs745480.50.37280.37280.35800.37280.313
281kviA685500.790.71810.67740.72200.67740.642
291mkyA3812850.5520.41550.41550.41550.41550.384
301mla_2703350.7750.67420.62260.62260.62260.609
311mn8A845450.4570.25170.35430.35400.32850.310
321n0uA4693010.5880.47530.47460.45240.45240.333
331ne3A565660.4530.25230.39430.39400.37240.344
341no5A934260.4190.37100.42470.42400.40540.500
351npsA884690.8000.76710.76710.28100.76710.283
361o2fB775100.5280.33800.3380.33700.26900.379
371of9A775070.5850.54690.4940.54600.49400.554
381ogwA725200.8900.78530.78530.78500.86220.78
391orgA1184420.8160.74400.73390.74400.74400.693
401pgx595620.5510.58240.32160.51600.44460.51
411r69612910.8240.70070.72550.72550.72550.827
421sfp1113080.7580.74530.74530.74540.74540.749
431shfA595360.8360.56490.50700.56400.50700.408
441sro715150.6480.65130.65130.58200.61580.583
451ten872940.8510.82150.82150.78600.82150.781
461tfi473390.5920.50610.55760.55200.55760.550
471thx1083020.8650.80000.80000.80000.80000.819
481tif595420.3400.32690.26670.26600.31990.232
491tig885650.5850.55240.45960.47400.41760.517
501vcc765510.4550.39730.40660.39700.40660.291
51256bA1065060.8140.76570.75780.76500.75780.723
522a0b1182820.8380.80830.80830.80830.80830.768
532cr7A605400.6660.35890.50590.58200.51360.365
542f3nA654850.7580.64030.73220.65100.71320.626
552pcy994350.6370.60400.57950.64600.62330.527
562reb_2605500.4030.39020.3780.32900.31740.416

The length of the protein sequence.
The size of the models in the decoy.
The best (maximum) TM-score of the models in the decoy.
The TM-score of centroid model in the largest cluster selected by -means++ (bold indicates better than SPICKER).
The TM-score of centroid model in the largest cluster selected by -means (bold indicates better than SPICKER).
The TM-score of centroid model in the largest cluster selected by -means (bold indicates better than SPICKER).
The TM-score of centroid model in the largest cluster selected by SPICKER.
The TM-score of centroid model selected by random.