Research Article

Classification of Complete Proteomes of Different Organisms and Protein Sets Based on Their Protein Distributions in Terms of Some Key Attributes of Proteins

Table 1

A summary of the proteomes and gene sets.

DomainSpeciesGene numberAveMedMaxMinIDpep%IDres%

EukaryotaH. sapiens20,193561.041734,3501645.249.3
D. melanogaster13,700537.239622,9491144.349.0
S. cerevisiae5917494.140549101638.144.6
A. thaliana27,407405.23485393736.843.6
P. trichocarpa41,434385.031754102935.542.6
A. comosus29,772372.628854073139.545.4
O. sativa48,788376.12904957538.044.5
A. trichopoda26,460317.021849902937.543.9
C. reinhardtii17,819732.949823,8593154.861.9
P. patens32,400351.925051991340.245.5
G. intestinalis9667353.814781613335.141.7
Monocercomonoides16,780784.639314,9024952.760.1

ArchaeaLokiarchaeum5348268.422435922020.033.0
I. hospitalis1434278.324013923320.434.3
N. equitans540280.22282197457.030.6

BacteriaE. coli4140316.928223581417.532.2
S. elongatus2612305.325818072920.834.3
Rickettsiales1780365.22512243317.732.8

GirusesMimivirus979356.728929592525.036.6
Pandoravirus2541259.217823212636.443.5

Gene setsViruses237,463251.81548573928.038.8
Plasmids95,214258.920616,990927.238.1
Mitochondria88,405286.12612640138.620.0
Plastids80,807280.015652421220.532.0

All proteinsf811,600325.722534,350532.239.8

Proteomes in the three domains of life; the giant DNA viruses (giruses) and collective protein sets are listed after the cellular species; Total gene numbers; Protein length statistics: Ave: average; Med: median; Max: maximal; Min: minimal protein lengths; Percentage of the intrinsically disordered proteins in the proteome or gene set; Average intrinsic disorder contents of all residues carried by the proteome or gene set; All proteins studied in the present work. The protein length statistics covers all proteins in a proteome or gene set; however, the proteins with unknown sequence(s) (X residues) are excluded in the intrinsic disorder calculations.