(a) Advantage and mechanism of sequencers. (b) Components and cost of sequencers. (c) Application of sequencers.
(a)
Sequencer
454 GS FLX
HiSeq 2000
SOLiDv4
Sanger 3730xl
Sequencing mechanism
Pyrosequencing
Sequencing by synthesis
Ligation and two-base coding
Dideoxy chain termination
Read length
700 bp
50SE, 50PE, 101PE
50 + 35 bp or 50 + 50 bp
bp
Accuracy
99.9%*
98%, (100PE)
99.94% *raw data
99.999%
Reads
1 M
3 G
1200~1400 M
—
Output data/run
0.7 Gb
600 Gb
120 Gb
1.9~84 Kb
Time/run
24 Hours
3~10 Days
7 Days for SE 14 Days for PE
20 Mins~3 Hours
Advantage
Read length, fast
High throughput
Accuracy
High quality, long read length
Disadvantage
Error rate with polybase more than 6, high cost, low throughput
Short read assembly
Short read assembly
High cost low throughput
(b)
Sequencers
454 GS FLX
HiSeq 2000
SOLiDv4
3730xl
Instrument price
Instrument $500,000, $7000 per run
Instrument $690,000, $6000/(30x) human genome
Instrument $495,000, $15,000/100 Gb
Instrument $95,000, about $4 per 800 bp reaction
CPU
2* Intel Xeon X5675
2* Intel Xeon X5560
8* processor 2.0 GHz
Pentium IV 3.0 GHz
Memory
48 GB
48 GB
16 GB
1 GB
Hard disk
1.1 TB
3 TB
10 TB
280 GB
Automation in library preparation
Yes
Yes
Yes
No
Other required device
REM e system
cBot system
EZ beads system
No
Cost/million bases
$10
$0.07
$0.13
$2400
(c)
Sequencers
454 GS FLX
HiSeq 2000
SOLiDv4
3730xl
Resequencing
Yes
Yes
De novo
Yes
Yes
Yes
Cancer
Yes
Yes
Yes
Array
Yes
Yes
Yes
Yes
High GC sample
Yes
Yes
Yes
Bacterial
Yes
Yes
Yes
Large genome
Yes
Yes
Mutation detection
Yes
Yes
Yes
Yes
(1) All the data is taken from daily average performance runs in BGI. The average daily sequence data output is about 8 Tb in BGI when about 80% sequencers (mainly HiSeq 2000) are running. (2) The reagent cost of 454 GS FLX Titanium is calculated based on the sequencing of 400 bp; the reagent cost of HiSeq 2000 is calculated based on the sequencing of 200 bp; the reagent cost of SOLiDv4 is calculated based on the sequencing of 85 bp. (3) HiSeq 2000 is more flexible in sequencing types like 50SE, 50PE, or 101PE. (4) SOLiD has high accuracy especially when coverage is more than 30x, so it is widely used in detecting variations in resequencing, targeted resequencing, and transcriptome sequencing. Lanes can be independently run to reduce cost.