Rohu and Magur Genome Sequencing Consortium

Current Status

A. Whole genome sequencing

Whole genome sequencing of Rohu and Magur has been done on different sequencing platforms viz. Ion Torrent PGM, Roche 454 GS FLX and Illmina NextSeq/HiSeq/MiSeq. The sequencing involved construction of shot gun DNA libraries, paired end libraries (size: 150-250, 200-400 bp, 350-450 bp, 400-500 bp, 500-600 bp and 550-650 bp) and mate pair libraries (2-4 Kb, 3 kb, 4-6 kb, 6Kb, 6-8 Kb, 8 kb, 8-10 Kb, 10-12 Kb and 20 kb). The status of the sequencing work is given below:

Clarias batrachus
Sequencing Platform Runs Data (GB) No. of reads (Millions) Average read length
454 GX FLX Run1 0.5 1.49 350.2
Run2 0.56 1.54 372.72
MP_8Kb 0.05 0.31 184.28
MP_20Kb 0.31 1.5 198.51
Ion Torrent PGM Run1 0.89 2.76 333.20
Run2 0.99 3.39 300.60
Illumina (HiSeq) PE_150-250 53.3 363.92 150
PE_350-450 48.9 333.72 150
PE_550-650 43 293.95 150
Illumina (MiSeq) PE_150-250 0.41 2.84 149.4
PE_350-450 3.4 16.37 208.57
PE_400-500 3.12 14.35 223.14
PE_550-650 0.78 4.44 180.46
MP_4-6Kb 0.29 1.64 182.7
Labeo rohita
Sequencing Platform Runs Data (GB) No. of reads (Millions) Average read length
454 GS FLX Run1 0.61 1.61 392.22
Run2 0.60 1.58 388.88
Run3 0.58 1.42 423.93
Run4 0.52 1.38 391.89
MP_3Kb 0.58 3.20 177.67
MP_3Kb 0.58 3.20 177.67
MP_20Kb 0.42 1.97 215.18
Ion Torrent PGM Run1 0.83 2.71 260
Illumina (Next Seq) PE_Short_Insert 57 386.63 150
PE_Long_Insert 53.8 364.86 150
MP_3kb 18.98 128.72 150
MP_6kb 12.72 86.3 150
Illumina (MiSeq) PE_200-400 3.12 18.14 176.27
PE_350-450 6.02 27.01 228.46
PE_500-600 4.17 22.46 190.45
PE_550-650 8.24 42.23 199.89
MP_2-4Kb 0.29 1.63 187.24
MP_4-6Kb 0.2 1.12 190.55
MP_6-8Kb 0.86 4.59 192.05
MP_8-10Kb 0.36 2.04 183.26
MP_10-12Kb 0.29 1.69 178.51
B. De novo assembly

The whole genome sequencing data obtained on different sequencing platforms (Roche 454, Ion Torrent, Illumina HiSeq and Illumina MiSeq), was assembled using different assemblers, viz. Newbler, CLC genomics workbench and Abyss. The Abyss assembly was performed on 4 different hash lengths (61, 64, 69 and 85). The assembly statistics is given below:

Statistics of 'Newbler' assembly of Roche 454 and Ion Torrent reads in L. rohita
No. of reads
16,220,469
No. of bases
5,262,108,568
Total no. of reads used
14,760,536
Total no. of bases used
5,017,770,751
No. of aligned reads
13,055,383 (80.49%)
No. of aligned bases
4,062,817,362 (81.01%)
Large contigs (length >500bp)
No. of contigs
379,748
No. of bases
477,437,906
Average contig Size
1,257
N50 contig Size
1,358
Largest contig Size
16,609
All contigs
No. of contigs
645,485
No. of bases
571,382,225
Statistics of 'Newbler' assembly of Roche 454 and Ion Torrent reads in C. batrachus
No. of reads
10,127,018
No. of bases
3,289,947,570
Total no. of reads used
10,153,852
Total no. of bases used
3,212,313,421
No. of aligned reads
7,730,461 (76.13%)
No. of aligned bases
2,468,347,647 (76.84%)
Large contigs (length >500bp)
No. of contigs
4,06,042
No. of bases
442,250,392
Average contig size
1,089
N50 contig size
1,178
Largest contig size
16,510
All Contigs
No. of contigs
5,36,276
No. of bases
481,576,303
Statistics of 'CLC' assembly of Roche 454 reads in C. batrachus
Minimum contig length
51
Maximum contig length
16479
Average contig length
551
Total number of contigs
282140
Total contig length
155462808
N75 contig size
454
N50 contig size
605
Statistics of 'Abyss' assembly of Illumina HiSeq reads inC. batrachus
Hash Length
61
64
69
85
Contigs generated
270392
270018
265919
287999
Maximum contig Length
59097
58997
65287
89416
Minimum contig length
200
200
200
200
Average contig length
3008.7
3045.7
3152
4191.5
Total contigs length
813531409
822382599
838189871
891793694
Total number of non-ATGC characters
6628749
6182998
5465133
3424771
Percentage of non-ATGC Characters
0.815
0.752
0.652
0.384
Contigs >= 200 bp
270392
270018
265919
287999
Contigs >= 500 bp
215724
214280
208309
211754
Contigs >= 1 Kbp
174852
174026
170275
173175
Contigs >= 10 Kbp
14408
14983
16427
18674
N50 value
5760
5893
6250
6611
Coverage
0.787
0.797
0.813
0.867
Statistics of 'CLC' assembly of Illumina NextSeq reads in L. rohita
Contigs Generated
244599
Maximum Contig Length
55485
Minimum Contig Length
394
Average Contig Length
3,520.0
Total Contigs Length
860992811
Total Number of Non-ATGC Characters
15007250
Percentage of Non-ATGC Characters
1.743
Contigs >= 100 bp
244599
Contigs >= 200 bp
244599
Contigs >= 500 bp
244507
Contigs >= 1 Kbp
200693
Contigs >= 10 Kbp
13222
N50 value
5394
Statistics of 'CLC' assembly of Illumina HiSeq reads in C.batrachus (first)
Contigs generated
357242
Maximum contig Length
26895
Minimum contig Length
321
Average contig Length
2,154.0
Total contigs Length
769503263
Total number of non-ATGC Characters
1882730
Percentage of non-ATGC Characters
0.245
Contigs >= 100 bp
357242
Contigs >= 200 bp
357242
Contigs >= 500 bp
357142
Contigs >= 1 Kbp
267117
Contigs >= 10 Kbp
1865
N50 value
2844
Statistics of 'CLC' assembly of Illumina HiSeq reads in C. batrachus (second)
Contigs generated
410486
Maximum contig Length
56074
Minimum contig Length
77
Average contig Length
2,004.5
Total contigs Length
822806256
Total number of non-ATGC Characters
36237954
Percentage of non-ATGC Characters
4.404
Contigs >= 100 bp
410485
Contigs >= 200 bp
410281
Contigs >= 500 bp
281098
Contigs >= 1 Kbp
198807
Contigs >= 10 Kbp
9279
N50 value
4184






Contact us | Disclaimer
Copyright © 2015 ICAR-National Bureau of Fish Genetic Resources, Lucknow.