Knowledge

Shotgun sequencing

Source 📝

420: 432:) was limited until the late 1990s, when technological advances made practical the handling of the vast quantities of complex data involved in the process. Historically, full-genome shotgun sequencing was believed to be limited by both the sheer size of large genomes and by the complexity added by the high percentage of repetitive DNA (greater than 50% for the human genome) present in large genomes. It was not widely accepted that a full-genome shotgun sequence of a large genome would provide reliable data. For these reasons, other strategies that lowered the computational load of sequence assembly had to be utilized before shotgun sequencing was performed. In hierarchical sequencing, also known as top-down sequencing, a low-resolution 212:. As sequencing projects began to take on longer and more complicated DNA sequences, multiple groups began to realize that useful information could be obtained by sequencing both ends of a fragment of DNA. Although sequencing both ends of the same fragment and keeping track of the paired data was more cumbersome than sequencing a single end of two distinct fragments, the knowledge that the two sequences were oriented in opposite directions and were about the length of a fragment apart from each other was valuable in reconstructing the sequence of the original target fragment. 486:. Two clones that have several fragment sizes in common are inferred to overlap because they contain multiple similarly spaced restriction sites in common. This method of genomic mapping is called restriction or BAC fingerprinting because it identifies a set of restriction sites contained in each clone. Once the overlap between the clones has been found and their order relative to the genome known, a scaffold of a minimal subset of these contigs that covers the entire genome is shotgun-sequenced. 460: 1641: 392:. For example, a hypothetical genome with 2,000 base pairs reconstructed from 8 reads with an average length of 500 nucleotides will have 2x redundancy. This parameter also enables one to estimate other quantities, such as the percentage of the genome covered by reads (sometimes also called coverage). A high coverage in shotgun sequencing is desired because it can overcome errors in 471:
another and then selecting the fewest clones required to form a contiguous scaffold that covers the entire area of interest. The order of the clones is deduced by determining the way in which they overlap. Overlapping clones can be identified in several ways. A small radioactively or chemically labeled probe containing a
475:(STS) can be hybridized onto a microarray upon which the clones are printed. In this way, all the clones that contain a particular sequence in the genome are identified. The end of one of these clones can then be sequenced to yield a new probe and the process repeated in a method called chromosome walking. 470:
Although the full sequences of the BAC contigs is not known, their orientations relative to one another are known. There are several methods for deducing this order and selecting the BACs that make up a tiling path. The general strategy involves identifying the positions of the clones relative to one
423:
In whole genome shotgun sequencing (top), the entire genome is sheared randomly into small fragments (appropriately sized for sequencing) and then reassembled. In hierarchical shotgun sequencing (bottom), the genome is first broken into larger segments. After the order of these segments is deduced,
159:
In this extremely simplified example, none of the reads cover the full length of the original sequence, but the four reads can be assembled into the original sequence using the overlap of their ends to align and order them. In reality, this process uses enormous amounts of information that are rife
489:
Because it involves first creating a low-resolution map of the genome, hierarchical shotgun sequencing is slower than whole-genome shotgun sequencing, but relies less heavily on computer algorithms than whole-genome shotgun sequencing. The process of extensive BAC library creation and tiling path
302:
by following connections between mate pairs. The distance between contigs can be inferred from the mate pair positions if the average fragment length of the library is known and has a narrow window of deviation. Depending on the size of the gap between contigs, different techniques can be used to
222:
locus, although the use of paired ends was limited to closing gaps after the application of a traditional shotgun sequencing approach. The first theoretical description of a pure pairwise end sequencing strategy, assuming fragments of constant length, was in 1991. At the time, there was community
323:
at once using large arrays of sequencers, which makes the whole process much more efficient than more traditional approaches. Detractors argue that although the technique quickly sequences large regions of DNA, its ability to correctly link these regions is suspect, particularly for eukaryotic
510:
Short-read or "next-gen" sequencing produces shorter reads (anywhere from 25–500bp) but many hundreds of thousands or millions of reads in a relatively short time (on the order of a day). This results in high coverage, but the assembly process is much more computationally intensive. These
490:
selection, however, make hierarchical shotgun sequencing slow and labor-intensive. Now that the technology is available and the reliability of the data demonstrated, the speed and cost efficiency of whole-genome shotgun sequencing has made it the primary method for genome sequencing.
537:
are: not being limited to bacteria; strain-level classification where amplicon sequencing only gets the genus; and the possibility to extract whole genes and specify their function as part of the metagenome. The sensitivity of metagenomic sequencing makes it an attractive choice for
498:
The classical shotgun sequencing was based on the Sanger sequencing method: this was the most advanced technique for sequencing genomes from about 1995–2005. The shotgun strategy is still applied today, however using other sequencing technologies, such as
436:
of the genome is made prior to actual sequencing. From this map, a minimal number of fragments that cover the entire chromosome are selected for sequencing. In this way, the minimum amount of high-throughput sequencing and assembly is required.
69:. Multiple overlapping reads for the target DNA are obtained by performing several rounds of this fragmentation and sequencing. Computer programs then use the overlapping ends of different reads to assemble them into a continuous sequence. 878:
Edwards, Al; Voss, Hartmut; Rice, Peter; Civitello, Andrew; Stegemann, Josef; Schwager, Christian; Zimmermann, Juergen; Erfle, Holger; Caskey, C.Thomas; Ansorge, Wilhelm (April 1990). "Automated DNA sequencing of the human HPRT locus".
227:
et al. introduced the innovation of using fragments of varying sizes, and demonstrated that a pure pairwise end-sequencing strategy would be possible on large targets. The strategy was subsequently adopted by
444:(PAC). Because multiple genome copies have been sheared at random, the fragments contained in these clones have different ends, and with enough coverage (see section above) finding the smallest possible 529:
software. With millions of reads from next generation sequencing of an environmental sample, it is possible to get a complete overview of any complex microbiome with thousands of species, like the
175:; that is, each base in the final sequence was present on average in 12 different reads. Even so, current methods have failed to isolate or assemble reliable sequence for approximately 1% of the ( 1645: 519:
Having reads of 400-500 base pairs length is sufficient to determine the species or strain of the organism where the DNA comes from, provided its genome is already known, by using for example a
411:. Sequence coverage is the average number of times a base is read (as described above). Physical coverage is the average number of times a base is read or spanned by mate paired reads. 467:
Once a tiling path has been found, the BACs that form this path are sheared at random into smaller fragments and can be sequenced using the shotgun method on a smaller scale.
167:
Many overlapping reads for each segment of the original DNA are necessary to overcome these difficulties and accurately assemble the sequence. For example, to complete the
390: 192:
Whole genome shotgun sequencing for small (4000- to 7000-base-pair) genomes was first suggested in 1979. The first genome sequenced by shotgun sequencing was that of
282:. Since the chain termination method usually can only produce reads between 500 and 1000 bases long, in all but the smallest clones, mate pairs will rarely overlap. 1536:
Thoendel, Matthew; Jeraldo, Patricio; Greenwood-Quaintance, Kerryl E.; Yao, Janet; Chia, Nicholas; Hanssen, Arlen D.; Abdel, Matthew P.; Patel, Robin (June 2017).
1261:
Bozdag, Serdar; Close, Timothy J.; Lonardi, Stefano (March 2013). "A Graph-Theoretical Approach to the Selection of the Minimum Tiling Path from a Physical Map".
307:(PCR) to amplify the region is required, followed by sequencing. If the gap is large (>20kb) then the large fragment is cloned in special vectors such as 428:
Although shotgun sequencing can in theory be applied to a genome of any size, its direct application to the sequencing of large genomes (for instance, the
914:
Roach, Jared C.; Boysen, Cecilie; Wang, Kai; Hood, Leroy (March 1995). "Pairwise end sequencing: a unified approach to genomic mapping and sequencing".
511:
technologies are vastly superior to Sanger sequencing due to the high volume of data and the relatively short time it takes to sequence a whole genome.
65:
In shotgun sequencing, DNA is broken up randomly into numerous small segments, which are sequenced using the chain termination method to obtain
256:
To apply the strategy, a high-molecular-weight DNA strand is sheared into random fragments, size-selected (usually 2, 10, 50, and 150 kb), and
219: 1656: 58:. Due to this size limit, longer sequences are subdivided into smaller fragments that can be sequenced separately, and these sequences are 1001: 1650: 1538:"Impact of Contaminating DNA in Whole-Genome Amplification Kits Used for Metagenomic Shotgun Sequencing for Infection Diagnosis" 727:
Gardner, Richard C.; Howarth, Alan J.; Hahn, Peter; Brown-Luedi, Marianne; Shepherd, Robert J.; Messing, Joachim (1981-06-25).
223:
consensus that the optimal fragment length for pairwise end sequencing would be three times the sequence read length. In 1995
1326: 1229: 1191: 1131:
Meyerson, M.; Gabriel, S.; Getz, G. (2010). "Advances in understanding cancer genomes through second-generation sequencing".
1589: 160:
with ambiguities and sequencing errors. Assembly of complex genomes is additionally complicated by the great abundance of
419: 328:
programs become more sophisticated and computing power becomes cheaper, it may be possible to overcome this limitation.
229: 205: 441: 440:
The amplified genome is first sheared into larger pieces (50-200kb) and cloned into a bacterial host using BACs or
729:"The complete nucleotide sequence of an infectious clone of cauliflower mosaic virus by M13mp7 shotgun sequencing" 949:
Fleischmann, RD; et al. (1995). "Whole-genome random sequencing and assembly of Haemophilus influenzae Rd".
551: 539: 308: 1682: 1614:"Shotgun sequencing finds nanoorganisms - Probe of acid mine drainage turns up unsuspected virus-sized Archaea" 218:. The first published description of the use of paired ends was in 1990 as part of the sequencing of the human 526: 1436:
Roumpeka, Despoina D.; Wallace, R. John; Escalettes, Frank; Fotheringham, Ian; Watson, Mick (6 March 2017).
1672: 500: 304: 1397: 1029: 193: 1692: 1677: 361: 244: 161: 73: 1392: 1024: 542:. It however emphasizes the problem of contamination of the sample or the sequencing pipeline. 234: 1504: 851:
Edwards, Al; Caskey, C. Thomas (August 1991). "Closure strategies for random DNA sequencing".
1687: 556: 397: 346:
in the reconstructed sequence. It can be calculated from the length of the original genome (
1050: 433: 265: 47: 1016: 958: 797: 691: 504: 479: 472: 168: 72:
Shotgun sequencing was one of the precursor technologies that was responsible for enabling
294:
software. First, overlapping reads are collected into longer composite sequences known as
164:, meaning similar short reads could come from completely different parts of the sequence. 8: 534: 337: 1020: 962: 801: 695: 1562: 1537: 1513: 1488: 1464: 1437: 1418: 1156: 1108: 1075: 982: 828: 785: 483: 864: 761: 728: 655: 630: 606: 581: 463:
A BAC contig that covers the entire genomic area of interest makes up the tiling path.
1613: 1567: 1518: 1469: 1438:"A Review of Bioinformatics Tools for Bio-Prospecting from Metagenomic Sequence Data" 1422: 1410: 1383:
Metzker, Michael L. (January 2010). "Sequencing technologies — the next generation".
1365: 1322: 1286: 1278: 1225: 1212:
Venter, J Craig (9 September 2005). "Shotgunning the Human Genome: A Personal View".
1187: 1148: 1113: 1095: 1042: 974: 931: 927: 896: 892: 833: 815: 766: 748: 709: 660: 611: 452:
that covers the entire genome is theoretically possible. This scaffold is called the
325: 291: 59: 986: 1557: 1549: 1508: 1500: 1459: 1449: 1402: 1355: 1314: 1270: 1217: 1179: 1160: 1140: 1103: 1087: 1034: 966: 923: 888: 860: 823: 805: 756: 740: 699: 650: 642: 601: 593: 520: 342:
Coverage (read depth or depth) is the average number of reads representing a given
257: 1038: 1360: 1343: 239: 1597: 677: 278: 51: 1091: 1666: 1454: 1282: 1099: 819: 752: 36: 1318: 1221: 1183: 970: 810: 744: 646: 319:
Proponents of this approach argue that it is possible to sequence the whole
303:
find the sequence in the gaps. If the gap is small (5-20kb) then the use of
54:("Sanger sequencing") can only be used for short DNA strands of 100 to 1000 1571: 1522: 1473: 1414: 1369: 1290: 1152: 1117: 1046: 837: 713: 597: 449: 429: 424:
they are further sheared into fragments appropriately sized for sequencing.
393: 978: 935: 900: 770: 664: 1553: 1274: 615: 224: 176: 35:
strands. It is named by analogy with the rapidly expanding, quasi-random
1489:"Clinical Metagenomic Next-Generation Sequencing for Pathogen Detection" 704: 679: 1342:
Voelkerding, Karl V; Dames, Shale A; Durtschi, Jacob D (1 April 2009).
343: 261: 28: 1535: 1435: 530: 55: 1406: 1144: 678:
International Human Genome Sequencing Consortium (21 October 2004).
20: 1263:
IEEE/ACM Transactions on Computational Biology and Bioinformatics
631:"Shotgun DNA sequencing using cloned DNase I-generated fragments" 84:
For example, consider the following two rounds of shotgun reads:
40: 1344:"Next-Generation Sequencing: From Basic Research to Diagnostics" 320: 248:(fruit fly) genome in 2000, and subsequently the human genome. 459: 726: 1487:
Gu, Wei; Miller, Steve; Chiu, Charles Y. (24 January 2019).
290:
The original sequence is reconstructed from the reads using
171:, most of the human genome was sequenced at 12X or greater 582:"A strategy of DNA sequencing employing computer programs" 268:
yielding two short sequences. Each sequence is called an
264:. The clones are then sequenced from both ends using the 32: 1341: 1076:"Bioinformatics challenges of new sequencing technology" 877: 680:"Finishing the euchromatic sequence of the human genome" 1174:
Dunham, Ian (9 September 2005). "Genome Sequencing".
364: 276:
and two reads from the same clone are referred to as
1309:Dear, Paul H (9 September 2005). "Genome Mapping". 1130: 913: 1260: 414: 384: 182: 1493:Annual Review of Pathology: Mechanisms of Disease 1167: 514: 1664: 1002:"The genome sequence of Drosophila melanogaster" 493: 400:addresses the relationships of such quantities. 790:Proceedings of the National Academy of Sciences 232:(TIGR) to sequence the genome of the bacterium 1074:Pop, Mihai; Salzberg, Steven L. (March 2008). 1657:National Center for Biotechnology Information 1304: 1302: 1300: 850: 16:Method used for sequencing random DNA strands 1486: 1124: 311:(BAC) followed by sequencing of the vector. 1243: 1241: 1073: 948: 1297: 1561: 1512: 1505:10.1146/annurev-pathmechdis-012418-012751 1463: 1453: 1396: 1359: 1207: 1205: 1203: 1107: 1028: 827: 809: 760: 703: 654: 605: 1238: 628: 575: 573: 571: 458: 418: 403:Sometimes a distinction is made between 199: 1382: 783: 1665: 1211: 1200: 1173: 671: 579: 298:. Contigs can be linked together into 999: 568: 1308: 324:genomes with repeating regions. As 204:Broader application benefited from 13: 1581: 230:The Institute for Genomic Research 14: 1704: 1634: 1590:"Shotgun sequencing comes of age" 442:P1-derived artificial chromosomes 1644: This article incorporates 1639: 1542:Journal of Clinical Microbiology 314: 309:bacterial artificial chromosomes 210:double-barrel shotgun sequencing 1529: 1480: 1429: 1376: 1335: 1254: 1067: 1000:Adams, MD; et al. (2000). 993: 552:Clinical metagenomic sequencing 415:Hierarchical shotgun sequencing 354:), and the average read length( 183:Whole genome shotgun sequencing 942: 907: 871: 844: 777: 720: 622: 515:Metagenomic shotgun sequencing 62:to give the overall sequence. 1: 1311:Encyclopedia of Life Sciences 1214:Encyclopedia of Life Sciences 1176:Encyclopedia of Life Sciences 1039:10.1126/science.287.5461.2185 865:10.1016/S1046-2023(05)80162-8 784:Doctrow, Brian (2016-07-19). 562: 494:Newer sequencing technologies 396:and assembly. The subject of 1361:10.1373/clinchem.2008.112789 928:10.1016/0888-7543(95)80219-C 893:10.1016/0888-7543(90)90493-E 786:"Profile of Joachim Messing" 179:) human genome, as of 2004. 7: 1247:Gibson, G. and Muse, S. V. 545: 533:. Advantages over 16S rRNA 385:{\displaystyle N\times L/G} 331: 285: 251: 10: 1709: 1249:A Primer of Genome Science 629:Anderson, Stephen (1981). 335: 187: 151:AGCATGCTGCAGTCATGCTTAGGCTA 139:------CTGCAGTCATGCTTAGGCTA 133:AGCATG-------------------- 121:-------------------TAGGCTA 115:AGCATGCTGCAGTCATGCT------- 103:AGCATGCTGCAGTCATGCTTAGGCTA 79: 1092:10.1016/j.tig.2007.12.006 305:polymerase chain reaction 1455:10.3389/fgene.2017.00023 266:chain termination method 208:, known colloquially as 194:cauliflower mosaic virus 128:Second shotgun sequence 48:chain-termination method 1385:Nature Reviews Genetics 1319:10.1038/npg.els.0005353 1222:10.1038/npg.els.0005850 1184:10.1038/npg.els.0005378 1133:Nature Reviews Genetics 971:10.1126/science.7542800 811:10.1073/pnas.1608857113 350:), the number of reads( 245:Drosophila melanogaster 206:pairwise end sequencing 110:First shotgun sequence 74:whole genome sequencing 1646:public domain material 733:Nucleic Acids Research 635:Nucleic Acids Research 586:Nucleic Acids Research 464: 425: 386: 235:Haemophilus influenzae 1683:1981 in biotechnology 1442:Frontiers in Genetics 745:10.1093/nar/9.12.2871 647:10.1093/nar/9.13.3015 557:DNA sequencing theory 501:short-read sequencing 462: 422: 398:DNA sequencing theory 387: 238:in 1995, and then by 200:Paired-end sequencing 196:, published in 1981. 27:is a method used for 1554:10.1128/JCM.02402-16 1275:10.1109/tcbb.2013.26 598:10.1093/nar/6.7.2601 527:taxonomic classifier 505:long-read sequencing 484:restriction-digested 473:sequence-tagged site 362: 260:into an appropriate 169:Human Genome Project 162:repetitive sequences 1021:2000Sci...287.2185. 963:1995Sci...269..496F 802:2016PNAS..113.7935D 705:10.1038/nature03001 696:2004Natur.431..931H 580:Staden, R. (1979). 535:amplicon sequencing 478:Alternatively, the 454:minimum tiling path 338:Coverage (genetics) 1620:. 22 December 2006 1348:Clinical Chemistry 1080:Trends in Genetics 465: 426: 382: 25:shotgun sequencing 1673:Molecular biology 1328:978-0-470-01617-6 1231:978-0-470-01617-6 1193:978-0-470-01617-6 1015:(5461): 2185–95. 957:(5223): 496–512. 796:(29): 7935–7937. 739:(12): 2871–2888. 690:(7011): 931–945. 641:(13): 3015–3027. 409:physical coverage 405:sequence coverage 326:sequence assembly 292:sequence assembly 274:read 1 and read 2 157: 156: 1700: 1660: 1643: 1642: 1629: 1627: 1625: 1609: 1607: 1605: 1596:. Archived from 1576: 1575: 1565: 1548:(6): 1789–1801. 1533: 1527: 1526: 1516: 1484: 1478: 1477: 1467: 1457: 1433: 1427: 1426: 1400: 1380: 1374: 1373: 1363: 1339: 1333: 1332: 1306: 1295: 1294: 1258: 1252: 1245: 1236: 1235: 1209: 1198: 1197: 1171: 1165: 1164: 1128: 1122: 1121: 1111: 1071: 1065: 1064: 1062: 1061: 1055: 1049:. Archived from 1032: 1006: 997: 991: 990: 946: 940: 939: 911: 905: 904: 875: 869: 868: 848: 842: 841: 831: 813: 781: 775: 774: 764: 724: 718: 717: 707: 675: 669: 668: 658: 626: 620: 619: 609: 592:(7): 2601–2610. 577: 391: 389: 388: 383: 378: 242:to sequence the 153: 141: 135: 123: 117: 105: 87: 86: 1708: 1707: 1703: 1702: 1701: 1699: 1698: 1697: 1663: 1662: 1649: 1640: 1637: 1632: 1623: 1621: 1612: 1603: 1601: 1600:on May 14, 2011 1588: 1584: 1582:Further reading 1579: 1534: 1530: 1485: 1481: 1434: 1430: 1407:10.1038/nrg2626 1398:10.1.1.719.3885 1381: 1377: 1340: 1336: 1329: 1307: 1298: 1259: 1255: 1246: 1239: 1232: 1210: 1201: 1194: 1172: 1168: 1145:10.1038/nrg2841 1139:(10): 685–696. 1129: 1125: 1072: 1068: 1059: 1057: 1053: 1030:10.1.1.549.8639 1004: 998: 994: 947: 943: 912: 908: 876: 872: 849: 845: 782: 778: 725: 721: 676: 672: 627: 623: 578: 569: 565: 548: 517: 496: 417: 374: 363: 360: 359: 340: 334: 317: 288: 254: 240:Celera Genomics 202: 190: 185: 152: 149: 146:Reconstruction 140: 137: 136: 134: 131: 122: 119: 118: 116: 113: 104: 101: 82: 17: 12: 11: 5: 1706: 1696: 1695: 1693:Bioinformatics 1690: 1685: 1680: 1678:DNA sequencing 1675: 1636: 1635:External links 1633: 1631: 1630: 1610: 1585: 1583: 1580: 1578: 1577: 1528: 1499:(1): 319–338. 1479: 1428: 1375: 1354:(4): 641–658. 1334: 1327: 1296: 1269:(2): 352–360. 1253: 1251:. 3rd ed. P.84 1237: 1230: 1199: 1192: 1166: 1123: 1086:(3): 142–149. 1066: 992: 941: 922:(2): 345–353. 906: 887:(4): 593–608. 870: 843: 776: 719: 670: 621: 566: 564: 561: 560: 559: 554: 547: 544: 516: 513: 495: 492: 416: 413: 381: 377: 373: 370: 367: 336:Main article: 333: 330: 316: 313: 287: 284: 253: 250: 201: 198: 189: 186: 184: 181: 155: 154: 150: 147: 143: 142: 138: 132: 129: 125: 124: 120: 114: 111: 107: 106: 102: 99: 95: 94: 91: 81: 78: 52:DNA sequencing 15: 9: 6: 4: 3: 2: 1705: 1694: 1691: 1689: 1686: 1684: 1681: 1679: 1676: 1674: 1671: 1670: 1668: 1661: 1658: 1654: 1653: 1652:NCBI Handbook 1647: 1619: 1615: 1611: 1599: 1595: 1594:The Scientist 1591: 1587: 1586: 1573: 1569: 1564: 1559: 1555: 1551: 1547: 1543: 1539: 1532: 1524: 1520: 1515: 1510: 1506: 1502: 1498: 1494: 1490: 1483: 1475: 1471: 1466: 1461: 1456: 1451: 1447: 1443: 1439: 1432: 1424: 1420: 1416: 1412: 1408: 1404: 1399: 1394: 1390: 1386: 1379: 1371: 1367: 1362: 1357: 1353: 1349: 1345: 1338: 1330: 1324: 1320: 1316: 1312: 1305: 1303: 1301: 1292: 1288: 1284: 1280: 1276: 1272: 1268: 1264: 1257: 1250: 1244: 1242: 1233: 1227: 1223: 1219: 1215: 1208: 1206: 1204: 1195: 1189: 1185: 1181: 1177: 1170: 1162: 1158: 1154: 1150: 1146: 1142: 1138: 1134: 1127: 1119: 1115: 1110: 1105: 1101: 1097: 1093: 1089: 1085: 1081: 1077: 1070: 1056:on 2018-07-22 1052: 1048: 1044: 1040: 1036: 1031: 1026: 1022: 1018: 1014: 1010: 1003: 996: 988: 984: 980: 976: 972: 968: 964: 960: 956: 952: 945: 937: 933: 929: 925: 921: 917: 910: 902: 898: 894: 890: 886: 882: 874: 866: 862: 858: 854: 847: 839: 835: 830: 825: 821: 817: 812: 807: 803: 799: 795: 791: 787: 780: 772: 768: 763: 758: 754: 750: 746: 742: 738: 734: 730: 723: 715: 711: 706: 701: 697: 693: 689: 685: 681: 674: 666: 662: 657: 652: 648: 644: 640: 636: 632: 625: 617: 613: 608: 603: 599: 595: 591: 587: 583: 576: 574: 572: 567: 558: 555: 553: 550: 549: 543: 541: 536: 532: 528: 525: 523: 512: 508: 506: 502: 491: 487: 485: 481: 476: 474: 468: 461: 457: 455: 451: 447: 443: 438: 435: 431: 421: 412: 410: 406: 401: 399: 395: 379: 375: 371: 368: 365: 357: 353: 349: 345: 339: 329: 327: 322: 315:Pros and cons 312: 310: 306: 301: 297: 293: 283: 281: 280: 275: 271: 267: 263: 259: 249: 247: 246: 241: 237: 236: 231: 226: 221: 217: 213: 211: 207: 197: 195: 180: 178: 174: 170: 165: 163: 148: 145: 144: 130: 127: 126: 112: 109: 108: 100: 97: 96: 92: 89: 88: 85: 77: 75: 70: 68: 63: 61: 57: 53: 49: 44: 42: 38: 37:shot grouping 34: 30: 26: 22: 1688:Metagenomics 1651: 1638: 1624:December 23, 1622:. Retrieved 1618:SpaceRef.com 1617: 1604:December 31, 1602:. Retrieved 1598:the original 1593: 1545: 1541: 1531: 1496: 1492: 1482: 1445: 1441: 1431: 1391:(1): 31–46. 1388: 1384: 1378: 1351: 1347: 1337: 1310: 1266: 1262: 1256: 1248: 1213: 1175: 1169: 1136: 1132: 1126: 1083: 1079: 1069: 1058:. Retrieved 1051:the original 1012: 1008: 995: 954: 950: 944: 919: 915: 909: 884: 880: 873: 859:(1): 41–47. 856: 852: 846: 793: 789: 779: 736: 732: 722: 687: 683: 673: 638: 634: 624: 589: 585: 540:clinical use 521: 518: 509: 497: 488: 477: 469: 466: 453: 445: 439: 434:physical map 430:human genome 427: 408: 404: 402: 394:base calling 355: 351: 347: 341: 318: 299: 295: 289: 277: 273: 269: 255: 243: 233: 215: 214: 209: 203: 191: 172: 166: 158: 83: 71: 66: 64: 45: 24: 18: 480:BAC library 450:BAC contigs 177:euchromatic 1667:Categories 1060:2017-10-25 563:References 524:-mer based 344:nucleotide 279:mate pairs 56:base pairs 29:sequencing 1423:205484500 1393:CiteSeerX 1283:1545-5963 1100:0168-9525 1025:CiteSeerX 820:0027-8424 753:0305-1048 531:gut flora 369:× 300:scaffolds 98:Original 93:Sequence 60:assembled 1572:28356418 1523:30355154 1474:28321234 1415:19997069 1370:19246620 1291:23929859 1153:20847746 1118:18262676 1047:10731132 987:10423613 916:Genomics 881:Genomics 838:27382176 714:15496913 546:See also 446:scaffold 332:Coverage 286:Assembly 270:end-read 252:Approach 173:coverage 21:genetics 1563:5442535 1514:6345613 1465:5337752 1161:2544266 1109:2680276 1017:Bibcode 1009:Science 979:7542800 959:Bibcode 951:Science 936:7601461 901:2341149 853:Methods 829:4961156 798:Bibcode 771:6269062 692:Bibcode 665:6269069 482:can be 296:contigs 216:History 188:History 90:Strand 80:Example 41:shotgun 31:random 1570:  1560:  1521:  1511:  1472:  1462:  1448:: 23. 1421:  1413:  1395:  1368:  1325:  1289:  1281:  1228:  1190:  1159:  1151:  1116:  1106:  1098:  1045:  1027:  985:  977:  934:  899:  836:  826:  818:  769:  762:326899 759:  751:  712:  684:Nature 663:  656:327328 653:  616:461197 614:  607:327874 604:  321:genome 262:vector 258:cloned 1648:from 1419:S2CID 1157:S2CID 1054:(PDF) 1005:(PDF) 983:S2CID 358:) as 225:Roach 220:HGPRT 67:reads 39:of a 1626:2006 1606:2002 1568:PMID 1519:PMID 1470:PMID 1411:PMID 1366:PMID 1323:ISBN 1287:PMID 1279:ISSN 1226:ISBN 1188:ISBN 1149:PMID 1114:PMID 1096:ISSN 1043:PMID 975:PMID 932:PMID 897:PMID 834:PMID 816:ISSN 767:PMID 749:ISSN 710:PMID 661:PMID 612:PMID 503:and 407:and 46:The 1558:PMC 1550:doi 1509:PMC 1501:doi 1460:PMC 1450:doi 1403:doi 1356:doi 1315:doi 1271:doi 1218:doi 1180:doi 1141:doi 1104:PMC 1088:doi 1035:doi 1013:287 967:doi 955:269 924:doi 889:doi 861:doi 824:PMC 806:doi 794:113 757:PMC 741:doi 700:doi 688:431 651:PMC 643:doi 602:PMC 594:doi 448:of 272:or 50:of 33:DNA 19:In 1669:: 1655:. 1616:. 1592:. 1566:. 1556:. 1546:55 1544:. 1540:. 1517:. 1507:. 1497:14 1495:. 1491:. 1468:. 1458:. 1444:. 1440:. 1417:. 1409:. 1401:. 1389:11 1387:. 1364:. 1352:55 1350:. 1346:. 1321:. 1313:. 1299:^ 1285:. 1277:. 1267:10 1265:. 1240:^ 1224:. 1216:. 1202:^ 1186:. 1178:. 1155:. 1147:. 1137:11 1135:. 1112:. 1102:. 1094:. 1084:24 1082:. 1078:. 1041:. 1033:. 1023:. 1011:. 1007:. 981:. 973:. 965:. 953:. 930:. 920:26 918:. 895:. 883:. 855:. 832:. 822:. 814:. 804:. 792:. 788:. 765:. 755:. 747:. 735:. 731:. 708:. 698:. 686:. 682:. 659:. 649:. 637:. 633:. 610:. 600:. 588:. 584:. 570:^ 507:. 76:. 43:. 23:, 1659:. 1628:. 1608:. 1574:. 1552:: 1525:. 1503:: 1476:. 1452:: 1446:8 1425:. 1405:: 1372:. 1358:: 1331:. 1317:: 1293:. 1273:: 1234:. 1220:: 1196:. 1182:: 1163:. 1143:: 1120:. 1090:: 1063:. 1037:: 1019:: 989:. 969:: 961:: 938:. 926:: 903:. 891:: 885:6 867:. 863:: 857:3 840:. 808:: 800:: 773:. 743:: 737:9 716:. 702:: 694:: 667:. 645:: 639:9 618:. 596:: 590:6 522:k 456:. 380:G 376:/ 372:L 366:N 356:L 352:N 348:G

Index

genetics
sequencing
DNA
shot grouping
shotgun
chain-termination method
DNA sequencing
base pairs
assembled
whole genome sequencing
repetitive sequences
Human Genome Project
euchromatic
cauliflower mosaic virus
pairwise end sequencing
HGPRT
Roach
The Institute for Genomic Research
Haemophilus influenzae
Celera Genomics
Drosophila melanogaster
cloned
vector
chain termination method
mate pairs
sequence assembly
polymerase chain reaction
bacterial artificial chromosomes
genome
sequence assembly

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.