Knowledge

Paired-end tag

Source 📝

188:. During the construction of the PET library, the fragments can be selected to all be of a certain size. After mapping, the PET sequences are thus expected to be consistently a particular distance away from each other. A discrepancy from this distance indicates a structural variation between the PET sequences. For example (Figure on the right): a deletion in the sequenced genome will have reads that map further away than expected in the reference genome as the reference genome will have a segment of DNA that is not present in the sequenced genome. 137: 168:. Anchoring one half of the pair uniquely to a single location in the genome allows mapping of the other half that is ambiguous. Ambiguous reads are those that map to more than a single location. This increased efficiency reduces the cost of sequencing as these ambiguous sequences, or reads, would normally be discarded. The connectivity of PET sequences also allows detection of structural variations: 145: 70: 251:: transcripts, gene structures, and gene expressions. The PET library is generated using full length cDNAs, so the ditags represent the 5’ capped and the 3’ polyA tail signatures of individual transcripts. Therefore, RNA-PET is especially useful for demarcating the boundaries of transcription units. This will help identify alternative transcription start sites and 52:(discussed below) used to produce PETs give longer tags (18/20 base pairs and 25/27 base pairs) but sequences of 50–100 base pairs would be optimal for both mapping and cost efficiency. After extracting the PETs from many DNA fragments, they are linked (concatenated) together for efficient sequencing. On average, 20–30 tags could be sequenced with the 279:
sites. The advantages of PET sequencing over these methods are that PET identify both ends of the transcripts and, at the same time, provide more specificity when mapping back to the genome. Sequencing the cDNAs can reveal the structures of transcripts in great details, but this approach is much more
60:
that has short read lengths and higher throughput. The main advantages of PET sequencing are its reduced cost by sequencing only short fragments, detection of structural variants in the genome, and increased specificity when aligning back to the genome compared to single tags, which involves only one
110:
Instead of cloning, adaptors containing the endonuclease sequence are ligated to the ends of fragmented genomic DNA or cDNA. The molecules are then self-circularized and digested with endonuclease, releasing the PET. Before sequencing, these PETs are ligated to adaptors to which PCR primers anneal
127:
cut downstream of their target binding sites. MmeI cuts 18/20 base pairs downstream and EcoP15I cuts 25/27 base pairs downstream. As these restriction enzymes bind at their target sequences located in the adaptors, they cut and release vectors that contain short sequences of the fragment or cDNA
97:
making the PET library. PET sequences are obtained by purifying plasmid and digesting with specific endonuclease leaving two short sequences on the ends of the vectors. Under intramolecular (dilute) conditions, vectors are re-circularized and ligated, leaving only the ditags in the vector. The
111:
for amplification. The advantage of cloning based construction of the library is that it maintains the fragments or cDNA intact for future use. However, the construction process is much longer than the cloning-free method. Variations on library construction have been produced by
649:
Chen, X.; Xu, H.; Yuan, P.; Fang, F.; Huss, M.; Vega, V. B.; Wong, E.; Orlov, Y. L.; Zhang, W.; Jiang, J.; Loh, Y. H.; Yeo, H. C.; Yeo, Z. X.; Narang, V.; Govindarajan, K. R.; Leong, B.; Shahab, A.; Ruan, Y.; Bourque, G.; Sung, W. K.; Clarke, N. D.; Wei, C. L.; Ng, H. H. (2008).
296:
procedure is used construct the cDNA library before generating the PETs, cDNAs that are difficult to clone (as a result of long transcripts) would have lower coverage. Similarly, transcripts (or transcript isoforms) with low expression levels would likely be under-represented as
240:
through RNA-PET in that the paired tags map to different regions in the genome. However, ChIA-PET involves artificial ligations between different DNA fragments located at different genomic regions, rather than naturally occurring fusion between two genomic regions as in
43:
by consisting of a short 5' linker sequence, a short 5' sequence tag, a short 3' sequence tag, and a short 3' linker sequence. It was shown conceptually that 13 base pairs are sufficient to map tags uniquely. However, longer sequences are more practical for mapping
801:
Ng, P.; Wei, C. L.; Sung, W. K.; Chiu, K. P.; Lipovich, L.; Ang, C. C.; Gupta, S.; Shahab, A.; Ridwan, A.; Wong, C. H.; Liu, E. T.; Ruan, Y. (2005). "Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation".
845:
Ruan, Y.; Ooi, H. S.; Choo, S. W.; Chiu, K. P.; Zhao, X. D.; Srinivasan, K. G.; Yao, F.; Choo, C. Y.; Liu, J.; Ariyaratne, P.; Bin, W. G.; Kuznetsov, V. A.; Shahab, A.; Sung, W. K.; Bourque, G.; Palanisamy, N.; Wei, C. L. (2007).
200:) and PET is used to detect regions of DNA bound by a protein of interest. ChIP-PET has the advantage over single read sequencing by reducing ambiguity of the reads generated. The advantage over chip hybridization ( 35:, therefore making the sequence of the DNA in between them available upon search (if full-genome sequence data is available) or upon further sequencing (since tag sites are unique enough to serve as 89:. The cloning sites are flanked with adaptor sequences that contain restriction sites for endonucleases (discussed below). Inserts are ligated to the plasmid vectors and individual vectors are then 232:
cells. ChIA-PET is an unbiased way to analyze interactions and higher-order chromatin structures because it can detect interactions between unknown DNA elements. In contrast,
39:
annealing sites). Paired-end tags (PET) exist in PET libraries with the intervening DNA absent, that is, a PET "represents" a larger fragment of genomic or
263:, but further experiment is needed to distinguish between them. Other methods of finding the boundaries of transcripts include the single-tag strategies 848:"Fusion transcripts and transcribed retrotransposed loci discovered through comprehensive transcriptome analysis using Paired-End diTags (PETs)" 496:"Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding" 192: 435:
Matsumura, H.; Reich, S.; Ito, A.; Saitoh, H.; Kamoun, S.; Winter, P.; Kahl, G.; Reuter, M.; Kruger, D. H.; Terauchi, R. (2003).
910: 156:: Because PET represent connectivity between the tags, the use of PET in genome re-sequencing has advantages over the use of 204:) is that hybridization tiling arrays do not have the statistical sensitivity that sequence reads have. However, ChIP-PET, 236:
methods are used to detect interactions involving a specific target region in the genome. ChIA-PET is similar to finding
268: 161: 699:
Wu, J.; Smith, L. T.; Plass, C.; Huang, T. H. (2006). "ChIP-chip comes of age for genome-wide functional analysis".
220:
long-range interactions between DNA elements bound by protein factors. The first ChIA-PET was developed by Fullwood
543:
Barski, A.; Cuddapah, S.; Cui, K.; Roh, T. Y.; Schones, D. E.; Wang, Z.; Wei, G.; Chepelev, I.; Zhao, K. (2007).
233: 216:: The application of PET sequencing on chromatin interaction analysis. It is a genome-wide strategy for finding 905: 388:"MmeI: A minimal Type II restriction-modification system that only modifies one DNA strand for host protection" 56:
method, which has a longer read length. Since the tag sequences are short, individual PETs are well suited for
264: 652:"Integration of external signaling pathways with the core transcriptional network in embryonic stem cells" 112: 99: 57: 284:. The major limitation of RNA-PET is the lack of information regarding the organization of the internal 900: 36: 185: 124: 90: 77:
PET libraries are typically prepared in two general methods: cloning based and cloning-free based.
915: 275:, with the CAGE and 5’ SAGE defining the transcription start sites and the 3’ SAGE defining the 339:"Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses" 181: 102:
technique, PET sequences can be left singular, dimerized, or concatenated into long chains.
752: 605: 448: 289: 169: 8: 197: 173: 49: 31:
fragment which are unique enough that they (theoretically) exist together only once in a
756: 609: 452: 872: 847: 827: 773: 740: 681: 631: 574: 520: 495: 412: 387: 363: 338: 471: 436: 877: 819: 778: 716: 673: 623: 566: 525: 476: 417: 368: 225: 831: 867: 859: 811: 768: 760: 708: 685: 663: 613: 578: 556: 515: 507: 466: 456: 407: 399: 358: 350: 177: 19:(sometimes "Paired-End diTags", or simply "ditags") are the short sequences at the 712: 635: 276: 252: 205: 668: 651: 561: 544: 441:
Proceedings of the National Academy of Sciences of the United States of America
260: 229: 157: 86: 53: 45: 85:
Fragmented genomic DNA or complementary DNA (cDNA) of interest is cloned into
894: 281: 248: 618: 593: 461: 881: 823: 782: 720: 677: 627: 570: 529: 480: 437:"Gene expression analysis of plant host-pathogen interactions by SuperSAGE" 421: 372: 280:
expensive than RNA-PET sequencing, especially for characterizing the whole
256: 237: 201: 511: 354: 224:. (2009) to generate a map of the interactions between chromatin bound by 403: 136: 764: 98:
sequences unique to the clone are now paired together. Depending on the
863: 545:"High-resolution profiling of histone methylations in the human genome" 123:
Unlike other endonucleases, the MmeI (type IIS) and EcoP15I (type III)
815: 272: 24: 20: 73:
Workflow of Cloning and Cloning-free based PET library construction.
212: 293: 288:
of transcripts. Therefore, RNA-PET is not suitable for detecting
148:
Example of alternative transcript structures detected by RNA-PET.
386:
Morgan, R. D.; Bhatia, T. K.; Lovasco, L.; Davis, T. B. (2008).
741:"An oestrogen-receptor-alpha-bound human chromatin interactome" 32: 592:
Johnson, D. S.; Mortazavi, A.; Myers, R. M.; Wold, B. (2007).
493: 69: 285: 40: 337:
Fullwood, M. J.; Wei, C. L.; Liu, E. T.; Ruan, Y. (2009).
144: 594:"Genome-wide mapping of in vivo protein-DNA interactions" 591: 28: 385: 434: 255:
sites of genes. RNA-PET could also be used to detect
196:: The combined use of chromatin immunoprecipitation ( 140:
Example of PET detection of deletions and insertions.
542: 336: 738: 698: 115:companies to suit their respective technologies. 892: 844: 64: 648: 208:and ChIP-chip have all been highly successful. 800: 247:: This application is used for studying the 734: 732: 730: 332: 330: 328: 326: 324: 322: 320: 318: 316: 314: 312: 310: 871: 772: 667: 617: 560: 519: 470: 460: 411: 362: 228:(ER-α) in oestrogen-treated human breast 727: 143: 135: 68: 307: 893: 796: 794: 792: 739:Fullwood, M. J.; et al. (2009). 494:McKernan, K. J.; et al. (2009). 105: 131: 13: 789: 14: 927: 128:ligated to them, producing PETs. 166:double-barrel shotgun sequencing 118: 80: 838: 692: 642: 585: 536: 487: 428: 379: 160:. This application is called 1: 713:10.1158/0008-5472.CAN-06-0276 301: 911:Molecular biology techniques 65:Constructing the PET library 7: 10: 932: 669:10.1016/j.cell.2008.04.043 562:10.1016/j.cell.2007.05.009 113:next-generation sequencing 100:next-generation sequencing 61:end of the DNA fragment. 58:next-generation sequencing 125:restriction endonucleases 164:, known colloquially as 619:10.1126/science.1141319 462:10.1073/pnas.2536670100 162:pairwise end sequencing 392:Nucleic Acids Research 292:. In addition, if the 271:, and the most recent 149: 141: 74: 906:Laboratory techniques 512:10.1101/gr.091868.109 355:10.1101/gr.074906.107 147: 139: 72: 17:Paired-end tags (PET) 290:alternative splicing 226:oestrogen receptor α 765:10.1038/nature08497 757:2009Natur.462...58F 610:2007Sci...316.1497J 604:(5830): 1497–1902. 453:2003PNAS..10015718M 447:(26): 15718–15723. 864:10.1101/gr.6018607 404:10.1093/nar/gkn711 150: 142: 106:Cloning-free based 75: 901:Molecular biology 707:(14): 6899–7702. 398:(20): 6558–6570. 923: 886: 885: 875: 842: 836: 835: 816:10.1038/nmeth733 798: 787: 786: 776: 736: 725: 724: 696: 690: 689: 671: 662:(6): 1106–1117. 646: 640: 639: 621: 589: 583: 582: 564: 540: 534: 533: 523: 506:(9): 1527–1541. 491: 485: 484: 474: 464: 432: 426: 425: 415: 383: 377: 376: 366: 334: 132:PET applications 931: 930: 926: 925: 924: 922: 921: 920: 891: 890: 889: 852:Genome Research 843: 839: 799: 790: 751:(7269): 58–64. 737: 728: 701:Cancer Research 697: 693: 647: 643: 590: 586: 541: 537: 500:Genome Research 492: 488: 433: 429: 384: 380: 343:Genome Research 335: 308: 304: 277:polyadenylation 253:polyadenylation 134: 121: 108: 87:plasmid vectors 83: 67: 12: 11: 5: 929: 919: 918: 916:DNA sequencing 913: 908: 903: 888: 887: 858:(6): 828–838. 837: 810:(2): 105–111. 804:Nature Methods 788: 726: 691: 641: 584: 555:(4): 823–837. 535: 486: 427: 378: 349:(4): 521–532. 305: 303: 300: 299: 298: 261:trans-splicing 242: 230:adenocarcinoma 209: 189: 186:translocations 133: 130: 120: 117: 107: 104: 82: 79: 66: 63: 48:uniquely. The 9: 6: 4: 3: 2: 928: 917: 914: 912: 909: 907: 904: 902: 899: 898: 896: 883: 879: 874: 869: 865: 861: 857: 853: 849: 841: 833: 829: 825: 821: 817: 813: 809: 805: 797: 795: 793: 784: 780: 775: 770: 766: 762: 758: 754: 750: 746: 742: 735: 733: 731: 722: 718: 714: 710: 706: 702: 695: 687: 683: 679: 675: 670: 665: 661: 657: 653: 645: 637: 633: 629: 625: 620: 615: 611: 607: 603: 599: 595: 588: 580: 576: 572: 568: 563: 558: 554: 550: 546: 539: 531: 527: 522: 517: 513: 509: 505: 501: 497: 490: 482: 478: 473: 468: 463: 458: 454: 450: 446: 442: 438: 431: 423: 419: 414: 409: 405: 401: 397: 393: 389: 382: 374: 370: 365: 360: 356: 352: 348: 344: 340: 333: 331: 329: 327: 325: 323: 321: 319: 317: 315: 313: 311: 306: 295: 291: 287: 283: 282:transcriptome 278: 274: 270: 266: 262: 258: 254: 250: 249:transcriptome 246: 243: 239: 235: 231: 227: 223: 219: 215: 214: 210: 207: 203: 199: 195: 194: 190: 187: 183: 179: 175: 171: 167: 163: 159: 155: 152: 151: 146: 138: 129: 126: 119:Endonucleases 116: 114: 103: 101: 96: 92: 88: 81:Cloning based 78: 71: 62: 59: 55: 51: 50:endonucleases 47: 42: 38: 34: 30: 26: 22: 18: 855: 851: 840: 807: 803: 748: 744: 704: 700: 694: 659: 655: 644: 601: 597: 587: 552: 548: 538: 503: 499: 489: 444: 440: 430: 395: 391: 381: 346: 342: 257:fusion genes 244: 238:fusion genes 221: 217: 211: 191: 178:duplications 165: 158:single reads 153: 122: 109: 94: 84: 76: 16: 15: 91:transformed 895:Categories 302:References 182:inversions 170:insertions 273:SuperSAGE 234:3C and 4C 202:ChIP-Chip 174:deletions 882:17568001 832:14288213 824:15782207 783:19890323 721:16849531 678:18555785 628:17540862 571:17512414 530:19546169 481:14676315 422:18931376 373:19339662 241:RNA-PET. 213:ChIA-PET 206:ChIP-Seq 193:ChIP-PET 873:1891342 774:2774924 753:Bibcode 686:1768190 606:Bibcode 598:Science 579:6326093 521:2752135 449:Bibcode 413:2582602 364:3807531 294:cloning 245:RNA-PET 218:de novo 154:DNA-PET 95:E. coli 25:3' ends 880:  870:  830:  822:  781:  771:  745:Nature 719:  684:  676:  636:519841 634:  626:  577:  569:  528:  518:  479:  472:307634 469:  420:  410:  371:  361:  222:et al. 54:Sanger 37:primer 33:genome 828:S2CID 682:S2CID 632:S2CID 575:S2CID 297:well. 286:exons 93:into 46:reads 27:of a 878:PMID 820:PMID 779:PMID 717:PMID 674:PMID 656:Cell 624:PMID 567:PMID 549:Cell 526:PMID 477:PMID 418:PMID 369:PMID 269:SAGE 265:CAGE 259:and 198:ChIP 41:cDNA 23:and 868:PMC 860:doi 812:doi 769:PMC 761:doi 749:462 709:doi 664:doi 660:133 614:doi 602:316 557:doi 553:129 516:PMC 508:doi 467:PMC 457:doi 445:100 408:PMC 400:doi 359:PMC 351:doi 29:DNA 897:: 876:. 866:. 856:17 854:. 850:. 826:. 818:. 806:. 791:^ 777:. 767:. 759:. 747:. 743:. 729:^ 715:. 705:66 703:. 680:. 672:. 658:. 654:. 630:. 622:. 612:. 600:. 596:. 573:. 565:. 551:. 547:. 524:. 514:. 504:19 502:. 498:. 475:. 465:. 455:. 443:. 439:. 416:. 406:. 396:36 394:. 390:. 367:. 357:. 347:19 345:. 341:. 309:^ 267:, 184:, 180:, 176:, 172:, 21:5’ 884:. 862:: 834:. 814:: 808:2 785:. 763:: 755:: 723:. 711:: 688:. 666:: 638:. 616:: 608:: 581:. 559:: 532:. 510:: 483:. 459:: 451:: 424:. 402:: 375:. 353::

Index

5’
3' ends
DNA
genome
primer
cDNA
reads
endonucleases
Sanger
next-generation sequencing

plasmid vectors
transformed
next-generation sequencing
next-generation sequencing
restriction endonucleases


single reads
pairwise end sequencing
insertions
deletions
duplications
inversions
translocations
ChIP-PET
ChIP
ChIP-Chip
ChIP-Seq
ChIA-PET

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.