Knowledge

ProbCons

Source 📝

1090: 482: 660: 153: 1085:{\displaystyle {\begin{aligned}E_{\Pr}(\operatorname {acc} (a^{*},a))&=\sum _{a}\Pr\operatorname {acc} (a^{*},a)\\&={\frac {1}{\min(|x|,|y|)}}\cdot \sum _{a}\mathbf {1} \{x_{i}\sim y_{i}\in a\}\Pr\\&={\frac {1}{\min(|x|,|y|)}}\cdot \sum _{x_{i}-y_{i}}P(x_{i}\sim y_{j}|x,y)\end{aligned}}} 1490: 477:{\displaystyle {\begin{aligned}P(x_{i}\sim y_{i}|x,y)\ {\overset {\underset {\mathrm {def} }{}}{=}}&\ \Pr\\=&\ \sum _{{\text{alignment }}a \atop {{\text{with }}x_{i}-y_{i}}}\Pr\\=&\ \sum _{{\text{alignment }}a}\mathbf {1} \{x_{i}-y_{i}\in a\}\Pr\end{aligned}}} 1261: 1223: 158: 541: 665: 1502:
Construct a guide tree by hierarchical clustering using MEA score as sequence similarity score. Cluster similarity is defined using weighted average over pairwise sequence similarity.
1254: 1099: 630: 595: 568: 146: 119: 92: 650: 1485:{\displaystyle P'(x_{i}-y_{i}|x,y)={\frac {1}{|{\mathcal {S}}|}}\sum _{z}\sum _{1\leq k\leq |z|}P(x_{i}\sim z_{i}|x,z)\cdot P(z_{i}\sim y_{i}|z,y)} 43:
programs, since it has repeatedly demonstrated a statistically significant advantage in accuracy over similar tools, including
1687: 1653: 489: 1630: 1613:
Roshan, Usman (2014-01-01). "Multiple Sequence Alignment Using Probcons and Probalign". In Russell, David J (ed.).
1529: 1519: 40: 1235: 652:
is defined as the number of common aligned pairs divided by the length of the shorter sequence.
608: 573: 546: 124: 97: 70: 8: 635: 1590: 1565: 1636: 1626: 1595: 1618: 1585: 1577: 1218:{\displaystyle E(x,y)=\arg \max _{a^{*}}\;E_{\Pr}(\operatorname {acc} (a^{*},a))} 1622: 1617:. Methods in Molecular Biology. Vol. 1079. Humana Press. pp. 147–153. 20: 1681: 1510:
Finally compute the MSA using progressive alignment or iterative alignment.
1640: 1599: 32: 1581: 1566:"PROBCONS: Probabilistic Consistency-based Multiple Sequence Alignment" 36: 24: 1544: 59:
The following describes the basic outline of the ProbCons algorithm.
1539: 1227: 35:
software for probabilistic consistency-based multiple alignment of
1524: 67:
For every pair of sequences compute the probability that letters
44: 1670: 1563: 48: 1534: 62: 1256:
are now re-estimated using all intermediate sequences z:
1232:
All pairs of sequences x,y from the set of all sequences
1094:
This yields a maximum expected accuracy (MEA) alignment:
1564:
Do CB, Mahabhashyam MS, Brudno M, Batzoglou S (2005).
1654:
Lecture "Bioinformatics II" at University of Freiburg
1264: 1238: 1102: 663: 638: 611: 576: 549: 492: 156: 127: 100: 73: 1497: 600: 536:{\displaystyle \mathbf {1} \{x_{i}\sim y_{i}\in a\}} 39:sequences. It is one of the most efficient protein 1484: 1248: 1217: 1084: 644: 624: 589: 562: 535: 476: 140: 113: 86: 1679: 1228:Step 3: Probabilistic Consistency Transformation 1153: 1131: 958: 916: 822: 752: 673: 444: 351: 236: 655:Calculate expected accuracy of each sequence: 148:an alignment that is generated by the model. 913: 881: 530: 498: 441: 409: 16:Protein multiple-sequence alignment program 1147: 1589: 63:Step 1: Reliability of an alignment edge 597:are in the alignment and 0 otherwise.) 1680: 1612: 1505: 1615:Multiple Sequence Alignment Methods 13: 1332: 1241: 632:with respect to another alignment 307: 224: 221: 218: 14: 1699: 1662: 1498:Step 4: Computation of guide tree 601:Step 2: Maximum expected accuracy 877: 494: 405: 1647: 1606: 1557: 1479: 1466: 1439: 1430: 1417: 1390: 1381: 1373: 1338: 1326: 1313: 1300: 1273: 1249:{\displaystyle {\mathcal {S}}} 1212: 1209: 1190: 1181: 1176: 1163: 1156: 1118: 1106: 1075: 1062: 1035: 993: 989: 981: 973: 965: 961: 939: 926: 919: 857: 853: 845: 837: 829: 825: 803: 784: 775: 762: 755: 732: 729: 710: 701: 696: 683: 676: 467: 454: 447: 374: 361: 354: 287: 274: 239: 204: 191: 164: 1: 1550: 605:The accuracy of an alignment 54: 7: 1688:Computational phylogenetics 1623:10.1007/978-1-62703-646-7_9 1520:Sequence alignment software 1513: 1494:This step can be iterated. 41:multiple sequence alignment 10: 1704: 1486: 1250: 1219: 1086: 646: 626: 591: 564: 537: 478: 142: 115: 88: 1487: 1251: 1220: 1087: 647: 627: 625:{\displaystyle a^{*}} 592: 590:{\displaystyle y_{i}} 565: 563:{\displaystyle x_{i}} 538: 479: 143: 141:{\displaystyle a^{*}} 116: 114:{\displaystyle y_{i}} 89: 87:{\displaystyle x_{i}} 1262: 1236: 1100: 661: 636: 609: 574: 547: 490: 154: 125: 98: 71: 1506:Step 5: Compute MSA 267: in some  1582:10.1101/gr.2821705 1482: 1386: 1354: 1246: 1215: 1146: 1082: 1080: 1031: 875: 751: 642: 622: 587: 560: 533: 474: 472: 403: 350: 228: 138: 111: 84: 1355: 1345: 1343: 1130: 1002: 997: 866: 861: 742: 645:{\displaystyle a} 543:is equal to 1 if 397: 389: 388: 348: 322: 312: 302: 301: 268: 235: 229: 216: 215: 209: 1695: 1674: 1673: 1671:Official website 1656: 1651: 1645: 1644: 1610: 1604: 1603: 1593: 1561: 1491: 1489: 1488: 1483: 1469: 1464: 1463: 1451: 1450: 1420: 1415: 1414: 1402: 1401: 1385: 1384: 1376: 1353: 1344: 1342: 1341: 1336: 1335: 1329: 1320: 1303: 1298: 1297: 1285: 1284: 1272: 1255: 1253: 1252: 1247: 1245: 1244: 1224: 1222: 1221: 1216: 1202: 1201: 1180: 1179: 1166: 1145: 1144: 1143: 1091: 1089: 1088: 1083: 1081: 1065: 1060: 1059: 1047: 1046: 1030: 1029: 1028: 1016: 1015: 998: 996: 992: 984: 976: 968: 953: 945: 929: 906: 905: 893: 892: 880: 874: 862: 860: 856: 848: 840: 832: 817: 809: 796: 795: 765: 750: 722: 721: 700: 699: 686: 651: 649: 648: 643: 631: 629: 628: 623: 621: 620: 596: 594: 593: 588: 586: 585: 569: 567: 566: 561: 559: 558: 542: 540: 539: 534: 523: 522: 510: 509: 497: 483: 481: 480: 475: 473: 457: 434: 433: 421: 420: 408: 402: 398: 395: 386: 364: 349: 347: 346: 345: 333: 332: 323: 320: 317: 313: 310: 299: 277: 269: 266: 264: 263: 251: 250: 233: 230: 227: 211: 207: 194: 189: 188: 176: 175: 147: 145: 144: 139: 137: 136: 120: 118: 117: 112: 110: 109: 93: 91: 90: 85: 83: 82: 1703: 1702: 1698: 1697: 1696: 1694: 1693: 1692: 1678: 1677: 1669: 1668: 1665: 1660: 1659: 1652: 1648: 1633: 1611: 1607: 1570:Genome Research 1562: 1558: 1553: 1516: 1508: 1500: 1465: 1459: 1455: 1446: 1442: 1416: 1410: 1406: 1397: 1393: 1380: 1372: 1359: 1349: 1337: 1331: 1330: 1325: 1324: 1319: 1299: 1293: 1289: 1280: 1276: 1265: 1263: 1260: 1259: 1240: 1239: 1237: 1234: 1233: 1230: 1197: 1193: 1162: 1152: 1148: 1139: 1135: 1134: 1101: 1098: 1097: 1079: 1078: 1061: 1055: 1051: 1042: 1038: 1024: 1020: 1011: 1007: 1006: 988: 980: 972: 964: 957: 952: 943: 942: 925: 901: 897: 888: 884: 876: 870: 852: 844: 836: 828: 821: 816: 807: 806: 791: 787: 761: 746: 735: 717: 713: 682: 672: 668: 664: 662: 659: 658: 637: 634: 633: 616: 612: 610: 607: 606: 603: 581: 577: 575: 572: 571: 554: 550: 548: 545: 544: 518: 514: 505: 501: 493: 491: 488: 487: 471: 470: 453: 429: 425: 416: 412: 404: 396:alignment  394: 393: 384: 378: 377: 360: 341: 337: 328: 324: 319: 318: 311:alignment  309: 308: 306: 297: 291: 290: 273: 265: 259: 255: 246: 242: 231: 217: 210: 190: 184: 180: 171: 167: 157: 155: 152: 151: 132: 128: 126: 123: 122: 105: 101: 99: 96: 95: 78: 74: 72: 69: 68: 65: 57: 17: 12: 11: 5: 1701: 1691: 1690: 1676: 1675: 1664: 1663:External links 1661: 1658: 1657: 1646: 1631: 1605: 1576:(2): 330–340. 1555: 1554: 1552: 1549: 1548: 1547: 1542: 1537: 1532: 1527: 1522: 1515: 1512: 1507: 1504: 1499: 1496: 1481: 1478: 1475: 1472: 1468: 1462: 1458: 1454: 1449: 1445: 1441: 1438: 1435: 1432: 1429: 1426: 1423: 1419: 1413: 1409: 1405: 1400: 1396: 1392: 1389: 1383: 1379: 1375: 1371: 1368: 1365: 1362: 1358: 1352: 1348: 1340: 1334: 1328: 1323: 1318: 1315: 1312: 1309: 1306: 1302: 1296: 1292: 1288: 1283: 1279: 1275: 1271: 1268: 1243: 1229: 1226: 1214: 1211: 1208: 1205: 1200: 1196: 1192: 1189: 1186: 1183: 1178: 1175: 1172: 1169: 1165: 1161: 1158: 1155: 1151: 1142: 1138: 1133: 1129: 1126: 1123: 1120: 1117: 1114: 1111: 1108: 1105: 1077: 1074: 1071: 1068: 1064: 1058: 1054: 1050: 1045: 1041: 1037: 1034: 1027: 1023: 1019: 1014: 1010: 1005: 1001: 995: 991: 987: 983: 979: 975: 971: 967: 963: 960: 956: 951: 948: 946: 944: 941: 938: 935: 932: 928: 924: 921: 918: 915: 912: 909: 904: 900: 896: 891: 887: 883: 879: 873: 869: 865: 859: 855: 851: 847: 843: 839: 835: 831: 827: 824: 820: 815: 812: 810: 808: 805: 802: 799: 794: 790: 786: 783: 780: 777: 774: 771: 768: 764: 760: 757: 754: 749: 745: 741: 738: 736: 734: 731: 728: 725: 720: 716: 712: 709: 706: 703: 698: 695: 692: 689: 685: 681: 678: 675: 671: 667: 666: 641: 619: 615: 602: 599: 584: 580: 557: 553: 532: 529: 526: 521: 517: 513: 508: 504: 500: 496: 469: 466: 463: 460: 456: 452: 449: 446: 443: 440: 437: 432: 428: 424: 419: 415: 411: 407: 401: 392: 385: 383: 380: 379: 376: 373: 370: 367: 363: 359: 356: 353: 344: 340: 336: 331: 327: 316: 305: 298: 296: 293: 292: 289: 286: 283: 280: 276: 272: 262: 258: 254: 249: 245: 241: 238: 232: 226: 223: 220: 214: 206: 203: 200: 197: 193: 187: 183: 179: 174: 170: 166: 163: 160: 159: 135: 131: 121:are paired in 108: 104: 81: 77: 64: 61: 56: 53: 21:bioinformatics 15: 9: 6: 4: 3: 2: 1700: 1689: 1686: 1685: 1683: 1672: 1667: 1666: 1655: 1650: 1642: 1638: 1634: 1632:9781627036450 1628: 1624: 1620: 1616: 1609: 1601: 1597: 1592: 1587: 1583: 1579: 1575: 1571: 1567: 1560: 1556: 1546: 1543: 1541: 1538: 1536: 1533: 1531: 1528: 1526: 1523: 1521: 1518: 1517: 1511: 1503: 1495: 1492: 1476: 1473: 1470: 1460: 1456: 1452: 1447: 1443: 1436: 1433: 1427: 1424: 1421: 1411: 1407: 1403: 1398: 1394: 1387: 1377: 1369: 1366: 1363: 1360: 1356: 1350: 1346: 1321: 1316: 1310: 1307: 1304: 1294: 1290: 1286: 1281: 1277: 1269: 1266: 1257: 1225: 1206: 1203: 1198: 1194: 1187: 1184: 1173: 1170: 1167: 1159: 1149: 1140: 1136: 1127: 1124: 1121: 1115: 1112: 1109: 1103: 1095: 1092: 1072: 1069: 1066: 1056: 1052: 1048: 1043: 1039: 1032: 1025: 1021: 1017: 1012: 1008: 1003: 999: 985: 977: 969: 954: 949: 947: 936: 933: 930: 922: 910: 907: 902: 898: 894: 889: 885: 871: 867: 863: 849: 841: 833: 818: 813: 811: 800: 797: 792: 788: 781: 778: 772: 769: 766: 758: 747: 743: 739: 737: 726: 723: 718: 714: 707: 704: 693: 690: 687: 679: 669: 656: 653: 639: 617: 613: 598: 582: 578: 555: 551: 527: 524: 519: 515: 511: 506: 502: 484: 464: 461: 458: 450: 438: 435: 430: 426: 422: 417: 413: 399: 390: 381: 371: 368: 365: 357: 342: 338: 334: 329: 325: 314: 303: 294: 284: 281: 278: 270: 260: 256: 252: 247: 243: 212: 201: 198: 195: 185: 181: 177: 172: 168: 161: 149: 133: 129: 106: 102: 79: 75: 60: 52: 50: 46: 42: 38: 34: 30: 26: 22: 1649: 1614: 1608: 1573: 1569: 1559: 1509: 1501: 1493: 1258: 1231: 1096: 1093: 657: 654: 604: 485: 150: 66: 58: 28: 18: 33:open source 1551:References 321:with  37:amino acid 25:proteomics 1545:Probalign 1453:∼ 1434:⋅ 1404:∼ 1370:≤ 1364:≤ 1357:∑ 1347:∑ 1287:− 1199:∗ 1188:⁡ 1141:∗ 1128:⁡ 1049:∼ 1018:− 1004:∑ 1000:⋅ 908:∈ 895:∼ 868:∑ 864:⋅ 793:∗ 782:⁡ 744:∑ 719:∗ 708:⁡ 618:∗ 525:∈ 512:∼ 436:∈ 423:− 391:∑ 335:− 304:∑ 253:∼ 178:∼ 134:∗ 55:Algorithm 1682:Category 1641:24170400 1600:15687296 1540:T-Coffee 1514:See also 1270:′ 29:ProbCons 1525:Clustal 486:(Where 45:Clustal 1639:  1629:  1598:  1591:546535 1588:  1530:MUSCLE 387:  300:  234:  208:  31:is an 49:MAFFT 1637:PMID 1627:ISBN 1596:PMID 1535:AMAP 570:and 94:and 47:and 23:and 1619:doi 1586:PMC 1578:doi 1185:acc 1132:max 1125:arg 959:min 823:min 779:acc 705:acc 19:In 1684:: 1635:. 1625:. 1594:. 1584:. 1574:15 1572:. 1568:. 1154:Pr 917:Pr 753:Pr 674:Pr 445:Pr 352:Pr 237:Pr 51:. 27:, 1643:. 1621:: 1602:. 1580:: 1480:) 1477:y 1474:, 1471:z 1467:| 1461:i 1457:y 1448:i 1444:z 1440:( 1437:P 1431:) 1428:z 1425:, 1422:x 1418:| 1412:i 1408:z 1399:i 1395:x 1391:( 1388:P 1382:| 1378:z 1374:| 1367:k 1361:1 1351:z 1339:| 1333:S 1327:| 1322:1 1317:= 1314:) 1311:y 1308:, 1305:x 1301:| 1295:i 1291:y 1282:i 1278:x 1274:( 1267:P 1242:S 1213:) 1210:) 1207:a 1204:, 1195:a 1191:( 1182:( 1177:] 1174:y 1171:, 1168:x 1164:| 1160:a 1157:[ 1150:E 1137:a 1122:= 1119:) 1116:y 1113:, 1110:x 1107:( 1104:E 1076:) 1073:y 1070:, 1067:x 1063:| 1057:j 1053:y 1044:i 1040:x 1036:( 1033:P 1026:i 1022:y 1013:i 1009:x 994:) 990:| 986:y 982:| 978:, 974:| 970:x 966:| 962:( 955:1 950:= 940:] 937:y 934:, 931:x 927:| 923:a 920:[ 914:} 911:a 903:i 899:y 890:i 886:x 882:{ 878:1 872:a 858:) 854:| 850:y 846:| 842:, 838:| 834:x 830:| 826:( 819:1 814:= 804:) 801:a 798:, 789:a 785:( 776:] 773:y 770:, 767:x 763:| 759:a 756:[ 748:a 740:= 733:) 730:) 727:a 724:, 715:a 711:( 702:( 697:] 694:y 691:, 688:x 684:| 680:a 677:[ 670:E 640:a 614:a 583:i 579:y 556:i 552:x 531:} 528:a 520:i 516:y 507:i 503:x 499:{ 495:1 468:] 465:y 462:, 459:x 455:| 451:a 448:[ 442:} 439:a 431:i 427:y 418:i 414:x 410:{ 406:1 400:a 382:= 375:] 372:y 369:, 366:x 362:| 358:a 355:[ 343:i 339:y 330:i 326:x 315:a 295:= 288:] 285:y 282:, 279:x 275:| 271:a 261:i 257:y 248:i 244:x 240:[ 225:f 222:e 219:d 213:= 205:) 202:y 199:, 196:x 192:| 186:i 182:y 173:i 169:x 165:( 162:P 130:a 107:i 103:y 80:i 76:x

Index

bioinformatics
proteomics
open source
amino acid
multiple sequence alignment
Clustal
MAFFT
Sequence alignment software
Clustal
MUSCLE
AMAP
T-Coffee
Probalign
"PROBCONS: Probabilistic Consistency-based Multiple Sequence Alignment"
doi
10.1101/gr.2821705
PMC
546535
PMID
15687296
doi
10.1007/978-1-62703-646-7_9
ISBN
9781627036450
PMID
24170400
Lecture "Bioinformatics II" at University of Freiburg
Official website
Category
Computational phylogenetics

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.