522:(MSAs) starting from a single query sequence or a MSA. As in PSI-BLAST, it works iteratively, repeatedly constructing new query profiles by adding the results found in the previous round. It matches against a pre-built HMM databases derived from protein sequence databases, each representing a "cluster" of related proteins. In the case of HHblits, such matches are done on the level of HMM-HMM profiles, which grants additional sensitivity. Its prefiltering reduces the tens of millions HMMs to match against to a few thousands of them, thus speeding up the slow HMM-HMM comparison process.
430:
functions remain unknown. Many proteins have been investigated in model organisms such as many bacteria, baker's yeast, fruit flies, zebra fish or mice, for which experiments can be often done more easily than with human cells. To predict the function, structure, or other properties of a protein for which only its sequence of amino acids is known, the protein sequence is compared to the sequences of other proteins in public databases. If a protein with sufficiently similar sequence is found, the two proteins are likely to be evolutionarily related (
298:
499:(MSAs), in which related proteins are written together (aligned), such that the frequencies of amino acids in each position can be interpreted as probabilities for amino acids in new related proteins, and be used to derive the "similarity scores". Because profiles contain much more information than a single sequence (e.g. the position-specific degree of conservation), profile-profile comparison methods are much more powerful than sequence-sequence comparison methods like
63:
479:
120:
22:
434:). In that case, they are likely to share similar structures and functions. Therefore, if a protein with a sufficiently similar sequence and with known functions and/or structure can be found by the sequence search, the unknown protein's functions, structure, and domain composition can be predicted. Such predictions greatly facilitate the determination of the function or structure by targeted validation experiments.
514:
of sequences related to the query sequence/MSA using the HHblits program. From this alignment, a profile HMM is calculated. The databases contain HMMs that are precalculated in the same fashion using PSI-BLAST. The output of HHpred and HHsearch is a ranked list of database matches (including E-values
577:
7, 8, and 9, for blind protein structure prediction experiments. In CASP9, HHpredA, B, and C were ranked 1st, 2nd, and 3rd out of 81 participating automatic structure prediction servers in template-based modeling and 6th, 7th, 8th on all 147 targets, while being much faster than the best 20 servers.
565:
of the query with the template protein sequence. For example, a search through the PDB database of proteins with solved 3D structure takes a few minutes. If a significant match with a protein of known structure (a "template") is found in the PDB database, HHpred allows the user to build a homology
429:
Proteins are central players in all of life's processes. Understanding them is central to understanding molecular processes in cells. This is particularly important in order to understand the origin of diseases. But for a large fraction of the approximately 20 000 human proteins the structures and
437:
Sequence searches are frequently performed by biologists to infer the function of an unknown protein from its sequence. For this purpose, the protein's sequence is compared to the sequences of other proteins in public databases and its function is deduced from those of the most similar sequences.
590:
In addition to HHsearch and HHblits, the HH-suite contains programs and perl scripts for format conversion, filtering of MSAs, generation of profile HMMs, the addition of secondary structure predictions to MSAs, the extraction of alignments from program output, and the generation of customized
393:
sequence searching. It contains programs that can search for similar protein sequences in protein sequence databases. Sequence searches are a standard tool in modern biology with which the function of unknown proteins can be inferred from the functions of proteins with similar sequences.
510:(HMMs), an extension of PSSM sequence profiles that also records position-specific amino acid insertion and deletion frequencies. HHsearch searches a database of HMMs with a query HMM. Before starting the search through the actual database of HMMs, HHsearch/HHpred builds a
486:
Modern sensitive methods for protein search utilize sequence profiles. They may be used to compare a sequence to a profile, or in more advanced cases such as HH-suite, to match among profiles. Profiles and alignments are themselves derived from matches, using for example
549:
Applications of HHpred and HHsearch include protein structure prediction, complex structure prediction, function prediction, domain prediction, domain boundary prediction, and evolutionary classification of proteins.
417:(HMMs). The name comes from the fact that it performs HMM-HMM alignments. Among the most popular methods for protein sequence matching, the programs have been cited more than 5000 times total according to
450:
can be inferred. HHsearch performs searches with a protein sequence through databases. The HHpred server and the HH-suite software package offer many popular, regularly updated databases, such as the
561:
is searched for "template" proteins similar to the query protein. If such a template protein is found, the structure of the protein of interest can be predicted based on a pairwise
463:
557:, that is, to build a model of the structure of a query protein for which only the sequence is known: For that purpose, a database of proteins with known structures such as the
438:
Often, no sequences with annotated functions can be found in such a search. In this case, more sensitive methods are required to identify more remotely related proteins or
582:
8, HHpred was ranked 7th on all targets and 2nd on the subset of single domain proteins, while still being more than 50 times faster than the top-ranked servers.
141:
134:
525:
The HH-suite comes with a number of pre-built profile HMMs that can be searched using HHblits and HHsearch, among them a clustered version of the
538:
1277:
495:(PSSM) profile contains for each position in the query sequence the similarity score for the 20 amino acids. The profiles are derived from
467:
184:
402:
are two main programs in the package and the entry point to its search function, the latter being a faster iteration.
221:
203:
156:
101:
49:
1260:
1307:
1097:
721:
492:
345:
79:
163:
1312:
72:
830:
716:
443:
407:
277:
170:
726:
711:
562:
519:
511:
496:
431:
254:
1171:
363:
152:
35:
245:
Johannes Söding, Michael
Remmert, Andreas Biegert, Andreas Hauser, Markus Meier, Martin Steinegger
948:
944:
130:
940:
515:
and probabilities for a true relationship) and the pairwise query-database sequence alignments.
1182:
1014:"Profile–profile comparisons by COMPASS predict intricate homologies between protein families"
736:
500:
488:
698:
The HMM-HMM alignment algorithm of HHblits and HHsearch was significantly accelerated using
386:
858:
803:
8:
507:
414:
297:
1222:
1195:
1149:
1124:
1038:
1013:
989:
964:
918:
894:"The HHpred interactive server for protein homology detection and structure prediction"
893:
871:
760:
177:
1282:
1227:
1154:
1078:
1043:
994:
923:
875:
863:
808:
558:
554:
530:
451:
1287:
793:
776:
1217:
1207:
1144:
1136:
1070:
1033:
1025:
984:
976:
913:
905:
853:
845:
831:"HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment"
798:
788:
358:
338:
318:
1254:
78:
It may require cleanup to comply with
Knowledge's content policies, particularly
1194:
Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger S, Söding J (2019).
1061:
Dunbrack RL Jr (2006). "Sequence comparison and protein structure prediction".
447:
439:
418:
350:
1212:
1074:
1301:
1292:
1172:
Official CASP9 results for the template-based modeling category (121 targets)
83:
41:
1231:
1158:
1082:
1047:
998:
927:
867:
812:
1266:
1196:"HH-suite3 for fast remote homology detection and deep protein annotation"
692:
Generate PDB file with indices renumbered to match input sequence indices
668:
Build HHblits database with prefiltering, packed MSA/HMM, and index files
909:
732:
CASP - Critical
Assessment of Techniques for Protein Structure Prediction
684:
Split a multiple-sequence FASTA file into multiple single-sequence files
624:
Filter an MSA by maximum sequence identity, coverage, and other criteria
373:
286:
1101:
1029:
980:
849:
240:
1140:
600:(Iteratively) search an HHblits database with a query sequence or MSA
442:. From these relationships, hypotheses about the protein's functions,
1125:"Mapping Monomeric Threading to Protein–Protein Structure Prediction"
323:
478:
119:
1248:
741:
660:
Generate MSAs or coarse 3D models from HHsearch or HHblits results
567:
455:
649:
526:
390:
676:
Run a command for many files in parallel using multiple threads
632:
Calculate pairwise alignments, dot plots etc. for two HMMs/MSAs
518:
HHblits, a part of the HH-suite since 2001, builds high-quality
1272:
1193:
570:
software, starting from the pairwise query-template alignment.
327:
573:
HHpred servers have been ranked among the best servers during
506:
HHpred and HHsearch represent query and database proteins by
311:
608:
Search an HHsearch database of HMMs with a query MSA or HMM
962:
731:
699:
579:
574:
534:
459:
1251:
at Max-Planck
Institute in Göttingen - HH-suite developers
828:
1269:— free server at Max-Planck Institute in Tuebingen
1263:— free server at Max-Planck Institute in Tuebingen
1122:
503:
or profile-sequence comparison methods like PSI-BLAST.
71:
A major contributor to this article appears to have a
1011:
652:
predicted secondary structure to an MSA or HHM file
965:"Improving the quality of twilight-zone alignments"
891:
777:"Protein homology detection by HMM-HMM comparison"
829:Remmert M, Biegert A, Hauser A, Söding J (2011).
1299:
1060:
963:Jaroszewski L, Rychlewski L, Godzik A (2000).
958:
956:
410:that uses homology information from HH-suite.
1187:
1129:Journal of Chemical Information and Modeling
1255:Precompiled HH-suite binaries and databases
1123:Guerler A, Govindarajoo B, Zhang Y (2013).
953:
541:structural protein domains, and many more.
482:Iterative sequence search scheme of HHblits
50:Learn how and when to remove these messages
1183:Official CASP9 results for all 147 targets
770:
768:
413:The HH-suite searches for sequences using
296:
1221:
1211:
1148:
1037:
1012:Sadreyev RI, Baker D, Grishin NV (2003).
988:
917:
857:
824:
822:
802:
792:
737:BLAST (Basic Local Alignment Search Tool)
222:Learn how and when to remove this message
204:Learn how and when to remove this message
102:Learn how and when to remove this message
887:
885:
774:
477:
765:
1300:
892:Söding J, Biegert A, Lupas AN (2005).
819:
533:of proteins with known structures, of
374:https://github.com/soedinglab/hh-suite
260:3.3.0 / 25 August 2020
140:Please improve this article by adding
1278:CASP9 template-based modeling results
1063:Current Opinion in Structural Biology
882:
113:
56:
15:
1293:HH-suite arch linux user repository
13:
14:
1324:
1242:
742:Context-specific BLAST (CS-BLAST)
31:This article has multiple issues.
722:Position-specific scoring matrix
493:position-specific scoring matrix
118:
82:. Please discuss further on the
61:
20:
1176:
1165:
1116:
616:Build an HMM from an input MSA
544:
39:or discuss these issues on the
1095:
1089:
1054:
1005:
934:
904:(Web Server issue): W244–248.
859:11858/00-001M-0000-0015-8D56-A
804:11858/00-001M-0000-0017-EC7A-F
754:
702:in version 3 of the HH-suite.
537:protein family alignments, of
1:
794:10.1093/bioinformatics/bti125
747:
424:
142:secondary or tertiary sources
717:Protein structure prediction
520:multiple sequence alignments
508:profile hidden Markov models
497:multiple sequence alignments
473:
408:protein structure prediction
7:
727:Multiple sequence alignment
712:Sequence alignment software
705:
585:
553:HHsearch is often used for
512:multiple sequence alignment
10:
1329:
1098:"Some Notes about HHSuite"
640:Reformat one or many MSAs
1213:10.1186/s12859-019-3019-7
1075:10.1016/j.sbi.2006.05.006
369:
357:
344:
334:
317:
307:
276:
272:
253:
249:
239:
1257:download from developers
406:is an online server for
1308:Bioinformatics software
1288:HH-suite ubuntu package
1283:HH-suite debian package
898:Nucleic Acids Research
761:Debian hhsuite package
483:
389:package for sensitive
129:relies excessively on
1313:Computational science
481:
80:neutral point of view
415:hidden Markov models
387:open-source software
1030:10.1110/ps.03197403
981:10.1110/ps.9.8.1487
941:Citations to HHpred
700:vector instructions
236:
1200:BMC Bioinformatics
910:10.1093/nar/gki408
850:10.1038/NMETH.1818
563:sequence alignment
484:
448:domain composition
234:
1141:10.1021/ci300579r
1024:(10): 2262–2272.
775:Söding J (2005).
696:
695:
559:protein data bank
555:homology modeling
531:Protein Data Bank
529:database, of the
454:, as well as the
452:Protein Data Bank
379:
378:
330:package available
232:
231:
224:
214:
213:
206:
188:
112:
111:
104:
75:with its subject.
54:
1320:
1236:
1235:
1225:
1215:
1191:
1185:
1180:
1174:
1169:
1163:
1162:
1152:
1120:
1114:
1113:
1111:
1109:
1100:. Archived from
1093:
1087:
1086:
1058:
1052:
1051:
1041:
1009:
1003:
1002:
992:
975:(8): 1487–1496.
960:
951:
938:
932:
931:
921:
889:
880:
879:
861:
835:
826:
817:
816:
806:
796:
772:
763:
758:
594:
593:
566:model using the
440:protein families
319:Operating system
300:
295:
292:
290:
288:
267:
265:
237:
233:
227:
220:
209:
202:
198:
195:
189:
187:
146:
122:
114:
107:
100:
96:
93:
87:
73:close connection
65:
64:
57:
46:
24:
23:
16:
1328:
1327:
1323:
1322:
1321:
1319:
1318:
1317:
1298:
1297:
1245:
1240:
1239:
1192:
1188:
1181:
1177:
1170:
1166:
1121:
1117:
1107:
1105:
1104:on 3 April 2019
1094:
1090:
1059:
1055:
1018:Protein Science
1010:
1006:
969:Protein Science
961:
954:
939:
935:
890:
883:
833:
827:
820:
773:
766:
759:
755:
750:
708:
689:renumberpdb.pl
673:multithread.pl
657:hhmakemodel.pl
588:
547:
476:
427:
303:
285:
268:
263:
261:
228:
217:
216:
215:
210:
199:
193:
190:
147:
145:
139:
135:primary sources
123:
108:
97:
91:
88:
77:
66:
62:
25:
21:
12:
11:
5:
1326:
1316:
1315:
1310:
1296:
1295:
1290:
1285:
1280:
1275:
1270:
1264:
1258:
1252:
1244:
1243:External links
1241:
1238:
1237:
1186:
1175:
1164:
1115:
1088:
1069:(3): 374–384.
1053:
1004:
952:
933:
881:
844:(2): 173–175.
818:
787:(7): 951–960.
781:Bioinformatics
764:
752:
751:
749:
746:
745:
744:
739:
734:
729:
724:
719:
714:
707:
704:
694:
693:
690:
686:
685:
682:
681:splitfasta.pl
678:
677:
674:
670:
669:
666:
662:
661:
658:
654:
653:
646:
642:
641:
638:
634:
633:
630:
626:
625:
622:
618:
617:
614:
610:
609:
606:
602:
601:
598:
587:
584:
546:
543:
491:or HHblits. A
475:
472:
426:
423:
419:Google Scholar
377:
376:
371:
367:
366:
361:
355:
354:
351:Bioinformatics
348:
342:
341:
336:
332:
331:
321:
315:
314:
309:
305:
304:
302:
301:
282:
280:
274:
273:
270:
269:
259:
257:
255:Stable release
251:
250:
247:
246:
243:
230:
229:
212:
211:
126:
124:
117:
110:
109:
69:
67:
60:
55:
29:
28:
26:
19:
9:
6:
4:
3:
2:
1325:
1314:
1311:
1309:
1306:
1305:
1303:
1294:
1291:
1289:
1286:
1284:
1281:
1279:
1276:
1274:
1271:
1268:
1265:
1262:
1259:
1256:
1253:
1250:
1247:
1246:
1233:
1229:
1224:
1219:
1214:
1209:
1205:
1201:
1197:
1190:
1184:
1179:
1173:
1168:
1160:
1156:
1151:
1146:
1142:
1138:
1135:(3): 717–25.
1134:
1130:
1126:
1119:
1103:
1099:
1092:
1084:
1080:
1076:
1072:
1068:
1064:
1057:
1049:
1045:
1040:
1035:
1031:
1027:
1023:
1019:
1015:
1008:
1000:
996:
991:
986:
982:
978:
974:
970:
966:
959:
957:
950:
946:
942:
937:
929:
925:
920:
915:
911:
907:
903:
899:
895:
888:
886:
877:
873:
869:
865:
860:
855:
851:
847:
843:
839:
832:
825:
823:
814:
810:
805:
800:
795:
790:
786:
782:
778:
771:
769:
762:
757:
753:
743:
740:
738:
735:
733:
730:
728:
725:
723:
720:
718:
715:
713:
710:
709:
703:
701:
691:
688:
687:
683:
680:
679:
675:
672:
671:
667:
665:hhblitsdb.pl
664:
663:
659:
656:
655:
651:
647:
644:
643:
639:
636:
635:
631:
628:
627:
623:
620:
619:
615:
612:
611:
607:
604:
603:
599:
596:
595:
592:
583:
581:
576:
571:
569:
564:
560:
556:
551:
542:
540:
536:
532:
528:
523:
521:
516:
513:
509:
504:
502:
498:
494:
490:
480:
471:
469:
465:
461:
457:
453:
449:
445:
441:
435:
433:
422:
420:
416:
411:
409:
405:
401:
397:
392:
388:
384:
375:
372:
368:
365:
362:
360:
356:
352:
349:
347:
343:
340:
337:
333:
329:
325:
322:
320:
316:
313:
310:
306:
299:
294:
284:
283:
281:
279:
275:
271:
258:
256:
252:
248:
244:
242:
238:
226:
223:
208:
205:
197:
186:
183:
179:
176:
172:
169:
165:
162:
158:
155: –
154:
150:
149:Find sources:
143:
137:
136:
132:
127:This article
125:
121:
116:
115:
106:
103:
95:
85:
81:
76:
74:
68:
59:
58:
53:
51:
44:
43:
38:
37:
32:
27:
18:
17:
1273:CASP website
1203:
1199:
1189:
1178:
1167:
1132:
1128:
1118:
1106:. Retrieved
1102:the original
1096:Li, Zhaoyu.
1091:
1066:
1062:
1056:
1021:
1017:
1007:
972:
968:
936:
901:
897:
841:
838:Nat. Methods
837:
784:
780:
756:
697:
637:reformat.pl
589:
572:
552:
548:
545:Applications
524:
517:
505:
485:
436:
432:"homologous"
428:
412:
403:
399:
395:
382:
380:
335:Available in
241:Developer(s)
218:
200:
191:
181:
174:
167:
160:
148:
128:
98:
89:
70:
47:
40:
34:
33:Please help
30:
1249:Soeding Lab
945:to HHsearch
591:databases.
470:databases.
291:/soedinglab
92:August 2018
1302:Categories
1206:(1): 473.
949:to HHblits
748:References
425:Background
308:Written in
278:Repository
264:2020-08-25
164:newspapers
153:"HH-suite"
131:references
36:improve it
876:205420247
645:addss.pl
621:hhfilter
605:hhsearch
489:PSI-BLAST
474:Algorithm
444:structure
324:Unix-like
293:/hh-suite
194:July 2012
84:talk page
42:talk page
1232:31521110
1159:23413988
1083:16713709
1048:14500884
999:10975570
928:15980461
868:22198341
813:15531603
706:See also
629:hhalign
597:hhblits
586:Contents
568:MODELLER
456:InterPro
396:HHsearch
383:HH-suite
235:HH-suite
1267:HHblits
1223:6744700
1150:4076494
1108:3 April
1039:2366929
990:2144727
919:1160169
650:Psipred
613:hhmake
527:UniProt
400:HHblits
391:protein
370:Website
359:License
339:English
262: (
178:scholar
1261:HHpred
1230:
1220:
1157:
1147:
1081:
1046:
1036:
997:
987:
926:
916:
874:
866:
811:
466:, and
446:, and
404:HHpred
385:is an
364:GPL v3
328:Debian
287:github
180:
173:
166:
159:
151:
872:S2CID
834:(PDF)
501:BLAST
185:JSTOR
171:books
1228:PMID
1155:PMID
1110:2019
1079:PMID
1044:PMID
995:PMID
924:PMID
864:PMID
809:PMID
648:Add
580:CASP
575:CASP
539:SCOP
535:Pfam
468:SCOP
460:Pfam
398:and
381:The
353:tool
346:Type
289:.com
157:news
1218:PMC
1208:doi
1145:PMC
1137:doi
1071:doi
1034:PMC
1026:doi
985:PMC
977:doi
914:PMC
906:doi
854:hdl
846:doi
799:hdl
789:doi
578:In
464:COG
312:C++
133:to
1304::
1226:.
1216:.
1204:20
1202:.
1198:.
1153:.
1143:.
1133:53
1131:.
1127:.
1077:.
1067:16
1065:.
1042:.
1032:.
1022:12
1020:.
1016:.
993:.
983:.
971:.
967:.
955:^
947:,
943:,
922:.
912:.
902:33
900:.
896:.
884:^
870:.
862:.
852:.
840:.
836:.
821:^
807:.
797:.
785:21
783:.
779:.
767:^
462:,
458:,
421:.
326:;
144:.
45:.
1234:.
1210::
1161:.
1139::
1112:.
1085:.
1073::
1050:.
1028::
1001:.
979::
973:9
930:.
908::
878:.
856::
848::
842:9
815:.
801::
791::
266:)
225:)
219:(
207:)
201:(
196:)
192:(
182:·
175:·
168:·
161:·
138:.
105:)
99:(
94:)
90:(
86:.
52:)
48:(
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.