745:
family, they were able to identify a small network of residues that were energetically coupled to a binding site residue. The network consisted of both residues spatially close to the binding site in the tertiary fold, called contact pairs, and more distant residues that participate in longer-range
781:
to natural WW domains. The fact that 12 out of the 43 designed proteins with the same SCA profile as natural WW domains properly folded provided strong evidence that little information—only coupling information—was required for specifying the protein fold. This support for the SCA hypothesis was
700:
704:
Statistical coupling energy is often systematically calculated between a fixed, perturbated position, and all other positions in an MSA. Continuing with the example MSA from the beginning of the section, consider a perturbation at position
561:
59:
Statistical coupling energy measures how a perturbation of amino acid distribution at one site in an MSA affects the amino acid distribution at another site. For example, consider a multiple sequence alignment with sites (or columns)
392:
786:
to natural WW folds, and b) none of the artificial proteins designed without coupling information folded properly. An accompanying study showed that the artificial WW domains were functionally similar to natural WW domains in
220:
825:
801:, it has been shown that, when combined with a simple residue-residue distance metric, SCA-based scoring can fairly accurately distinguish native from non-native protein folds.
566:
442:
941:
Suel; Lockless, SW; Wall, MA; Ranganathan, R; et al. (2003). "Evolutionarily conserved networks of residues mediate allosteric communication in proteins".
246:
984:
Socolich; Lockless, SW; Russ, WP; Lee, H; Gardner, KH; Ranganathan, R; et al. (2005). "Evolutionary information for specifying a protein fold".
130:
1087:
709:
where the amino distribution changes from 40% I, 40% H, 20% M to 100% I. If, in a subsequent subalignment, this changes the distribution at
1088:"Using scores derived from statistical coupling analysis to distinguish correct and incorrect folds in de-novo protein structure prediction"
769:
Statistical coupling analysis has also been used as a basis for computational protein design. In 2005, Socolich et al. used an SCA for the
741:
Ranganathan and
Lockless originally developed SCA to examine thermodynamic (energetic) coupling of residue pairs in proteins. Using the
51:
indicates the degree of evolutionary dependence between the residues, with higher coupling energy corresponding to increased dependence.
795:
116:
have an amino acid distribution different from the mean distribution observed in all proteins, they are said to have some degree of
1035:
Russ; Lowery, DM; Mishra, P; Yaffe, MB; Ranganathan, R; et al. (2005). "Natural-like function in artificial WW domains".
788:
867:"A perturbation-based method for calculating explicit likelihood of evolutionary co-variance in multiple sequence alignments"
852:"Supplementary Material for 'Evolutionarily conserved networks of residues mediate allosteric communication in proteins.'"
906:
Lockless SW, Ranaganathan R (1999). "Evolutionarily conserved pathways of energetic connectivity in protein families".
100:
has an average distribution (the 20 amino acids are present at roughly the same frequencies seen in all proteins), and
774:
1123:
695:{\displaystyle \Delta \Delta G_{i,j}^{stat}={\sqrt {\sum _{x}(\ln P_{i|\delta j}^{x}-\ln P_{i}^{x})^{2}}}}
36:
417:
in all positions among all sequenced proteins. The summation runs over all 20 amino acids. After ΔG
830:
238:
1143:
782:
made more compelling considering that a) the successfully folded proteins had only 36% average
39:(MSA). More specifically, it quantifies how much the amino acid distribution at some position
833:- A summary of the Ranganathan lab's SCA-based design of artificial yet functional WW domains.
820:
556:{\displaystyle \Delta \Delta G_{i,j}^{stat}=\Delta G_{i|\delta j}^{stat}-\Delta G_{i}^{stat}}
1044:
993:
8:
762:
families also showed energetic coupling in sparse networks of residues that cooperate in
1048:
997:
387:{\displaystyle P_{i}^{x}={\frac {N!}{n_{x}!(N-n_{x})!}}p_{x}^{n_{x}}(1-p_{x})^{N-n_{x}}}
1115:
1068:
1017:
966:
809:
783:
1107:
1060:
1009:
958:
923:
888:
778:
763:
1119:
970:
883:
866:
713:
from 60% V, 40% L to 90% V, 10% L, but does not change the distribution at position
1099:
1072:
1052:
1021:
1001:
950:
915:
878:
919:
755:
43:
changes upon a perturbation of the amino acid distribution at another position
24:
425:
in a subalignment produced after a perturbation of amino acid distribution at
1137:
851:
1111:
1064:
1013:
962:
927:
892:
215:{\displaystyle \Delta G_{i}^{stat}={\sqrt {\sum _{x}(\ln P_{i}^{x})^{2}}}}
717:, then there would be some amount of statistical coupling energy between
32:
28:
1056:
1005:
1103:
759:
742:
93:
85:
123:
In statistical coupling analysis, the conservation (ΔG) at each site (
826:
Ranganathan lecture on statistical coupling analysis (audio included)
770:
89:
68:, where each site has some distribution of amino acids. At position
954:
77:
437:, is simply the difference between these two values. That is:
73:
865:
Dekker; Fodor, A; Aldrich, RW; Yellen, G; et al. (2004).
747:
54:
940:
905:
751:
746:
energetic interactions. Later applications of SCA by the
413:
corresponds to the approximate distribution of amino acid
864:
983:
1085:
1034:
569:
445:
433:) is taken. Statistical coupling energy, denoted ΔΔG
249:
133:
694:
555:
386:
214:
1135:
229:describes the probability of finding amino acid
104:has 80% histidine, 20% valine. Since positions
401:is the percentage of sequences with residue
773:to create artificial proteins with similar
421:is computed, the conservation for position
76:and the remaining 40% of sequences have a
882:
55:Definition of statistical coupling energy
789:ligand binding affinity and specificity
1136:
13:
573:
570:
526:
485:
449:
446:
237:, and is defined by a function in
134:
14:
1155:
814:
831:Protein folding — a step closer?
1086:Bartlett GJ, Taylor WR (2008).
736:
1079:
1028:
977:
934:
899:
858:
844:
681:
639:
621:
497:
405:(e.g. methionine) at position
362:
342:
311:
292:
201:
176:
72:, 60% of the sequences have a
1:
884:10.1093/bioinformatics/bth128
837:
439:
243:
17:Statistical coupling analysis
920:10.1126/science.286.5438.295
799:protein structure prediction
7:
804:
49:statistical coupling energy
37:multiple sequence alignment
10:
1160:
943:Nature Structural Biology
764:allosteric communication
84:the distribution is 40%
775:thermodynamic stability
23:is a technique used in
696:
557:
388:
216:
697:
563:, or, more commonly,
558:
389:
217:
821:What is a WW domain?
567:
443:
247:
131:
1057:10.1038/nature03990
1049:2005Natur.437..579R
1006:10.1038/nature03991
998:2005Natur.437..512S
679:
655:
605:
552:
522:
481:
341:
264:
199:
160:
1104:10.1002/prot.21779
810:Mutual information
692:
665:
630:
620:
576:
553:
529:
488:
452:
384:
320:
250:
212:
185:
175:
137:
1043:(7058): 579–583.
992:(7058): 512–518.
914:(5438): 295–299.
877:(10): 1565–1572.
784:sequence identity
748:Ranganathan group
725:but none between
690:
611:
397:where N is 100, n
318:
210:
166:
127:) is defined as:
47:. The resulting
31:between pairs of
1151:
1128:
1127:
1122:. Archived from
1083:
1077:
1076:
1032:
1026:
1025:
981:
975:
974:
938:
932:
931:
903:
897:
896:
886:
862:
856:
855:
848:
701:
699:
698:
693:
691:
689:
688:
678:
673:
654:
649:
642:
619:
610:
604:
590:
562:
560:
559:
554:
551:
537:
521:
507:
500:
480:
466:
393:
391:
390:
385:
383:
382:
381:
380:
360:
359:
340:
339:
338:
328:
319:
317:
310:
309:
288:
287:
277:
269:
263:
258:
221:
219:
218:
213:
211:
209:
208:
198:
193:
174:
165:
159:
145:
1159:
1158:
1154:
1153:
1152:
1150:
1149:
1148:
1134:
1133:
1132:
1131:
1084:
1080:
1033:
1029:
982:
978:
939:
935:
904:
900:
863:
859:
850:
849:
845:
840:
817:
807:
756:serine protease
739:
702:
684:
680:
674:
669:
650:
638:
634:
615:
609:
591:
580:
568:
565:
564:
538:
533:
508:
496:
492:
467:
456:
444:
441:
440:
436:
432:
420:
412:
400:
395:
376:
372:
365:
361:
355:
351:
334:
330:
329:
324:
305:
301:
283:
279:
278:
270:
268:
259:
254:
248:
245:
244:
228:
204:
200:
194:
189:
170:
164:
146:
141:
132:
129:
128:
57:
12:
11:
5:
1157:
1147:
1146:
1144:Bioinformatics
1130:
1129:
1126:on 2012-12-17.
1098:(1): 950–959.
1078:
1027:
976:
955:10.1038/nsb881
933:
898:
871:Bioinformatics
857:
842:
841:
839:
836:
835:
834:
828:
823:
816:
815:External links
813:
806:
803:
738:
735:
687:
683:
677:
672:
668:
664:
661:
658:
653:
648:
645:
641:
637:
633:
629:
626:
623:
618:
614:
608:
603:
600:
597:
594:
589:
586:
583:
579:
575:
572:
550:
547:
544:
541:
536:
532:
528:
525:
520:
517:
514:
511:
506:
503:
499:
495:
491:
487:
484:
479:
476:
473:
470:
465:
462:
459:
455:
451:
448:
434:
430:
418:
410:
398:
379:
375:
371:
368:
364:
358:
354:
350:
347:
344:
337:
333:
327:
323:
316:
313:
308:
304:
300:
297:
294:
291:
286:
282:
276:
273:
267:
262:
257:
253:
226:
207:
203:
197:
192:
188:
184:
181:
178:
173:
169:
163:
158:
155:
152:
149:
144:
140:
136:
80:, at position
56:
53:
25:bioinformatics
9:
6:
4:
3:
2:
1156:
1145:
1142:
1141:
1139:
1125:
1121:
1117:
1113:
1109:
1105:
1101:
1097:
1093:
1089:
1082:
1074:
1070:
1066:
1062:
1058:
1054:
1050:
1046:
1042:
1038:
1031:
1023:
1019:
1015:
1011:
1007:
1003:
999:
995:
991:
987:
980:
972:
968:
964:
960:
956:
952:
948:
944:
937:
929:
925:
921:
917:
913:
909:
902:
894:
890:
885:
880:
876:
872:
868:
861:
853:
847:
843:
832:
829:
827:
824:
822:
819:
818:
812:
811:
802:
800:
798:
792:
790:
785:
780:
776:
772:
767:
765:
761:
757:
753:
749:
744:
734:
732:
728:
724:
720:
716:
712:
708:
685:
675:
670:
666:
662:
659:
656:
651:
646:
643:
635:
631:
627:
624:
616:
612:
606:
601:
598:
595:
592:
587:
584:
581:
577:
548:
545:
542:
539:
534:
530:
523:
518:
515:
512:
509:
504:
501:
493:
489:
482:
477:
474:
471:
468:
463:
460:
457:
453:
438:
428:
424:
416:
408:
404:
377:
373:
369:
366:
356:
352:
348:
345:
335:
331:
325:
321:
314:
306:
302:
298:
295:
289:
284:
280:
274:
271:
265:
260:
255:
251:
242:
241:as follows:
240:
239:binomial form
236:
232:
223:
205:
195:
190:
186:
182:
179:
171:
167:
161:
156:
153:
150:
147:
142:
138:
126:
121:
119:
115:
111:
107:
103:
99:
95:
91:
87:
83:
79:
75:
71:
67:
63:
52:
50:
46:
42:
38:
35:in a protein
34:
30:
26:
22:
18:
1124:the original
1095:
1091:
1081:
1040:
1036:
1030:
989:
985:
979:
949:(1): 59–69.
946:
942:
936:
911:
907:
901:
874:
870:
860:
846:
808:
796:
793:
768:
740:
737:Applications
730:
726:
722:
718:
714:
710:
706:
703:
426:
422:
414:
406:
402:
396:
234:
233:at position
230:
224:
124:
122:
118:conservation
117:
113:
109:
105:
101:
97:
81:
69:
65:
61:
58:
48:
44:
40:
20:
16:
15:
33:amino acids
29:covariation
27:to measure
838:References
760:hemoglobin
743:PDZ domain
94:methionine
86:isoleucine
779:structure
771:WW domain
663:
657:−
644:δ
628:
613:∑
574:Δ
571:Δ
527:Δ
524:−
502:δ
486:Δ
450:Δ
447:Δ
370:−
349:−
299:−
183:
168:∑
135:Δ
90:histidine
1138:Category
1120:33836866
1112:18004776
1092:Proteins
1065:16177795
1014:16177782
971:67749580
963:12483203
928:10514373
893:14962924
805:See also
92:and 20%
64:through
1073:4424336
1045:Bibcode
1022:4363255
994:Bibcode
908:Science
797:de novo
750:on the
409:, and p
225:Here, P
78:leucine
1118:
1110:
1071:
1063:
1037:Nature
1020:
1012:
986:Nature
969:
961:
926:
891:
431:i | δj
88:, 40%
74:valine
1116:S2CID
1069:S2CID
1018:S2CID
967:S2CID
1108:PMID
1061:PMID
1010:PMID
959:PMID
924:PMID
889:PMID
777:and
758:and
752:GPCR
729:and
721:and
435:i, j
222:.
112:and
1100:doi
1053:doi
1041:437
1002:doi
990:437
951:doi
916:doi
912:286
879:doi
794:In
429:(ΔG
21:SCA
19:or
1140::
1114:.
1106:.
1096:71
1094:.
1090:.
1067:.
1059:.
1051:.
1039:.
1016:.
1008:.
1000:.
988:.
965:.
957:.
947:10
945:.
922:.
910:.
887:.
875:20
873:.
869:.
791:.
766:.
754:,
733:.
660:ln
625:ln
180:ln
120:.
108:,
96:,
1102::
1075:.
1055::
1047::
1024:.
1004::
996::
973:.
953::
930:.
918::
895:.
881::
854:.
731:j
727:l
723:j
719:i
715:l
711:i
707:j
686:2
682:)
676:x
671:i
667:P
652:x
647:j
640:|
636:i
632:P
622:(
617:x
607:=
602:t
599:a
596:t
593:s
588:j
585:,
582:i
578:G
549:t
546:a
543:t
540:s
535:i
531:G
519:t
516:a
513:t
510:s
505:j
498:|
494:i
490:G
483:=
478:t
475:a
472:t
469:s
464:j
461:,
458:i
454:G
427:j
423:i
419:i
415:x
411:x
407:i
403:x
399:x
394:,
378:x
374:n
367:N
363:)
357:x
353:p
346:1
343:(
336:x
332:n
326:x
322:p
315:!
312:)
307:x
303:n
296:N
293:(
290:!
285:x
281:n
275:!
272:N
266:=
261:x
256:i
252:P
235:i
231:x
227:i
206:2
202:)
196:x
191:i
187:P
177:(
172:x
162:=
157:t
154:a
151:t
148:s
143:i
139:G
125:i
114:l
110:j
106:i
102:l
98:k
82:j
70:i
66:z
62:a
45:j
41:i
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.