33:
750:
However, this issue is not a problem with RF distances per se, it is a more general criticism of tree distances. Regardless of the behaviour of any specific tree distance a practicing evolutionary biologist might view some tree rearrangements as "important" and other rearrangements as "trivial". Tree distances are tools; they are most useful in the context of other information about the organisms in the trees.
123:
is the number of partitions of data implied by the second tree but not the first tree (although some software implementations divide the RF metric by 2 and others scale the RF distance to have a maximum value of 1). The partitions are calculated for each tree by removing each branch. Thus, the number
673:
In 1985 Day gave an algorithm based on perfect hashing that computes this distance that has only a linear complexity in the number of nodes in the trees. A randomized algorithm that uses hash tables that are not necessarily perfect has been shown to approximate the
Robinson-Foulds distance with a
721:
The RF metric remains widely used because the idea of using the number of splits that differ between a pair of trees is a relatively intuitive way to assess the differences among trees for many systematists. This is the primary strength of the RF distance and the reason for its continued use in
757:
The best-performing generalized
Robinson-Foulds distances have a basis in information theory, and measure the distance between trees in terms of the quantity of information that the trees' splits hold in common (measured in bits). The Clustering Information Distance (implemented in R package
749:
Another issue to consider when using RF distances is that differences in one clade may be trivial (perhaps if the clade resolves three species within a genus differently) or may be fundamental (if the clade is deep in the tree and defines two fundamental subgroups, such as mammals and birds).
753:
These issues can be addressed by using less conservative metrics. "Generalized RF distances" recognize similarity between similar, but non-identical, splits; the original
Robinson Foulds distance doesn't care how similar two groupings are, if they aren't identical they are discarded.
127:
RF distances have been criticized as biased, but they represent a relatively intuitive measure of the distances between phylogenetic trees and therefore remain widely used (the original 1981 paper describing
Robinson-Foulds distances was cited more than 2700 times by 2023 based on
722:
phylogenetics. Of course, the number of splits that differ between a pair of trees depends on the number of taxa in the trees so one might argue that this unit is not meaningful. However, it is straightforward to normalize RF distances so they range between zero and one.
132:). Nevertheless, the biases inherent to the RF distances suggest that researches should consider using "Generalized" Robinson–Foulds metrics that may have better theoretical and practical performance and avoid the biases and misleading attributes of the original metric.
973:*Böcker S., Canzar S., Klau G.W. 2013. The generalized Robinson-Foulds metric. In: Darling A., Stoye J., editors. Algorithms in Bioinformatics. WABI 2013. Lecture Notes in Computer Science, vol 8126. Berlin, Heidelberg: Springer. p. 156–169.
742:
Its range of values can depend on tree shape: trees that contain many uneven partitions will command relatively lower distances, on average, than trees with many even partitions.
459:
362:
144:) for each node (which could be empty, but only nodes with degree greater than or equal to three can be labeled by an empty set) the Robinson–Foulds metric finds the number of
1047:
Schuh, R. T. & Polhemus, J. T. (1980). "Analysis of taxonomic congruence among morphological, ecological and biogeographic data sets for the
Leptopodomorpha (Hemiptera)".
392:
245:
197:
The authors define two trees to be the same if they are isomorphic and the isomorphism preserves the labeling. The construction of the proof is based on a function called
192:
658:
The RF distance corresponds to an equivalent similarity metric that reflects the resolution of the strict consensus of two trees, first used to compare trees in 1980.
268:
215:
162:
194:
operations to convert one into the other. The number of operations defines their distance. Rooted trees can be examined by attaching a dummy leaf to the root node.
648:
621:
594:
567:
540:
513:
486:
419:
322:
295:
1150:
1190:
Makarenkov, V and
Leclerc, B. Comparison of additive trees using circular orders, Journal of Computational Biology,7,5,731-744, 2000,"Mary Ann Liebert, Inc."
1149:
M. Bourque, Arbres de
Steiner et reseaux dont certains sommets sont a localisation variable. PhD thesis, University de Montreal, Montreal, Quebec, 1978
982:
Nye T.M.W., Liò P., Gilks W.R. 2006. A novel algorithm and web-based tool for comparing two alternative phylogenetic trees. Bioinformatics. 22:117–119.
961:
Y. Lin, V. Rajan, B.M. Moret A metric for phylogenetic trees based on matching IEEE/ACM Trans. Comput. Biol. Bioinform., 9 (4) (2012), pp. 1014-1022
976:
Bogdanowicz D., Giaro K. 2012. Matching split distance for unrooted binary phylogenetic trees. IEEE/ACM Trans. Comput. Biol. Bioinforma. 9:150–160.
729:
Relative to other metrics, lacks sensitivity, and is thus imprecise; it can take two fewer distinct values than there are taxa in a tree.
979:
Bogdanowicz D., Giaro K. 2013. On a matching distance between rooted phylogenetic trees. Int. J. Appl. Math. Comput. Sci. 23:669–684.
857:
710:
735:
Its value can be counterintuitive. One example is that moving a tip and its neighbour to a particular point on a tree generates a
17:
1194:
Pattengale, Nicholas D.; Gottlieb, Eric J.; Moret, Bernard M.E. (2007). "Efficiently
Computing the Robinson–Foulds Metric".
1151:
http://www.worldcat.org/title/arbres-de-steiner-et-reseaux-dont-certains-sommets-sont-a-localisation-variable/oclc/053538946
1277:
1282:
75:
839:
42:
706:(`treedist()` function). For comparing groups of trees, the fastest implementations include HashRF and MrsRF.
827:
845:
1095:"Bayesian and parsimony approaches reconstruct informative trees from simulated morphological datasets"
745:
It performs more poorly than many alternative measures in practical settings, based on simulated trees.
424:
327:
803:
1208:
809:
791:
821:
367:
220:
167:
686:, the metric is often used to compute a distance between two trees. The treedist program in the
883:
1203:
691:
488:. The number of operations in each of these procedures is equivalent to the number of edges in
46:
53:
253:
217:, which contracts an edge (combining the nodes, creating a union of their sets). Conversely,
200:
147:
1007:"Information theoretic Generalized Robinson-Foulds metrics for comparing phylogenetic trees"
914:
713:
to compute distances between trees that represent how languages are related to each other.
662:
626:
599:
572:
545:
518:
491:
464:
397:
300:
273:
119:
is the number of partitions of data implied by the first tree but not the second tree and
57:
8:
732:
It is rapidly saturated; very similar trees can be allocated the maximum distance value.
725:
However, the RF metric also suffers a number of theoretical and practical shortcomings:
1124:
101:
124:
of eligible partitions for each tree is equal to the number of branches in that tree.
1256:
1221:
1169:
1129:
1064:
1029:
1025:
944:
936:
899:
1251:
1234:
1246:
1213:
1181:
1165:
1119:
1109:
1056:
1021:
926:
895:
766:
762:) is recommended as the most suitable alternative to the Robinson-Foulds distance.
1176:
William H. E. Day, "Optimal algorithms for comparing trees with labeled leaves",
739:
difference value than if just one of the two tips were moved to the same place.
129:
759:
703:
699:
698:
Python library (under the name "symmetric difference metric"), and R packages
661:
In their 1981 paper
Robinson and Foulds proved that the distance is in fact a
1271:
1068:
940:
683:
1060:
931:
247:
expands an edge (decontraction), where the set can be split in any fashion.
1260:
1225:
1156:
Robinson, D. R.; Foulds, L. R. (1981). "Comparison of phylogenetic trees".
1133:
1114:
1033:
948:
1217:
1094:
1006:
1185:
596:. The sum of the operations is equivalent to a transformation from
765:
An alternative approach to tree distance calculation is to use
687:
140:
Given two unrooted trees of nodes and a set of labels (i.e.,
1046:
141:
769:, rather than splits, as the basis for tree comparison.
695:
1235:"DendroPy: A Python library for phylogenetic computing"
1193:
668:
629:
602:
575:
548:
521:
494:
467:
427:
400:
370:
330:
303:
276:
256:
223:
203:
170:
150:
100:, is a simple way to calculate the distance between
915:"Practical Performance of Tree Comparison Metrics"
642:
615:
588:
561:
534:
507:
480:
453:
413:
386:
356:
316:
289:
262:
239:
209:
186:
156:
1269:
1232:
1155:
882:Robinson, D.F.; Foulds, L.R. (February 1981).
881:
912:
851:hardwiredClusterDistance(tree1, tree2, true)
711:used in quantitative comparative linguistics
394:is used to add the edges only discovered in
1088:
1086:
1084:
1082:
1080:
1078:
1000:
998:
996:
994:
992:
990:
913:Kuhner, Mary K.; Yamato, Jon (2015-03-01).
772:
716:
56:. Please do not remove this message until
1250:
1207:
1123:
1113:
930:
820:Faster than phangorn implementation; see
709:The Robinson–Foulds metric has also been
76:Learn how and when to remove this message
1075:
987:
690:suite offers this function, as does the
677:
52:Relevant discussion may be found on the
1233:Sukumaran, J.; Holder, Mark T. (2010).
14:
1270:
1092:
1004:
969:
967:
877:
875:
873:
26:
669:Algorithms for computing the metric
24:
1143:
884:"Comparison of phylogenetic trees"
702:(`RobinsonFoulds()` function) and
25:
1294:
964:
870:
674:bounded error in sublinear time.
454:{\displaystyle T_{1}\wedge T_{2}}
357:{\displaystyle T_{1}\wedge T_{2}}
1196:Journal of Computational Biology
270:function removes all edges from
31:
1040:
1026:10.1093/bioinformatics/btaa614
955:
906:
833:tree_1.robinson_foulds(tree_2)
135:
13:
1:
1252:10.1093/bioinformatics/btq228
864:
653:
1170:10.1016/0025-5564(81)90043-2
900:10.1016/0025-5564(81)90043-2
797:dist.dendlist(dendlist(x,y))
542:plus the number of edges in
387:{\displaystyle \alpha ^{-1}}
240:{\displaystyle \alpha ^{-1}}
187:{\displaystyle \alpha ^{-1}}
7:
1278:Computational phylogenetics
1180:, Number 1, December 1985.
96:, often abbreviated as the
94:symmetric difference metric
58:conditions to do so are met
10:
1299:
1283:Bioinformatics algorithms
1178:Journal of Classification
1093:Smith, Martin R. (2019).
1005:Smith, Martin R. (2020).
1158:Mathematical Biosciences
888:Mathematical Biosciences
773:Software implementations
717:Strengths and weaknesses
263:{\displaystyle \alpha }
210:{\displaystyle \alpha }
157:{\displaystyle \alpha }
1115:10.1098/rsbl.2018.0632
644:
617:
590:
563:
536:
509:
482:
455:
415:
388:
358:
318:
291:
264:
241:
211:
188:
158:
18:Robinson-Foulds metric
1218:10.1089/cmb.2007.R012
1061:10.1093/sysbio/29.1.1
932:10.1093/sysbio/syu085
678:Specific applications
645:
643:{\displaystyle T_{2}}
618:
616:{\displaystyle T_{1}}
591:
589:{\displaystyle T_{1}}
564:
562:{\displaystyle T_{2}}
537:
535:{\displaystyle T_{2}}
510:
508:{\displaystyle T_{1}}
483:
481:{\displaystyle T_{2}}
456:
416:
414:{\displaystyle T_{2}}
389:
359:
319:
317:{\displaystyle T_{2}}
292:
290:{\displaystyle T_{1}}
265:
242:
212:
189:
159:
815:RobinsonFoulds(x, y)
627:
600:
573:
546:
519:
492:
465:
425:
398:
368:
328:
301:
274:
254:
221:
201:
168:
148:
45:of this article is
1186:10.1007/BF01908061
1049:Systematic Biology
919:Systematic Biology
853:from PhyloNetworks
640:
613:
586:
559:
532:
505:
478:
451:
411:
384:
354:
314:
287:
260:
237:
207:
184:
154:
107:It is defined as (
102:phylogenetic trees
1245:(12): 1569–1571.
1020:(20): 5007–5013.
862:
861:
650:, or vice versa.
86:
85:
78:
16:(Redirected from
1290:
1264:
1254:
1229:
1211:
1173:
1164:(1–2): 131–147.
1138:
1137:
1127:
1117:
1099:
1090:
1073:
1072:
1044:
1038:
1037:
1011:
1002:
985:
971:
962:
959:
953:
952:
934:
910:
904:
903:
894:(1–2): 131–147.
879:
852:
834:
816:
798:
780:Language/Program
777:
776:
767:Quartet distance
649:
647:
646:
641:
639:
638:
622:
620:
619:
614:
612:
611:
595:
593:
592:
587:
585:
584:
569:that are not in
568:
566:
565:
560:
558:
557:
541:
539:
538:
533:
531:
530:
515:that are not in
514:
512:
511:
506:
504:
503:
487:
485:
484:
479:
477:
476:
460:
458:
457:
452:
450:
449:
437:
436:
420:
418:
417:
412:
410:
409:
393:
391:
390:
385:
383:
382:
363:
361:
360:
355:
353:
352:
340:
339:
323:
321:
320:
315:
313:
312:
297:that are not in
296:
294:
293:
288:
286:
285:
269:
267:
266:
261:
246:
244:
243:
238:
236:
235:
216:
214:
213:
208:
193:
191:
190:
185:
183:
182:
163:
161:
160:
155:
81:
74:
70:
67:
61:
35:
34:
27:
21:
1298:
1297:
1293:
1292:
1291:
1289:
1288:
1287:
1268:
1267:
1146:
1144:Further reading
1141:
1108:(2). 20180632.
1102:Biology Letters
1097:
1091:
1076:
1045:
1041:
1009:
1003:
988:
972:
965:
960:
956:
911:
907:
880:
871:
867:
850:
832:
814:
799:from dendextend
796:
775:
719:
680:
671:
656:
634:
630:
628:
625:
624:
607:
603:
601:
598:
597:
580:
576:
574:
571:
570:
553:
549:
547:
544:
543:
526:
522:
520:
517:
516:
499:
495:
493:
490:
489:
472:
468:
466:
463:
462:
445:
441:
432:
428:
426:
423:
422:
405:
401:
399:
396:
395:
375:
371:
369:
366:
365:
348:
344:
335:
331:
329:
326:
325:
308:
304:
302:
299:
298:
281:
277:
275:
272:
271:
255:
252:
251:
228:
224:
222:
219:
218:
202:
199:
198:
175:
171:
169:
166:
165:
149:
146:
145:
138:
122:
118:
114:
110:
90:Robinson–Foulds
82:
71:
65:
62:
51:
36:
32:
23:
22:
15:
12:
11:
5:
1296:
1286:
1285:
1280:
1266:
1265:
1239:Bioinformatics
1230:
1209:10.1.1.75.3338
1202:(6): 724–735.
1191:
1188:
1174:
1153:
1145:
1142:
1140:
1139:
1074:
1039:
1014:Bioinformatics
986:
984:
983:
980:
977:
963:
954:
925:(2): 205–214.
905:
868:
866:
863:
860:
859:
854:
848:
842:
841:
836:
830:
824:
823:
818:
812:
806:
805:
800:
794:
788:
787:
784:
781:
774:
771:
747:
746:
743:
740:
733:
730:
718:
715:
692:RAxML_standard
679:
676:
670:
667:
655:
652:
637:
633:
610:
606:
583:
579:
556:
552:
529:
525:
502:
498:
475:
471:
448:
444:
440:
435:
431:
408:
404:
381:
378:
374:
351:
347:
343:
338:
334:
311:
307:
284:
280:
259:
234:
231:
227:
206:
181:
178:
174:
153:
137:
134:
130:Google Scholar
120:
116:
112:
108:
84:
83:
66:September 2020
39:
37:
30:
9:
6:
4:
3:
2:
1295:
1284:
1281:
1279:
1276:
1275:
1273:
1262:
1258:
1253:
1248:
1244:
1240:
1236:
1231:
1227:
1223:
1219:
1215:
1210:
1205:
1201:
1197:
1192:
1189:
1187:
1183:
1179:
1175:
1171:
1167:
1163:
1159:
1154:
1152:
1148:
1147:
1135:
1131:
1126:
1121:
1116:
1111:
1107:
1103:
1096:
1089:
1087:
1085:
1083:
1081:
1079:
1070:
1066:
1062:
1058:
1054:
1050:
1043:
1035:
1031:
1027:
1023:
1019:
1015:
1008:
1001:
999:
997:
995:
993:
991:
981:
978:
975:
974:
970:
968:
958:
950:
946:
942:
938:
933:
928:
924:
920:
916:
909:
901:
897:
893:
889:
885:
878:
876:
874:
869:
858:
855:
849:
847:
844:
843:
840:
837:
831:
829:
826:
825:
822:
819:
817:from TreeDist
813:
811:
808:
807:
804:
801:
795:
793:
790:
789:
785:
782:
779:
778:
770:
768:
763:
761:
755:
751:
744:
741:
738:
734:
731:
728:
727:
726:
723:
714:
712:
707:
705:
701:
697:
694:package, the
693:
689:
685:
684:phylogenetics
675:
666:
664:
659:
651:
635:
631:
608:
604:
581:
577:
554:
550:
527:
523:
500:
496:
473:
469:
446:
442:
438:
433:
429:
406:
402:
379:
376:
372:
349:
345:
341:
336:
332:
309:
305:
282:
278:
257:
248:
232:
229:
225:
204:
195:
179:
176:
172:
151:
143:
133:
131:
125:
105:
103:
99:
95:
91:
80:
77:
69:
59:
55:
49:
48:
44:
38:
29:
28:
19:
1242:
1238:
1199:
1195:
1177:
1161:
1157:
1105:
1101:
1052:
1048:
1042:
1017:
1013:
957:
922:
918:
908:
891:
887:
764:
756:
752:
748:
736:
724:
720:
708:
681:
672:
660:
657:
421:to the tree
249:
196:
139:
126:
106:
97:
93:
89:
87:
72:
63:
41:
1055:(1): 1–26.
364:, and then
324:, creating
136:Explanation
98:RF distance
1272:Categories
865:References
654:Properties
461:to build
43:neutrality
1204:CiteSeerX
1069:1063-5157
941:1076-836X
835:from ete3
439:∧
377:−
373:α
342:∧
258:α
230:−
226:α
205:α
177:−
173:α
152:α
54:talk page
1261:20421198
1226:17691890
1134:30958126
1034:32619004
949:25378436
783:Function
760:TreeDist
704:phangorn
700:TreeDist
696:DendroPy
115:) where
47:disputed
1125:6405459
1259:
1224:
1206:
1132:
1122:
1067:
1032:
947:
939:
828:Python
786:Notes
688:PHYLIP
663:metric
1098:(PDF)
1010:(PDF)
846:Julia
737:lower
1257:PMID
1222:PMID
1130:PMID
1065:ISSN
1030:PMID
945:PMID
937:ISSN
856:See
838:See
802:See
250:The
164:and
142:taxa
88:The
40:The
1247:doi
1214:doi
1182:doi
1166:doi
1120:PMC
1110:doi
1057:doi
1022:doi
927:doi
896:doi
682:In
623:to
92:or
1274::
1255:.
1243:26
1241:.
1237:.
1220:.
1212:.
1200:14
1198:.
1162:53
1160:.
1128:.
1118:.
1106:15
1104:.
1100:.
1077:^
1063:.
1053:29
1051:.
1028:.
1018:36
1016:.
1012:.
989:^
966:^
943:.
935:.
923:64
921:.
917:.
892:53
890:.
886:.
872:^
665:.
111:+
104:.
1263:.
1249::
1228:.
1216::
1184::
1172:.
1168::
1136:.
1112::
1071:.
1059::
1036:.
1024::
951:.
929::
902:.
898::
810:R
792:R
636:2
632:T
609:1
605:T
582:1
578:T
555:2
551:T
528:2
524:T
501:1
497:T
474:2
470:T
447:2
443:T
434:1
430:T
407:2
403:T
380:1
350:2
346:T
337:1
333:T
310:2
306:T
283:1
279:T
233:1
180:1
121:B
117:A
113:B
109:A
79:)
73:(
68:)
64:(
60:.
50:.
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.