1341:
of a string in a text. This modification gives the end-position of matching substrings of the text. To determine the start-position of the matching substrings, the number of insertions and deletions can be stored separately and used to compute the start-position from the end-position.
1286:
We can give different penalty costs to insertion, deletion and substitution. We can also give penalty costs that depend on which characters are inserted, deleted or substituted.
1478:
1264:
1345:
The resulting algorithm is by no means efficient, but was at the time of its publication (1980) one of the first algorithms that performed approximate search.
1746:
1218:
We can store the number of insertions, deletions, and substitutions separately, or even the positions at which they occur, which is always
1641:
1680:
1606:
1337:
By initializing the first row of the matrix with zeros, we obtain a variant of the Wagner–Fischer algorithm that can be used for
1611:
430:
Two examples of the resulting matrix (hovering over an underlined number reveals the operation performed to get that number):
1427:
1367:
1616:
1328:
is the
Levenshtein distance), which is much faster than the regular dynamic programming algorithm if the distance is small.
1827:
1601:
47:. Navarro lists the following inventors of it, with date of publication, and acknowledges that the list is incomplete:
1695:
1636:
1508:
1543:
1858:
1723:
57:
1728:
1444:
1807:
1533:
1812:
1690:
1657:
1593:
1822:
1718:
1662:
1563:
1517:
93:
1387:
96:
of the first string and all prefixes of the second, then we can compute the values in the matrix by
1817:
1621:
1553:
88:
The Wagner–Fischer algorithm computes edit distance based on the observation that if we reserve a
1863:
1766:
990:
966:
1185:
is smaller than the minimum of the three, and use this to show one of the three is not minimal.
100:
the matrix, and thus find the distance between the two full strings as the last value computed.
1474:
1382:
1338:
1771:
1713:
1573:
1501:
1063:
Otherwise, the distance is the minimum of the three possible ways to do the transformation:
1578:
1178:
128:
89:
1240:
1215:), since it only requires that the previous row and current row be stored at any one time.
8:
1776:
28:
1705:
1672:
1420:
Algorithms on strings, trees, and sequences: computer science and computational biology
1400:
1290:
73:
44:
1837:
1460:
1423:
1404:
1477:. Student Section of the European Summer School in Logic, Language and Information (
1301:
values can be computed in parallel, and the algorithm can be adapted to perform the
131:
between them, looks as follows. The input strings are one-indexed, while the matrix
1832:
1802:
1756:
1558:
1494:
1456:
1392:
981:
operations. At the end, the bottom-right element of the array contains the answer.
20:
1685:
1626:
1538:
1309:
1294:
969:
maintained throughout the algorithm is that we can transform the initial segment
1738:
1631:
1198:
51:
1852:
1548:
1525:
63:
32:
1231:
If we are only interested in the distance if it is smaller than a threshold
1751:
1568:
1396:
1761:
104:
97:
1177:
is in fact minimal; this is more difficult to show, and involves an
842:
619:
495:
1797:
1475:
An approximate gazetteer for GATE based on levenshtein distance
176:// the first i characters of s and the first j characters of t
1781:
1486:
248:// target prefixes can be reached from empty source prefix
212:// source prefixes can be transformed into empty string by
1235:, then it suffices to compute a diagonal stripe of width
1268:
in the matrix. In this way, the algorithm can be run in
1173:
This proof fails to validate that the number placed in
953:
916:
879:
805:
758:
650:
588:
557:
526:
1332:
1308:
By examining diagonals instead of rows, and by using
1243:
1157:
is of course the number required to transform all of
1109:
and then do the same transformation, for a total of
173:// for all i and j, d will hold the distance between
1258:
1193:Possible modifications to this algorithm include:
1009:It is initially true on row and column 0 because
1850:
1056:and just leave the last character alone, giving
1368:"A guided tour to approximate string matching"
1225:We can normalize the distance to the interval
1197:We can adapt the algorithm to use less space,
43:The Wagner–Fischer algorithm has a history of
1502:
1422:. Cambridge, UK: Cambridge University Press.
1417:
1305:function in phases to eliminate dependencies.
993:is that we can transform the initial segment
1509:
1495:
1312:, we can find the Levenshtein distance in
768:
763:
1442:
1386:
1361:
1359:
1357:
1188:
1013:can be transformed into the empty string
678:
1681:Comparison of regular-expression engines
1021:characters. Similarly, we can transform
1005:operations. This invariant holds since:
1445:"Lazy Dynamic-Programming can be Eager"
1365:
1128:operations, then we can do the same to
1052:operations, then we can do the same to
984:
92:to hold the edit distances between all
83:
1851:
1354:
1642:Zhu–Takaoka string matching algorithm
1490:
1283:is the length of the shortest string.
1149:The operations required to transform
179:// note that d has (m+1)*(n+1) values
103:A straightforward implementation, as
1607:Boyer–Moore string-search algorithm
1078:operations, then we can simply add
35:between two strings of characters.
13:
1333:Seller's variant for string search
14:
1875:
1696:Nondeterministic finite automaton
1637:Two-way string-matching algorithm
1293:poorly, due to a large number of
1105:operations, then we can remove
251:// by inserting every character
1612:Boyer–Moore–Horspool algorithm
1602:Apostolico–Giancarlo algorithm
1467:
1436:
1411:
1:
1348:
1617:Knuth–Morris–Pratt algorithm
1544:Damerau–Levenshtein distance
1461:10.1016/0020-0190(92)90202-7
1443:Allison L (September 1992).
1132:, and exchange the original
31:algorithm that computes the
7:
1808:Compressed pattern matching
1534:Approximate string matching
1516:
1140:afterwards, for a total of
10:
1880:
1813:Longest common subsequence
1724:Needleman–Wunsch algorithm
1594:String-searching algorithm
1473:Bruno Woltzenlogel Paleo.
1144:operations (substitution).
989:As mentioned earlier, the
434:
215:// dropping all characters
38:
1823:Sequential pattern mining
1790:
1737:
1704:
1671:
1663:Commentz-Walter algorithm
1651:Multiple string searching
1650:
1592:
1584:Wagner–Fischer algorithm
1524:
1366:Navarro, Gonzalo (2001).
1179:argument by contradiction
79:Lowrance and Wagner, 1975
1833:String rewriting systems
1818:Longest common substring
1729:Smith–Waterman algorithm
1554:Gestalt pattern matching
140:
111:that takes two strings,
25:Wagner–Fischer algorithm
1767:Generalized suffix tree
1691:Thompson's construction
1090:operations (insertion).
1040:, and we can transform
1017:by simply dropping all
1719:Hirschberg's algorithm
1418:Gusfield, Dan (1997).
1260:
1189:Possible modifications
1113:operations (deletion).
1859:Algorithms on strings
1574:Levenshtein automaton
1564:Jaro–Winkler distance
1397:10.1145/375360.375365
1375:ACM Computing Surveys
1261:
1029:by simply adding all
135:is zero-indexed, and
16:Programming algorithm
1622:Rabin–Karp algorithm
1579:Levenshtein distance
1259:{\displaystyle 2k+1}
1241:
1116:If we can transform
1093:If we can transform
1066:If we can transform
985:Proof of correctness
129:Levenshtein distance
84:Calculating distance
58:Needleman and Wunsch
1777:Ternary search tree
1339:fuzzy string search
1297:. However, all the
1181:in which we assume
1001:using a minimum of
977:using a minimum of
138:is a closed range.
29:dynamic programming
1706:Sequence alignment
1673:Regular expression
1449:Inf. Proc. Letters
1256:
1082:afterwards to get
127:, and returns the
45:multiple invention
1846:
1845:
1838:String operations
1429:978-0-521-58519-4
1295:data dependencies
1169:holds our result.
963:
962:
959:
958:
684:
683:
1871:
1803:Pattern matching
1757:Suffix automaton
1559:Hamming distance
1511:
1504:
1497:
1488:
1487:
1482:
1471:
1465:
1464:
1440:
1434:
1433:
1415:
1409:
1408:
1390:
1372:
1363:
1304:
1300:
1282:
1278:
1267:
1265:
1263:
1262:
1257:
1234:
1227:
1221:
1184:
1176:
1168:
1164:
1160:
1156:
1152:
1143:
1139:
1135:
1131:
1127:
1123:
1119:
1112:
1108:
1104:
1100:
1096:
1089:
1085:
1081:
1077:
1073:
1069:
1059:
1055:
1051:
1047:
1043:
1039:
1032:
1028:
1024:
1020:
1016:
1012:
1004:
1000:
996:
980:
976:
972:
955:
918:
881:
844:
807:
770:
765:
760:
688:
687:
680:
652:
621:
590:
559:
528:
497:
437:
436:
433:
432:
426:
423:
420:
417:
414:
413:substitutionCost
411:
408:
405:
402:
399:
396:
393:
390:
387:
384:
381:
378:
375:
372:
369:
366:
363:
360:
357:
356:substitutionCost
354:
351:
348:
345:
342:
341:substitutionCost
339:
336:
333:
330:
327:
324:
321:
318:
315:
312:
309:
306:
303:
300:
297:
294:
291:
288:
285:
282:
279:
276:
273:
270:
267:
264:
261:
258:
255:
252:
249:
246:
243:
240:
237:
234:
231:
228:
225:
222:
219:
216:
213:
210:
207:
204:
201:
198:
195:
192:
189:
186:
183:
180:
177:
174:
171:
168:
165:
162:
159:
156:
153:
150:
147:
144:
137:
21:computer science
1879:
1878:
1874:
1873:
1872:
1870:
1869:
1868:
1849:
1848:
1847:
1842:
1786:
1733:
1700:
1686:Regular grammar
1667:
1646:
1627:Raita algorithm
1588:
1539:Bitap algorithm
1520:
1515:
1485:
1472:
1468:
1441:
1437:
1430:
1416:
1412:
1388:10.1.1.452.6317
1370:
1364:
1355:
1351:
1335:
1324:)) time (where
1310:lazy evaluation
1302:
1298:
1289:This algorithm
1280:
1269:
1242:
1239:
1238:
1236:
1232:
1226:
1219:
1191:
1182:
1174:
1166:
1162:
1158:
1154:
1150:
1141:
1137:
1133:
1129:
1125:
1121:
1117:
1110:
1106:
1102:
1098:
1094:
1087:
1083:
1079:
1075:
1071:
1067:
1057:
1053:
1049:
1045:
1041:
1037:
1030:
1026:
1022:
1018:
1014:
1010:
1002:
998:
994:
987:
978:
974:
970:
428:
427:
424:
421:
419:// substitution
418:
415:
412:
409:
406:
403:
400:
397:
394:
391:
388:
385:
382:
379:
376:
373:
370:
367:
364:
361:
358:
355:
352:
349:
346:
343:
340:
337:
334:
331:
328:
325:
322:
319:
316:
313:
310:
307:
304:
301:
298:
295:
292:
289:
286:
283:
280:
277:
274:
271:
268:
265:
262:
259:
256:
253:
250:
247:
244:
241:
238:
235:
232:
229:
226:
223:
220:
217:
214:
211:
208:
205:
202:
199:
196:
193:
190:
187:
184:
181:
178:
175:
172:
169:
166:
163:
160:
157:
154:
151:
148:
145:
142:
136:
107:for a function
86:
41:
17:
12:
11:
5:
1877:
1867:
1866:
1864:String metrics
1861:
1844:
1843:
1841:
1840:
1835:
1830:
1825:
1820:
1815:
1810:
1805:
1800:
1794:
1792:
1788:
1787:
1785:
1784:
1779:
1774:
1769:
1764:
1759:
1754:
1749:
1743:
1741:
1739:Data structure
1735:
1734:
1732:
1731:
1726:
1721:
1716:
1710:
1708:
1702:
1701:
1699:
1698:
1693:
1688:
1683:
1677:
1675:
1669:
1668:
1666:
1665:
1660:
1654:
1652:
1648:
1647:
1645:
1644:
1639:
1634:
1632:Trigram search
1629:
1624:
1619:
1614:
1609:
1604:
1598:
1596:
1590:
1589:
1587:
1586:
1581:
1576:
1571:
1566:
1561:
1556:
1551:
1546:
1541:
1536:
1530:
1528:
1522:
1521:
1514:
1513:
1506:
1499:
1491:
1484:
1483:
1466:
1435:
1428:
1410:
1352:
1350:
1347:
1334:
1331:
1330:
1329:
1306:
1287:
1284:
1255:
1252:
1249:
1246:
1229:
1223:
1216:
1190:
1187:
1171:
1170:
1147:
1146:
1145:
1114:
1091:
1061:
1034:
986:
983:
961:
960:
957:
956:
951:
948:
945:
942:
939:
936:
933:
930:
927:
923:
922:
919:
914:
911:
908:
905:
902:
899:
896:
893:
889:
888:
885:
882:
877:
874:
871:
868:
865:
862:
859:
855:
854:
851:
848:
845:
840:
837:
834:
831:
828:
825:
821:
820:
817:
814:
811:
808:
803:
800:
797:
794:
791:
787:
786:
783:
780:
777:
774:
771:
766:
761:
756:
753:
749:
748:
745:
742:
739:
736:
733:
730:
727:
724:
721:
718:
717:
714:
711:
708:
705:
702:
699:
696:
693:
691:
685:
682:
681:
676:
673:
670:
667:
664:
661:
658:
654:
653:
648:
645:
642:
639:
636:
633:
630:
626:
625:
622:
617:
614:
611:
608:
605:
602:
598:
597:
594:
591:
586:
583:
580:
577:
574:
570:
569:
566:
563:
560:
555:
552:
549:
546:
542:
541:
538:
535:
532:
529:
524:
521:
518:
514:
513:
510:
507:
504:
501:
498:
493:
490:
486:
485:
482:
479:
476:
473:
470:
467:
464:
461:
460:
457:
454:
451:
448:
445:
442:
440:
141:
85:
82:
81:
80:
77:
70:
67:
61:
55:
40:
37:
15:
9:
6:
4:
3:
2:
1876:
1865:
1862:
1860:
1857:
1856:
1854:
1839:
1836:
1834:
1831:
1829:
1826:
1824:
1821:
1819:
1816:
1814:
1811:
1809:
1806:
1804:
1801:
1799:
1796:
1795:
1793:
1789:
1783:
1780:
1778:
1775:
1773:
1770:
1768:
1765:
1763:
1760:
1758:
1755:
1753:
1750:
1748:
1745:
1744:
1742:
1740:
1736:
1730:
1727:
1725:
1722:
1720:
1717:
1715:
1712:
1711:
1709:
1707:
1703:
1697:
1694:
1692:
1689:
1687:
1684:
1682:
1679:
1678:
1676:
1674:
1670:
1664:
1661:
1659:
1656:
1655:
1653:
1649:
1643:
1640:
1638:
1635:
1633:
1630:
1628:
1625:
1623:
1620:
1618:
1615:
1613:
1610:
1608:
1605:
1603:
1600:
1599:
1597:
1595:
1591:
1585:
1582:
1580:
1577:
1575:
1572:
1570:
1567:
1565:
1562:
1560:
1557:
1555:
1552:
1550:
1549:Edit distance
1547:
1545:
1542:
1540:
1537:
1535:
1532:
1531:
1529:
1527:
1526:String metric
1523:
1519:
1512:
1507:
1505:
1500:
1498:
1493:
1492:
1489:
1480:
1476:
1470:
1462:
1458:
1455:(4): 207–12.
1454:
1450:
1446:
1439:
1431:
1425:
1421:
1414:
1406:
1402:
1398:
1394:
1389:
1384:
1380:
1376:
1369:
1362:
1360:
1358:
1353:
1346:
1343:
1340:
1327:
1323:
1319:
1315:
1311:
1307:
1296:
1292:
1288:
1285:
1276:
1272:
1253:
1250:
1247:
1244:
1230:
1224:
1217:
1214:
1210:
1207:) instead of
1206:
1202:
1201:
1196:
1195:
1194:
1186:
1180:
1148:
1115:
1092:
1065:
1064:
1062:
1035:
1008:
1007:
1006:
992:
982:
968:
952:
949:
946:
943:
940:
937:
934:
931:
928:
925:
924:
920:
915:
912:
909:
906:
903:
900:
897:
894:
891:
890:
886:
883:
878:
875:
872:
869:
866:
863:
860:
857:
856:
852:
849:
846:
841:
838:
835:
832:
829:
826:
823:
822:
818:
815:
812:
809:
804:
801:
798:
795:
792:
789:
788:
784:
781:
778:
775:
772:
767:
762:
757:
754:
751:
750:
746:
743:
740:
737:
734:
731:
728:
725:
722:
720:
719:
715:
712:
709:
706:
703:
700:
697:
694:
692:
690:
689:
686:
677:
674:
671:
668:
665:
662:
659:
656:
655:
649:
646:
643:
640:
637:
634:
631:
628:
627:
623:
618:
615:
612:
609:
606:
603:
600:
599:
595:
592:
587:
584:
581:
578:
575:
572:
571:
567:
564:
561:
556:
553:
550:
547:
544:
543:
539:
536:
533:
530:
525:
522:
519:
516:
515:
511:
508:
505:
502:
499:
494:
491:
488:
487:
483:
480:
477:
474:
471:
468:
465:
463:
462:
458:
455:
452:
449:
446:
443:
441:
439:
438:
435:
431:
139:
134:
130:
126:
122:
118:
114:
110:
106:
101:
99:
98:flood filling
95:
91:
78:
75:
71:
69:Sellers, 1974
68:
65:
62:
59:
56:
53:
50:
49:
48:
46:
36:
34:
33:edit distance
30:
26:
22:
1752:Suffix array
1658:Aho–Corasick
1583:
1569:Lee distance
1469:
1452:
1448:
1438:
1419:
1413:
1381:(1): 31–88.
1378:
1374:
1344:
1336:
1325:
1321:
1317:
1313:
1291:parallelizes
1279:time, where
1274:
1270:
1212:
1208:
1204:
1199:
1192:
1172:
1161:into all of
988:
964:
429:
404:// insertion
132:
124:
120:
116:
112:
108:
102:
87:
42:
24:
18:
1762:Suffix tree
1060:operations.
1033:characters.
389:// deletion
72:Wagner and
1853:Categories
1349:References
123:of length
115:of length
105:pseudocode
1405:207551224
1383:CiteSeerX
1165:, and so
991:invariant
967:invariant
1481:), 2007.
146:Distance
143:function
109:Distance
94:prefixes
52:Vintsyuk
1828:Sorting
1798:Parsing
1518:Strings
1303:minimum
1266:
1237:
371:minimum
197:element
182:declare
74:Fischer
64:Sankoff
39:History
1479:ESSLLI
1426:
1403:
1385:
422:return
119:, and
90:matrix
76:, 1974
66:, 1972
60:, 1970
54:, 1968
23:, the
1791:Other
1747:DAFSA
1714:BLAST
1401:S2CID
1371:(PDF)
1320:(1 +
1153:into
1038:s = t
997:into
973:into
27:is a
1782:Trie
1772:Rope
1424:ISBN
1299:cost
1136:for
965:The
350:else
311:from
290:from
260:from
224:from
209:zero
194:each
161:char
152:char
1457:doi
1393:doi
1142:k+1
1124:in
1120:to
1111:k+1
1101:in
1097:to
1088:k+1
1086:in
1074:in
1070:to
1048:in
1044:to
1036:If
1025:to
305:for
284:for
254:for
218:for
191:set
185:int
19:In
1855::
1453:43
1451:.
1447:.
1399:.
1391:.
1379:33
1377:.
1373:.
1356:^
1275:kl
1213:mn
926:y
921:4
892:a
887:5
858:d
853:6
824:n
819:6
790:u
785:7
752:S
747:8
716:y
713:a
710:d
707:r
704:u
701:t
698:a
695:S
657:g
629:n
624:3
601:i
596:3
573:t
568:4
545:t
540:5
517:i
512:6
489:s
484:6
459:n
456:e
453:t
450:t
447:i
444:k
368::=
359::=
344::=
326:if
317:to
296:to
278::=
266:to
242::=
230:to
206:to
200:in
1510:e
1503:t
1496:v
1463:.
1459::
1432:.
1407:.
1395::
1326:d
1322:d
1318:m
1316:(
1314:O
1281:l
1277:)
1273:(
1271:O
1254:1
1251:+
1248:k
1245:2
1233:k
1228:.
1222:.
1220:j
1211:(
1209:O
1205:m
1203:(
1200:O
1183:d
1175:d
1167:d
1163:t
1159:s
1155:t
1151:s
1138:t
1134:s
1130:s
1126:k
1122:t
1118:s
1107:s
1103:k
1099:t
1095:s
1084:t
1080:t
1076:k
1072:t
1068:s
1058:k
1054:s
1050:k
1046:t
1042:s
1031:j
1027:t
1023:s
1019:i
1015:t
1011:s
1003:d
999:t
995:s
979:d
975:t
971:s
954:3
950:4
947:5
944:5
941:5
938:4
935:4
932:5
929:6
917:3
913:4
910:4
907:4
904:4
901:3
898:4
895:5
884:4
880:3
876:4
873:3
870:3
867:3
864:3
861:4
850:5
847:4
843:3
839:3
836:2
833:2
830:2
827:3
816:5
813:4
810:3
806:2
802:2
799:1
796:1
793:2
782:6
779:5
776:4
773:3
769:2
764:1
759:0
755:1
744:7
741:6
738:5
735:4
732:3
729:2
726:1
723:0
679:3
675:4
672:4
669:5
666:6
663:7
660:7
651:2
647:3
644:3
641:4
638:5
635:6
632:6
620:2
616:2
613:3
610:4
607:5
604:5
593:2
589:1
585:2
582:3
579:4
576:4
565:3
562:2
558:1
554:2
551:3
548:3
537:4
534:3
531:2
527:1
523:2
520:2
509:5
506:4
503:3
500:2
496:1
492:1
481:5
478:4
475:3
472:2
469:1
466:0
425:d
416:)
410:+
407:d
401:,
398:1
395:+
392:d
386:,
383:1
380:+
377:d
374:(
365:d
362:1
353::
347:0
338::
335:t
332:=
329:s
323::
320:m
314:1
308:i
302::
299:n
293:1
287:j
281:j
275:d
272::
269:n
263:1
257:j
245:i
239:d
236::
233:m
227:1
221:i
203:d
188:d
170::
167:)
164:t
158:,
155:s
149:(
133:d
125:n
121:t
117:m
113:s
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.