773:
259:
1907:.) Thus, we need to consider only two symmetric ways of modifying a substring more than once: (1) transpose letters and insert an arbitrary number of characters between them, or (2) delete a sequence of characters and transpose letters that become adjacent after deletion. The straightforward implementation of this idea gives an algorithm of cubic complexity:
768:{\displaystyle d_{a,b}(i,j)=\min {\begin{cases}0&{\text{if }}i=j=0,\\d_{a,b}(i-1,j)+1&{\text{if }}i>0,\\d_{a,b}(i,j-1)+1&{\text{if }}j>0,\\d_{a,b}(i-1,j-1)+1_{(a_{i}\neq b_{j})}&{\text{if }}i,j>0,\\d_{a,b}(i-2,j-2)+1_{(a_{i}\neq b_{j})}&{\text{if }}i,j>1{\text{ and }}a_{i}=b_{j-1}{\text{ and }}a_{i-1}=b_{j},\\\end{cases}}}
62:
In his seminal paper, Damerau stated that in an investigation of spelling errors for an information-retrieval system, more than 80% were a result of a single error of one of the four types. Damerau's paper considered only misspellings that could be corrected with at most one edit operation. While the
2074:
The algorithm can be used with any set of words, like vendor names. Since entry is manual by nature, there is a risk of entering a false vendor. A fraudster employee may enter one real vendor such as "Rich Heir Estate
Services" versus a false vendor "Rich Hier State Services". The fraudster would
2057:
frequently undergoes insertions, deletions, substitutions, and transpositions, and each of these operations occurs on approximately the same timescale, the
Damerau–Levenshtein distance is an appropriate metric of the variation between two strands of DNA. More common in DNA, protein, and other
2033:. In natural languages, strings are short and the number of errors (misspellings) rarely exceeds 2. In such circumstances, restricted and real edit distance differ very rarely. Oommen and Loke even mitigated the limitation of the restricted edit distance by introducing
1823:
To devise a proper algorithm to calculate unrestricted
Damerau–Levenshtein distance, note that there always exists an optimal sequence of edit operations, where once-transposed letters are never modified afterwards. (This holds as long as the cost of a transposition,
2075:
then create a false bank account and have the company route checks to the real vendor and false vendor. The
Damerau–Levenshtein algorithm will detect the transposed and dropped letter and bring attention of the items to a fraud examiner.
1406:, while the second one computes the Damerau–Levenshtein distance with adjacent transpositions. Adding transpositions adds significant complexity. The difference between the two algorithms consists in that the
47:
between two sequences. Informally, the
Damerau–Levenshtein distance between two words is the minimum number of operations (consisting of insertions, deletions or substitutions of a single character, or
1969:
1216:
1115:
1308:
1905:
2290:
Majorek, Karolina A.; Dunin-Horkawicz, Stanisław; et al. (2013), "The RNase H-like superfamily: new members, comparative structural analysis and evolutionary classification",
826:
2194:. Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. pp. 1025–1032.
1680:
The following algorithm computes the true
Damerau–Levenshtein distance with adjacent transpositions; this algorithm requires as an additional parameter the size of the alphabet
1006:
936:
161:
2012:
870:
59:
by including transpositions among its allowable operations in addition to the three classical single-character edit operations (insertions, deletions and substitutions).
1849:
1384:
1344:
247:
225:
204:
182:
113:
93:
2164:, Conferences in Research and Practice in Information Technology, vol. 68, Darlinghurst, Australia: Australian Computer Society, Inc., pp. 117–124,
1469:
because that would require the substring to be edited more than once, which is not allowed in OSA, and therefore the shortest sequence of operations is
2377:
Oommen, B. J.; Loke, R. K. S. (1997). "Pattern recognition of strings with substitutions, insertions, deletions and generalized transpositions".
2158:
Bard, Gregory V. (2007), "Spelling-error tolerant, order-independent pass-phrases via the
Damerau–Levenshtein string-edit distance metric",
2775:
2463:
2670:
2709:
2635:
2161:
Proceedings of the Fifth
Australasian Symposium on ACSW Frontiers : 2007, Ballarat, Australia, January 30 - February 2, 2007
2640:
1910:
2169:
1122:
1021:
2645:
2186:
2119:
2021:
can be modified to process transposition. See the information retrieval section of for an example of such an adaptation.
2856:
2630:
2724:
2665:
2537:
2222:
Levenshtein, Vladimir I. (February 1966), "Binary codes capable of correcting deletions, insertions, and reversals",
1237:
67:, Damerau–Levenshtein distance has also seen uses in biology to measure the variation between protein sequences.
2752:
2059:
1854:
2897:
2757:
2612:
2063:
1522:
778:
2427:
Lowrance, Roy; Wagner, Robert A. (April 1975), "An
Extension of the String-to-String Correction Problem",
2892:
2836:
2562:
2127:. Proceedings of the 38th Annual Meeting on Association for Computational Linguistics. pp. 286–293.
2030:
1219:
49:
2841:
2719:
2686:
2622:
1979:
are string lengths. Using the ideas of
Lowrance and Wagner, this naive algorithm can be improved to be
63:
original motivation was to measure distance between human misspellings to improve applications such as
951:
881:
2851:
2747:
2691:
2592:
2546:
2083:
The U.S. Government uses the Damerau–Levenshtein distance with its Consolidated Screening List API.
305:
2846:
2650:
2582:
2399:
118:
2887:
2795:
2037:. Nevertheless, one must remember that the restricted edit distance usually does not satisfy the
1982:
1410:
computes the number of edit operations needed to make the strings equal under the condition that
835:
2471:
2394:
1659:
The difference from the algorithm for Levenshtein distance is the addition of one recurrence:
2800:
2742:
2602:
2530:
1117:
corresponds to a match or mismatch, depending on whether the respective symbols are the same,
2607:
2386:
2248:
2231:
1827:
1529:
1521:
Optimal string alignment distance can be computed using a straightforward extension of the
1353:
1313:
56:
36:
32:
875:
Each recursive call matches one of the cases covered by the Damerau–Levenshtein distance:
8:
2805:
2038:
1525:
1486:
1398:
Presented here are two algorithms: the first, simpler one, computes what is known as the
2390:
2235:
2101: – Form of cybersquatting which relies on mistakes when inputting a website address
2058:
bioinformatics related alignment tasks is the use of closely related algorithms such as
2734:
2701:
2505:
2446:
2356:
2312:
2270:
829:
232:
210:
189:
167:
98:
78:
20:
2408:
2251:(March 1964), "A technique for computer detection and correction of spelling errors",
2866:
2509:
2339:
Boytsov, Leonid (May 2011). "Indexing methods for approximate dictionary searching".
2317:
2165:
2450:
2360:
2159:
2861:
2831:
2785:
2587:
2523:
2497:
2436:
2404:
2348:
2307:
2299:
2274:
2260:
2195:
2128:
24:
2714:
2655:
2567:
2470:. International Trade Administration, U.S. Department of Commerce. Archived from
2018:
2767:
2660:
2488:
Navarro, Gonzalo (March 2001), "A guided tour to approximate string matching",
2205:
2188:
Exploring distributional similarity based models for query spelling correction
2138:
2881:
2577:
2554:
2098:
1726:// note that d has indices starting at −1, while a, b and da are one-indexed.
64:
44:
40:
2352:
2200:
2133:
2095:– Suggests corrections that are based on a Damerau–Levenshtein distance of 1
2780:
2597:
2321:
2289:
2501:
2441:
2303:
2265:
1851:, is at least the average of the cost of an insertion and deletion, i.e.,
2790:
2042:
52:
of two adjacent characters) required to change one word into the other.
1533:
1724:
d be a 2-d array of integers, dimensions length(a)+2, length(b)+2
1557:
d be a 2-d array of integers, dimensions length(a)+1, length(b)+1
2826:
1705:: distance, integer da := new array of |Σ| integers
1728:
maxdist := length(a) + length(b) d := maxdist
1649:
d := minimum(d, d + 1)
75:
To express the Damerau–Levenshtein distance between two strings
2092:
1559:// note that d is zero-indexed, while a and b are one-indexed.
2121:
An Improved Error Model for Noisy Channel Spelling Correction
2014:
in the worst case, which is what the above pseudocode does.
55:
The Damerau–Levenshtein distance differs from the classical
2810:
1485:. Note that for the optimal string alignment distance, the
761:
2515:
2054:
2029:
Damerau–Levenshtein distance plays an important role in
1964:{\displaystyle O{\big (}M\cdot N\cdot \max(M,N){\big )}}
1800:
cost := 1 d := minimum(d + cost,
1796:cost := 0 db := j
1675:
1630:
cost := 1 d := minimum(d + 1,
1414:, whereas the second one presents no such restriction.
1234:
is then given by the function value for full strings:
1211:{\displaystyle d_{a,b}(i-2,j-2)+1_{(a_{i}\neq b_{j})}}
1110:{\displaystyle d_{a,b}(i-1,j-1)+1_{(a_{i}\neq b_{j})}}
1985:
1913:
1857:
1830:
1356:
1316:
1240:
1125:
1024:
954:
884:
838:
781:
262:
235:
213:
192:
170:
121:
101:
81:
1669:d := minimum(d, d + 1)
1516:
1788:k := da] ℓ := db
2006:
1963:
1899:
1843:
1378:
1338:
1302:
1210:
1109:
1000:
930:
864:
820:
767:
241:
219:
198:
176:
155:
107:
87:
163:is defined, whose value is a distance between an
2879:
1936:
1445:, but the optimal string alignment distance OSA(
297:
2426:
1303:{\displaystyle d_{a,b}{\big (}|a|,|b|{\big )}}
2531:
1956:
1919:
1295:
1259:
2221:
1684:, so that all entries of the arrays are in
1417:Take for example the edit distance between
2538:
2524:
2376:
2117:
1758:d := maxdist d := j
1743:d := maxdist d := i
1665:i > 1 and j > 1 and a = b and a = b
1645:i > 1 and j > 1 and a = b and a = b
2440:
2398:
2334:
2332:
2330:
2311:
2285:
2283:
2264:
2199:
2132:
1226:The Damerau–Levenshtein distance between
2710:Comparison of regular-expression engines
2372:
2370:
2487:
2338:
2247:
2880:
2422:
2420:
2418:
2327:
2280:
2153:
2151:
2118:Brill, Eric; Moore, Robert C. (2000).
1900:{\displaystyle 2W_{T}\geq W_{I}+W_{D}}
1425:. The Damerau–Levenshtein distance LD(
2671:Zhu–Takaoka string matching algorithm
2519:
2367:
1676:Distance with adjacent transpositions
1412:no substring is edited more than once
821:{\displaystyle 1_{(a_{i}\neq b_{j})}}
186:prefix (initial substring) of string
2341:Journal of Experimental Algorithmics
2157:
256:function is defined recursively as:
2636:Boyer–Moore string-search algorithm
2415:
2148:
1513:), and so it is not a true metric.
1461:is used, it is not possible to use
13:
2481:
2184:
2069:
1408:optimal string alignment algorithm
1008:corresponds to an insertion (from
14:
2909:
2725:Nondeterministic finite automaton
2666:Two-way string-matching algorithm
2464:"Consolidated Screening List API"
2078:
1517:Optimal string alignment distance
1400:optimal string alignment distance
1001:{\displaystyle d_{a,b}(i,j-1)+1}
938:corresponds to a deletion (from
931:{\displaystyle d_{a,b}(i-1,j)+1}
2456:
2041:, and thus cannot be used with
2024:
1453:) = 3 because if the operation
1222:between two successive symbols.
2641:Boyer–Moore–Horspool algorithm
2631:Apostolico–Giancarlo algorithm
2241:
2215:
2178:
2111:
2001:
1989:
1951:
1939:
1372:
1364:
1332:
1324:
1289:
1281:
1273:
1265:
1203:
1177:
1166:
1142:
1102:
1076:
1065:
1041:
989:
971:
919:
901:
813:
787:
660:
634:
623:
599:
549:
523:
512:
488:
440:
422:
374:
356:
291:
279:
150:
138:
1:
2409:10.1016/S0031-3203(96)00101-X
2105:
1346:denotes the length of string
70:
2646:Knuth–Morris–Pratt algorithm
2573:Damerau–Levenshtein distance
1393:
156:{\displaystyle d_{a,b}(i,j)}
29:Damerau–Levenshtein distance
7:
2837:Compressed pattern matching
2563:Approximate string matching
2545:
2086:
2031:natural language processing
2017:It is interesting that the
2007:{\displaystyle O(M\cdot N)}
1812:d + (i−k−1) + 1 + (j-ℓ−1))
1626:cost := 0
865:{\displaystyle a_{i}=b_{j}}
10:
2914:
2842:Longest common subsequence
2753:Needleman–Wunsch algorithm
2623:String-searching algorithm
2468:Trade.gov Developer Portal
2060:Needleman–Wunsch algorithm
2035:generalized transpositions
872:and equal to 1 otherwise.
16:Metric in computer science
2852:Sequential pattern mining
2819:
2766:
2733:
2700:
2692:Commentz-Walter algorithm
2680:Multiple string searching
2679:
2621:
2613:Wagner–Fischer algorithm
2553:
2253:Communications of the ACM
2862:String rewriting systems
2847:Longest common substring
2758:Smith–Waterman algorithm
2583:Gestalt pattern matching
2185:Li; et al. (2006).
2064:Smith–Waterman algorithm
1553:: distance, integer
1528:algorithm that computes
1404:restricted edit distance
2796:Generalized suffix tree
2720:Thompson's construction
2353:10.1145/1963190.1963191
2201:10.3115/1220175.1220304
2134:10.3115/1075218.1075255
37:Vladimir I. Levenshtein
2748:Hirschberg's algorithm
2292:Nucleic Acids Research
2224:Soviet Physics Doklady
2048:
2008:
1965:
1901:
1845:
1380:
1340:
1304:
1212:
1111:
1002:
932:
866:
822:
769:
243:
221:
200:
178:
157:
109:
89:
2603:Levenshtein automaton
2593:Jaro–Winkler distance
2502:10.1145/375360.375365
2490:ACM Computing Surveys
2442:10.1145/321879.321880
2266:10.1145/363958.363994
2009:
1966:
1902:
1846:
1844:{\displaystyle W_{T}}
1773:db := 0
1381:
1379:{\displaystyle j=|b|}
1341:
1339:{\displaystyle i=|a|}
1305:
1213:
1112:
1003:
933:
867:
823:
770:
244:
222:
201:
179:
158:
110:
90:
2651:Rabin–Karp algorithm
2608:Levenshtein distance
1983:
1911:
1855:
1828:
1530:Levenshtein distance
1354:
1314:
1238:
1123:
1022:
952:
882:
836:
779:
260:
233:
211:
190:
168:
119:
99:
79:
57:Levenshtein distance
33:Frederick J. Damerau
2898:Dynamic programming
2806:Ternary search tree
2391:1997PatRe..30..789O
2379:Pattern Recognition
2304:10.1093/nar/gkt1414
2236:1966SPhD...10..707L
2039:triangle inequality
1701:: strings a, b
1549:: strings a, b
1526:dynamic programming
1489:does not hold: OSA(
1487:triangle inequality
254:restricted distance
2893:Information theory
2735:Sequence alignment
2702:Regular expression
2004:
1961:
1897:
1841:
1816:da] := i
1376:
1336:
1300:
1208:
1107:
998:
928:
862:
830:indicator function
818:
765:
760:
239:
217:
196:
174:
153:
105:
85:
43:for measuring the
21:information theory
2875:
2874:
2867:String operations
2171:978-1-920682-49-1
1720:da := 0
1386:is the length of
1218:corresponds to a
724:
690:
670:
559:
454:
388:
316:
242:{\displaystyle b}
220:{\displaystyle j}
199:{\displaystyle a}
177:{\displaystyle i}
108:{\displaystyle b}
88:{\displaystyle a}
2905:
2832:Pattern matching
2786:Suffix automaton
2588:Hamming distance
2540:
2533:
2526:
2517:
2516:
2512:
2476:
2475:
2460:
2454:
2453:
2444:
2424:
2413:
2412:
2402:
2374:
2365:
2364:
2336:
2325:
2324:
2315:
2298:(7): 4160–4179,
2287:
2278:
2277:
2268:
2249:Damerau, Fred J.
2245:
2239:
2238:
2219:
2213:
2212:
2210:
2204:. Archived from
2203:
2193:
2182:
2176:
2174:
2155:
2146:
2145:
2143:
2137:. Archived from
2136:
2126:
2115:
2013:
2011:
2010:
2005:
1970:
1968:
1967:
1962:
1960:
1959:
1923:
1922:
1906:
1904:
1903:
1898:
1896:
1895:
1883:
1882:
1870:
1869:
1850:
1848:
1847:
1842:
1840:
1839:
1687:
1683:
1671:// transposition
1651:// transposition
1590:d := j
1575:d := i
1389:
1385:
1383:
1382:
1377:
1375:
1367:
1349:
1345:
1343:
1342:
1337:
1335:
1327:
1309:
1307:
1306:
1301:
1299:
1298:
1292:
1284:
1276:
1268:
1263:
1262:
1256:
1255:
1233:
1229:
1217:
1215:
1214:
1209:
1207:
1206:
1202:
1201:
1189:
1188:
1141:
1140:
1116:
1114:
1113:
1108:
1106:
1105:
1101:
1100:
1088:
1087:
1040:
1039:
1007:
1005:
1004:
999:
970:
969:
937:
935:
934:
929:
900:
899:
871:
869:
868:
863:
861:
860:
848:
847:
832:equal to 0 when
827:
825:
824:
819:
817:
816:
812:
811:
799:
798:
774:
772:
771:
766:
764:
763:
754:
753:
741:
740:
725:
722:
720:
719:
701:
700:
691:
688:
671:
668:
664:
663:
659:
658:
646:
645:
598:
597:
560:
557:
553:
552:
548:
547:
535:
534:
487:
486:
455:
452:
421:
420:
389:
386:
355:
354:
317:
314:
278:
277:
248:
246:
245:
240:
228:
226:
224:
223:
218:
205:
203:
202:
197:
185:
183:
181:
180:
175:
162:
160:
159:
154:
137:
136:
114:
112:
111:
106:
94:
92:
91:
86:
25:computer science
2913:
2912:
2908:
2907:
2906:
2904:
2903:
2902:
2878:
2877:
2876:
2871:
2815:
2762:
2729:
2715:Regular grammar
2696:
2675:
2656:Raita algorithm
2617:
2568:Bitap algorithm
2549:
2544:
2484:
2482:Further reading
2479:
2462:
2461:
2457:
2425:
2416:
2375:
2368:
2337:
2328:
2288:
2281:
2246:
2242:
2220:
2216:
2208:
2191:
2183:
2179:
2172:
2156:
2149:
2141:
2124:
2116:
2112:
2108:
2089:
2081:
2072:
2070:Fraud detection
2051:
2027:
2019:bitap algorithm
1984:
1981:
1980:
1955:
1954:
1918:
1917:
1912:
1909:
1908:
1891:
1887:
1878:
1874:
1865:
1861:
1856:
1853:
1852:
1835:
1831:
1829:
1826:
1825:
1821:
1814://transposition
1685:
1681:
1678:
1673:
1657:
1640:// substitution
1519:
1396:
1387:
1371:
1363:
1355:
1352:
1351:
1347:
1331:
1323:
1315:
1312:
1311:
1294:
1293:
1288:
1280:
1272:
1264:
1258:
1257:
1245:
1241:
1239:
1236:
1235:
1231:
1227:
1197:
1193:
1184:
1180:
1176:
1172:
1130:
1126:
1124:
1121:
1120:
1096:
1092:
1083:
1079:
1075:
1071:
1029:
1025:
1023:
1020:
1019:
959:
955:
953:
950:
949:
889:
885:
883:
880:
879:
856:
852:
843:
839:
837:
834:
833:
807:
803:
794:
790:
786:
782:
780:
777:
776:
759:
758:
749:
745:
730:
726:
723: and
721:
709:
705:
696:
692:
689: and
687:
667:
665:
654:
650:
641:
637:
633:
629:
587:
583:
580:
579:
556:
554:
543:
539:
530:
526:
522:
518:
476:
472:
469:
468:
451:
449:
410:
406:
403:
402:
385:
383:
344:
340:
337:
336:
313:
311:
301:
300:
267:
263:
261:
258:
257:
234:
231:
230:
212:
209:
208:
207:
191:
188:
187:
169:
166:
165:
164:
126:
122:
120:
117:
116:
100:
97:
96:
80:
77:
76:
73:
17:
12:
11:
5:
2911:
2901:
2900:
2895:
2890:
2888:String metrics
2873:
2872:
2870:
2869:
2864:
2859:
2854:
2849:
2844:
2839:
2834:
2829:
2823:
2821:
2817:
2816:
2814:
2813:
2808:
2803:
2798:
2793:
2788:
2783:
2778:
2772:
2770:
2768:Data structure
2764:
2763:
2761:
2760:
2755:
2750:
2745:
2739:
2737:
2731:
2730:
2728:
2727:
2722:
2717:
2712:
2706:
2704:
2698:
2697:
2695:
2694:
2689:
2683:
2681:
2677:
2676:
2674:
2673:
2668:
2663:
2661:Trigram search
2658:
2653:
2648:
2643:
2638:
2633:
2627:
2625:
2619:
2618:
2616:
2615:
2610:
2605:
2600:
2595:
2590:
2585:
2580:
2575:
2570:
2565:
2559:
2557:
2551:
2550:
2543:
2542:
2535:
2528:
2520:
2514:
2513:
2483:
2480:
2478:
2477:
2474:on 2019-10-30.
2455:
2435:(2): 177–183,
2414:
2400:10.1.1.50.1459
2385:(5): 789–800.
2366:
2326:
2279:
2259:(3): 171–176,
2240:
2230:(8): 707–710,
2214:
2211:on 2010-04-01.
2177:
2170:
2147:
2144:on 2012-12-21.
2109:
2107:
2104:
2103:
2102:
2096:
2088:
2085:
2080:
2079:Export control
2077:
2071:
2068:
2050:
2047:
2026:
2023:
2003:
2000:
1997:
1994:
1991:
1988:
1958:
1953:
1950:
1947:
1944:
1941:
1938:
1935:
1932:
1929:
1926:
1921:
1916:
1894:
1890:
1886:
1881:
1877:
1873:
1868:
1864:
1860:
1838:
1834:
1802://substitution
1690:
1677:
1674:
1661:
1538:
1523:Wagner–Fischer
1518:
1515:
1433:) = 2 because
1395:
1392:
1374:
1370:
1366:
1362:
1359:
1334:
1330:
1326:
1322:
1319:
1297:
1291:
1287:
1283:
1279:
1275:
1271:
1267:
1261:
1254:
1251:
1248:
1244:
1224:
1223:
1205:
1200:
1196:
1192:
1187:
1183:
1179:
1175:
1171:
1168:
1165:
1162:
1159:
1156:
1153:
1150:
1147:
1144:
1139:
1136:
1133:
1129:
1118:
1104:
1099:
1095:
1091:
1086:
1082:
1078:
1074:
1070:
1067:
1064:
1061:
1058:
1055:
1052:
1049:
1046:
1043:
1038:
1035:
1032:
1028:
1017:
997:
994:
991:
988:
985:
982:
979:
976:
973:
968:
965:
962:
958:
947:
927:
924:
921:
918:
915:
912:
909:
906:
903:
898:
895:
892:
888:
859:
855:
851:
846:
842:
815:
810:
806:
802:
797:
793:
789:
785:
762:
757:
752:
748:
744:
739:
736:
733:
729:
718:
715:
712:
708:
704:
699:
695:
686:
683:
680:
677:
674:
666:
662:
657:
653:
649:
644:
640:
636:
632:
628:
625:
622:
619:
616:
613:
610:
607:
604:
601:
596:
593:
590:
586:
582:
581:
578:
575:
572:
569:
566:
563:
555:
551:
546:
542:
538:
533:
529:
525:
521:
517:
514:
511:
508:
505:
502:
499:
496:
493:
490:
485:
482:
479:
475:
471:
470:
467:
464:
461:
458:
450:
448:
445:
442:
439:
436:
433:
430:
427:
424:
419:
416:
413:
409:
405:
404:
401:
398:
395:
392:
384:
382:
379:
376:
373:
370:
367:
364:
361:
358:
353:
350:
347:
343:
339:
338:
335:
332:
329:
326:
323:
320:
312:
310:
307:
306:
304:
299:
296:
293:
290:
287:
284:
281:
276:
273:
270:
266:
238:
216:
195:
173:
152:
149:
146:
143:
140:
135:
132:
129:
125:
104:
84:
72:
69:
65:spell checkers
15:
9:
6:
4:
3:
2:
2910:
2899:
2896:
2894:
2891:
2889:
2886:
2885:
2883:
2868:
2865:
2863:
2860:
2858:
2855:
2853:
2850:
2848:
2845:
2843:
2840:
2838:
2835:
2833:
2830:
2828:
2825:
2824:
2822:
2818:
2812:
2809:
2807:
2804:
2802:
2799:
2797:
2794:
2792:
2789:
2787:
2784:
2782:
2779:
2777:
2774:
2773:
2771:
2769:
2765:
2759:
2756:
2754:
2751:
2749:
2746:
2744:
2741:
2740:
2738:
2736:
2732:
2726:
2723:
2721:
2718:
2716:
2713:
2711:
2708:
2707:
2705:
2703:
2699:
2693:
2690:
2688:
2685:
2684:
2682:
2678:
2672:
2669:
2667:
2664:
2662:
2659:
2657:
2654:
2652:
2649:
2647:
2644:
2642:
2639:
2637:
2634:
2632:
2629:
2628:
2626:
2624:
2620:
2614:
2611:
2609:
2606:
2604:
2601:
2599:
2596:
2594:
2591:
2589:
2586:
2584:
2581:
2579:
2578:Edit distance
2576:
2574:
2571:
2569:
2566:
2564:
2561:
2560:
2558:
2556:
2555:String metric
2552:
2548:
2541:
2536:
2534:
2529:
2527:
2522:
2521:
2518:
2511:
2507:
2503:
2499:
2495:
2491:
2486:
2485:
2473:
2469:
2465:
2459:
2452:
2448:
2443:
2438:
2434:
2430:
2423:
2421:
2419:
2410:
2406:
2401:
2396:
2392:
2388:
2384:
2380:
2373:
2371:
2362:
2358:
2354:
2350:
2346:
2342:
2335:
2333:
2331:
2323:
2319:
2314:
2309:
2305:
2301:
2297:
2293:
2286:
2284:
2276:
2272:
2267:
2262:
2258:
2254:
2250:
2244:
2237:
2233:
2229:
2225:
2218:
2207:
2202:
2197:
2190:
2189:
2181:
2173:
2167:
2163:
2162:
2154:
2152:
2140:
2135:
2130:
2123:
2122:
2114:
2110:
2100:
2099:Typosquatting
2097:
2094:
2091:
2090:
2084:
2076:
2067:
2065:
2061:
2056:
2046:
2044:
2040:
2036:
2032:
2022:
2020:
2015:
1998:
1995:
1992:
1986:
1978:
1974:
1948:
1945:
1942:
1933:
1930:
1927:
1924:
1914:
1892:
1888:
1884:
1879:
1875:
1871:
1866:
1862:
1858:
1836:
1832:
1819:
1815:
1811:
1807:
1803:
1799:
1795:
1791:
1787:
1784:
1780:
1776:
1772:
1769:
1765:
1761:
1757:
1754:
1750:
1746:
1742:
1739:
1735:
1731:
1727:
1723:
1719:
1716:
1712:
1708:
1704:
1700:
1697:
1693:
1689:
1672:
1668:
1664:
1660:
1655:
1652:
1648:
1644:
1641:
1637:
1633:
1629:
1625:
1621:
1618:
1615:
1611:
1607:
1604:
1601:
1597:
1593:
1589:
1586:
1582:
1578:
1574:
1571:
1567:
1563:
1560:
1556:
1552:
1548:
1545:
1542:OSA-distance
1541:
1537:
1535:
1531:
1527:
1524:
1514:
1512:
1508:
1504:
1500:
1496:
1492:
1488:
1484:
1480:
1476:
1472:
1468:
1464:
1460:
1456:
1452:
1448:
1444:
1440:
1436:
1432:
1428:
1424:
1420:
1415:
1413:
1409:
1405:
1401:
1391:
1368:
1360:
1357:
1328:
1320:
1317:
1285:
1277:
1269:
1252:
1249:
1246:
1242:
1221:
1220:transposition
1198:
1194:
1190:
1185:
1181:
1173:
1169:
1163:
1160:
1157:
1154:
1151:
1148:
1145:
1137:
1134:
1131:
1127:
1119:
1097:
1093:
1089:
1084:
1080:
1072:
1068:
1062:
1059:
1056:
1053:
1050:
1047:
1044:
1036:
1033:
1030:
1026:
1018:
1015:
1011:
995:
992:
986:
983:
980:
977:
974:
966:
963:
960:
956:
948:
945:
941:
925:
922:
916:
913:
910:
907:
904:
896:
893:
890:
886:
878:
877:
876:
873:
857:
853:
849:
844:
840:
831:
808:
804:
800:
795:
791:
783:
755:
750:
746:
742:
737:
734:
731:
727:
716:
713:
710:
706:
702:
697:
693:
684:
681:
678:
675:
672:
655:
651:
647:
642:
638:
630:
626:
620:
617:
614:
611:
608:
605:
602:
594:
591:
588:
584:
576:
573:
570:
567:
564:
561:
544:
540:
536:
531:
527:
519:
515:
509:
506:
503:
500:
497:
494:
491:
483:
480:
477:
473:
465:
462:
459:
456:
446:
443:
437:
434:
431:
428:
425:
417:
414:
411:
407:
399:
396:
393:
390:
380:
377:
371:
368:
365:
362:
359:
351:
348:
345:
341:
333:
330:
327:
324:
321:
318:
308:
302:
294:
288:
285:
282:
274:
271:
268:
264:
255:
250:
236:
214:
193:
171:
147:
144:
141:
133:
130:
127:
123:
115:, a function
102:
82:
68:
66:
60:
58:
53:
51:
50:transposition
46:
45:edit distance
42:
41:string metric
38:
34:
31:(named after
30:
26:
22:
2781:Suffix array
2687:Aho–Corasick
2598:Lee distance
2572:
2496:(1): 31–88,
2493:
2489:
2472:the original
2467:
2458:
2432:
2428:
2382:
2378:
2344:
2340:
2295:
2291:
2256:
2252:
2243:
2227:
2223:
2217:
2206:the original
2187:
2180:
2160:
2139:the original
2120:
2113:
2082:
2073:
2052:
2043:metric trees
2034:
2028:
2025:Applications
2016:
1976:
1972:
1822:
1817:
1813:
1809:
1805:
1801:
1797:
1793:
1789:
1785:
1782:
1778:
1777:j := 1
1774:
1770:
1767:
1763:
1762:i := 1
1759:
1755:
1752:
1748:
1747:j := 0
1744:
1740:
1737:
1733:
1732:i := 0
1729:
1725:
1721:
1717:
1714:
1710:
1709:i := 1
1706:
1702:
1698:
1695:
1694:DL-distance
1691:
1686:[0, |Σ|)
1679:
1670:
1666:
1662:
1658:
1653:
1650:
1646:
1642:
1639:
1636:// insertion
1635:
1631:
1627:
1623:
1619:
1616:
1613:
1609:
1608:j := 1
1605:
1602:
1599:
1595:
1594:i := 1
1591:
1587:
1584:
1580:
1579:j := 0
1576:
1572:
1569:
1565:
1564:i := 0
1561:
1558:
1554:
1550:
1546:
1543:
1539:
1520:
1510:
1506:
1502:
1498:
1494:
1490:
1482:
1478:
1474:
1470:
1466:
1462:
1458:
1454:
1450:
1446:
1442:
1438:
1434:
1430:
1426:
1422:
1418:
1416:
1411:
1407:
1403:
1399:
1397:
1225:
1013:
1009:
943:
939:
874:
253:
251:
74:
61:
54:
28:
18:
2791:Suffix tree
1808:d + 1,
1806://insertion
1804:d + 1,
1638:d + cost)
1634:d + 1,
1632:// deletion
1505:) < OSA(
2882:Categories
2106:References
1810://deletion
1781:length(b)
1766:length(a)
1751:length(b)
1736:length(a)
1612:length(b)
1598:length(a)
1583:length(b)
1568:length(a)
1534:pseudocode
229:prefix of
71:Definition
2510:207551224
2395:CiteSeerX
1996:⋅
1934:⋅
1928:⋅
1872:≥
1783:inclusive
1768:inclusive
1753:inclusive
1738:inclusive
1715:inclusive
1692:algorithm
1614:inclusive
1600:inclusive
1585:inclusive
1570:inclusive
1540:algorithm
1394:Algorithm
1191:≠
1161:−
1149:−
1090:≠
1060:−
1048:−
984:−
908:−
801:≠
735:−
714:−
648:≠
618:−
606:−
537:≠
507:−
495:−
435:−
363:−
2451:18892193
2361:15635688
2322:24464998
2087:See also
1971:, where
1497:) + OSA(
1310:, where
669:if
558:if
453:if
387:if
315:if
2857:Sorting
2827:Parsing
2547:Strings
2387:Bibcode
2313:3985635
2275:7713345
2232:Bibcode
1509:,
1501:,
1493:,
1449:,
1429:,
828:is the
227:-symbol
184:-symbol
39:) is a
2508:
2449:
2397:
2359:
2320:
2310:
2273:
2168:
2093:Ispell
2053:Since
1818:return
1792:a = b
1703:output
1654:return
1622:a = b
1551:output
1350:, and
775:where
206:and a
27:, the
2820:Other
2776:DAFSA
2743:BLAST
2506:S2CID
2447:S2CID
2429:J ACM
2357:S2CID
2347:: 1.
2271:S2CID
2209:(PDF)
2192:(PDF)
2142:(PDF)
2125:(PDF)
1699:input
1547:input
1532:. In
2811:Trie
2801:Rope
2318:PMID
2166:ISBN
1975:and
1798:else
1794:then
1713:|Σ|
1667:then
1647:then
1628:else
1624:then
1421:and
1230:and
682:>
571:>
460:>
394:>
252:The
95:and
35:and
23:and
2498:doi
2437:doi
2405:doi
2349:doi
2308:PMC
2300:doi
2261:doi
2196:doi
2129:doi
2062:or
2055:DNA
2049:DNA
1937:max
1775:for
1760:for
1745:for
1730:for
1722:let
1707:for
1606:for
1592:for
1577:for
1562:for
1555:let
1511:ABC
1503:ABC
1483:ABC
1467:ABC
1451:ABC
1443:ABC
1431:ABC
1423:ABC
1402:or
1012:to
942:to
298:min
19:In
2884::
2504:,
2494:33
2492:,
2466:.
2445:,
2433:22
2431:,
2417:^
2403:.
2393:.
2383:30
2381:.
2369:^
2355:.
2345:16
2343:.
2329:^
2316:,
2306:,
2296:42
2294:,
2282:^
2269:,
2255:,
2228:10
2226:,
2150:^
2066:.
2045:.
1820:d
1790:if
1786:do
1779:to
1771:do
1764:to
1756:do
1749:to
1741:do
1734:to
1718:do
1711:to
1696:is
1688::
1663:if
1656:d
1643:if
1620:if
1617:do
1610:to
1603:do
1596:to
1588:do
1581:to
1573:do
1566:to
1544:is
1536::
1507:CA
1499:AC
1495:AC
1491:CA
1481:→
1479:AB
1477:→
1473:→
1471:CA
1465:→
1463:AC
1459:AC
1457:→
1455:CA
1447:CA
1441:→
1439:AC
1437:→
1435:CA
1427:CA
1419:CA
1390:.
1016:),
946:),
249:.
2539:e
2532:t
2525:v
2500::
2439::
2411:.
2407::
2389::
2363:.
2351::
2302::
2263::
2257:7
2234::
2198::
2175:.
2131::
2002:)
1999:N
1993:M
1990:(
1987:O
1977:N
1973:M
1957:)
1952:)
1949:N
1946:,
1943:M
1940:(
1931:N
1925:M
1920:(
1915:O
1893:D
1889:W
1885:+
1880:I
1876:W
1867:T
1863:W
1859:2
1837:T
1833:W
1682:Σ
1475:A
1388:b
1373:|
1369:b
1365:|
1361:=
1358:j
1348:a
1333:|
1329:a
1325:|
1321:=
1318:i
1296:)
1290:|
1286:b
1282:|
1278:,
1274:|
1270:a
1266:|
1260:(
1253:b
1250:,
1247:a
1243:d
1232:b
1228:a
1204:)
1199:j
1195:b
1186:i
1182:a
1178:(
1174:1
1170:+
1167:)
1164:2
1158:j
1155:,
1152:2
1146:i
1143:(
1138:b
1135:,
1132:a
1128:d
1103:)
1098:j
1094:b
1085:i
1081:a
1077:(
1073:1
1069:+
1066:)
1063:1
1057:j
1054:,
1051:1
1045:i
1042:(
1037:b
1034:,
1031:a
1027:d
1014:b
1010:a
996:1
993:+
990:)
987:1
981:j
978:,
975:i
972:(
967:b
964:,
961:a
957:d
944:b
940:a
926:1
923:+
920:)
917:j
914:,
911:1
905:i
902:(
897:b
894:,
891:a
887:d
858:j
854:b
850:=
845:i
841:a
814:)
809:j
805:b
796:i
792:a
788:(
784:1
756:,
751:j
747:b
743:=
738:1
732:i
728:a
717:1
711:j
707:b
703:=
698:i
694:a
685:1
679:j
676:,
673:i
661:)
656:j
652:b
643:i
639:a
635:(
631:1
627:+
624:)
621:2
615:j
612:,
609:2
603:i
600:(
595:b
592:,
589:a
585:d
577:,
574:0
568:j
565:,
562:i
550:)
545:j
541:b
532:i
528:a
524:(
520:1
516:+
513:)
510:1
504:j
501:,
498:1
492:i
489:(
484:b
481:,
478:a
474:d
466:,
463:0
457:j
447:1
444:+
441:)
438:1
432:j
429:,
426:i
423:(
418:b
415:,
412:a
408:d
400:,
397:0
391:i
381:1
378:+
375:)
372:j
369:,
366:1
360:i
357:(
352:b
349:,
346:a
342:d
334:,
331:0
328:=
325:j
322:=
319:i
309:0
303:{
295:=
292:)
289:j
286:,
283:i
280:(
275:b
272:,
269:a
265:d
237:b
215:j
194:a
172:i
151:)
148:j
145:,
142:i
139:(
134:b
131:,
128:a
124:d
103:b
83:a
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.