Faster Levenshtein

Aug 20, 2009 at 1:42 AM

Here is a faster implementation of Levenshtein algorithm. It doesn't use matrix, only temp one-dimensional array.

Recoded to C# from Perl implementation found here: http://www.mgilleland.com/ld/ldperl2.htm

from my own tests it performs around twice faster than old method.

 

public static int LevensteinDistance(string s, string t)
        {
            string a, b;

            // remixing to put longest string as the first parameter to make a temporary array at minimum space
            if (s.Length > t.Length)
            { a = s; b = t; }
            else
            { a = t; b = s; }

            int n = a.Length;
            int m = b.Length;

            int cur = 0;
            int next = 0;

            int[] w = new int[m + 1];

            for (int i = 0; i < m; i++)
            {
                w[i] = i;
            }

            for (int i = 0; i < n; i++)
            {
                cur = i + 1;
                for (int j = 0; j < m; j++)
                {
                    next = Math.Min(Math.Min(w[j + 1] + 1, cur + 1), w[j] + (a[i].Equals(b[j]) ? 0 : 1));
                    w[j] = cur;
                    cur = next;
                }
                w[m] = next;
            }

            return next;
        }

Nov 28, 2011 at 10:51 PM

I concur with @normalex.  My testing shows his method to be around 2X faster than sift.  Very nice!