Skip to content
XeveAbout code and problems

Calculate Levenshtein distance in MATLAB (words or characters) with Wagner–Fischer algorithm

July 1, 2018 0 comments Article Uncategorized nspo

This is a MATLAB implementation of the Wagner–Fischer algorithm.

function y = lvdistance(sentence1, sentence2, penalties, split_characters)
%LVDISTANCE Calculate Levenshtein distance between two strings
% Uses Wagner–Fischer algorithm
% See https://en.wikipedia.org/wiki/Wagner%E2%80%93Fischer_algorithm
%
% Arguments:
% sentence1, sentence2: strings, e.g. 'hello world'
% penalties (optional): vector of penalties for deletion, insertion and
% substitution (default: [1 1 1])
% split_characters (optional): split at each character, not only at spaces
% (default: false)
%
% GPL v3, Nicolai Spohrer <nicolai@xeve.de>

switch nargin
    case 2
        penalty_deletion = 1;
        penalty_insertion = 1;
        penalty_substitution = 1;
        split_characters = false;
    case 3
        penalty_deletion = penalties(1);
        penalty_insertion = penalties(2);
        penalty_substitution = penalties(3);
        split_characters = false;
    case 4
        penalty_deletion = penalties(1);
        penalty_insertion = penalties(2);
        penalty_substitution = penalties(3);
    otherwise
        error 'Invalid number of arguments';
end

if split_characters
    % split at each character
    sentence1 = num2cell(sentence1);
    sentence2 = num2cell(sentence2);
else
    % split at spaces
    sentence1 = strsplit(sentence1, ' ');
    sentence2 = strsplit(sentence2, ' ');
end



m = length(sentence1);
n = length(sentence2);

% distance 2D array
d = zeros(m+1, n+1);

for i=1:m+1
    d(i, 1) = i-1;
end

for j=1:n+1
    d(1, j) = j-1;
end

for j=1:n
    for i=1:m
        if strcmp(sentence1{i}, sentence2{j})
            % words are equal, therefore no operation required
            d(i+1, j+1) = d(i, j);
        else
            % words are not equal
            d(i+1, j+1) = min(min(...
                d(i, j+1) + penalty_deletion,...  % deletion
                d(i+1, j) + penalty_insertion),...  % insertion
                d(i, j) + penalty_substitution); % substitution
        end
    end
end

y = d(m+1, n+1);
end
Tags: Levenshtein, matlab, Wagner–Fischer, word edit distance

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Spanish Juggling problem
  • Changing the maximum upload file size with Nextcloud in a docker-compose setup
  • Shrinking a QNAP Virtualization Station disk image
  • Indexed Priority Queue in C++
  • Efficient Union-Find in C++ – or: Disjoint-set forests with Path Compression and Ranks

Recent Comments

  • Arjun on Fix for Latex4CorelDraw with CorelDraw 2017
  • Nigel De Gillern on No wired (LAN) connection in Linux, although everything looks right
  • Harald H on Automatically reboot TP-Link router WDR4300 / N750 with bash script
  • Gene on Running multiple Django projects on one Apache instance with mod_wsgi
  • nspo on NMPC with CasADi and Python – Part 1: ODE and steady state

Archives

  • January 2023
  • December 2022
  • January 2021
  • May 2020
  • April 2020
  • July 2018
  • May 2018
  • April 2018
  • March 2018
  • September 2015
  • August 2015
  • June 2015
  • March 2015
  • February 2015
  • September 2014
  • March 2013

Categories

  • Uncategorized

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Copyright Xeve 2025 | Theme by ThemeinProgress | Proudly powered by WordPress