Longest common subsequence
You are encouraged to solve this task according to the task description, using any language you may know.
The longest common subsequence (or LCS) of groups A and B is the longest group of elements from A and B that are common between the two groups and in the same order in each group. For example, the sequences "1234" and "1224533324" have an LCS of "1234":
1234 1224533324
For a string example, consider the sequences "thisisatest" and "testing123testing". An LCS would be "tsitest":
thisisatest testing123testing
In this puzzle, your code only needs to deal with strings. Write a function which returns an LCS of two strings (case-sensitive). You don't need to show multiple LCS's.
For more information on this problem please see Wikipedia.
Ada
Using recursion: <lang ada>with Ada.Text_IO; use Ada.Text_IO;
procedure Test_LCS is
function LCS (A, B : String) return String is begin if A'Length = 0 or else B'Length = 0 then return ""; elsif A (A'Last) = B (B'Last) then return LCS (A (A'First..A'Last - 1), B (B'First..B'Last - 1)) & A (A'Last); else declare X : String renames LCS (A, B (B'First..B'Last - 1)); Y : String renames LCS (A (A'First..A'Last - 1), B); begin if X'Length > Y'Length then return X; else return Y; end if; end; end if; end LCS;
begin
Put_Line (LCS ("thisisatest", "testing123testing"));
end Test_LCS;</lang>
- Output:
tsitest
Non-recursive solution: <lang ada>with Ada.Text_IO; use Ada.Text_IO;
procedure Test_LCS is
function LCS (A, B : String) return String is L : array (A'First..A'Last + 1, B'First..B'Last + 1) of Natural; begin for I in L'Range (1) loop L (I, B'First) := 0; end loop; for J in L'Range (2) loop L (A'First, J) := 0; end loop; for I in A'Range loop for J in B'Range loop if A (I) = B (J) then L (I + 1, J + 1) := L (I, J) + 1; else L (I + 1, J + 1) := Natural'Max (L (I + 1, J), L (I, J + 1)); end if; end loop; end loop; declare I : Integer := L'Last (1); J : Integer := L'Last (2); R : String (1..Integer'Max (A'Length, B'Length)); K : Integer := R'Last; begin while I > L'First (1) and then J > L'First (2) loop if L (I, J) = L (I - 1, J) then I := I - 1; elsif L (I, J) = L (I, J - 1) then J := J - 1; else I := I - 1; J := J - 1; R (K) := A (I); K := K - 1; end if; end loop; return R (K + 1..R'Last); end; end LCS;
begin
Put_Line (LCS ("thisisatest", "testing123testing"));
end Test_LCS;</lang>
- Output:
tsitest
ALGOL 68
<lang algol68>main:(
PROC lcs = (STRING a, b)STRING: BEGIN IF UPB a = 0 OR UPB b = 0 THEN "" ELIF a [UPB a] = b [UPB b] THEN lcs (a [:UPB a - 1], b [:UPB b - 1]) + a [UPB a] ELSE STRING x = lcs (a, b [:UPB b - 1]); STRING y = lcs (a [:UPB a - 1], b); IF UPB x > UPB y THEN x ELSE y FI FI END # lcs #; print((lcs ("thisisatest", "testing123testing"), new line))
)</lang>
- Output:
tsitest
APL
<lang APL>lcs←{
⎕IO←0 betterof←{⊃(</+/¨⍺ ⍵)⌽⍺ ⍵} ⍝ better of 2 selections cmbn←{↑,⊃∘.,/(⊂⊂⍬),⍵} ⍝ combine lists rr←{∧/↑>/1 ¯1↓[1]¨⊂⍵} ⍝ rising rows hmrr←{∨/(rr ⍵)∧∧/⍵=⌈\⍵} ⍝ has monotonically rising rows rnbc←{{⍵/⍳⍴⍵}¨↓[0]×⍵} ⍝ row numbers by column valid←hmrr∘cmbn∘rnbc ⍝ any valid solutions? a w←(</⊃∘⍴¨⍺ ⍵)⌽⍺ ⍵ ⍝ longest first matches←a∘.=w aps←{⍵[;⍒+⌿⍵]}∘{(⍵/2)⊤⍳2*⍵} ⍝ all possible subsequences swps←{⍵/⍨∧⌿~(~∨⌿⍺)⌿⍵} ⍝ subsequences with possible solns sstt←matches swps aps⊃⍴w ⍝ subsequences to try w/⍨{ ⍺←0⍴⍨⊃⍴⍵ ⍝ initial selection (+/⍺)≥+/⍵[;0]:⍺ ⍝ no scope to improve this←⍺ betterof{⍵×valid ⍵/matches}⍵[;0] ⍝ try to improve 1=1⊃⍴⍵:this ⍝ nothing left to try this ∇ 1↓[1]⍵ ⍝ keep looking }sstt }</lang>
AutoHotkey
using dynamic programming
ahk forum: discussion <lang AutoHotkey>lcs(a,b) { ; Longest Common Subsequence of strings, using Dynamic Programming
Loop % StrLen(a)+2 { ; Initialize i := A_Index-1 Loop % StrLen(b)+2 j := A_Index-1, len%i%_%j% := 0 } Loop Parse, a ; scan a { i := A_Index, i1 := i+1, x := A_LoopField Loop Parse, b ; scan b { j := A_Index, j1 := j+1, y := A_LoopField len%i1%_%j1% := x=y ? len%i%_%j% + 1 : (u:=len%i1%_%j%) > (v:=len%i%_%j1%) ? u : v } } x := StrLen(a)+1, y := StrLen(b)+1 While x*y { ; construct solution from lengths x1 := x-1, y1 := y-1 If (len%x%_%y% = len%x1%_%y%) x := x1 Else If (len%x%_%y% = len%x%_%y1%) y := y1 Else x := x1, y := y1, t := SubStr(a,x,1) t } Return t
}</lang>
BASIC
<lang qbasic>FUNCTION lcs$ (a$, b$)
IF LEN(a$) = 0 OR LEN(b$) = 0 THEN
lcs$ = ""
ELSEIF RIGHT$(a$, 1) = RIGHT$(b$, 1) THEN
lcs$ = lcs$(LEFT$(a$, LEN(a$) - 1), LEFT$(b$, LEN(b$) - 1)) + RIGHT$(a$, 1)
ELSE
x$ = lcs$(a$, LEFT$(b$, LEN(b$) - 1)) y$ = lcs$(LEFT$(a$, LEN(a$) - 1), b$) IF LEN(x$) > LEN(y$) THEN lcs$ = x$ ELSE lcs$ = y$ END IF
END IF
END FUNCTION</lang>
BBC BASIC
This makes heavy use of BBC BASIC's shortcut LEFT$(a$) and RIGHT$(a$) functions. <lang bbcbasic> PRINT FNlcs("1234", "1224533324")
PRINT FNlcs("thisisatest", "testing123testing") END DEF FNlcs(a$, b$) IF a$="" OR b$="" THEN = "" IF RIGHT$(a$) = RIGHT$(b$) THEN = FNlcs(LEFT$(a$), LEFT$(b$)) + RIGHT$(a$) LOCAL x$, y$ x$ = FNlcs(a$, LEFT$(b$)) y$ = FNlcs(LEFT$(a$), b$) IF LEN(y$) > LEN(x$) SWAP x$,y$ = x$</lang>
Output:
1234 tsitest
Bracmat
<lang bracmat> ( LCS
= A a ta B b tb prefix . !arg:(?prefix.@(?A:%?a ?ta).@(?B:%?b ?tb)) & ( !a:!b&LCS$(!prefix !a.!ta.!tb) | LCS$(!prefix.!A.!tb)&LCS$(!prefix.!ta.!B) ) | !prefix:? ([>!max:[?max):?lcs | )
& 0:?max & :?lcs & LCS$(.thisisatest.testing123testing) & out$(max !max lcs !lcs);</lang>
- Output:
max 7 lcs t s i t e s t
C
<lang c>#include <string.h>
- include <stdlib.h>
- include <stdio.h>
- define MAX(A,B) (((A)>(B))? (A) : (B))
char * lcs(const char *a,const char * b) {
int lena = strlen(a)+1; int lenb = strlen(b)+1;
int bufrlen = 40; char bufr[40], *result;
int i,j; const char *x, *y; int *la = calloc(lena*lenb, sizeof( int)); int **lengths = malloc( lena*sizeof( int*)); for (i=0; i<lena; i++) lengths[i] = la + i*lenb;
for (i=0,x=a; *x; i++, x++) { for (j=0,y=b; *y; j++,y++ ) { if (*x == *y) { lengths[i+1][j+1] = lengths[i][j] +1; } else { int ml = MAX(lengths[i+1][j], lengths[i][j+1]); lengths[i+1][j+1] = ml; } } }
result = bufr+bufrlen; *--result = '\0'; i = lena-1; j = lenb-1; while ( (i>0) && (j>0) ) { if (lengths[i][j] == lengths[i-1][j]) i -= 1; else if (lengths[i][j] == lengths[i][j-1]) j-= 1; else {
// assert( a[i-1] == b[j-1]);
*--result = a[i-1]; i-=1; j-=1; } } free(la); free(lengths); return strdup(result);
}</lang> Testing <lang c>int main() {
printf("%s\n", lcs("thisisatest", "testing123testing")); // tsitest return 0;
}</lang>
With recursion
<lang c>#include <stdio.h>
- include <stdlib.h>
- include <string.h>
char* lcs(const char *a, const char *b, char *out) { int longest = 0; int match(const char *a, const char *b, int dep) { if (!a || !b) return 0; if (!*a || !*b) { if (dep <= longest) return 0; out[ longest = dep ] = 0; return 1; }
if (*a == *b) return match(a + 1, b + 1, dep + 1) && (out[dep] = *a);
return match(a + 1, b + 1, dep) + match(strchr(a, *b), b, dep) + match(a, strchr(b, *a), dep); }
return match(a, b, 0) ? out : 0; }
int main() { char buf[128]; printf("%s\n", lcs("thisisatest", "testing123testing", buf)); printf("%p\n", lcs("no", "match", buf)); return 0; }</lang>
C++
Background
A small symbol set can generate a large O(m*n) number of pairs where two input sequences match, given sequence lengths m and n. This is the case in the Bioinformatics applications of nucleotide and protein sequencing.
Here a "divide and conquer" approach devised by Hirschberg will limit the space required to O(m+n). However, this approach still requires O(m*n) time even in the best case.
This quadratic time dependency may be prohibitive, given the great length of the input sequences. So, heuristics may be favored over optimal Dynamic Programming solutions.
In the application of comparing source file revisions, records form a sparse symbol space and generate a linear O(m+n) number of pairs where the input sequences match.
A binary search optimization due to Hunt and Szymanski can be applied in this case, which results in expected performance of O(s log s) where s = m+n. In the worst case, however, performance degrades to O(m*n log s) time as the number of matches [and the space required to represent them] grows to O(m*n).
References
"A linear space algorithm for computing maximal common subsequences"
by Daniel S. Hirschberg, published June 1975
Communications of the ACM [Volume 18, Number 6, pp. 341–343]
"An Algorithm for Differential File Comparison"
by James W. Hunt and M. Douglas McIlroy, June 1976
Computing Science Technical Report, Bell Laboratories 41
"A Fast Algorithm for Computing Longest Common Subsequences"
by James W. Hunt and Thomas G. Szymanski, published May 1977
Communications of the ACM [Volume 20, Number 5, pp. 350-353]
Hunt and Szymanski algorithm <lang cpp>#include <stdint.h>
- include <string>
- include <memory> // for shared_ptr<>
- include <iostream>
- include <deque>
- include <map>
- include <algorithm> // for lower_bound()
using namespace std;
class LCS { protected:
// This linked list class is used to trace the LCS candidates class Pair { public: uint32_t index1; uint32_t index2; shared_ptr<Pair> next;
Pair(uint32_t index1, uint32_t index2, shared_ptr<Pair> next = nullptr) : index1(index1), index2(index2), next(next) { }
static shared_ptr<Pair> Reverse(const shared_ptr<Pair> pairs) { shared_ptr<Pair> head = nullptr; for (auto next = pairs; next != nullptr; next = next->next) head = make_shared<Pair>(next->index1, next->index2, head); return head; } };
typedef deque<shared_ptr<Pair>> PAIRS; typedef deque<uint32_t> THRESHOLD; typedef deque<uint32_t> INDEXES; typedef map<char, INDEXES> CHAR2INDEXES; typedef deque<INDEXES*> MATCHES;
// return the LCS as a linked list of matched index pairs uint64_t LCS::Pairs(MATCHES& matches, shared_ptr<Pair> *pairs) { auto trace = pairs != nullptr; PAIRS traces; THRESHOLD threshold;
// //[Assert]After each index1 iteration threshold[index3] is the least index2 // such that the LCS of s1[0:index1] and s2[0:index2] has length index3 + 1 // uint32_t index1 = 0; for (const auto& it1 : matches) { if (!it1->empty()) { auto dq2 = *it1; auto limit = threshold.end(); for (auto it2 = dq2.begin(); it2 != dq2.end(); it2++) { // Each of the index1, index2 pairs considered here correspond to a match auto index2 = *it2; // Note that index2 values are monotonically decreasing // std::lower_bound() performs a binary search for each match limit = lower_bound(threshold.begin(), limit, index2); auto index3 = distance(threshold.begin(), limit);
// // Look ahead to the next index2 value to optimize space used in the Hunt // and Szymanski algorithm. If the next index2 is also an improvement on // the value currently held in threshold[index3], a new Pair will only be // superseded on the next index2 iteration. // // Depending on match redundancy, the number of Pair constructions may be // divided by factors ranging from 2 up to 10 or more. // if (it2 + 1 != dq2.end()) { auto next_index2 = *(it2 + 1); if (limit == threshold.begin() || *(limit - 1) < next_index2) continue; }
if (limit == threshold.end()) { // insert case threshold.push_back(index2); if (trace) { auto prefix = index3 > 0 ? traces[index3 - 1] : nullptr; auto last = make_shared<Pair>(index1, index2, prefix); traces.push_back(last); } } else if (index2 < *limit) { // replacement case *limit = index2; if (trace) { auto prefix = index3 > 0 ? traces[index3 - 1] : nullptr; auto last = make_shared<Pair>(index1, index2, prefix); traces[index3] = last; } } } }
index1++; }
if (trace) { auto last = traces.size() > 0 ? traces.back() : nullptr; // Reverse longest back-trace *pairs = Pair::Reverse(last); }
auto length = threshold.size(); return length; }
// // Match() avoids incurring m*n comparisons by using the associative memory // implemented by CHAR2INDEXES to achieve O(m+n) performance, where m and n // are the input lengths. // // The symbol space is sparse in the case of records; so, the lookup time is // at most O(log(m+n)). The lookup time can be assumed constant in the case // of characters. // void Match(CHAR2INDEXES& indexes, MATCHES& matches, const string& s1, const string& s2) { uint32_t index = 0; for (const auto& it : s2) indexes[it].push_front(index++);
for (const auto& it : s1) { auto& dq2 = indexes[it]; matches.push_back(&dq2); } }
string Select(shared_ptr<Pair> pairs, uint64_t length, bool right, const string& s1, const string& s2) { string buffer; buffer.reserve(length); for (auto next = pairs; next != nullptr; next = next->next) { auto c = right ? s2[next->index2] : s1[next->index1]; buffer.push_back(c); } return buffer; }
public:
string Correspondence(const string& s1, const string& s2) { CHAR2INDEXES indexes; MATCHES matches; // holds references into indexes Match(indexes, matches, s1, s2); shared_ptr<Pair> pairs; // obtain the LCS as index pairs auto length = Pairs(matches, &pairs); return Select(pairs, length, false, s1, s2); }
};</lang> Example: <lang cpp> LCS lcs;
auto s = lcs.Correspondence(s1, s2); cout << s << endl;</lang>
C#
<lang csharp>using System;
namespace LCS {
class Program { static void Main(string[] args) { string word1 = "thisisatest"; string word2 = "testing123testing"; Console.WriteLine(lcsBack(word1, word2)); Console.ReadKey(); }
public static string lcsBack(string a, string b) { string aSub = a.Substring(0, (a.Length - 1 < 0) ? 0 : a.Length - 1); string bSub = b.Substring(0, (b.Length - 1 < 0) ? 0 : b.Length - 1); if (a.Length == 0 || b.Length == 0) return ""; else if (a[a.Length - 1] == b[b.Length - 1]) return lcsBack(aSub, bSub) + a[a.Length - 1]; else { string x = lcsBack(a, bSub); string y = lcsBack(aSub, b); return (x.Length > y.Length) ? x : y; } } }
}</lang>
Clojure
Based on algorithm from Wikipedia. <lang Clojure>(defn longest [xs ys] (if (> (count xs) (count ys)) xs ys))
(def lcs
(memoize (fn [[x & xs] [y & ys]] (cond (or (= x nil) (= y nil) ) nil (= x y) (cons x (lcs xs ys)) :else (longest (lcs (cons x xs) ys) (lcs xs (cons y ys)))))))</lang>
CoffeeScript
<lang coffeescript> lcs = (s1, s2) ->
len1 = s1.length len2 = s2.length # Create a virtual matrix that is (len1 + 1) by (len2 + 1), # where m[i][j] is the longest common string using only # the first i chars of s1 and first j chars of s2. The # matrix is virtual, because we only keep the last two rows # in memory. prior_row = ( for i in [0..len2])
for i in [0...len1] row = [] for j in [0...len2] if s1[i] == s2[j] row.push prior_row[j] + s1[i] else subs1 = row[j] subs2 = prior_row[j+1] if subs1.length > subs2.length row.push subs1 else row.push subs2 prior_row = row row[len2]
s1 = "thisisatest" s2 = "testing123testing" console.log lcs(s1, s2)</lang>
Common Lisp
Here's a memoizing/dynamic-programming solution that uses an n × m array where n and m are the lengths of the input arrays. The first return value is a sequence (of the same type as array1) which is the longest common subsequence. The second return value is the length of the longest common subsequence. <lang lisp>(defun longest-common-subsequence (array1 array2)
(let* ((l1 (length array1)) (l2 (length array2)) (results (make-array (list l1 l2) :initial-element nil))) (declare (dynamic-extent results)) (labels ((lcs (start1 start2) ;; if either sequence is empty, return (() 0) (if (or (eql start1 l1) (eql start2 l2)) (list '() 0) ;; otherwise, return any memoized value (let ((result (aref results start1 start2))) (if (not (null result)) result ;; otherwise, compute and store a value (setf (aref results start1 start2) (if (eql (aref array1 start1) (aref array2 start2)) ;; if they start with the same element, ;; move forward in both sequences (destructuring-bind (seq len) (lcs (1+ start1) (1+ start2)) (list (cons (aref array1 start1) seq) (1+ len))) ;; otherwise, move ahead in each separately, ;; and return the better result. (let ((a (lcs (1+ start1) start2)) (b (lcs start1 (1+ start2)))) (if (> (second a) (second b)) a b))))))))) (destructuring-bind (seq len) (lcs 0 0) (values (coerce seq (type-of array1)) len)))))</lang>
For example,
<lang lisp>(longest-common-subsequence "123456" "1a2b3c")</lang>
produces the two values
<lang lisp>"123" 3</lang>
An alternative adopted from Clojure
Here is another version with its own memoization macro:
<lang lisp>(defmacro mem-defun (name args body)
(let ((hash-name (gensym))) `(let ((,hash-name (make-hash-table :test 'equal))) (defun ,name ,args (or (gethash (list ,@args) ,hash-name) (setf (gethash (list ,@args) ,hash-name) ,body))))))
(mem-defun lcs (xs ys)
(labels ((longer (a b) (if (> (length a) (length b)) a b))) (cond ((or (null xs) (null ys)) nil) ((equal (car xs) (car ys)) (cons (car xs) (lcs (cdr xs) (cdr ys))))
(t (longer (lcs (cdr xs) ys) (lcs xs (cdr ys)))))))</lang>
When we test it, we get:
<lang lisp>(coerce (lcs (coerce "thisisatest" 'list) (coerce "testing123testing" 'list)) 'string))))
"tsitest"</lang>
D
Both versions don't work correctly with Unicode text.
Recursive version
<lang d>import std.stdio, std.array;
T[] lcs(T)(in T[] a, in T[] b) pure nothrow @safe {
if (a.empty || b.empty) return null; if (a[0] == b[0]) return a[0] ~ lcs(a[1 .. $], b[1 .. $]); const longest = (T[] x, T[] y) => x.length > y.length ? x : y; return longest(lcs(a, b[1 .. $]), lcs(a[1 .. $], b));
}
void main() {
lcs("thisisatest", "testing123testing").writeln;
}</lang>
- Output:
tsitest
Faster dynamic programming version
The output is the same. <lang d>import std.stdio, std.algorithm, std.traits;
T[] lcs(T)(in T[] a, in T[] b) pure /*nothrow*/ {
auto L = new uint[][](a.length + 1, b.length + 1);
foreach (immutable i; 0 .. a.length) foreach (immutable j; 0 .. b.length) L[i + 1][j + 1] = (a[i] == b[j]) ? (1 + L[i][j]) : max(L[i + 1][j], L[i][j + 1]);
Unqual!T[] result; for (auto i = a.length, j = b.length; i > 0 && j > 0; ) { if (a[i - 1] == b[j - 1]) { result ~= a[i - 1]; i--; j--; } else if (L[i][j - 1] < L[i - 1][j]) i--; else j--; }
result.reverse(); // Not nothrow. return result;
}
void main() {
lcs("thisisatest", "testing123testing").writeln;
}</lang>
Hirschberg algorithm version
See: http://en.wikipedia.org/wiki/Hirschberg_algorithm
This is currently a little slower than the classic dynamic programming version, but it uses a linear amount of memory, so it's usable for much larger inputs. To speed up this code on dmd remove the memory allocations from lensLCS, and do not use the retro range (replace it with foreach_reverse). The output is the same.
Adapted from Python code: http://wordaligned.org/articles/longest-common-subsequence
<lang d>import std.stdio, std.algorithm, std.range, std.array, std.string, std.typecons;
uint[] lensLCS(R)(R xs, R ys) pure nothrow @safe {
auto prev = new typeof(return)(1 + ys.length); auto curr = new typeof(return)(1 + ys.length);
foreach (immutable x; xs) { swap(curr, prev); size_t i = 0; foreach (immutable y; ys) { curr[i + 1] = (x == y) ? prev[i] + 1 : max(curr[i], prev[i + 1]); i++; } }
return curr;
}
void calculateLCS(T)(in T[] xs, in T[] ys, bool[] xs_in_lcs,
in size_t idx=0) pure nothrow @safe { immutable nx = xs.length; immutable ny = ys.length;
if (nx == 0) return;
if (nx == 1) { if (ys.canFind(xs[0])) xs_in_lcs[idx] = true; } else { immutable mid = nx / 2; const xb = xs[0.. mid]; const xe = xs[mid .. $]; immutable ll_b = lensLCS(xb, ys);
const ll_e = lensLCS(xe.retro, ys.retro); // retro is slow with dmd.
//immutable k = iota(ny + 1) // .reduce!(max!(j => ll_b[j] + ll_e[ny - j])); immutable k = iota(ny + 1) .minPos!((i, j) => tuple(ll_b[i] + ll_e[ny - i]) > tuple(ll_b[j] + ll_e[ny - j]))[0];
calculateLCS(xb, ys[0 .. k], xs_in_lcs, idx); calculateLCS(xe, ys[k .. $], xs_in_lcs, idx + mid); }
}
const(T)[] lcs(T)(in T[] xs, in T[] ys) pure /*nothrow*/ @safe {
auto xs_in_lcs = new bool[xs.length]; calculateLCS(xs, ys, xs_in_lcs); return zip(xs, xs_in_lcs).filter!q{ a[1] }.map!q{ a[0] }.array; // Not nothrow.
}
string lcsString(in string s1, in string s2) pure /*nothrow*/ @safe {
return lcs(s1.representation, s2.representation).assumeUTF;
}
void main() {
lcsString("thisisatest", "testing123testing").writeln;
}</lang>
Dart
<lang dart>import 'dart:math';
String lcsRecursion(String a, String b) {
int aLen = a.length; int bLen = b.length;
if (aLen == 0 || bLen == 0) { return ""; } else if (a[aLen-1] == b[bLen-1]) { return lcsRecursion(a.substring(0,aLen-1),b.substring(0,bLen-1)) + a[aLen-1]; } else { var x = lcsRecursion(a, b.substring(0,bLen-1)); var y = lcsRecursion(a.substring(0,aLen-1), b); return (x.length > y.length) ? x : y; }
}
String lcsDynamic(String a, String b) {
var lengths = new List<List<int>>.generate(a.length + 1, (_) => new List.filled(b.length+1, 0), growable: false);
// row 0 and column 0 are initialized to 0 already for (int i = 0; i < a.length; i++) { for (int j = 0; j < b.length; j++) { if (a[i] == b[j]) { lengths[i+1][j+1] = lengths[i][j] + 1; } else { lengths[i+1][j+1] = max(lengths[i+1][j], lengths[i][j+1]); } } }
// read the substring out from the matrix StringBuffer reversedLcsBuffer = new StringBuffer(); for (int x = a.length, y = b.length; x != 0 && y != 0;) { if (lengths[x][y] == lengths[x-1][y]) { x--; } else if (lengths[x][y] == lengths[x][y-1]) { y--; } else { assert(a[x-1] == b[y-1]); reversedLcsBuffer.write(a[x-1]); x--; y--; } }
// reverse String var reversedLCS = reversedLcsBuffer.toString(); var lcsBuffer = new StringBuffer(); for(var i = reversedLCS.length - 1; i>=0; i--) { lcsBuffer.write(reversedLCS[i]); } return lcsBuffer.toString();
}
void main() {
print("lcsDynamic('1234', '1224533324') = ${lcsDynamic('1234', '1224533324')}"); print("lcsDynamic('thisisatest', 'testing123testing') = ${lcsDynamic('thisisatest', 'testing123testing')}"); print("lcsDynamic(, 'x') = ${lcsDynamic(, 'x')}"); print("lcsDynamic('x', 'x') = ${lcsDynamic('x', 'x')}"); print(); print("lcsRecursion('1234', '1224533324') = ${lcsRecursion('1234', '1224533324')}"); print("lcsRecursion('thisisatest', 'testing123testing') = ${lcsRecursion('thisisatest', 'testing123testing')}"); print("lcsRecursion(, 'x') = ${lcsRecursion(, 'x')}"); print("lcsRecursion('x', 'x') = ${lcsRecursion('x', 'x')}");
} </lang>
- Output:
lcsDynamic('1234', '1224533324') = 1234 lcsDynamic('thisisatest', 'testing123testing') = tsitest lcsDynamic('', 'x') = lcsDynamic('x', 'x') = x lcsRecursion('1234', '1224533324') = 1234 lcsRecursion('thisisatest', 'testing123testing') = tsitest lcsRecursion('', 'x') = lcsRecursion('x', 'x') = x
Egison
<lang egison> (define $common-seqs
(lambda [$xs $ys] (match-all [xs ys] [(list char) (list char)] [[(loop $i [1 $n] <join _ <cons $c_i ...>> _) (loop $i [1 ,n] <join _ <cons ,c_i ...>> _)] (map (lambda [$i] c_i) (between 1 n))])))
(define $lcs (compose common-seqs rac)) </lang> Output: <lang egison> > (lcs "thisisatest" "testing123testing")) "tsitest" </lang>
Erlang
This implementation also includes the ability to calculate the length of the longest common subsequence. In calculating that length, we generate a cache which can be traversed to generate the longest common subsequence. <lang erlang> module(lcs). -compile(export_all).
lcs_length(S,T) ->
{L,_C} = lcs_length(S,T,dict:new()), L.
lcs_length([]=S,T,Cache) ->
{0,dict:store({S,T},0,Cache)};
lcs_length(S,[]=T,Cache) ->
{0,dict:store({S,T},0,Cache)};
lcs_length([H|ST]=S,[H|TT]=T,Cache) ->
{L,C} = lcs_length(ST,TT,Cache), {L+1,dict:store({S,T},L+1,C)};
lcs_length([_SH|ST]=S,[_TH|TT]=T,Cache) ->
case dict:is_key({S,T},Cache) of true -> {dict:fetch({S,T},Cache),Cache}; false -> {L1,C1} = lcs_length(S,TT,Cache), {L2,C2} = lcs_length(ST,T,C1), L = lists:max([L1,L2]), {L,dict:store({S,T},L,C2)} end.
lcs(S,T) ->
{_,C} = lcs_length(S,T,dict:new()), lcs(S,T,C,[]).
lcs([],_,_,Acc) ->
lists:reverse(Acc);
lcs(_,[],_,Acc) ->
lists:reverse(Acc);
lcs([H|ST],[H|TT],Cache,Acc) ->
lcs(ST,TT,Cache,[H|Acc]);
lcs([_SH|ST]=S,[_TH|TT]=T,Cache,Acc) ->
case dict:fetch({S,TT},Cache) > dict:fetch({ST,T},Cache) of true -> lcs(S,TT,Cache, Acc); false -> lcs(ST,T,Cache,Acc) end.
</lang> Output: <lang erlang> 77> lcs:lcs("thisisatest","testing123testing"). "tsitest" 78> lcs:lcs("1234","1224533324"). "1234" </lang>
We can also use the process dictionary to memoize the recursive implementation:
<lang erlang> lcs(Xs0, Ys0) ->
CacheKey = {lcs_cache, Xs0, Ys0}, case get(CacheKey) of undefined -> Result = case {Xs0, Ys0} of {[], _} -> [] ; {_, []} -> [] ; {[Same | Xs], [Same | Ys]} -> [Same | lcs(Xs, Ys)] ; {[_ | XsRest]=XsAll, [_ | YsRest]=YsAll} -> A = lcs(XsRest, YsAll), B = lcs(XsAll , YsRest), case length(A) > length(B) of true -> A ; false -> B end end, undefined = put(CacheKey, Result), Result ; Result -> Result end.
</lang>
Fortran
Using the iso_varying_string module which can be found here (or equivalent module conforming to the ISO/IEC 1539-2:2000 API or to a subset according to the need of this code: char
, len
, //
, extract
, ==
, =
)
<lang fortran>program lcstest
use iso_varying_string implicit none
type(varying_string) :: s1, s2
s1 = "thisisatest" s2 = "testing123testing" print *, char(lcs(s1, s2))
s1 = "1234" s2 = "1224533324" print *, char(lcs(s1, s2))
contains
recursive function lcs(a, b) result(l) type(varying_string) :: l type(varying_string), intent(in) :: a, b
type(varying_string) :: x, y
l = "" if ( (len(a) == 0) .or. (len(b) == 0) ) return if ( extract(a, len(a), len(a)) == extract(b, len(b), len(b)) ) then l = lcs(extract(a, 1, len(a)-1), extract(b, 1, len(b)-1)) // extract(a, len(a), len(a)) else x = lcs(a, extract(b, 1, len(b)-1)) y = lcs(extract(a, 1, len(a)-1), b) if ( len(x) > len(y) ) then l = x else l = y end if end if end function lcs
end program lcstest</lang>
F#
Copied and slightly adapted from OCaml (direct recursion) <lang fsharp>open System
let longest xs ys = if List.length xs > List.length ys then xs else ys
let rec lcs a b =
match a, b with | [], _ | _, [] -> [] | x::xs, y::ys -> if x = y then x :: lcs xs ys else longest (lcs a ys) (lcs xs b)
[<EntryPoint>] let main argv =
let split (str:string) = List.init str.Length (fun i -> str.[i]) printfn "%A" (String.Join("", (lcs (split "thisisatest") (split "testing123testing")))) 0
</lang>
Go
Recursion
Brute force <lang go>func lcs(a, b string) string {
aLen := len(a) bLen := len(b) if aLen == 0 || bLen == 0 { return "" } else if a[aLen-1] == b[bLen-1] { return lcs(a[:aLen-1], b[:bLen-1]) + string(a[aLen-1]) } x := lcs(a, b[:bLen-1]) y := lcs(a[:aLen-1], b) if len(x) > len(y) { return x } return y
}</lang>
Dynamic Programming
<lang go>func lcs(a, b string) string { arunes := []rune(a) brunes := []rune(b) aLen := len(arunes) bLen := len(brunes) lengths := make([][]int, aLen+1) for i := 0; i <= aLen; i++ { lengths[i] = make([]int, bLen+1) } // row 0 and column 0 are initialized to 0 already
for i := 0; i < aLen; i++ { for j := 0; j < bLen; j++ { if arunes[i] == brunes[j] { lengths[i+1][j+1] = lengths[i][j] + 1 } else if lengths[i+1][j] > lengths[i][j+1] { lengths[i+1][j+1] = lengths[i+1][j] } else { lengths[i+1][j+1] = lengths[i][j+1] } } }
// read the substring out from the matrix s := make([]rune, 0, lengths[aLen][bLen]) for x, y := aLen, bLen; x != 0 && y != 0; { if lengths[x][y] == lengths[x-1][y] { x-- } else if lengths[x][y] == lengths[x][y-1] { y-- } else { s = append(s, arunes[x-1]) x-- y-- } } // reverse string for i, j := 0, len(s)-1; i < j; i, j = i+1, j-1 { s[i], s[j] = s[j], s[i] } return string(s) }</lang>
Haskell
The Wikipedia solution translates directly into Haskell, with the only difference that equal characters are added in front:
<lang haskell>longest xs ys = if length xs > length ys then xs else ys
lcs [] _ = [] lcs _ [] = [] lcs (x:xs) (y:ys)
| x == y = x : lcs xs ys | otherwise = longest (lcs (x:xs) ys) (lcs xs (y:ys))</lang>
A Memoized version of the naive algorithm.
<lang haskell>import qualified Data.MemoCombinators as M
lcs = memoize lcsm
where lcsm [] _ = [] lcsm _ [] = [] lcsm (x:xs) (y:ys) | x == y = x : lcs xs ys | otherwise = maxl (lcs (x:xs) ys) (lcs xs (y:ys))
maxl x y = if length x > length y then x else y memoize = M.memo2 mString mString mString = M.list M.char -- Chars, but you can specify any type you need for the memo</lang>
Memoization (aka dynamic programming) of that uses zip to make both the index and the character available:
<lang haskell>import Data.Array
lcs xs ys = a!(0,0) where
n = length xs m = length ys a = array ((0,0),(n,m)) $ l1 ++ l2 ++ l3 l1 = [((i,m),[]) | i <- [0..n]] l2 = [((n,j),[]) | j <- [0..m]] l3 = [((i,j), f x y i j) | (x,i) <- zip xs [0..], (y,j) <- zip ys [0..]] f x y i j | x == y = x : a!(i+1,j+1) | otherwise = longest (a!(i,j+1)) (a!(i+1,j))</lang>
All 3 solutions work of course not only with strings, but also with any other list. Example: <lang haskell>*Main> lcs "thisisatest" "testing123testing" "tsitest"</lang> The dynamic programming version without using arrays: <lang haskell>import Data.List
longest xs ys = if length xs > length ys then xs else ys
lcs xs ys = head $ foldr(\xs -> map head. scanr1 f. zipWith (\x y -> [x,y]) xs) e m where
m = map (\x -> flip (++) [[]] $ map (\y -> [x | x==y]) ys) xs e = replicate (length ys) [] f [a,b] [c,d] | null a = longest b c: [b] | otherwise = (a++d):[b]</lang>
Simple and slow solution:
<lang haskell>import Data.Ord import Data.List
-- longest common lcs xs ys = maximumBy (comparing length) $ intersect (subsequences xs) (subsequences ys)
main = print $ lcs "thisisatest" "testing123testing"</lang>
- Output:
"tsitest"
Icon and Unicon
This solution is a modified variant of the recursive solution. The modifications include (a) deleting all characters not common to both strings and (b) stripping off common prefixes and suffixes in a single step.
<lang Icon>procedure main() LCSTEST("thisisatest","testing123testing") LCSTEST("","x") LCSTEST("x","x") LCSTEST("beginning-middle-ending","beginning-diddle-dum-ending") end
link strings
procedure LCSTEST(a,b) #: helper to show inputs and results write("lcs( ",image(a),", ",image(b)," ) = ",image(res := lcs(a,b))) return res end
procedure lcs(a,b) #: return longest common sub-sequence of characters (modified recursive method) local i,x,y local c,nc
if *(a|b) = 0 then return "" # done if either string is empty if a == b then return a # done if equal
if *(a ++ b -- (c := a ** b)) > 0 then { # find all characters not in common a := deletec(a,nc := ~c) # .. remove b := deletec(b,nc) # .. remove } # only unequal strings and shared characters beyond
i := 0 ; while a[i+1] == b[i+1] do i +:=1 # find common prefix ... if *(x := a[1+:i]) > 0 then # if any return x || lcs(a[i+1:0],b[i+1:0]) # ... remove and process remainder
i := 0 ; while a[-(i+1)] == b[-(i+1)] do i +:=1 # find common suffix ... if *(y := a[0-:i]) > 0 then # if any return lcs(a[1:-i],b[1:-i]) || y # ... remove and process remainder
return if *(x := lcs(a,b[1:-1])) > *(y := lcs(a[1:-1],b)) then x else y # divide, discard, and keep longest
end</lang>
- Output:
lcs( "thisisatest", "testing123testing" ) = "tsitest" lcs( "", "x" ) = "" lcs( "x", "x" ) = "x" lcs( "beginning-middle-ending", "beginning-diddle-dum-ending" ) = "beginning-iddle-ending"
J
<lang j>lcs=: dyad define
|.x{~ 0{"1 cullOne^:_ (\: +/"1)(\:{."1) 4$.$. x =/ y
)
cullOne=: ({~[: <@<@< [: (i. 0:)1,[: *./[: |: 2>/\]) :: ]</lang>
Here's another approach:
<lang J>mergeSq=: ;@}: ~.@, {.@;@{. ,&.> 3 {:: 4&{. common=: 2 2 <@mergeSq@,;.3^:_ [: (<@#&.> i.@$) =/ lcs=: [ {~ 0 {"1 ,&$ #: 0 ({:: (#~ [: (= >./) #@>)) 0 ({:: ,) common</lang>
Example use (works with either definition of lcs):
<lang J> 'thisisatest' lcs 'testing123testing' tsitest</lang>
Dynamic programming version <lang j>longest=: ]`[@.(>&#) upd=:{:@[,~ ({.@[ ,&.> {:@])`({:@[ longest&.> {.@])@.(0 = #&>@{.@[) lcs=: 0{:: [: ([: {.&> [: upd&.>/\.<"1@:,.)/ a:,.~a:,~=/{"1 a:,.<"0@[</lang> Output: <lang j> '1234' lcs '1224533324' 1234
'thisisatest' lcs 'testing123testing'
tsitest</lang>
Recursion <lang j>lcs=:;(($:}.) longest }.@[ $: ])`({.@[,$:&}.)@.(=&{.)`((i.0)"_)@.(+.&(0=#))&((e.#[)&>/) ;~</lang>
Java
Recursion
This is not a particularly fast algorithm, but it gets the job done eventually. The speed is a result of many recursive function calls. <lang java>public static String lcs(String a, String b){
int aLen = a.length(); int bLen = b.length(); if(aLen == 0 || bLen == 0){ return ""; }else if(a.charAt(aLen-1) == b.charAt(bLen-1)){ return lcs(a.substring(0,aLen-1),b.substring(0,bLen-1)) + a.charAt(aLen-1); }else{ String x = lcs(a, b.substring(0,bLen-1)); String y = lcs(a.substring(0,aLen-1), b); return (x.length() > y.length()) ? x : y; }
}</lang>
Dynamic Programming
<lang java>public static String lcs(String a, String b) {
int[][] lengths = new int[a.length()+1][b.length()+1];
// row 0 and column 0 are initialized to 0 already
for (int i = 0; i < a.length(); i++) for (int j = 0; j < b.length(); j++) if (a.charAt(i) == b.charAt(j)) lengths[i+1][j+1] = lengths[i][j] + 1; else lengths[i+1][j+1] = Math.max(lengths[i+1][j], lengths[i][j+1]);
// read the substring out from the matrix StringBuffer sb = new StringBuffer(); for (int x = a.length(), y = b.length(); x != 0 && y != 0; ) { if (lengths[x][y] == lengths[x-1][y]) x--; else if (lengths[x][y] == lengths[x][y-1]) y--; else { assert a.charAt(x-1) == b.charAt(y-1); sb.append(a.charAt(x-1)); x--; y--; } }
return sb.reverse().toString();
}</lang>
JavaScript
Recursion
This is more or less a translation of the recursive Java version above. <lang javascript>function lcs(a, b) {
var aSub = a.substr(0, a.length-1); var bSub = b.substr(0, b.length-1); if (a.length == 0 || b.length == 0) { return ""; } else if (a.charAt(a.length-1) == b.charAt(b.length-1)) { return lcs(aSub, bSub) + a.charAt(a.length-1); } else { var x = lcs(a, bSub); var y = lcs(aSub, b); return (x.length > y.length) ? x : y; }
}</lang>
Dynamic Programming
This version runs in O(mn) time and consumes O(mn) space. Factoring out loop edge cases could get a small constant time improvement, and it's fairly trivial to edit the final loop to produce a full diff in addition to the lcs. <lang javascript>function lcs(x,y){ var s,i,j,m,n, lcs=[],row=[],c=[], left,diag,latch; //make sure shorter string is the column string if(m<n){s=x;x=y;y=s;} m = x.length; n = y.length; //build the c-table for(j=0;j<n;row[j++]=0); for(i=0;i<m;i++){ c[i] = row = row.slice(); for(diag=0,j=0;j<n;j++,diag=latch){ latch=row[j]; if(x[i] == y[j]){row[j] = diag+1;} else{ left = row[j-1]||0; if(left>row[j]){row[j] = left;} } } } i--,j--; //row[j] now contains the length of the lcs //recover the lcs from the table while(i>-1&&j>-1){ switch(c[i][j]){ default: j--; lcs.unshift(x[i]); case (i&&c[i-1][j]): i--; continue; case (j&&c[i][j-1]): j--; } } return lcs.join(); }</lang>
BUG note: In line 6, m and n are not yet initialized, and so x and y are never swapped. Swapping is useless here, and becomes wrong when extending the algorithm to produce a diff.
The final loop can be modified to concatenate maximal common substrings rather than individual characters: <lang javascript> var t=i; while(i>-1&&j>-1){ switch(c[i][j]){ default:i--,j--; continue; case (i&&c[i-1][j]): if(t!==i){lcs.unshift(x.substring(i+1,t+1));} t=--i; continue; case (j&&c[i][j-1]): j--; if(t!==i){lcs.unshift(x.substring(i+1,t+1));} t=i; } } if(t!==i){lcs.unshift(x.substring(i+1,t+1));}</lang>
Greedy Algorithm
This is a bit harder to understand, but is significantly faster and less memory intensive than the dynamic programming version, in exchange for giving up the ability to re-use the table to find alternate solutions and greater complexity in generating diffs. Note that this implementation uses a binary buffer for additional efficiency gains, but it's simple to transform to use string or array concatenation; <lang javascript>function lcs_greedy(x,y){ var symbols = {}, r=0,p=0,p1,L=0,idx, m=x.length,n=y.length, S = new Buffer(m<n?n:m); p1 = popsym(0); for(i=0;i < m;i++){ p = (r===p)?p1:popsym(i); p1 = popsym(i+1); idx=(p > p1)?(i++,p1):p; if(idx===n){p=popsym(i);} else{ r=idx; S[L++]=x.charCodeAt(i); } } return S.toString('utf8',0,L);
function popsym(index){ var s = x[index], pos = symbols[s]+1; pos = y.indexOf(s,pos>r?pos:r); if(pos===-1){pos=n;} symbols[s]=pos; return pos; } }</lang>
jq
We first give a recursive solution, which works for strings or for arrays, and then use it to write an enhanced solution that first removes extraneous characters and recognizes a common initial substring.<lang jq>
- Generic version for strings or for arrays:
def recursive_lcs(a; b):
if (a|length) == 0 or (b|length) == 0 then a[0:0] else a[0:-1] as $aSub | b[0:-1] as $bSub | a[-1:] as $last | if $last == b[-1:] then recursive_lcs($aSub; $bSub) + $last else recursive_lcs(a; $bSub) as $x | recursive_lcs($aSub; b) as $y | if ($x|length) > ($y|length) then $x else $y end end end ;</lang>
Enhanced version:<lang jq>
- return the length of the common initial subsequence;
- x and y are arrays
- The inner helper function has no arguments
- and so has no recursion overhead
def common_heads(x;y):
def common: if x[.] != null and x[.] == y[.] then (.+1)|common else . end; 0 | common;
- x and y are arrays
def intersection(x;y):
( (x|unique) + (y|unique) | sort) as $sorted | reduce range(1; $sorted|length) as $i ([]; if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;
- x and y are strings; emit [winnowedx, winnowedy]
def winnow(x; y):
(x|explode) as $x | (y|explode) as $y | intersection($x; $y) as $intersection | [ ($x | map( select( . as $i | $intersection | index($i) ))) , ($y | map( select( . as $i | $intersection | index($i) ))) ] | map(implode) ;
- First remove extraneous characters and recognize common heads
def lcs(a; b):
if (a|length) == 0 or (b|length) == 0 then "" else winnow(a;b) | .[0] as $a | .[1] as $b | common_heads($a | explode; $b | explode) as $heads | if $heads > 0 then $a[0:$heads] + recursive_lcs( $a[$heads:]; b[$heads:]) else recursive_lcs($a; $b) end end ;</lang>
Example:<lang jq> def test:
lcs( "thisisatest"; "testing123testing"), lcs("beginning-middle-ending" ; "beginning-diddle-dum-ending" )
test</lang><lang sh>$ time jq -n -f LCS.jq time jq -n -f LCS.jq "tsitest" "beginning-iddle-ending"
real 0m0.456s user 0m0.427s sys 0m0.005s</lang>
Liberty BASIC
<lang lb> 'variation of BASIC example w$="aebdef" z$="cacbc" print lcs$(w$,z$)
'output: 'ab
wait
FUNCTION lcs$(a$, b$)
IF LEN(a$) = 0 OR LEN(b$) = 0 THEN lcs$ = "" exit function end if
IF RIGHT$(a$, 1) = RIGHT$(b$, 1) THEN lcs$ = lcs$(LEFT$(a$, LEN(a$) - 1), LEFT$(b$, LEN(b$) - 1)) + RIGHT$(a$, 1) exit function ELSE x$ = lcs$(a$, LEFT$(b$, LEN(b$) - 1)) y$ = lcs$(LEFT$(a$, LEN(a$) - 1), b$) IF LEN(x$) > LEN(y$) THEN lcs$ = x$ exit function ELSE lcs$ = y$ exit function END IF END IF
END FUNCTION </lang>
Logo
This implementation works on both words and lists. <lang logo>to longest :s :t
output ifelse greater? count :s count :t [:s] [:t]
end to lcs :s :t
if empty? :s [output :s] if empty? :t [output :t] if equal? first :s first :t [output combine first :s lcs bf :s bf :t] output longest lcs :s bf :t lcs bf :s :t
end</lang>
Lua
<lang lua>function LCS( a, b )
if #a == 0 or #b == 0 then return "" elseif string.sub( a, -1, -1 ) == string.sub( b, -1, -1 ) then return LCS( string.sub( a, 1, -2 ), string.sub( b, 1, -2 ) ) .. string.sub( a, -1, -1 ) else local a_sub = LCS( a, string.sub( b, 1, -2 ) ) local b_sub = LCS( string.sub( a, 1, -2 ), b ) if #a_sub > #b_sub then return a_sub else return b_sub end end
end
print( LCS( "thisisatest", "testing123testing" ) )</lang>
M4
<lang M4>define(`set2d',`define(`$1[$2][$3]',`$4')') define(`get2d',`defn($1[$2][$3])') define(`tryboth',
`pushdef(`x',lcs(`$1',substr(`$2',1),`$1 $2'))`'pushdef(`y', lcs(substr(`$1',1),`$2',`$1 $2'))`'ifelse(eval(len(x)>len(y)),1, `x',`y')`'popdef(`x')`'popdef(`y')')
define(`checkfirst',
`ifelse(substr(`$1',0,1),substr(`$2',0,1), `substr(`$1',0,1)`'lcs(substr(`$1',1),substr(`$2',1))', `tryboth(`$1',`$2')')')
define(`lcs',
`ifelse(get2d(`c',`$1',`$2'),`', `pushdef(`a',ifelse( `$1',`',`', `$2',`',`', `checkfirst(`$1',`$2')'))`'a`'set2d(`c',`$1',`$2',a)`'popdef(`a')', `get2d(`c',`$1',`$2')')')
lcs(`1234',`1224533324')
lcs(`thisisatest',`testing123testing')</lang> Note: the caching (set2d/get2d) obscures the code even more than usual, but is necessary in order to get the second test to run in a reasonable amount of time.
Maple
<lang Maple> > StringTools:-LongestCommonSubSequence( "thisisatest", "testing123testing" );
"tsitest"
</lang>
Mathematica
A built-in function can do this for us: <lang Mathematica>a = "thisisatest"; b = "testing123testing"; LongestCommonSequence[a, b]</lang> gives: <lang Mathematica>tsitest</lang> Note that Mathematica also has a built-in function called LongestCommonSubsequence[a,b]:
finds the longest contiguous subsequence of elements common to the strings or lists a and b.
which would give "test" as the result for LongestCommonSubsequence[a, b].
The description for LongestCommonSequence[a,b] is:
finds the longest sequence of contiguous or disjoint elements common to the strings or lists a and b.
I added this note because the name of this article suggests LongestCommonSubsequence does the job, however LongestCommonSubsequence performs the puzzle-description.
Nim
Recursion
<lang nim>proc lcs(x, y): string =
if x == "" or y == "": return ""
if x[0] == y[0]: return x[0] & lcs(x[1..x.high], y[1..y.high])
let a = lcs(x, y[1..y.high]) let b = lcs(x[1..x.high], y) result = if a.len > b.len: a else: b
echo lcs("1234", "1224533324") echo lcs("thisisatest", "testing123testing")</lang>
Dynamic Programming
<lang nim>proc lcs(a, b): string =
var ls = newSeq[seq[int]] a.len+1 for i in 0 .. a.len: ls[i].newSeq b.len+1
for i, x in a: for j, y in b: if x == y: ls[i+1][j+1] = ls[i][j] + 1 else: ls[i+1][j+1] = max(ls[i+1][j], ls[i][j+1])
result = "" var x = a.len var y = b.len while x > 0 and y > 0: if ls[x][y] == ls[x-1][y]: dec x elif ls[x][y] == ls[x][y-1]: dec y else: assert a[x-1] == b[y-1] result = a[x-1] & result dec x dec y
echo lcs("1234", "1224533324") echo lcs("thisisatest", "testing123testing")</lang>
OCaml
Recursion
from Haskell <lang ocaml>let longest xs ys = if List.length xs > List.length ys then xs else ys
let rec lcs a b = match a, b with
[], _ | _, [] -> [] | x::xs, y::ys -> if x = y then x :: lcs xs ys else longest (lcs a ys) (lcs xs b)</lang>
Memoized recursion
<lang ocaml> let lcs xs ys =
let cache = Hashtbl.create 16 in let rec lcs xs ys = try Hashtbl.find cache (xs, ys) with | Not_found -> let result = match xs, ys with | [], _ -> [] | _, [] -> [] | x :: xs, y :: ys when x = y -> x :: lcs xs ys | _ :: xs_rest, _ :: ys_rest -> let a = lcs xs_rest ys in let b = lcs xs ys_rest in if (List.length a) > (List.length b) then a else b in Hashtbl.add cache (xs, ys) result; result in lcs xs ys</lang>
Dynamic programming
<lang ocaml>let lcs xs' ys' =
let xs = Array.of_list xs' and ys = Array.of_list ys' in let n = Array.length xs and m = Array.length ys in let a = Array.make_matrix (n+1) (m+1) [] in for i = n-1 downto 0 do for j = m-1 downto 0 do a.(i).(j) <- if xs.(i) = ys.(j) then xs.(i) :: a.(i+1).(j+1) else longest a.(i).(j+1) a.(i+1).(j) done done; a.(0).(0)</lang>
Because both solutions only work with lists, here are some functions to convert to and from strings: <lang ocaml>let list_of_string str =
let result = ref [] in String.iter (fun x -> result := x :: !result) str; List.rev !result
let string_of_list lst =
let result = String.create (List.length lst) in ignore (List.fold_left (fun i x -> result.[i] <- x; i+1) 0 lst); result</lang>
Both solutions work. Example:
# string_of_list (lcs (list_of_string "thisisatest") (list_of_string "testing123testing"));; - : string = "tsitest"
Oz
Recursive solution: <lang oz>declare
fun {LCS Xs Ys} case [Xs Ys] of [nil _] then nil [] [_ nil] then nil [] [X|Xr Y|Yr] andthen X==Y then X|{LCS Xr Yr} [] [_|Xr _|Yr] then {Longest {LCS Xs Yr} {LCS Xr Ys}} end end
fun {Longest Xs Ys} if {Length Xs} > {Length Ys} then Xs else Ys end end
in
{System.showInfo {LCS "thisisatest" "testing123testing"}}</lang>
Pascal
<lang pascal>Program LongestCommonSubsequence(output);
function lcs(a, b: string): string;
var x, y: string; lenga, lengb: integer; begin lenga := length(a); lengb := length(b); lcs := ; if (lenga > 0) and (lengb > 0) then if a[lenga] = b[lengb] then lcs := lcs(copy(a, 1, lenga-1), copy(b, 1, lengb-1)) + a[lenga] else begin x := lcs(a, copy(b, 1, lengb-1)); y := lcs(copy(a, 1, lenga-1), b); if length(x) > length(y) then lcs := x else lcs := y; end; end;
var
s1, s2: string;
begin
s1 := 'thisisatest'; s2 := 'testing123testing'; writeln (lcs(s1, s2)); s1 := '1234'; s2 := '1224533324'; writeln (lcs(s1, s2));
end.</lang>
- Output:
:> ./LongestCommonSequence tsitest 1234
Perl
<lang perl>use Algorithm::Diff qw/ LCS /;
my @a = split //, 'thisisatest'; my @b = split //, 'testing123testing';
print LCS( \@a, \@b );</lang>
Perl 6
Recursion
This solution is similar to the Haskell one. It is slow. <lang perl6>sub lcs(Str $xstr, Str $ystr) {
return "" unless $xstr & $ystr;
my ($x, $xs, $y, $ys) = $xstr.substr(0, 1), $xstr.substr(1), $ystr.substr(0, 1), $ystr.substr(1); return $x eq $y ?? $x ~ lcs($xs, $ys) !! max({ $^a.chars }, lcs($xstr, $ys), lcs($xs, $ystr) );
}
say lcs("thisisatest", "testing123testing");</lang>
Dynamic Programming
<lang perl6> sub lcs(Str $xstr, Str $ystr) {
my ($xlen, $ylen) = ($xstr, $ystr)>>.chars; my @lengths = map {[(0) xx ($ylen+1)]}, 0..$xlen;
for $xstr.comb.kv -> $i, $x { for $ystr.comb.kv -> $j, $y { @lengths[$i+1][$j+1] = $x eq $y ?? @lengths[$i][$j]+1 !! (@lengths[$i+1][$j], @lengths[$i][$j+1]).max; } }
my @x = $xstr.comb; my ($x, $y) = ($xlen, $ylen); my $result = ""; while $x != 0 && $y != 0 { if @lengths[$x][$y] == @lengths[$x-1][$y] { $x--; } elsif @lengths[$x][$y] == @lengths[$x][$y-1] { $y--; } else { $result = @x[$x-1] ~ $result; $x--; $y--; } }
return $result;
}
say lcs("thisisatest", "testing123testing");</lang>
PicoLisp
<lang PicoLisp>(de commonSequences (A B)
(when A (conc (when (member (car A) B) (mapcar '((L) (cons (car A) L)) (cons NIL (commonSequences (cdr A) (cdr @))) ) ) (commonSequences (cdr A) B) ) ) )
(maxi length
(commonSequences (chop "thisisatest") (chop "testing123testing") ) )</lang>
- Output:
-> ("t" "s" "i" "t" "e" "s" "t")
Prolog
Recursive Version
First version: <lang Prolog>test :-
time(lcs("thisisatest", "testing123testing", Lcs)), writef('%s',[Lcs]).
lcs([ H|L1],[ H|L2],[H|Lcs]) :- !,
lcs(L1,L2,Lcs).
lcs([H1|L1],[H2|L2],Lcs):-
lcs( L1 ,[H2|L2],Lcs1), lcs([H1|L1], L2 ,Lcs2), longest(Lcs1,Lcs2,Lcs),!.
lcs(_,_,[]).
longest(L1,L2,Longest) :-
length(L1,Length1), length(L2,Length2), ((Length1 > Length2) -> Longest = L1; Longest = L2).</lang>
Second version, with memoization: <lang Prolog>%declare that we will add lcs_db facts during runtime
- - dynamic lcs_db/3.
test :-
retractall(lcs_db(_,_,_)), %clear the database of known results time(lcs("thisisatest", "testing123testing", Lcs)), writef('%s',[Lcs]).
% check if the result is known
lcs(L1,L2,Lcs) :-
lcs_db(L1,L2,Lcs),!.
lcs([ H|L1],[ H|L2],[H|Lcs]) :- !,
lcs(L1,L2,Lcs).
lcs([H1|L1],[H2|L2],Lcs) :-
lcs( L1 ,[H2|L2],Lcs1), lcs([H1|L1], L2 ,Lcs2), longest(Lcs1,Lcs2,Lcs),!, assert(lcs_db([H1|L1],[H2|L2],Lcs)).
lcs(_,_,[]).
longest(L1,L2,Longest) :-
length(L1,Length1), length(L2,Length2), ((Length1 > Length2) -> Longest = L1; Longest = L2).</lang>
- Demonstrating:
Example for "beginning-middle-ending" and "beginning-diddle-dum-ending"
First version :
<lang Prolog>?- time(lcs("beginning-middle-ending","beginning-diddle-dum-ending", Lcs)),writef('%s', [Lcs]).
% 10,875,184 inferences, 1.840 CPU in 1.996 seconds (92% CPU, 5910426 Lips)
beginning-iddle-ending</lang>
Second version which is much faster :
<lang Prolog>?- time(lcs("beginning-middle-ending","beginning-diddle-dum-ending", Lcs)),writef('%s', [Lcs]).
% 2,376 inferences, 0.010 CPU in 0.003 seconds (300% CPU, 237600 Lips)
beginning-iddle-ending</lang>
PureBasic
<lang PureBasic>Procedure.s lcs(a$, b$)
Protected x$ , lcs$ If Len(a$) = 0 Or Len(b$) = 0 lcs$ = "" ElseIf Right(a$, 1) = Right(b$, 1) lcs$ = lcs(Left(a$, Len(a$) - 1), Left(b$, Len(b$) - 1)) + Right(a$, 1) Else x$ = lcs(a$, Left(b$, Len(b$) - 1)) y$ = lcs(Left(a$, Len(a$) - 1), b$) If Len(x$) > Len(y$) lcs$ = x$ Else lcs$ = y$ EndIf EndIf ProcedureReturn lcs$
EndProcedure OpenConsole() PrintN( lcs("thisisatest", "testing123testing")) PrintN("Press any key to exit"): Repeat: Until Inkey() <> ""</lang>
Python
The simplest way is to use LCS within mlpy package
Recursion
This solution is similar to the Haskell one. It is slow. <lang python>def lcs(xstr, ystr):
""" >>> lcs('thisisatest', 'testing123testing') 'tsitest' """ if not xstr or not ystr: return "" x, xs, y, ys = xstr[0], xstr[1:], ystr[0], ystr[1:] if x == y: return x + lcs(xs, ys) else: return max(lcs(xstr, ys), lcs(xs, ystr), key=len)</lang>
Test it: <lang python>if __name__=="__main__":
import doctest; doctest.testmod()</lang>
Dynamic Programming
<lang python>def lcs(a, b):
lengths = [[0 for j in range(len(b)+1)] for i in range(len(a)+1)] # row 0 and column 0 are initialized to 0 already for i, x in enumerate(a): for j, y in enumerate(b): if x == y: lengths[i+1][j+1] = lengths[i][j] + 1 else: lengths[i+1][j+1] = \ max(lengths[i+1][j], lengths[i][j+1]) # read the substring out from the matrix result = "" x, y = len(a), len(b) while x != 0 and y != 0: if lengths[x][y] == lengths[x-1][y]: x -= 1 elif lengths[x][y] == lengths[x][y-1]: y -= 1 else: assert a[x-1] == b[y-1] result = a[x-1] + result x -= 1 y -= 1 return result</lang>
Racket
<lang racket>#lang racket (define (longest xs ys)
(if (> (length xs) (length ys)) xs ys))
(define memo (make-hash)) (define (lookup xs ys)
(hash-ref memo (cons xs ys) #f))
(define (store xs ys r)
(hash-set! memo (cons xs ys) r) r)
(define (lcs/list sx sy)
(or (lookup sx sy) (store sx sy (match* (sx sy) [((cons x xs) (cons y ys)) (if (equal? x y) (cons x (lcs/list xs ys)) (longest (lcs/list sx ys) (lcs/list xs sy)))] [(_ _) '()]))))
(define (lcs sx sy)
(list->string (lcs/list (string->list sx) (string->list sy))))
(lcs "thisisatest" "testing123testing")</lang>
- Output:
"tsitest">
REXX
<lang rexx>/*REXX program to test the LCS (Longest Common Subsequence) subroutine.*/ parse arg aaa bbb . /*get two arguments (strings). */ say 'string A = 'aaa /*echo string A to screen. */ say 'string B = 'bbb /*echo string B to screen. */ say ' LCS = 'lcs(aaa,bbb) /*tell Longest Common Sequence. */ exit /*stick a fork in it, we're done.*/ /*──────────────────────────────────LCS subroutine──────────────────────*/ lcs: procedure; parse arg a,b,z /*Longest Common Subsequence. */
/*reduce recursions, removes the */ /*chars in A ¬ in B, & vice-versa*/
if z== then return lcs( lcs(a,b,0), lcs(b,a,0), 9) j=length(a) if z==0 then do /*special invocation: shrink Z. */
do j=1 for j; _=substr(a,j,1) if pos(_,b)\==0 then z=z||_ end /*j*/ return substr(z,2) end
k=length(b) if j==0 | k==0 then return /*Either string null? Bupkis. */ _=substr(a,j,1) if _==substr(b,k,1) then return lcs(substr(a,1,j-1),substr(b,1,k-1),9)_ x=lcs(a,substr(b,1,k-1),9) y=lcs(substr(a,1,j-1),b,9) if length(x)>length(y) then return x
return y</lang>
- Output with input “ 1234 1224533324 ”:
string A=1234 string B=1224533324 LCS=1234
- Output with input “ thisisatest testing123testing ”:
string A=thisisatest string B=testing123testing LCS=tsitest
Ruby
Recursion
This solution is similar to the Haskell one. It is slow (The time complexity is exponential.)
<lang ruby>=begin irb(main):001:0> lcs('thisisatest', 'testing123testing') => "tsitest" =end def lcs(xstr, ystr)
return "" if xstr.empty? || ystr.empty? x, xs, y, ys = xstr[0..0], xstr[1..-1], ystr[0..0], ystr[1..-1] if x == y x + lcs(xs, ys) else [lcs(xstr, ys), lcs(xs, ystr)].max_by {|x| x.size} end
end</lang>
Dynamic programming
Walker class for the LCS matrix:
<lang ruby>class LCS
SELF, LEFT, UP, DIAG = [0,0], [0,-1], [-1,0], [-1,-1] def initialize(a, b) @m = Array.new(a.length) { Array.new(b.length) } a.each_char.with_index do |x, i| b.each_char.with_index do |y, j| match(x, y, i, j) end end end def match(c, d, i, j) @i, @j = i, j @m[i][j] = compute_entry(c, d) end def lookup(x, y) [@i+x, @j+y] end def valid?(i=@i, j=@j) i >= 0 && j >= 0 end def peek(x, y) i, j = lookup(x, y) valid?(i, j) ? @m[i][j] : 0 end def compute_entry(c, d) c == d ? peek(*DIAG) + 1 : [peek(*LEFT), peek(*UP)].max end def backtrack @i, @j = @m.length-1, @m[0].length-1 y = [] y << @i+1 if backstep? while valid? y.reverse end def backtrack2 @i, @j = @m.length-1, @m[0].length-1 y = [] y << @j+1 if backstep? while valid? [backtrack, y.reverse] end def backstep? backstep = compute_backstep @i, @j = lookup(*backstep) backstep == DIAG end def compute_backstep case peek(*SELF) when peek(*LEFT) then LEFT when peek(*UP) then UP else DIAG end end
end</lang>
lcs function:
<lang ruby>def lcs(a, b)
walker = LCS.new(a, b) walker.backtrack.inject("") { |s, i| s << a[i] }
end
puts lcs('thisisatest', 'testing123testing') puts lcs("rosettacode", "raisethysword")</lang>
- Output:
tsitest rsetod
Referring to LCS here.
Run BASIC
<lang runbasic>a$ = "aebdaef" b$ = "cacbac" print lcs$(a$,b$) end
FUNCTION lcs$(a$, b$) IF a$ = "" OR b$ = "" THEN
lcs$ = "" goto [ext]
end if
IF RIGHT$(a$, 1) = RIGHT$(b$, 1) THEN
lcs$ = lcs$(LEFT$(a$, LEN(a$) - 1), LEFT$(b$, LEN(b$) - 1)) + RIGHT$(a$, 1) goto [ext] ELSE x1$ = lcs$(a$, LEFT$(b$, LEN(b$) - 1)) x2$ = lcs$(LEFT$(a$, LEN(a$) - 1), b$) IF LEN(x1$) > LEN(x2$) THEN lcs$ = x1$ goto [ext] ELSE lcs$ = x2$ goto [ext] END IF
END IF [ext]
END FUNCTION</lang>
aba
Scala
<lang scala>object LCS extends App {
// recursive version: def lcsr(a: String, b: String): String = { if (a.size==0 || b.size==0) "" else if (a==b) a else if(a(a.size-1)==b(b.size-1)) lcsr(a.substring(0,a.size-1),b.substring(0,b.size-1))+a(a.size-1) else { val x = lcsr(a,b.substring(0,b.size-1)) val y = lcsr(a.substring(0,a.size-1),b) if (x.size > y.size) x else y } } // dynamic programming version: def lcsd(a: String, b: String): String = { if (a.size==0 || b.size==0) "" else if (a==b) a else { val lengths = Array.ofDim[Int](a.size+1,b.size+1) for (i <- 0 until a.size) for (j <- 0 until b.size) if (a(i) == b(j)) lengths(i+1)(j+1) = lengths(i)(j) + 1 else lengths(i+1)(j+1) = scala.math.max(lengths(i+1)(j),lengths(i)(j+1)) // read the substring out from the matrix val sb = new StringBuilder() var x = a.size var y = b.size do { if (lengths(x)(y) == lengths(x-1)(y)) x -= 1 else if (lengths(x)(y) == lengths(x)(y-1)) y -= 1 else { assert(a(x-1) == b(y-1)) sb += a(x-1) x -= 1 y -= 1 } } while (x!=0 && y!=0) sb.toString.reverse } } val elapsed: (=> Unit) => Long = f => {val s = System.currentTimeMillis; f; (System.currentTimeMillis - s)/1000} val pairs = List(("thisiaatest","testing123testing") ,("","x") ,("x","x") ,("beginning-middle-ending", "beginning-diddle-dum-ending"))
var s = "" println("recursive version:") pairs foreach {p => println{val t = elapsed(s = lcsr(p._1,p._2)) "lcsr(\""+p._1+"\",\""+p._2+"\") = \""+s+"\" ("+t+" sec)"} }
println("\n"+"dynamic programming version:") pairs foreach {p => println{val t = elapsed(s = lcsd(p._1,p._2)) "lcsd(\""+p._1+"\",\""+p._2+"\") = \""+s+"\" ("+t+" sec)"} }
}</lang>
- Output:
recursive version: lcsr("thisiaatest","testing123testing") = "tsitest" (0 sec) lcsr("","x") = "" (0 sec) lcsr("x","x") = "x" (0 sec) lcsr("beginning-middle-ending","beginning-diddle-dum-ending") = "beginning-iddle-ending" (29 sec) dynamic programming version: lcsd("thisiaatest","testing123testing") = "tsitest" (0 sec) lcsd("","x") = "" (0 sec) lcsd("x","x") = "x" (0 sec) lcsd("beginning-middle-ending","beginning-diddle-dum-ending") = "beginning-iddle-ending" (0 sec)
Scheme
Port from Clojure.
<lang scheme>
- using srfi-69
(define (memoize proc)
(let ((results (make-hash-table))) (lambda args (or (hash-table-ref results args (lambda () #f)) (let ((r (apply proc args))) (hash-table-set! results args r) r)))))
(define (longest xs ys)
(if (> (length xs) (length ys)) xs ys))
(define lcs
(memoize (lambda (seqx seqy) (if (pair? seqx) (let ((x (car seqx)) (xs (cdr seqx))) (if (pair? seqy) (let ((y (car seqy)) (ys (cdr seqy))) (if (equal? x y) (cons x (lcs xs ys)) (longest (lcs seqx ys) (lcs xs seqy)))) '())) '()))))
</lang>
Testing: <lang scheme>
(test-group
"lcs" (test '() (lcs '(a b c) '(A B C))) (test '(a) (lcs '(a a a) '(A A a))) (test '() (lcs '() '(a b c))) (test '() (lcs '(a b c) '())) (test '(a c) (lcs '(a b c) '(a B c))) (test '(b) (lcs '(a b c) '(A b C))) (test '( b d e f g h j) (lcs '(a b d e f g h i j) '(A b c d e f F a g h j))))
</lang>
Seed7
<lang seed7>$ include "seed7_05.s7i";
const func string: lcs (in string: a, in string: b) is func
result var string: lcs is ""; local var string: x is ""; var string: y is ""; begin if a <> "" and b <> "" then if a[length(a)] = b[length(b)] then lcs := lcs(a[.. pred(length(a))], b[.. pred(length(b))]) & str(a[length(a)]); else x := lcs(a, b[.. pred(length(b))]); y := lcs(a[.. pred(length(a))], b); if length(x) > length(y) then lcs := x; else lcs := y; end if; end if; end if; end func;
const proc: main is func
begin writeln(lcs("thisisatest", "testing123testing")); writeln(lcs("1234", "1224533324")); end func;</lang>
Output:
tsitest 1234
SETL
Recursive; Also works on tuples (vectors) <lang setl> op .longest(a, b);
return if #a > #b then a else b end; end .longest; procedure lcs(a, b); if exists empty in {a, b} | #empty = 0 then return empty; elseif a(1) = b(1) then return a(1) + lcs(a(2..), b(2..)); else return lcs(a(2..), b) .longest lcs(a, b(2..)); end; end lcs;</lang>
Sidef
<lang ruby>func lcs(xstr is String, ystr is String) -> String {
(xstr.is_empty || ystr.is_empty) && return ;
var(x, xs, y, ys) = (xstr.ft(0, 1), xstr.ft(1), ystr.ft(0, 1), ystr.ft(1));
if (x == y) { x + lcs($xs, $ys) } else { [lcs(xstr, ys), lcs(xs, ystr)].max_by {|x| x.len}; }
}
say lcs("thisisatest", "testing123testing");</lang>
- Output:
% time sidef -Mblock lcs.sf tsitest sidef -Mblock lcs.sf 0.23s user 0.01s system 97% cpu 0.240 total
Slate
We define this on the Sequence type since there is nothing string-specific about the concept.
Recursion
<lang slate>s1@(Sequence traits) longestCommonSubsequenceWith: s2@(Sequence traits) [
s1 isEmpty \/ s2 isEmpty ifTrue: [^ {}]. s1 last = s2 last ifTrue: [(s1 allButLast longestCommonSubsequenceWith: s2 allButLast) copyWith: s1 last] ifFalse: [| x y | x: (s1 longestCommonSubsequenceWith: s2 allButLast). y: (s1 allButLast longestCommonSubsequenceWith: s2). x length > y length ifTrue: [x] ifFalse: [y]]
].</lang>
Dynamic Programming
<lang slate>s1@(Sequence traits) longestCommonSubsequenceWith: s2@(Sequence traits) [| lengths |
lengths: (ArrayMD newWithDimensions: {s1 length `cache. s2 length `cache} defaultElement: 0). s1 doWithIndex: [| :elem1 :index1 | s2 doWithIndex: [| :elem2 :index2 | elem1 = elem2 ifTrue: [lengths at: {index1 + 1. index2 + 1} put: (lengths at: {index1. index2}) + 1] ifFalse: [lengths at: {index1 + 1. index2 + 1} put: ((lengths at: {index1 + 1. index2}) max: (lengths at: {index1. index2 + 1}))]]]. ([| :result index1 index2 | index1: s1 length. index2: s2 length. [index1 isPositive /\ index2 isPositive] whileTrue: [(lengths at: {index1. index2}) = (lengths at: {index1 - 1. index2}) ifTrue: [index1: index1 - 1] ifFalse: [(lengths at: {index1. index2}) = (lengths at: {index1. index2 - 1})] ifTrue: [index2: index2 - 1] ifFalse: ["assert: (s1 at: index1 - 1) = (s2 at: index2 - 1)." result nextPut: (s1 at: index1 - 1). index1: index1 - 1. index2: index2 - 1]] ] writingAs: s1) reverse
].</lang>
Tcl
Recursive
<lang tcl>proc r_lcs {a b} {
if {$a eq "" || $b eq ""} {return ""} set a_ [string range $a 1 end] set b_ [string range $b 1 end] if {[set c [string index $a 0]] eq [string index $b 0]} { return "$c[r_lcs $a_ $b_]" } else { set x [r_lcs $a $b_] set y [r_lcs $a_ $b] return [expr {[string length $x] > [string length $y] ? $x :$y}] }
}</lang>
Dynamic
<lang tcl>package require Tcl 8.5 namespace import ::tcl::mathop::+ namespace import ::tcl::mathop::- namespace import ::tcl::mathfunc::max
proc d_lcs {a b} {
set la [string length $a] set lb [string length $b] set lengths [lrepeat [+ $la 1] [lrepeat [+ $lb 1] 0]]
for {set i 0} {$i < $la} {incr i} { for {set j 0} {$j < $lb} {incr j} { if {[string index $a $i] eq [string index $b $j]} { lset lengths [+ $i 1] [+ $j 1] [+ [lindex $lengths $i $j] 1] } else { lset lengths [+ $i 1] [+ $j 1] [max [lindex $lengths [+ $i 1] $j] [lindex $lengths $i [+ $j 1]]] } } }
set result "" set x $la set y $lb while {$x >0 && $x > 0} { if {[lindex $lengths $x $y] == [lindex $lengths [- $x 1] $y]} { incr x -1 } elseif {[lindex $lengths $x $y] == [lindex $lengths $x [- $y 1]]} { incr y -1 } else { if {[set c [string index $a [- $x 1]]] ne [string index $b [- $y 1]]} { error "assertion failed: a.charAt(x-1) == b.charAt(y-1)" } append result $c incr x -1 incr y -1 } } return [string reverse $result]
}</lang>
Performance Comparison
<lang tcl>% time {d_lcs thisisatest testing123testing} 10 637.5 microseconds per iteration % time {r_lcs thisisatest testing123testing} 10 1275566.8 microseconds per iteration</lang>
Ursala
This uses the same recursive algorithm as in the Haskell example, and works on lists of any type. <lang Ursala>#import std
lcs = ~&alrB^& ~&E?abh/~&alh2fabt2RC @faltPrXlrtPXXPW leql?/~&r ~&l</lang> test program: <lang Ursala>#cast %s
example = lcs('thisisatest','testing123testing')</lang>
- Output:
'tsitest'
zkl
This is quite vile in terms of [time] efficiency, another algorithm should be used for real work.
<lang zkl>fcn lcs(a,b){
if(not a or not b) return(""); if (a[0]==b[0]) return(a[0] + self.fcn(a[1,*],b[1,*])); return(fcn(x,y){if(x.len()>y.len())x else y}(lcs(a,b[1,*]),lcs(a[1,*],b)))
}</lang> The last line looks strange but it is just return(lambda longest(lcs.lcs))
- Output:
zkl: lcs("thisisatest", "testing123testing") tsitest
- Programming Tasks
- Solutions by Programming Task
- Recursion
- Memoization
- Ada
- ALGOL 68
- APL
- AutoHotkey
- BASIC
- BBC BASIC
- Bracmat
- C
- C++
- C sharp
- Clojure
- CoffeeScript
- Common Lisp
- D
- Dart
- Egison
- Erlang
- Fortran
- F Sharp
- Go
- Haskell
- Icon
- Unicon
- Icon Programming Library
- J
- Java
- JavaScript
- Jq
- Liberty BASIC
- Logo
- Lua
- M4
- Maple
- Mathematica
- Nim
- OCaml
- Oz
- Pascal
- Perl
- Perl 6
- PicoLisp
- Prolog
- PureBasic
- Python
- Racket
- REXX
- Ruby
- Run BASIC
- Scala
- Scala examples needing attention
- Examples needing attention
- Scheme
- Seed7
- SETL
- Sidef
- Slate
- Tcl
- Ursala
- Zkl