Longest common subsequence: Difference between revisions
PascalABC.NET |
|||
(186 intermediate revisions by 51 users not shown) | |||
Line 1: | Line 1: | ||
{{task}}[[Category:Recursion]][[Category:Memoization]] |
{{task}}[[Category:Recursion]][[Category:Memoization]] |
||
'''Introduction''' |
|||
The '''longest common subsequence''' (or [http://en.wikipedia.org/wiki/Longest_common_subsequence_problem '''LCS''']) of groups A and B is the longest group of elements from A and B that are common between the two groups and in the same order in each group. For example, the sequences "1234" and "1224533324" have an LCS of "1234": |
|||
Define a ''subsequence'' to be any output string obtained by deleting zero or more symbols from an input string. |
|||
The [http://en.wikipedia.org/wiki/Longest_common_subsequence_problem '''Longest Common Subsequence'''] ('''LCS''') is a subsequence of maximum length common to two or more strings. |
|||
Let ''A'' ≡ ''A''[0]… ''A''[m - 1] and ''B'' ≡ ''B''[0]… ''B''[n - 1], m < n be strings drawn from an alphabet Σ of size s, containing every distinct symbol in A + B. |
|||
An ordered pair (i, j) will be referred to as a match if ''A''[i] = ''B''[j], where 0 ≤ i < m and 0 ≤ j < n. |
|||
The set of matches '''M''' defines a relation over matches: '''M'''[i, j] ⇔ (i, j) ∈ '''M'''. |
|||
Define a ''non-strict'' [https://en.wikipedia.org/wiki/Product_order product-order] (≤) over ordered pairs, such that (i1, j1) ≤ (i2, j2) ⇔ i1 ≤ i2 and j1 ≤ j2. We define (≥) similarly. |
|||
We say ordered pairs p1 and p2 are ''comparable'' if either p1 ≤ p2 or p1 ≥ p2 holds. If i1 < i2 and j2 < j1 (or i2 < i1 and j1 < j2) then neither p1 ≤ p2 nor p1 ≥ p2 are possible, and we say p1 and p2 are ''incomparable''. |
|||
Define the ''strict'' product-order (<) over ordered pairs, such that (i1, j1) < (i2, j2) ⇔ i1 < i2 and j1 < j2. We define (>) similarly. |
|||
A chain '''C''' is a subset of '''M''' consisting of at least one element m; and where either m1 < m2 or m1 > m2 for every pair of distinct elements m1 and m2. An antichain '''D''' is any subset of '''M''' in which every pair of distinct elements m1 and m2 are incomparable. |
|||
A chain can be visualized as a strictly increasing curve that passes through matches (i, j) in the m*n coordinate space of '''M'''[i, j]. |
|||
Every Common Sequence of length ''q'' corresponds to a chain of cardinality ''q'', over the set of matches '''M'''. Thus, finding an LCS can be restated as the problem of finding a chain of maximum cardinality ''p''. |
|||
According to [Dilworth 1950], this cardinality ''p'' equals the minimum number of disjoint antichains into which '''M''' can be decomposed. Note that such a decomposition into the minimal number p of disjoint antichains may not be unique. |
|||
'''Background''' |
|||
Where the number of symbols appearing in matches is small relative to the length of the input strings, reuse of the symbols increases; and the number of matches will tend towards O(''m*n'') quadratic growth. This occurs, for example, in the Bioinformatics application of nucleotide and protein sequencing. |
|||
The divide-and-conquer approach of [Hirschberg 1975] limits the space required to O(''n''). However, this approach requires O(''m*n'') time even in the best case. |
|||
This quadratic time dependency may become prohibitive, given very long input strings. Thus, heuristics are often favored over optimal Dynamic Programming solutions. |
|||
In the application of comparing file revisions, records from the input files form a large symbol space; and the number of symbols approaches the length of the LCS. In this case the number of matches reduces to linear, O(''n'') growth. |
|||
A binary search optimization due to [Hunt and Szymanski 1977] can be applied to the basic Dynamic Programming approach, resulting in an expected performance of O(''n log m''). Performance can degrade to O(''m*n log m'') time in the worst case, as the number of matches grows to O(''m*n''). |
|||
'''Note''' |
|||
[Rick 2000] describes a linear-space algorithm with a time bound of O(''n*s + p*min(m, n - p)''). |
|||
'''Legend''' |
|||
A, B are input strings of lengths m, n respectively |
|||
p is the length of the LCS |
|||
M is the set of matches (i, j) such that A[i] = B[j] |
|||
r is the magnitude of M |
|||
s is the magnitude of the alphabet Σ of distinct symbols in A + B |
|||
'''References''' |
|||
[Dilworth 1950] "A decomposition theorem for partially ordered sets" |
|||
by Robert P. Dilworth, published January 1950, |
|||
Annals of Mathematics [Volume 51, Number 1, ''pp.'' 161-166] |
|||
[Goeman and Clausen 2002] "A New Practical Linear Space Algorithm for the Longest Common |
|||
Subsequence Problem" by Heiko Goeman and Michael Clausen, |
|||
published 2002, Kybernetika [Volume 38, Issue 1, ''pp.'' 45-66] |
|||
[Hirschberg 1975] "A linear space algorithm for computing maximal common subsequences" |
|||
by Daniel S. Hirschberg, published June 1975 |
|||
Communications of the ACM [Volume 18, Number 6, ''pp.'' 341-343] |
|||
[Hunt and McIlroy 1976] "An Algorithm for Differential File Comparison" |
|||
by James W. Hunt and M. Douglas McIlroy, June 1976 |
|||
Computing Science Technical Report, Bell Laboratories 41 |
|||
[Hunt and Szymanski 1977] "A Fast Algorithm for Computing Longest Common Subsequences" |
|||
by James W. Hunt and Thomas G. Szymanski, published May 1977 |
|||
Communications of the ACM [Volume 20, Number 5, ''pp.'' 350-353] |
|||
[Rick 2000] "Simple and fast linear space computation of longest common subsequences" |
|||
by Claus Rick, received 17 March 2000, Information Processing Letters, |
|||
Elsevier Science [Volume 75, ''pp.'' 275–281] |
|||
<br /> |
|||
'''Examples''' |
|||
The sequences "1234" and "1224533324" have an LCS of "1234": |
|||
'''<u>1234</u>''' |
'''<u>1234</u>''' |
||
'''<u>12</u>'''245'''<u>3</u>'''332'''<u>4</u>''' |
'''<u>12</u>'''245'''<u>3</u>'''332'''<u>4</u>''' |
||
For a string example, consider the sequences "thisisatest" and "testing123testing". An LCS would be "tsitest": |
For a string example, consider the sequences "thisisatest" and "testing123testing". An LCS would be "tsitest": |
||
'''<u>t</u>'''hi'''<u>si</u>'''sa'''<u>test</u>''' |
'''<u>t</u>'''hi'''<u>si</u>'''sa'''<u>test</u>''' |
||
Line 10: | Line 90: | ||
For more information on this problem please see [[wp:Longest_common_subsequence_problem|Wikipedia]]. |
For more information on this problem please see [[wp:Longest_common_subsequence_problem|Wikipedia]]. |
||
{{Template:Strings}} |
|||
<br><br> |
|||
=={{header|11l}}== |
|||
{{trans|Python}} |
|||
<syntaxhighlight lang="11l">F lcs(a, b) |
|||
V lengths = [[0] * (b.len+1)] * (a.len+1) |
|||
L(x) a |
|||
V i = L.index |
|||
L(y) b |
|||
V j = L.index |
|||
I x == y |
|||
lengths[i + 1][j + 1] = lengths[i][j] + 1 |
|||
E |
|||
lengths[i + 1][j + 1] = max(lengths[i + 1][j], lengths[i][j + 1]) |
|||
V result = ‘’ |
|||
V j = b.len |
|||
L(i) 1..a.len |
|||
I lengths[i][j] != lengths[i - 1][j] |
|||
result ‘’= a[i - 1] |
|||
R result |
|||
print(lcs(‘1234’, ‘1224533324’)) |
|||
print(lcs(‘thisisatest’, ‘testing123testing’))</syntaxhighlight> |
|||
{{out}} |
|||
<pre> |
|||
1234 |
|||
tisitst |
|||
</pre> |
|||
=={{header|Ada}}== |
=={{header|Ada}}== |
||
Using recursion: |
Using recursion: |
||
< |
<syntaxhighlight lang="ada">with Ada.Text_IO; use Ada.Text_IO; |
||
procedure Test_LCS is |
procedure Test_LCS is |
||
Line 37: | Line 150: | ||
begin |
begin |
||
Put_Line (LCS ("thisisatest", "testing123testing")); |
Put_Line (LCS ("thisisatest", "testing123testing")); |
||
end Test_LCS;</ |
end Test_LCS;</syntaxhighlight> |
||
{{out}} |
{{out}} |
||
<pre> |
<pre> |
||
Line 43: | Line 156: | ||
</pre> |
</pre> |
||
Non-recursive solution: |
Non-recursive solution: |
||
< |
<syntaxhighlight lang="ada">with Ada.Text_IO; use Ada.Text_IO; |
||
procedure Test_LCS is |
procedure Test_LCS is |
||
Line 87: | Line 200: | ||
begin |
begin |
||
Put_Line (LCS ("thisisatest", "testing123testing")); |
Put_Line (LCS ("thisisatest", "testing123testing")); |
||
end Test_LCS;</ |
end Test_LCS;</syntaxhighlight> |
||
{{out}} |
{{out}} |
||
<pre> |
<pre> |
||
Line 98: | Line 211: | ||
{{works with|ALGOL 68G|Any - tested with release mk15-0.8b.fc9.i386}} |
{{works with|ALGOL 68G|Any - tested with release mk15-0.8b.fc9.i386}} |
||
{{works with|ELLA ALGOL 68|Any (with appropriate job cards) - tested with release 1.8.8d.fc9.i386}} |
{{works with|ELLA ALGOL 68|Any (with appropriate job cards) - tested with release 1.8.8d.fc9.i386}} |
||
< |
<syntaxhighlight lang="algol68">main:( |
||
PROC lcs = (STRING a, b)STRING: |
PROC lcs = (STRING a, b)STRING: |
||
BEGIN |
BEGIN |
||
Line 112: | Line 225: | ||
END # lcs #; |
END # lcs #; |
||
print((lcs ("thisisatest", "testing123testing"), new line)) |
print((lcs ("thisisatest", "testing123testing"), new line)) |
||
)</ |
)</syntaxhighlight> |
||
{{out}} |
{{out}} |
||
<pre> |
<pre> |
||
tsitest |
tsitest |
||
</pre> |
</pre> |
||
=={{header|APL}}== |
=={{header|APL}}== |
||
{{works with|Dyalog APL}} |
{{works with|Dyalog APL}} |
||
< |
<syntaxhighlight lang="apl">lcs←{ |
||
⎕IO←0 |
⎕IO←0 |
||
betterof←{⊃(</+/¨⍺ ⍵)⌽⍺ ⍵} ⍝ better of 2 selections |
betterof←{⊃(</+/¨⍺ ⍵)⌽⍺ ⍵} ⍝ better of 2 selections |
||
Line 139: | Line 253: | ||
this ∇ 1↓[1]⍵ ⍝ keep looking |
this ∇ 1↓[1]⍵ ⍝ keep looking |
||
}sstt |
}sstt |
||
}</ |
}</syntaxhighlight> |
||
=={{header|Arturo}}== |
|||
{{trans|Python}} |
|||
<syntaxhighlight lang="rebol">lcs: function [a,b][ |
|||
ls: new array.of: @[inc size a, inc size b] 0 |
|||
loop.with:'i a 'x [ |
|||
loop.with:'j b 'y [ |
|||
ls\[i+1]\[j+1]: (x=y)? -> ls\[i]\[j] + 1 |
|||
-> max @[ls\[i+1]\[j], ls\[i]\[j+1]] |
|||
] |
|||
] |
|||
[result, x, y]: @[new "", size a, size b] |
|||
while [and? [x > 0][y > 0]][ |
|||
if? ls\[x]\[y] = ls\[x-1]\[y] -> x: x-1 |
|||
else [ |
|||
if? ls\[x]\[y] = ls\[x]\[y-1] -> y: y-1 |
|||
else [ |
|||
result: a\[x-1] ++ result |
|||
x: x-1 |
|||
y: y-1 |
|||
] |
|||
] |
|||
] |
|||
return result |
|||
] |
|||
print lcs "1234" "1224533324" |
|||
print lcs "thisisatest" "testing123testing"</syntaxhighlight> |
|||
{{out}} |
|||
<pre>1234 |
|||
tsitest</pre> |
|||
=={{header|AutoHotkey}}== |
=={{header|AutoHotkey}}== |
||
{{trans|Java}} using dynamic programming<br> |
{{trans|Java}} using dynamic programming<br> |
||
ahk forum: [http://www.autohotkey.com/forum/viewtopic.php?t=44657&start=135 discussion] |
ahk forum: [http://www.autohotkey.com/forum/viewtopic.php?t=44657&start=135 discussion] |
||
< |
<syntaxhighlight lang="autohotkey">lcs(a,b) { ; Longest Common Subsequence of strings, using Dynamic Programming |
||
Loop % StrLen(a)+2 { ; Initialize |
Loop % StrLen(a)+2 { ; Initialize |
||
i := A_Index-1 |
i := A_Index-1 |
||
Line 171: | Line 320: | ||
} |
} |
||
Return t |
Return t |
||
}</ |
}</syntaxhighlight> |
||
=={{header|BASIC}}== |
=={{header|BASIC}}== |
||
==={{header|QuickBASIC}}=== |
|||
{{works with|QuickBasic|4.5}} |
{{works with|QuickBasic|4.5}} |
||
{{trans|Java}} |
{{trans|Java}} |
||
< |
<syntaxhighlight lang="qbasic">FUNCTION lcs$ (a$, b$) |
||
IF LEN(a$) = 0 OR LEN(b$) = 0 THEN |
IF LEN(a$) = 0 OR LEN(b$) = 0 THEN |
||
lcs$ = "" |
lcs$ = "" |
||
Line 192: | Line 340: | ||
END IF |
END IF |
||
END IF |
END IF |
||
END FUNCTION</ |
END FUNCTION</syntaxhighlight> |
||
=={{header|BASIC256}}== |
|||
{{trans|FreeBASIC}} |
|||
<syntaxhighlight lang="basic256">function LCS(a, b) |
|||
if length(a) = 0 or length(b) = 0 then return "" |
|||
if right(a, 1) = right(b, 1) then |
|||
LCS = LCS(left(a, length(a) - 1), left(b, length(b) - 1)) + right(a, 1) |
|||
else |
|||
x = LCS(a, left(b, length(b) - 1)) |
|||
y = LCS(left(a, length(a) - 1), b) |
|||
if length(x) > length(y) then return x else return y |
|||
end if |
|||
end function |
|||
print LCS("1234", "1224533324") |
|||
print LCS("thisisatest", "testing123testing") |
|||
end</syntaxhighlight> |
|||
=={{header|BBC BASIC}}== |
=={{header|BBC BASIC}}== |
||
This makes heavy use of BBC BASIC's shortcut '''LEFT$(a$)''' and '''RIGHT$(a$)''' functions. |
This makes heavy use of BBC BASIC's shortcut '''LEFT$(a$)''' and '''RIGHT$(a$)''' functions. |
||
< |
<syntaxhighlight lang="bbcbasic"> PRINT FNlcs("1234", "1224533324") |
||
PRINT FNlcs("thisisatest", "testing123testing") |
PRINT FNlcs("thisisatest", "testing123testing") |
||
END |
END |
||
Line 215: | Line 373: | ||
y$ = FNlcs(LEFT$(a$), b$) |
y$ = FNlcs(LEFT$(a$), b$) |
||
IF LEN(y$) > LEN(x$) SWAP x$,y$ |
IF LEN(y$) > LEN(x$) SWAP x$,y$ |
||
= x$</ |
= x$</syntaxhighlight> |
||
'''Output:''' |
'''Output:''' |
||
<pre> |
<pre> |
||
Line 221: | Line 379: | ||
tsitest |
tsitest |
||
</pre> |
</pre> |
||
=={{header|BQN}}== |
|||
It's easier and faster to get only the length of the longest common subsequence, using <code>LcsLen ← ¯1 ⊑ 0¨∘⊢ {𝕩⌈⌈`𝕨+»𝕩}˝ =⌜⟜⌽</code>. This function can be expanded by changing <code>⌈</code> to <code>⊣⍟(>○≠)</code> (choosing from two arguments one that has the greatest length), and replacing the empty length 0 with the empty string <code>""</code> in the right places. |
|||
<syntaxhighlight lang="bqn">LCS ← ¯1 ⊑ "" <⊸∾ ""¨∘⊢ ⊣⍟(>○≠){𝕩𝔽¨𝔽`𝕨∾¨""<⊸»𝕩}˝ (=⌜⥊¨⊣)⟜⌽</syntaxhighlight> |
|||
Output: |
|||
<syntaxhighlight lang="bqn"> "1234" LCS "1224533324" |
|||
"1234" |
|||
"thisisatest" LCS "testing123testing" |
|||
"tsitest"</syntaxhighlight> |
|||
=={{header|Bracmat}}== |
=={{header|Bracmat}}== |
||
< |
<syntaxhighlight lang="bracmat"> ( LCS |
||
= A a ta B b tb prefix |
= A a ta B b tb prefix |
||
. !arg:(?prefix.@(?A:%?a ?ta).@(?B:%?b ?tb)) |
. !arg:(?prefix.@(?A:%?a ?ta).@(?B:%?b ?tb)) |
||
Line 235: | Line 402: | ||
& :?lcs |
& :?lcs |
||
& LCS$(.thisisatest.testing123testing) |
& LCS$(.thisisatest.testing123testing) |
||
& out$(max !max lcs !lcs);</ |
& out$(max !max lcs !lcs);</syntaxhighlight> |
||
{{out}} |
{{out}} |
||
<pre>max 7 lcs t s i t e s t</pre> |
<pre>max 7 lcs t s i t e s t</pre> |
||
=={{header|C}}== |
=={{header|C}}== |
||
< |
<syntaxhighlight lang="c">#include <stdio.h> |
||
#include <stdlib.h> |
#include <stdlib.h> |
||
#include <stdio.h> |
|||
#define MAX( |
#define MAX(a, b) (a > b ? a : b) |
||
int lcs (char *a, int n, char *b, int m, char **s) { |
|||
int |
int i, j, k, t; |
||
int |
int *z = calloc((n + 1) * (m + 1), sizeof (int)); |
||
int **c = calloc((n + 1), sizeof (int *)); |
|||
for (i = 0; i <= n; i++) { |
|||
c[i] = &z[i * (m + 1)]; |
|||
} |
|||
for (i = 1; i <= n; i++) { |
|||
for (j = 1; j <= m; j++) { |
|||
const char *x, *y; |
|||
if (a[i - 1] == b[j - 1]) { |
|||
int *la = calloc(lena*lenb, sizeof( int)); |
|||
c[i][j] = c[i - 1][j - 1] + 1; |
|||
int **lengths = malloc( lena*sizeof( int*)); |
|||
for (i=0; i<lena; i++) lengths[i] = la + i*lenb; |
|||
for (i=0,x=a; *x; i++, x++) { |
|||
for (j=0,y=b; *y; j++,y++ ) { |
|||
if (*x == *y) { |
|||
lengths[i+1][j+1] = lengths[i][j] +1; |
|||
} |
} |
||
else { |
else { |
||
c[i][j] = MAX(c[i - 1][j], c[i][j - 1]); |
|||
lengths[i+1][j+1] = ml; |
|||
} |
} |
||
} |
} |
||
} |
} |
||
t = c[n][m]; |
|||
*s = malloc(t); |
|||
for (i = n, j = m, k = t - 1; k >= 0;) { |
|||
*--result = '\0'; |
|||
if (a[i - 1] == b[j - 1]) |
|||
(*s)[k] = a[i - 1], i--, j--, k--; |
|||
if ( |
else if (c[i][j - 1] > c[i - 1][j]) |
||
j--; |
|||
else |
else |
||
i--; |
|||
// assert( a[i-1] == b[j-1]); |
|||
*--result = a[i-1]; |
|||
i-=1; j-=1; |
|||
} |
|||
} |
} |
||
free( |
free(c); |
||
free(z); |
|||
return t; |
|||
}</lang> |
|||
} |
|||
</syntaxhighlight> |
|||
Testing |
Testing |
||
< |
<syntaxhighlight lang="c">int main () { |
||
char a[] = "thisisatest"; |
|||
{ |
|||
char b[] = "testing123testing"; |
|||
int n = sizeof a - 1; |
|||
int m = sizeof b - 1; |
|||
char *s = NULL; |
|||
int t = lcs(a, n, b, m, &s); |
|||
printf("%.*s\n", t, s); // tsitest |
|||
return 0; |
return 0; |
||
}</ |
}</syntaxhighlight> |
||
=={{header|C sharp|C#}}== |
|||
===With recursion=== |
===With recursion=== |
||
<syntaxhighlight lang="csharp">using System; |
|||
<lang c>#include <stdio.h> |
|||
#include <stdlib.h> |
|||
#include <string.h> |
|||
namespace LCS |
|||
char* lcs(const char *a, const char *b, char *out) |
|||
{ |
{ |
||
class Program |
|||
int longest = 0; |
|||
{ |
|||
int match(const char *a, const char *b, int dep) { |
|||
static void Main(string[] args) |
|||
if (!a || !b) return 0; |
|||
{ |
|||
string word1 = "thisisatest"; |
|||
if (dep <= longest) return 0; |
|||
string word2 = "testing123testing"; |
|||
out[ longest = dep ] = 0; |
|||
return 1; |
|||
Console.WriteLine(lcsBack(word1, word2)); |
|||
} |
|||
Console.ReadKey(); |
|||
} |
|||
public static string lcsBack(string a, string b) |
|||
if (*a == *b) |
|||
{ |
|||
return match(a + 1, b + 1, dep + 1) && (out[dep] = *a); |
|||
string aSub = a.Substring(0, (a.Length - 1 < 0) ? 0 : a.Length - 1); |
|||
string bSub = b.Substring(0, (b.Length - 1 < 0) ? 0 : b.Length - 1); |
|||
return match(a + 1, b + 1, dep) + |
|||
match(strchr(a, *b), b, dep) + |
|||
if (a.Length == 0 || b.Length == 0) |
|||
match(a, strchr(b, *a), dep); |
|||
return ""; |
|||
} |
|||
else if (a[a.Length - 1] == b[b.Length - 1]) |
|||
return lcsBack(aSub, bSub) + a[a.Length - 1]; |
|||
else |
|||
} |
|||
{ |
|||
string x = lcsBack(a, bSub); |
|||
int main() |
|||
string y = lcsBack(aSub, b); |
|||
{ |
|||
return (x.Length > y.Length) ? x : y; |
|||
char buf[128]; |
|||
} |
|||
printf("%s\n", lcs("thisisatest", "testing123testing", buf)); |
|||
} |
|||
printf("%p\n", lcs("no", "match", buf)); |
|||
} |
|||
return 0; |
|||
}</ |
}</syntaxhighlight> |
||
=={{header|C++}}== |
=={{header|C++}}== |
||
'''Hunt and Szymanski algorithm''' |
|||
<lang cpp> |
|||
<syntaxhighlight lang="cpp"> |
|||
#include <stdint.h> |
#include <stdint.h> |
||
#include <string> |
#include <string> |
||
#include <memory> // for shared_ptr<> |
|||
#include <iostream> |
#include <iostream> |
||
#include <deque> |
#include <deque> |
||
#include < |
#include <unordered_map> //[C++11] |
||
#include <map> |
|||
#include <algorithm> // for lower_bound() |
#include <algorithm> // for lower_bound() |
||
#include <iterator> // for next() and prev() |
|||
using namespace std; |
using namespace std; |
||
class |
class LCS { |
||
protected: |
protected: |
||
// |
// Instances of the Pair linked list class are used to recover the LCS: |
||
class Pair { |
class Pair { |
||
public: |
public: |
||
uint32_t index1; |
uint32_t index1; |
||
uint32_t index2; |
uint32_t index2; |
||
Pair |
shared_ptr<Pair> next; |
||
Pair(uint32_t index1, uint32_t index2, Pair |
Pair(uint32_t index1, uint32_t index2, shared_ptr<Pair> next = nullptr) |
||
: index1(index1), index2(index2), next(next) { |
: index1(index1), index2(index2), next(next) { |
||
} |
} |
||
static |
static shared_ptr<Pair> Reverse(const shared_ptr<Pair> pairs) { |
||
shared_ptr<Pair> head = nullptr; |
|||
for (auto next = pairs; next != nullptr; next = next->next) |
for (auto next = pairs; next != nullptr; next = next->next) |
||
head = make_shared<Pair>(next->index1, next->index2, head); |
|||
length++; |
|||
return length; |
|||
} |
|||
static Pair* Copy(const Pair* pairs) { |
|||
Pair* head = nullptr; |
|||
Pair* rest = nullptr; |
|||
for (auto next = pairs; next != nullptr; next = next->next) { |
|||
auto pair = new Pair(next->index1, next->index2); |
|||
rest = rest == nullptr ? head = pair : rest->next = pair; |
|||
} |
|||
return head; |
return head; |
||
} |
|||
// Non-mutating Reverse |
|||
static Pair* Reverse(const Pair* pairs) { |
|||
return MuReverse(Copy(pairs)); |
|||
} |
|||
// Mutating Reverse |
|||
static Pair* MuReverse(Pair* pairs) { |
|||
Pair* last = nullptr; // Return Value |
|||
while (pairs != nullptr) { |
|||
auto next = pairs->next; // save next |
|||
pairs->next = last; // update next |
|||
last = pairs; // update last |
|||
pairs = next; // advance |
|||
} |
|||
return last; |
|||
} |
} |
||
}; |
}; |
||
typedef deque<shared_ptr<Pair>> PAIRS; |
|||
// |
|||
// A small symbol set can generate a large O(m*n) number |
|||
// of pairs where two input sequences match, given sequence |
|||
// lengths m and n. This is the case in the Bioinformatics |
|||
// applications of nucleotide and protein sequencing. |
|||
// |
|||
// Here a "divide and conquer" approach devised by Hirschberg |
|||
// can limits the space required to O(m+n). However, the |
|||
// approach still requires O(m*n) time even in the best case. |
|||
// |
|||
// This quadratic time dependency may be prohibitive, given |
|||
// the great length of the input sequences. So, heuristics |
|||
// may be favored over optimal Dynamic Programming solutions. |
|||
// |
|||
// In the application of comparing source file revisions, |
|||
// records form a sparse symbol space and generate a linear |
|||
// O(m+n) number of pairs where the input sequences match. |
|||
// |
|||
// A binary search optimization due to Hunt and Szymanski |
|||
// can be applied in this case, which results in expected |
|||
// performance of O(s log s) where s = m+n. In the worst |
|||
// case, performance degrades to O(m*n log s) time as the |
|||
// number of matches [and the space required to represent |
|||
// them] grows to O(m*n). |
|||
// |
|||
// References: |
|||
// |
|||
//"A linear space algorithm for computing maximal common subsequences" |
|||
// by Daniel S. Hirschberg, published June 1975 |
|||
// Communications of the ACM [Volume 18, Number 6, pp. 341–343] |
|||
// |
|||
//"A Fast Algorithm for Computing Longest Common Subsequences" |
|||
// by James W. Hunt and Thomas G. Szymanski, published May 1977 |
|||
// Communications of the ACM [Volume 20, Number 5, pp. 350-353] |
|||
// |
|||
typedef vector<Pair*> PAIRS; |
|||
typedef deque<uint32_t> INDEXES; |
typedef deque<uint32_t> INDEXES; |
||
typedef |
typedef unordered_map<char, INDEXES> CHAR_TO_INDEXES_MAP; |
||
typedef |
typedef deque<INDEXES*> MATCHES; |
||
typedef vector<uint32_t> THRESHOLD; |
|||
static uint32_t FindLCS( |
|||
Pair* LCS(MATCHES& matches) { |
|||
MATCHES& indexesOf2MatchedByIndex1, shared_ptr<Pair>* pairs) { |
|||
PAIRS traces; |
|||
auto traceLCS = pairs != nullptr; |
|||
THRESHOLD threshold; |
|||
PAIRS chains; |
|||
INDEXES prefixEnd; |
|||
// |
// |
||
// |
//[Assert]After each index1 iteration prefixEnd[index3] is the least index2 |
||
// such that the LCS of s1[0:index1] and s2[0:index2] has length index3 + 1 |
// such that the LCS of s1[0:index1] and s2[0:index2] has length index3 + 1 |
||
// |
// |
||
uint32_t index1 = 0; |
uint32_t index1 = 0; |
||
for (const auto& it1 : |
for (const auto& it1 : indexesOf2MatchedByIndex1) { |
||
auto dq2 = *it1; |
|||
auto limit = prefixEnd.end(); |
|||
for (auto it2 = dq2.rbegin(); it2 != dq2.rend(); it2++) { |
|||
// Each index1, index2 pair corresponds to a match |
|||
auto index2 = *it2; |
|||
// |
|||
// Note: The reverse iterator it2 visits index2 values in descending order, |
|||
{ |
|||
// allowing in-place update of prefixEnd[]. std::lower_bound() is used to |
|||
auto index2 = it2; |
|||
// perform a binary search. |
|||
// |
|||
auto it = lower_bound(threshold.begin(), threshold.end(), index2); |
|||
limit = lower_bound(prefixEnd.begin(), limit, index2); |
|||
// |
|||
// Look ahead to the next index2 value to optimize Pairs used by the Hunt |
|||
// and Szymanski algorithm. If the next index2 is also an improvement on |
|||
// the value currently held in prefixEnd[index3], a new Pair will only be |
|||
// superseded on the next index2 iteration. |
|||
// |
|||
// Verify that a next index2 value exists; and that this value is greater |
|||
// than the final index2 value of the LCS prefix at prev(limit): |
|||
// |
|||
auto preferNextIndex2 = next(it2) != dq2.rend() && |
|||
(limit == prefixEnd.begin() || *prev(limit) < *next(it2)); |
|||
// |
|||
// Depending on match redundancy, this optimization may reduce the number |
|||
// of Pair allocations by factors ranging from 2 up to 10 or more. |
|||
// |
|||
if (preferNextIndex2) continue; |
|||
auto index3 = distance(prefixEnd.begin(), limit); |
|||
// insert case |
|||
if (limit == prefixEnd.end()) { |
|||
// Insert Case |
|||
auto prefix = index3 > 0 ? traces[index3 - 1] : nullptr; |
|||
prefixEnd.push_back(index2); |
|||
// Refresh limit iterator: |
|||
limit = prev(prefixEnd.end()); |
|||
if (traceLCS) { |
|||
chains.push_back(pushPair(chains, index3, index1, index2)); |
|||
} |
} |
||
} |
|||
else if (index2 < *limit) { |
|||
// Update Case |
|||
// Update limit value: |
|||
auto prefix = index3 > 0 ? traces[index3 - 1] : nullptr; |
|||
*limit = index2; |
|||
if (traceLCS) { |
|||
chains[index3] = pushPair(chains, index3, index1, index2); |
|||
} |
} |
||
} |
} |
||
} // next index2 |
|||
} |
|||
index1++; |
index1++; |
||
} // next index1 |
|||
if (traceLCS) { |
|||
// Return the LCS as a linked list of matched index pairs: |
|||
auto last = chains.empty() ? nullptr : chains.back(); |
|||
// Reverse longest chain |
|||
*pairs = Pair::Reverse(last); |
|||
} |
} |
||
auto length = prefixEnd.size(); |
|||
return length; |
|||
} |
} |
||
private: |
|||
static shared_ptr<Pair> pushPair( |
|||
PAIRS& chains, const ptrdiff_t& index3, uint32_t& index1, uint32_t& index2) { |
|||
auto prefix = index3 > 0 ? chains[index3 - 1] : nullptr; |
|||
return make_shared<Pair>(index1, index2, prefix); |
|||
} |
|||
protected: |
|||
// |
// |
||
// Match() avoids |
// Match() avoids m*n comparisons by using CHAR_TO_INDEXES_MAP to |
||
// |
// achieve O(m+n) performance, where m and n are the input lengths. |
||
// |
// |
||
// The |
// The lookup time can be assumed constant in the case of characters. |
||
// |
// The symbol space is larger in the case of records; but the lookup |
||
// time will be O(log(m+n)), at most. |
|||
// of characters. |
|||
// |
// |
||
static void Match( |
|||
uint32_t Match(CHAR2INDEXES& indexes, MATCHES& matches, |
|||
CHAR_TO_INDEXES_MAP& indexesOf2MatchedByChar, MATCHES& indexesOf2MatchedByIndex1, |
|||
const string& s1, const string& s2) { |
const string& s1, const string& s2) { |
||
uint32_t count = 0; |
|||
uint32_t index = 0; |
uint32_t index = 0; |
||
for (const auto& it : s2) |
for (const auto& it : s2) |
||
indexesOf2MatchedByChar[it].push_back(index++); |
|||
for (const auto& it : s1) { |
for (const auto& it : s1) { |
||
auto& |
auto& dq2 = indexesOf2MatchedByChar[it]; |
||
indexesOf2MatchedByIndex1.push_back(&dq2); |
|||
count += deque.size(); |
|||
} |
} |
||
} |
|||
static string Select(shared_ptr<Pair> pairs, uint32_t length, |
|||
return count; |
|||
bool right, const string& s1, const string& s2) { |
|||
string buffer; |
|||
buffer.reserve(length); |
|||
for (auto next = pairs; next != nullptr; next = next->next) { |
|||
auto c = right ? s2[next->index2] : s1[next->index1]; |
|||
buffer.push_back(c); |
|||
} |
|||
return buffer; |
|||
} |
} |
||
public: |
public: |
||
static string Correspondence(const string& s1, const string& s2) { |
|||
CHAR_TO_INDEXES_MAP indexesOf2MatchedByChar; |
|||
CHAR2INDEXES indexes; |
|||
MATCHES |
MATCHES indexesOf2MatchedByIndex1; // holds references into indexesOf2MatchedByChar |
||
Match(indexesOf2MatchedByChar, indexesOf2MatchedByIndex1, s1, s2); |
|||
shared_ptr<Pair> pairs; // obtain the LCS as index pairs |
|||
return LCS(matches); |
|||
auto length = FindLCS(indexesOf2MatchedByIndex1, &pairs); |
|||
return Select(pairs, length, false, s1, s2); |
|||
} |
} |
||
};</syntaxhighlight> |
|||
string Correspondence(const string& s1, const string& s2) { |
|||
auto pairs = Compare(s1, s2); |
|||
string buffer; |
|||
buffer.reserve(Pair::Length(pairs)); |
|||
for (auto next = pairs; next != nullptr; next = next->next) |
|||
buffer.push_back(s1[next->index1]); |
|||
return buffer; |
|||
} |
|||
}; |
|||
</lang> |
|||
'''Example''': |
'''Example''': |
||
< |
<syntaxhighlight lang="cpp"> |
||
auto s = LCS::Correspondence(s1, s2); |
|||
Diff diff; |
|||
cout << s << endl;</syntaxhighlight> |
|||
auto s = diff.Correspondence("thisisatest", "testing123testing"); |
|||
cout << s << endl; |
|||
</lang> |
|||
More fully featured examples are available at [https://github.com/CNHume/Samples/tree/master/C%2B%2B/LCS Samples/C++/LCS]. |
|||
=={{header|C sharp|C#}}== |
|||
<lang csharp>using System; |
|||
namespace LCS |
|||
{ |
|||
class Program |
|||
{ |
|||
static void Main(string[] args) |
|||
{ |
|||
string word1 = "thisisatest"; |
|||
string word2 = "testing123testing"; |
|||
Console.WriteLine(lcsBack(word1, word2)); |
|||
Console.ReadKey(); |
|||
} |
|||
public static string lcsBack(string a, string b) |
|||
{ |
|||
string aSub = a.Substring(0, (a.Length - 1 < 0) ? 0 : a.Length - 1); |
|||
string bSub = b.Substring(0, (b.Length - 1 < 0) ? 0 : b.Length - 1); |
|||
if (a.Length == 0 || b.Length == 0) |
|||
return ""; |
|||
else if (a[a.Length - 1] == b[b.Length - 1]) |
|||
return lcsBack(aSub, bSub) + a[a.Length - 1]; |
|||
else |
|||
{ |
|||
string x = lcsBack(a, bSub); |
|||
string y = lcsBack(aSub, b); |
|||
return (x.Length > y.Length) ? x : y; |
|||
} |
|||
} |
|||
} |
|||
}</lang> |
|||
=={{header|Clojure}}== |
=={{header|Clojure}}== |
||
Based on algorithm from Wikipedia. |
Based on algorithm from Wikipedia. |
||
< |
<syntaxhighlight lang="clojure">(defn longest [xs ys] (if (> (count xs) (count ys)) xs ys)) |
||
(def lcs |
(def lcs |
||
(memoize |
(memoize |
||
(fn [[x & xs] [y & ys]] |
(fn [[x & xs] [y & ys]] |
||
(cond |
(cond |
||
(or (= x nil) (= y nil) |
(or (= x nil) (= y nil)) nil |
||
(= x y) (cons x (lcs xs ys)) |
(= x y) (cons x (lcs xs ys)) |
||
:else (longest (lcs (cons x xs) ys) |
:else (longest (lcs (cons x xs) ys) |
||
(lcs xs (cons y ys)))))))</syntaxhighlight> |
|||
=={{header|CoffeeScript}}== |
=={{header|CoffeeScript}}== |
||
< |
<syntaxhighlight lang="coffeescript"> |
||
lcs = (s1, s2) -> |
lcs = (s1, s2) -> |
||
len1 = s1.length |
len1 = s1.length |
||
Line 601: | Line 712: | ||
s1 = "thisisatest" |
s1 = "thisisatest" |
||
s2 = "testing123testing" |
s2 = "testing123testing" |
||
console.log lcs(s1, s2)</ |
console.log lcs(s1, s2)</syntaxhighlight> |
||
=={{header|Common Lisp}}== |
=={{header|Common Lisp}}== |
||
Here's a memoizing/dynamic-programming solution that uses an <var>n × m</var> array where <var>n</var> and <var>m</var> are the lengths of the input arrays. The first return value is a sequence (of the same type as array1) which is the longest common subsequence. The second return value is the length of the longest common subsequence. |
Here's a memoizing/dynamic-programming solution that uses an <var>n × m</var> array where <var>n</var> and <var>m</var> are the lengths of the input arrays. The first return value is a sequence (of the same type as array1) which is the longest common subsequence. The second return value is the length of the longest common subsequence. |
||
< |
<syntaxhighlight lang="lisp">(defun longest-common-subsequence (array1 array2) |
||
(let* ((l1 (length array1)) |
(let* ((l1 (length array1)) |
||
(l2 (length array2)) |
(l2 (length array2)) |
||
Line 632: | Line 743: | ||
b))))))))) |
b))))))))) |
||
(destructuring-bind (seq len) (lcs 0 0) |
(destructuring-bind (seq len) (lcs 0 0) |
||
(values (coerce seq (type-of array1)) len)))))</ |
(values (coerce seq (type-of array1)) len)))))</syntaxhighlight> |
||
For example, |
For example, |
||
< |
<syntaxhighlight lang="lisp">(longest-common-subsequence "123456" "1a2b3c")</syntaxhighlight> |
||
produces the two values |
produces the two values |
||
< |
<syntaxhighlight lang="lisp">"123" |
||
3</ |
3</syntaxhighlight> |
||
===An alternative adopted from Clojure=== |
===An alternative adopted from Clojure=== |
||
Line 647: | Line 758: | ||
Here is another version with its own memoization macro: |
Here is another version with its own memoization macro: |
||
< |
<syntaxhighlight lang="lisp">(defmacro mem-defun (name args body) |
||
(let ((hash-name (gensym))) |
(let ((hash-name (gensym))) |
||
`(let ((,hash-name (make-hash-table :test 'equal))) |
`(let ((,hash-name (make-hash-table :test 'equal))) |
||
Line 660: | Line 771: | ||
((equal (car xs) (car ys)) (cons (car xs) (lcs (cdr xs) (cdr ys)))) |
((equal (car xs) (car ys)) (cons (car xs) (lcs (cdr xs) (cdr ys)))) |
||
(t (longer (lcs (cdr xs) ys) |
(t (longer (lcs (cdr xs) ys) |
||
(lcs xs (cdr ys)))))))</ |
(lcs xs (cdr ys)))))))</syntaxhighlight> |
||
When we test it, we get: |
When we test it, we get: |
||
< |
<syntaxhighlight lang="lisp">(coerce (lcs (coerce "thisisatest" 'list) (coerce "testing123testing" 'list)) 'string)))) |
||
"tsitest"</ |
"tsitest"</syntaxhighlight> |
||
=={{header|D}}== |
=={{header|D}}== |
||
Line 672: | Line 783: | ||
===Recursive version=== |
===Recursive version=== |
||
< |
<syntaxhighlight lang="d">import std.stdio, std.array; |
||
T[] lcs(T)(in T[] a, in T[] b) pure nothrow @safe { |
T[] lcs(T)(in T[] a, in T[] b) pure nothrow @safe { |
||
Line 684: | Line 795: | ||
void main() { |
void main() { |
||
lcs("thisisatest", "testing123testing").writeln; |
lcs("thisisatest", "testing123testing").writeln; |
||
}</ |
}</syntaxhighlight> |
||
{{out}} |
{{out}} |
||
<pre>tsitest</pre> |
<pre>tsitest</pre> |
||
Line 690: | Line 801: | ||
===Faster dynamic programming version=== |
===Faster dynamic programming version=== |
||
The output is the same. |
The output is the same. |
||
< |
<syntaxhighlight lang="d">import std.stdio, std.algorithm, std.traits; |
||
T[] lcs(T)(in T[] a, in T[] b) pure /*nothrow*/ { |
T[] lcs(T)(in T[] a, in T[] b) pure /*nothrow*/ { |
||
Line 719: | Line 830: | ||
void main() { |
void main() { |
||
lcs("thisisatest", "testing123testing").writeln; |
lcs("thisisatest", "testing123testing").writeln; |
||
}</ |
}</syntaxhighlight> |
||
===Hirschberg algorithm version=== |
===Hirschberg algorithm version=== |
||
Line 728: | Line 839: | ||
Adapted from Python code: http://wordaligned.org/articles/longest-common-subsequence |
Adapted from Python code: http://wordaligned.org/articles/longest-common-subsequence |
||
< |
<syntaxhighlight lang="d">import std.stdio, std.algorithm, std.range, std.array, std.string, std.typecons; |
||
uint[] lensLCS(R)(R xs, R ys) pure nothrow @safe { |
uint[] lensLCS(R)(R xs, R ys) pure nothrow @safe { |
||
Line 788: | Line 899: | ||
void main() { |
void main() { |
||
lcsString("thisisatest", "testing123testing").writeln; |
lcsString("thisisatest", "testing123testing").writeln; |
||
}</ |
}</syntaxhighlight> |
||
=={{header|Dart}}== |
=={{header|Dart}}== |
||
< |
<syntaxhighlight lang="dart">import 'dart:math'; |
||
String lcsRecursion(String a, String b) { |
String lcsRecursion(String a, String b) { |
||
Line 858: | Line 969: | ||
print("lcsRecursion('x', 'x') = ${lcsRecursion('x', 'x')}"); |
print("lcsRecursion('x', 'x') = ${lcsRecursion('x', 'x')}"); |
||
} |
} |
||
</syntaxhighlight> |
|||
</lang> |
|||
{{out}} |
{{out}} |
||
<pre>lcsDynamic('1234', '1224533324') = 1234 |
<pre>lcsDynamic('1234', '1224533324') = 1234 |
||
Line 869: | Line 980: | ||
lcsRecursion('', 'x') = |
lcsRecursion('', 'x') = |
||
lcsRecursion('x', 'x') = x</pre> |
lcsRecursion('x', 'x') = x</pre> |
||
=={{header|EasyLang}}== |
|||
{{trans|BASIC256}} |
|||
<syntaxhighlight> |
|||
func$ right a$ n . |
|||
return substr a$ (len a$ - n + 1) n |
|||
. |
|||
func$ left a$ n . |
|||
if n < 0 |
|||
n = len a$ + n |
|||
. |
|||
return substr a$ 1 n |
|||
. |
|||
func$ lcs a$ b$ . |
|||
if len a$ = 0 or len b$ = 0 |
|||
return "" |
|||
. |
|||
if right a$ 1 = right b$ 1 |
|||
return lcs left a$ -1 left b$ -1 & right a$ 1 |
|||
. |
|||
x$ = lcs a$ left b$ -1 |
|||
y$ = lcs left a$ -1 b$ |
|||
if len x$ > len y$ |
|||
return x$ |
|||
else |
|||
return y$ |
|||
. |
|||
. |
|||
print lcs "1234" "1224533324" |
|||
print lcs "thisisatest" "testing123testing" |
|||
</syntaxhighlight> |
|||
{{out}} |
|||
<pre> |
|||
1234 |
|||
tsitest |
|||
</pre> |
|||
=={{header|Egison}}== |
=={{header|Egison}}== |
||
< |
<syntaxhighlight lang="egison"> |
||
(define $common-seqs |
(define $common-seqs |
||
(lambda [$xs $ys] |
(lambda [$xs $ys] |
||
Line 881: | Line 1,029: | ||
(define $lcs (compose common-seqs rac)) |
(define $lcs (compose common-seqs rac)) |
||
</syntaxhighlight> |
|||
</lang> |
|||
'''Output:''' |
'''Output:''' |
||
< |
<syntaxhighlight lang="egison"> |
||
> (lcs "thisisatest" "testing123testing")) |
> (lcs "thisisatest" "testing123testing")) |
||
"tsitest" |
"tsitest" |
||
</syntaxhighlight> |
|||
</lang> |
|||
=={{header|Elixir}}== |
|||
{{works with|Elixir|1.3}} |
|||
===Simple recursion=== |
|||
This solution is Brute force. It is slow |
|||
{{trans|Ruby}} |
|||
<syntaxhighlight lang="elixir">defmodule LCS do |
|||
def lcs(a, b) do |
|||
lcs(to_charlist(a), to_charlist(b), []) |> to_string |
|||
end |
|||
defp lcs([h|at], [h|bt], res), do: lcs(at, bt, [h|res]) |
|||
defp lcs([_|at]=a, [_|bt]=b, res) do |
|||
Enum.max_by([lcs(a, bt, res), lcs(at, b, res)], &length/1) |
|||
end |
|||
defp lcs(_, _, res), do: res |> Enum.reverse |
|||
end |
|||
IO.puts LCS.lcs("thisisatest", "testing123testing") |
|||
IO.puts LCS.lcs('1234','1224533324')</syntaxhighlight> |
|||
===Dynamic Programming=== |
|||
{{trans|Erlang}} |
|||
<syntaxhighlight lang="elixir">defmodule LCS do |
|||
def lcs_length(s,t), do: lcs_length(s,t,Map.new) |> elem(0) |
|||
defp lcs_length([],t,cache), do: {0,Map.put(cache,{[],t},0)} |
|||
defp lcs_length(s,[],cache), do: {0,Map.put(cache,{s,[]},0)} |
|||
defp lcs_length([h|st]=s,[h|tt]=t,cache) do |
|||
{l,c} = lcs_length(st,tt,cache) |
|||
{l+1,Map.put(c,{s,t},l+1)} |
|||
end |
|||
defp lcs_length([_sh|st]=s,[_th|tt]=t,cache) do |
|||
if Map.has_key?(cache,{s,t}) do |
|||
{Map.get(cache,{s,t}),cache} |
|||
else |
|||
{l1,c1} = lcs_length(s,tt,cache) |
|||
{l2,c2} = lcs_length(st,t,c1) |
|||
l = max(l1,l2) |
|||
{l,Map.put(c2,{s,t},l)} |
|||
end |
|||
end |
|||
def lcs(s,t) do |
|||
{s,t} = {to_charlist(s),to_charlist(t)} |
|||
{_,c} = lcs_length(s,t,Map.new) |
|||
lcs(s,t,c,[]) |> to_string |
|||
end |
|||
defp lcs([],_,_,acc), do: Enum.reverse(acc) |
|||
defp lcs(_,[],_,acc), do: Enum.reverse(acc) |
|||
defp lcs([h|st],[h|tt],cache,acc), do: lcs(st,tt,cache,[h|acc]) |
|||
defp lcs([_sh|st]=s,[_th|tt]=t,cache,acc) do |
|||
if Map.get(cache,{s,tt}) > Map.get(cache,{st,t}) do |
|||
lcs(s,tt,cache,acc) |
|||
else |
|||
lcs(st,t,cache,acc) |
|||
end |
|||
end |
|||
end |
|||
IO.puts LCS.lcs("thisisatest","testing123testing") |
|||
IO.puts LCS.lcs("1234","1224533324")</syntaxhighlight> |
|||
{{out}} |
|||
<pre> |
|||
tsitest |
|||
1234 |
|||
</pre> |
|||
Referring to LCS [[Shortest common supersequence#Elixir|here]]. |
|||
=={{header|Erlang}}== |
=={{header|Erlang}}== |
||
This implementation also includes the ability to calculate the length of the longest common subsequence. In calculating that length, we generate a cache which can be traversed to generate the longest common subsequence. |
This implementation also includes the ability to calculate the length of the longest common subsequence. In calculating that length, we generate a cache which can be traversed to generate the longest common subsequence. |
||
< |
<syntaxhighlight lang="erlang"> |
||
module(lcs). |
module(lcs). |
||
-compile(export_all). |
-compile(export_all). |
||
Line 932: | Line 1,150: | ||
lcs(ST,T,Cache,Acc) |
lcs(ST,T,Cache,Acc) |
||
end. |
end. |
||
</syntaxhighlight> |
|||
</lang> |
|||
'''Output:''' |
'''Output:''' |
||
< |
<syntaxhighlight lang="erlang"> |
||
77> lcs:lcs("thisisatest","testing123testing"). |
77> lcs:lcs("thisisatest","testing123testing"). |
||
"tsitest" |
"tsitest" |
||
78> lcs:lcs("1234","1224533324"). |
78> lcs:lcs("1234","1224533324"). |
||
"1234" |
"1234" |
||
</syntaxhighlight> |
|||
</lang> |
|||
We can also use the process dictionary to memoize the recursive implementation: |
We can also use the process dictionary to memoize the recursive implementation: |
||
< |
<syntaxhighlight lang="erlang"> |
||
lcs(Xs0, Ys0) -> |
lcs(Xs0, Ys0) -> |
||
CacheKey = {lcs_cache, Xs0, Ys0}, |
CacheKey = {lcs_cache, Xs0, Ys0}, |
||
Line 967: | Line 1,185: | ||
Result |
Result |
||
end. |
end. |
||
</syntaxhighlight> |
|||
</lang> |
|||
Similar to the above, but without using the process dictionary: |
|||
<syntaxhighlight lang="erlang"> |
|||
-module(lcs). |
|||
%% API exports |
|||
-export([ |
|||
lcs/2 |
|||
]). |
|||
%%==================================================================== |
|||
%% API functions |
|||
%%==================================================================== |
|||
lcs(A, B) -> |
|||
{LCS, _Cache} = get_lcs(A, B, [], #{}), |
|||
lists:reverse(LCS). |
|||
%%==================================================================== |
|||
%% Internal functions |
|||
%%===================================================== |
|||
get_lcs(A, B, Acc, Cache) -> |
|||
case maps:find({A, B, Acc}, Cache) of |
|||
{ok, LCS} -> {LCS, Cache}; |
|||
error -> |
|||
{NewLCS, NewCache} = compute_lcs(A, B, Acc, Cache), |
|||
{NewLCS, NewCache#{ {A, B, Acc} => NewLCS }} |
|||
end. |
|||
compute_lcs(A, B, Acc, Cache) when length(A) == 0 orelse length(B) == 0 -> |
|||
{Acc, Cache}; |
|||
compute_lcs([Token |ATail], [Token |BTail], Acc, Cache) -> |
|||
get_lcs(ATail, BTail, [Token |Acc], Cache); |
|||
compute_lcs([_AToken |ATail]=A, [_BToken |BTail]=B, Acc, Cache) -> |
|||
{LCSA, CacheA} = get_lcs(A, BTail, Acc, Cache), |
|||
{LCSB, CacheB} = get_lcs(ATail, B, Acc, CacheA), |
|||
LCS = case length(LCSA) > length(LCSB) of |
|||
true -> LCSA; |
|||
false -> LCSB |
|||
end, |
|||
{LCS, CacheB}. |
|||
</syntaxhighlight> |
|||
'''Output:''' |
|||
<syntaxhighlight lang="erlang"> |
|||
48> lcs:lcs("thisisatest", "testing123testing"). |
|||
"tsitest" |
|||
</syntaxhighlight> |
|||
=={{header|F Sharp|F#}}== |
|||
Copied and slightly adapted from OCaml (direct recursion) |
|||
<syntaxhighlight lang="fsharp">open System |
|||
let longest xs ys = if List.length xs > List.length ys then xs else ys |
|||
let rec lcs a b = |
|||
match a, b with |
|||
| [], _ |
|||
| _, [] -> [] |
|||
| x::xs, y::ys -> |
|||
if x = y then |
|||
x :: lcs xs ys |
|||
else |
|||
longest (lcs a ys) (lcs xs b) |
|||
[<EntryPoint>] |
|||
let main argv = |
|||
let split (str:string) = List.init str.Length (fun i -> str.[i]) |
|||
printfn "%A" (String.Join("", |
|||
(lcs (split "thisisatest") (split "testing123testing")))) |
|||
0 |
|||
</syntaxhighlight> |
|||
=={{header|Factor}}== |
|||
<syntaxhighlight lang="factor">USE: lcs |
|||
"thisisatest" "testing123testing" lcs print</syntaxhighlight> |
|||
{{out}} |
|||
<pre> |
|||
tsitest |
|||
</pre> |
|||
=={{header|Fortran}}== |
=={{header|Fortran}}== |
||
Line 973: | Line 1,272: | ||
Using the <tt>iso_varying_string</tt> module which can be found [http://www.fortran.com/iso_varying_string.f95 here] (or equivalent module conforming to the ISO/IEC 1539-2:2000 API or to a subset according to the need of this code: <code>char</code>, <code>len</code>, <code>//</code>, <code>extract</code>, <code>==</code>, <code>=</code>) |
Using the <tt>iso_varying_string</tt> module which can be found [http://www.fortran.com/iso_varying_string.f95 here] (or equivalent module conforming to the ISO/IEC 1539-2:2000 API or to a subset according to the need of this code: <code>char</code>, <code>len</code>, <code>//</code>, <code>extract</code>, <code>==</code>, <code>=</code>) |
||
< |
<syntaxhighlight lang="fortran">program lcstest |
||
use iso_varying_string |
use iso_varying_string |
||
implicit none |
implicit none |
||
Line 1,010: | Line 1,309: | ||
end function lcs |
end function lcs |
||
end program lcstest</ |
end program lcstest</syntaxhighlight> |
||
=={{header|F Sharp|F#}}== |
|||
Copied and slightly adapted from OCaml (direct recursion) |
|||
<lang fsharp>open System |
|||
let longest xs ys = if List.length xs > List.length ys then xs else ys |
|||
let rec lcs a b = |
|||
match a, b with |
|||
| [], _ |
|||
| _, [] -> [] |
|||
| x::xs, y::ys -> |
|||
if x = y then |
|||
x :: lcs xs ys |
|||
else |
|||
longest (lcs a ys) (lcs xs b) |
|||
=={{header|FreeBASIC}}== |
|||
[<EntryPoint>] |
|||
<syntaxhighlight lang="freebasic">Function LCS(a As String, b As String) As String |
|||
let main argv = |
|||
Dim As String x, y |
|||
let split (str:string) = List.init str.Length (fun i -> str.[i]) |
|||
If Len(a) = 0 Or Len(b) = 0 Then |
|||
printfn "%A" (String.Join("", |
|||
Return "" |
|||
(lcs (split "thisisatest") (split "testing123testing")))) |
|||
Elseif Right(a, 1) = Right(b, 1) Then |
|||
0 |
|||
LCS = LCS(Left(a, Len(a) - 1), Left(b, Len(b) - 1)) + Right(a, 1) |
|||
</lang> |
|||
Else |
|||
x = LCS(a, Left(b, Len(b) - 1)) |
|||
y = LCS(Left(a, Len(a) - 1), b) |
|||
If Len(x) > Len(y) Then Return x Else Return y |
|||
End If |
|||
End Function |
|||
Print LCS("1234", "1224533324") |
|||
Print LCS("thisisatest", "testing123testing") |
|||
Sleep</syntaxhighlight> |
|||
=={{header|Go}}== |
=={{header|Go}}== |
||
Line 1,039: | Line 1,335: | ||
===Recursion=== |
===Recursion=== |
||
Brute force |
Brute force |
||
< |
<syntaxhighlight lang="go">func lcs(a, b string) string { |
||
aLen := len(a) |
aLen := len(a) |
||
bLen := len(b) |
bLen := len(b) |
||
Line 1,053: | Line 1,349: | ||
} |
} |
||
return y |
return y |
||
}</ |
}</syntaxhighlight> |
||
===Dynamic Programming=== |
===Dynamic Programming=== |
||
< |
<syntaxhighlight lang="go">func lcs(a, b string) string { |
||
arunes := []rune(a) |
arunes := []rune(a) |
||
brunes := []rune(b) |
brunes := []rune(b) |
||
Line 1,097: | Line 1,393: | ||
} |
} |
||
return string(s) |
return string(s) |
||
}</ |
}</syntaxhighlight> |
||
=={{header|Groovy}}== |
|||
Recursive solution: |
|||
<syntaxhighlight lang="groovy">def lcs(xstr, ystr) { |
|||
if (xstr == "" || ystr == "") { |
|||
return ""; |
|||
} |
|||
def x = xstr[0]; |
|||
def y = ystr[0]; |
|||
def xs = xstr.size() > 1 ? xstr[1..-1] : ""; |
|||
def ys = ystr.size() > 1 ? ystr[1..-1] : ""; |
|||
if (x == y) { |
|||
return (x + lcs(xs, ys)); |
|||
} |
|||
def lcs1 = lcs(xstr, ys); |
|||
def lcs2 = lcs(xs, ystr); |
|||
lcs1.size() > lcs2.size() ? lcs1 : lcs2; |
|||
} |
|||
println(lcs("1234", "1224533324")); |
|||
println(lcs("thisisatest", "testing123testing"));</syntaxhighlight> |
|||
{{out}} |
|||
<pre>1234 |
|||
tsitest</pre> |
|||
=={{header|Haskell}}== |
=={{header|Haskell}}== |
||
Line 1,103: | Line 1,428: | ||
The [[wp:Longest_common_subsequence#Solution_for_two_sequences|Wikipedia solution]] translates directly into Haskell, with the only difference that equal characters are added in front: |
The [[wp:Longest_common_subsequence#Solution_for_two_sequences|Wikipedia solution]] translates directly into Haskell, with the only difference that equal characters are added in front: |
||
< |
<syntaxhighlight lang="haskell">longest xs ys = if length xs > length ys then xs else ys |
||
lcs [] _ = [] |
lcs [] _ = [] |
||
Line 1,109: | Line 1,434: | ||
lcs (x:xs) (y:ys) |
lcs (x:xs) (y:ys) |
||
| x == y = x : lcs xs ys |
| x == y = x : lcs xs ys |
||
| otherwise = longest (lcs (x:xs) ys) (lcs xs (y:ys))</ |
| otherwise = longest (lcs (x:xs) ys) (lcs xs (y:ys))</syntaxhighlight> |
||
A Memoized version of the naive algorithm. |
A Memoized version of the naive algorithm. |
||
< |
<syntaxhighlight lang="haskell">import qualified Data.MemoCombinators as M |
||
lcs = memoize lcsm |
lcs = memoize lcsm |
||
Line 1,125: | Line 1,450: | ||
maxl x y = if length x > length y then x else y |
maxl x y = if length x > length y then x else y |
||
memoize = M.memo2 mString mString |
memoize = M.memo2 mString mString |
||
mString = M.list M.char -- Chars, but you can specify any type you need for the memo</ |
mString = M.list M.char -- Chars, but you can specify any type you need for the memo</syntaxhighlight> |
||
Memoization (aka dynamic programming) of that uses ''zip'' to make both the index and the character available: |
Memoization (aka dynamic programming) of that uses ''zip'' to make both the index and the character available: |
||
< |
<syntaxhighlight lang="haskell">import Data.Array |
||
lcs xs ys = a!(0,0) where |
lcs xs ys = a!(0,0) where |
||
Line 1,140: | Line 1,465: | ||
f x y i j |
f x y i j |
||
| x == y = x : a!(i+1,j+1) |
| x == y = x : a!(i+1,j+1) |
||
| otherwise = longest (a!(i,j+1)) (a!(i+1,j))</ |
| otherwise = longest (a!(i,j+1)) (a!(i+1,j))</syntaxhighlight> |
||
All 3 solutions work of course not only with strings, but also with any other list. Example: |
All 3 solutions work of course not only with strings, but also with any other list. Example: |
||
< |
<syntaxhighlight lang="haskell">*Main> lcs "thisisatest" "testing123testing" |
||
"tsitest"</ |
"tsitest"</syntaxhighlight> |
||
The dynamic programming version without using arrays: |
The dynamic programming version without using arrays: |
||
< |
<syntaxhighlight lang="haskell">import Data.List |
||
longest xs ys = if length xs > length ys then xs else ys |
longest xs ys = if length xs > length ys then xs else ys |
||
Line 1,154: | Line 1,479: | ||
f [a,b] [c,d] |
f [a,b] [c,d] |
||
| null a = longest b c: [b] |
| null a = longest b c: [b] |
||
| otherwise = (a++d):[b]</ |
| otherwise = (a++d):[b]</syntaxhighlight> |
||
Simple and slow solution: |
Simple and slow solution: |
||
< |
<syntaxhighlight lang="haskell">import Data.Ord |
||
import Data.List |
import Data.List |
||
Line 1,165: | Line 1,490: | ||
lcs xs ys = maximumBy (comparing length) $ intersect (subsequences xs) (subsequences ys) |
lcs xs ys = maximumBy (comparing length) $ intersect (subsequences xs) (subsequences ys) |
||
main = print $ lcs "thisisatest" "testing123testing"</ |
main = print $ lcs "thisisatest" "testing123testing"</syntaxhighlight> |
||
{{out}} |
{{out}} |
||
<pre>"tsitest"</pre> |
<pre>"tsitest"</pre> |
||
Line 1,174: | Line 1,499: | ||
{{libheader|Icon Programming Library}} [http://www.cs.arizona.edu/icon/library/src/procs/strings.icn Uses deletec from strings] |
{{libheader|Icon Programming Library}} [http://www.cs.arizona.edu/icon/library/src/procs/strings.icn Uses deletec from strings] |
||
< |
<syntaxhighlight lang="icon">procedure main() |
||
LCSTEST("thisisatest","testing123testing") |
LCSTEST("thisisatest","testing123testing") |
||
LCSTEST("","x") |
LCSTEST("","x") |
||
Line 1,209: | Line 1,534: | ||
return if *(x := lcs(a,b[1:-1])) > *(y := lcs(a[1:-1],b)) then x else y # divide, discard, and keep longest |
return if *(x := lcs(a,b[1:-1])) > *(y := lcs(a[1:-1],b)) then x else y # divide, discard, and keep longest |
||
end</ |
end</syntaxhighlight> |
||
{{out}} |
{{out}} |
||
<pre>lcs( "thisisatest", "testing123testing" ) = "tsitest" |
<pre>lcs( "thisisatest", "testing123testing" ) = "tsitest" |
||
Line 1,217: | Line 1,542: | ||
=={{header|J}}== |
=={{header|J}}== |
||
< |
<syntaxhighlight lang="j">lcs=: dyad define |
||
|.x{~ 0{"1 cullOne^:_ (\: +/"1)(\:{."1) 4$.$. x =/ y |
|.x{~ 0{"1 cullOne^:_ (\: +/"1)(\:{."1) 4$.$. x =/ y |
||
) |
) |
||
cullOne=: ({~[: <@<@< [: (i. 0:)1,[: *./[: |: 2>/\]) :: ]</ |
cullOne=: ({~[: <@<@< [: (i. 0:)1,[: *./[: |: 2>/\]) :: ]</syntaxhighlight> |
||
Here's [[Longest_common_subsequence/J|another approach]]: |
Here's [[Longest_common_subsequence/J|another approach]]: |
||
< |
<syntaxhighlight lang="j">mergeSq=: ;@}: ~.@, {.@;@{. ,&.> 3 {:: 4&{. |
||
common=: 2 2 <@mergeSq@,;.3^:_ [: (<@#&.> i.@$) =/ |
common=: 2 2 <@mergeSq@,;.3^:_ [: (<@#&.> i.@$) =/ |
||
lcs=: [ {~ 0 {"1 ,&$ #: 0 ({:: (#~ [: (= >./) #@>)) 0 ({:: ,) common</ |
lcs=: [ {~ 0 {"1 ,&$ #: 0 ({:: (#~ [: (= >./) #@>)) 0 ({:: ,) common</syntaxhighlight> |
||
Example use (works with either definition of lcs): |
Example use (works with either definition of lcs): |
||
< |
<syntaxhighlight lang="j"> 'thisisatest' lcs 'testing123testing' |
||
tsitest</ |
tsitest</syntaxhighlight> |
||
'''Dynamic programming version''' |
'''Dynamic programming version''' |
||
< |
<syntaxhighlight lang="j">longest=: ]`[@.(>&#) |
||
upd=:{:@[,~ ({.@[ ,&.> {:@])`({:@[ longest&.> {.@])@.(0 = #&>@{.@[) |
upd=:{:@[,~ ({.@[ ,&.> {:@])`({:@[ longest&.> {.@])@.(0 = #&>@{.@[) |
||
lcs=: 0{:: [: ([: {.&> [: upd&.>/\.<"1@:,.)/ a:,.~a:,~=/{"1 a:,.<"0@[</ |
lcs=: 0{:: [: ([: {.&> [: upd&.>/\.<"1@:,.)/ a:,.~a:,~=/{"1 a:,.<"0@[</syntaxhighlight> |
||
'''Output:''' |
'''Output:''' |
||
< |
<syntaxhighlight lang="j"> '1234' lcs '1224533324' |
||
1234 |
1234 |
||
'thisisatest' lcs 'testing123testing' |
'thisisatest' lcs 'testing123testing' |
||
tsitest</ |
tsitest</syntaxhighlight> |
||
'''Recursion''' |
'''Recursion''' |
||
< |
<syntaxhighlight lang="j">lcs=:;(($:}.) longest }.@[ $: ])`({.@[,$:&}.)@.(=&{.)`((i.0)"_)@.(+.&(0=#))&((e.#[)&>/) ;~</syntaxhighlight> |
||
=={{header|Java}}== |
=={{header|Java}}== |
||
===Recursion=== |
===Recursion=== |
||
This is not a particularly fast algorithm, but it gets the job done eventually. The speed is a result of many recursive function calls. |
This is not a particularly fast algorithm, but it gets the job done eventually. The speed is a result of many recursive function calls. |
||
< |
<syntaxhighlight lang="java">public static String lcs(String a, String b){ |
||
int aLen = a.length(); |
int aLen = a.length(); |
||
int bLen = b.length(); |
int bLen = b.length(); |
||
Line 1,264: | Line 1,589: | ||
return (x.length() > y.length()) ? x : y; |
return (x.length() > y.length()) ? x : y; |
||
} |
} |
||
}</ |
}</syntaxhighlight> |
||
===Dynamic Programming=== |
===Dynamic Programming=== |
||
< |
<syntaxhighlight lang="java">public static String lcs(String a, String b) { |
||
int[][] lengths = new int[a.length()+1][b.length()+1]; |
int[][] lengths = new int[a.length()+1][b.length()+1]; |
||
Line 1,297: | Line 1,622: | ||
return sb.reverse().toString(); |
return sb.reverse().toString(); |
||
}</ |
}</syntaxhighlight> |
||
=={{header|JavaScript}}== |
=={{header|JavaScript}}== |
||
Line 1,303: | Line 1,628: | ||
{{trans|Java}} |
{{trans|Java}} |
||
This is more or less a translation of the recursive Java version above. |
This is more or less a translation of the recursive Java version above. |
||
< |
<syntaxhighlight lang="javascript">function lcs(a, b) { |
||
var aSub = a.substr(0, a.length-1); |
var aSub = a.substr(0, a.length - 1); |
||
var bSub = b.substr(0, b.length-1); |
var bSub = b.substr(0, b.length - 1); |
||
if (a.length == 0 || b.length == 0) { |
if (a.length === 0 || b.length === 0) { |
||
return |
return ''; |
||
} else if (a.charAt(a.length-1) == b.charAt(b.length-1)) { |
} else if (a.charAt(a.length - 1) === b.charAt(b.length - 1)) { |
||
return lcs(aSub, bSub) + a.charAt(a.length-1); |
return lcs(aSub, bSub) + a.charAt(a.length - 1); |
||
} else { |
} else { |
||
var x = lcs(a, bSub); |
var x = lcs(a, bSub); |
||
Line 1,316: | Line 1,641: | ||
return (x.length > y.length) ? x : y; |
return (x.length > y.length) ? x : y; |
||
} |
} |
||
}</ |
}</syntaxhighlight> |
||
ES6 recursive implementation |
|||
<syntaxhighlight lang="javascript"> |
|||
const longest = (xs, ys) => (xs.length > ys.length) ? xs : ys; |
|||
const lcs = (xx, yy) => { |
|||
if (!xx.length || !yy.length) { return ''; } |
|||
const [x, ...xs] = xx; |
|||
const [y, ...ys] = yy; |
|||
return (x === y) ? (x + lcs(xs, ys)) : longest(lcs(xx, ys), lcs(xs, yy)); |
|||
};</syntaxhighlight> |
|||
===Dynamic Programming=== |
===Dynamic Programming=== |
||
This version runs in O(mn) time and consumes O(mn) space. |
This version runs in O(mn) time and consumes O(mn) space. |
||
Factoring out loop edge cases could get a small constant time improvement, and it's fairly trivial to edit the final loop to produce a full diff in addition to the lcs. |
Factoring out loop edge cases could get a small constant time improvement, and it's fairly trivial to edit the final loop to produce a full diff in addition to the lcs. |
||
< |
<syntaxhighlight lang="javascript">function lcs(x,y){ |
||
var s,i,j,m,n, |
var s,i,j,m,n, |
||
lcs=[],row=[],c=[], |
lcs=[],row=[],c=[], |
||
Line 1,354: | Line 1,694: | ||
} |
} |
||
return lcs.join(''); |
return lcs.join(''); |
||
}</ |
}</syntaxhighlight> |
||
'''BUG note: In line 6, m and n are not yet initialized, and so x and y are never swapped.''' |
'''BUG note: In line 6, m and n are not yet initialized, and so x and y are never swapped.''' |
||
Line 1,360: | Line 1,700: | ||
The final loop can be modified to concatenate maximal common substrings rather than individual characters: |
The final loop can be modified to concatenate maximal common substrings rather than individual characters: |
||
< |
<syntaxhighlight lang="javascript"> var t=i; |
||
while(i>-1&&j>-1){ |
while(i>-1&&j>-1){ |
||
switch(c[i][j]){ |
switch(c[i][j]){ |
||
Line 1,374: | Line 1,714: | ||
} |
} |
||
} |
} |
||
if(t!==i){lcs.unshift(x.substring(i+1,t+1));}</ |
if(t!==i){lcs.unshift(x.substring(i+1,t+1));}</syntaxhighlight> |
||
===Greedy Algorithm=== |
===Greedy Algorithm=== |
||
This is |
This is an heuristic algorithm that won't always return the correct answer, but is significantly faster and less memory intensive than the dynamic programming version, in exchange for giving up the ability to re-use the table to find alternate solutions and greater complexity in generating diffs. Note that this implementation uses a binary buffer for additional efficiency gains, but it's simple to transform to use string or array concatenation; |
||
< |
<syntaxhighlight lang="javascript">function lcs_greedy(x,y){ |
||
var p1, i, idx, |
|||
symbols = {}, |
|||
r=0,p=0,p1,L=0,idx, |
|||
r = 0, |
|||
m=x.length,n=y.length, |
|||
p = 0, |
|||
S = new Buffer(m<n?n:m); |
|||
l = 0, |
|||
p1 = popsym(0); |
|||
m = x.length, |
|||
for(i=0;i < m;i++){ |
|||
n = y.length, |
|||
p = (r===p)?p1:popsym(i); |
|||
s = new Buffer((m < n) ? n : m); |
|||
p1 = popsym(i+1); |
|||
idx=(p > p1)?(i++,p1):p; |
|||
p1 = popsym(0); |
|||
else{ |
|||
for (i = 0; i < m; i++) { |
|||
r=idx; |
|||
p = (r === p) ? p1 : popsym(i); |
|||
S[L++]=x.charCodeAt(i); |
|||
p1 = popsym(i + 1); |
|||
} |
|||
if (p > p1) { |
|||
} |
|||
i += 1; |
|||
return S.toString('utf8',0,L); |
|||
idx = p1; |
|||
} else { |
|||
idx = p; |
|||
} |
|||
if (idx === n) { |
|||
p = popsym(i); |
|||
} else { |
|||
r = idx; |
|||
s[l] = x.charCodeAt(i); |
|||
l += 1; |
|||
} |
|||
} |
|||
return s.toString('utf8', 0, l); |
|||
function popsym(index) { |
|||
var s = x[index], |
|||
pos = symbols[s] + 1; |
|||
pos = y.indexOf(s,pos>r?pos:r); |
|||
if(pos===-1){pos=n;} |
|||
symbols[s]=pos; |
|||
return pos; |
|||
} |
|||
}</lang> |
|||
pos = y.indexOf(s, ((pos > r) ? pos : r)); |
|||
=={{header|jq}}== |
|||
if (pos === -1) { pos = n; } |
|||
We first give a recursive solution, which works for strings or for arrays, and then use it to write an enhanced solution that first removes extraneous characters and recognizes a common initial substring.<lang jq> |
|||
symbols[s] = pos; |
|||
# Generic version for strings or for arrays: |
|||
return pos; |
|||
def recursive_lcs(a; b): |
|||
} |
|||
if (a|length) == 0 or (b|length) == 0 then a[0:0] |
|||
}</syntaxhighlight> |
|||
else a[0:-1] as $aSub |
|||
| b[0:-1] as $bSub |
|||
| a[-1:] as $last |
|||
| if $last == b[-1:] then recursive_lcs($aSub; $bSub) + $last |
|||
else recursive_lcs(a; $bSub) as $x |
|||
| recursive_lcs($aSub; b) as $y |
|||
| if ($x|length) > ($y|length) then $x else $y end |
|||
end |
|||
end ;</lang> |
|||
Enhanced version:<lang jq> |
|||
# return the length of the common initial subsequence; |
|||
# x and y are arrays |
|||
# The inner helper function has no arguments |
|||
# and so has no recursion overhead |
|||
def common_heads(x;y): |
|||
def common: |
|||
if x[.] != null and x[.] == y[.] then (.+1)|common else . end; |
|||
0 | common; |
|||
Note that it won't return the correct answer for all inputs. For example: <syntaxhighlight lang="javascript">lcs_greedy('bcaaaade', 'deaaaabc'); // 'bc' instead of 'aaaa'</syntaxhighlight> |
|||
# x and y are arrays |
|||
def intersection(x;y): |
|||
( (x|unique) + (y|unique) | sort) as $sorted |
|||
| reduce range(1; $sorted|length) as $i |
|||
([]; if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ; |
|||
=={{header|jq}}== |
|||
# x and y are strings; emit [winnowedx, winnowedy] |
|||
Naive recursive version: |
|||
def winnow(x; y): |
|||
<syntaxhighlight lang="jq">def lcs(xstr; ystr): |
|||
(x|explode) as $x |
|||
if (xstr == "" or ystr == "") then "" |
|||
| (y|explode) as $y |
|||
else |
|||
| intersection($x; $y) as $intersection |
|||
xstr[0:1] as $x |
|||
| [ ($x | map( select( . as $i | $intersection | index($i) ))) , |
|||
| xstr[1:] as $xs |
|||
($y | map( select( . as $i | $intersection | index($i) ))) ] |
|||
| ystr[1:] as $ys |
|||
| map(implode) ; |
|||
| if ($x == ystr[0:1]) then ($x + lcs($xs; $ys)) |
|||
else |
|||
lcs(xstr; $ys) as $one |
|||
| lcs($xs; ystr) as $two |
|||
| if ($one|length) > ($two|length) then $one else $two end |
|||
end |
|||
end;</syntaxhighlight> |
|||
Example: |
|||
<syntaxhighlight lang="jq">lcs("1234"; "1224533324"), |
|||
lcs("thisisatest"; "testing123testing")</syntaxhighlight> |
|||
Output:<syntaxhighlight lang="sh"> |
|||
# jq -n -f lcs-recursive.jq |
|||
"1234" |
|||
"tsitest"</syntaxhighlight> |
|||
=={{header|Julia}}== |
|||
# First remove extraneous characters and recognize common heads |
|||
{{works with|Julia|0.6}} |
|||
def lcs(a; b): |
|||
<syntaxhighlight lang="julia">longest(a::String, b::String) = length(a) ≥ length(b) ? a : b |
|||
if (a|length) == 0 or (b|length) == 0 then "" |
|||
else winnow(a;b) |
|||
| .[0] as $a | .[1] as $b |
|||
| common_heads($a | explode; $b | explode) as $heads |
|||
| if $heads > 0 then $a[0:$heads] + recursive_lcs( $a[$heads:]; b[$heads:]) |
|||
else recursive_lcs($a; $b) |
|||
end |
|||
end ;</lang> |
|||
Example:<lang jq> |
|||
def test: |
|||
lcs( "thisisatest"; "testing123testing"), |
|||
lcs("beginning-middle-ending" ; "beginning-diddle-dum-ending" ) |
|||
; |
|||
""" |
|||
test</lang><lang sh>$ time jq -n -f LCS.jq |
|||
julia> lcsrecursive("thisisatest", "testing123testing") |
|||
time jq -n -f LCS.jq |
|||
"tsitest" |
"tsitest" |
||
""" |
|||
"beginning-iddle-ending" |
|||
# Recursive |
|||
function lcsrecursive(xstr::String, ystr::String) |
|||
if length(xstr) == 0 || length(ystr) == 0 |
|||
return "" |
|||
end |
|||
x, xs, y, ys = xstr[1], xstr[2:end], ystr[1], ystr[2:end] |
|||
real 0m0.456s |
|||
if x == y |
|||
user 0m0.427s |
|||
return string(x, lcsrecursive(xs, ys)) |
|||
sys 0m0.005s</lang> |
|||
else |
|||
return longest(lcsrecursive(xstr, ys), lcsrecursive(xs, ystr)) |
|||
end |
|||
end |
|||
# Dynamic |
|||
function lcsdynamic(a::String, b::String) |
|||
lengths = zeros(Int, length(a) + 1, length(b) + 1) |
|||
# row 0 and column 0 are initialized to 0 already |
|||
for (i, x) in enumerate(a), (j, y) in enumerate(b) |
|||
if x == y |
|||
lengths[i+1, j+1] = lengths[i, j] + 1 |
|||
else |
|||
lengths[i+1, j+1] = max(lengths[i+1, j], lengths[i, j+1]) |
|||
end |
|||
end |
|||
# read the substring out from the matrix |
|||
result = "" |
|||
x, y = length(a) + 1, length(b) + 1 |
|||
while x > 1 && y > 1 |
|||
if lengths[x, y] == lengths[x-1, y] |
|||
x -= 1 |
|||
elseif lengths[x, y] == lengths[x, y-1] |
|||
y -= 1 |
|||
else |
|||
@assert a[x-1] == b[y-1] |
|||
result = string(a[x-1], result) |
|||
x -= 1 |
|||
y -= 1 |
|||
end |
|||
end |
|||
return result |
|||
end |
|||
@show lcsrecursive("thisisatest", "testing123testing") |
|||
@time lcsrecursive("thisisatest", "testing123testing") |
|||
@show lcsdynamic("thisisatest", "testing123testing") |
|||
@time lcsdynamic("thisisatest", "testing123testing")</syntaxhighlight> |
|||
{{out}} |
|||
<pre>lcsrecursive("thisisatest", "testing123testing") = "tsitest" |
|||
0.038153 seconds (537.77 k allocations: 16.415 MiB) |
|||
lcsdynamic("thisisatest", "testing123testing") = "tsitest" |
|||
0.000004 seconds (12 allocations: 2.141 KiB)</pre> |
|||
=={{header|Kotlin}}== |
|||
<syntaxhighlight lang="scala">// version 1.1.2 |
|||
fun lcs(x: String, y: String): String { |
|||
if (x.length == 0 || y.length == 0) return "" |
|||
val x1 = x.dropLast(1) |
|||
val y1 = y.dropLast(1) |
|||
if (x.last() == y.last()) return lcs(x1, y1) + x.last() |
|||
val x2 = lcs(x, y1) |
|||
val y2 = lcs(x1, y) |
|||
return if (x2.length > y2.length) x2 else y2 |
|||
} |
|||
fun main(args: Array<String>) { |
|||
val x = "thisisatest" |
|||
val y = "testing123testing" |
|||
println(lcs(x, y)) |
|||
}</syntaxhighlight> |
|||
{{out}} |
|||
<pre> |
|||
tsitest |
|||
</pre> |
|||
=={{header|Liberty BASIC}}== |
=={{header|Liberty BASIC}}== |
||
<syntaxhighlight lang="lb"> |
|||
<lang lb> |
|||
'variation of BASIC example |
'variation of BASIC example |
||
w$="aebdef" |
w$="aebdef" |
||
Line 1,504: | Line 1,910: | ||
END IF |
END IF |
||
END FUNCTION |
END FUNCTION |
||
</syntaxhighlight> |
|||
</lang> |
|||
=={{header|Logo}}== |
=={{header|Logo}}== |
||
This implementation works on both words and lists. |
This implementation works on both words and lists. |
||
< |
<syntaxhighlight lang="logo">to longest :s :t |
||
output ifelse greater? count :s count :t [:s] [:t] |
output ifelse greater? count :s count :t [:s] [:t] |
||
end |
end |
||
Line 1,516: | Line 1,922: | ||
if equal? first :s first :t [output combine first :s lcs bf :s bf :t] |
if equal? first :s first :t [output combine first :s lcs bf :s bf :t] |
||
output longest lcs :s bf :t lcs bf :s :t |
output longest lcs :s bf :t lcs bf :s :t |
||
end</ |
end</syntaxhighlight> |
||
=={{header|Lua}}== |
=={{header|Lua}}== |
||
< |
<syntaxhighlight lang="lua">function LCS( a, b ) |
||
if #a == 0 or #b == 0 then |
if #a == 0 or #b == 0 then |
||
return "" |
return "" |
||
Line 1,536: | Line 1,942: | ||
end |
end |
||
print( LCS( "thisisatest", "testing123testing" ) )</ |
print( LCS( "thisisatest", "testing123testing" ) )</syntaxhighlight> |
||
=={{header|M4}}== |
=={{header|M4}}== |
||
< |
<syntaxhighlight lang="m4">define(`set2d',`define(`$1[$2][$3]',`$4')') |
||
define(`get2d',`defn($1[$2][$3])') |
define(`get2d',`defn($1[$2][$3])') |
||
define(`tryboth', |
define(`tryboth', |
||
Line 1,559: | Line 1,965: | ||
lcs(`1234',`1224533324') |
lcs(`1234',`1224533324') |
||
lcs(`thisisatest',`testing123testing')</ |
lcs(`thisisatest',`testing123testing')</syntaxhighlight> |
||
Note: the caching (set2d/get2d) obscures the code even more than usual, but is necessary in order to get the second test to run in a reasonable amount of time. |
Note: the caching (set2d/get2d) obscures the code even more than usual, but is necessary in order to get the second test to run in a reasonable amount of time. |
||
=={{header|Maple}}== |
=={{header|Maple}}== |
||
<syntaxhighlight lang="maple"> |
|||
<lang Maple> |
|||
> StringTools:-LongestCommonSubSequence( "thisisatest", "testing123testing" ); |
> StringTools:-LongestCommonSubSequence( "thisisatest", "testing123testing" ); |
||
"tsitest" |
"tsitest" |
||
</syntaxhighlight> |
|||
</lang> |
|||
=={{header|Mathematica}}== |
=={{header|Mathematica}}/{{header|Wolfram Language}}== |
||
A built-in function can do this for us: |
A built-in function can do this for us: |
||
< |
<syntaxhighlight lang="mathematica">a = "thisisatest"; |
||
b = "testing123testing"; |
b = "testing123testing"; |
||
LongestCommonSequence[a, b]</ |
LongestCommonSequence[a, b]</syntaxhighlight> |
||
gives: |
gives: |
||
<lang |
<syntaxhighlight lang="mathematica">tsitest</syntaxhighlight> |
||
Note that Mathematica also has a built-in function called LongestCommonSubsequence[a,b]: |
Note that Mathematica also has a built-in function called LongestCommonSubsequence[a,b]: |
||
Line 1,585: | Line 1,991: | ||
''finds the longest sequence of contiguous or disjoint elements common to the strings or lists a and b.'' |
''finds the longest sequence of contiguous or disjoint elements common to the strings or lists a and b.'' |
||
I added this note because the name of this article suggests LongestCommonSubsequence does the job, however |
I added this note because the name of this article suggests LongestCommonSubsequence does the job, however LongestCommonSequence performs the puzzle-description. |
||
=={{header|Nim}}== |
=={{header|Nim}}== |
||
===Recursion=== |
===Recursion=== |
||
{{trans|Python}} |
{{trans|Python}} |
||
< |
<syntaxhighlight lang="nim">proc lcs(x, y: string): string = |
||
if x == "" or y == "": |
if x == "" or y == "": |
||
return "" |
return "" |
||
Line 1,602: | Line 2,008: | ||
echo lcs("1234", "1224533324") |
echo lcs("1234", "1224533324") |
||
echo lcs("thisisatest", "testing123testing")</ |
echo lcs("thisisatest", "testing123testing")</syntaxhighlight> |
||
This recursive version is not efficient but the execution time can be greatly improved by using memoization. |
|||
===Dynamic Programming=== |
===Dynamic Programming=== |
||
{{trans|Python}} |
{{trans|Python}} |
||
< |
<syntaxhighlight lang="nim">proc lcs(a, b: string): string = |
||
var ls = newSeq[seq[int]] |
var ls = newSeq[seq[int]](a.len+1) |
||
for i in 0 .. a.len: |
for i in 0 .. a.len: |
||
ls[i].newSeq |
ls[i].newSeq(b.len+1) |
||
for i, x in a: |
for i, x in a: |
||
Line 1,633: | Line 2,041: | ||
echo lcs("1234", "1224533324") |
echo lcs("1234", "1224533324") |
||
echo lcs("thisisatest", "testing123testing")</ |
echo lcs("thisisatest", "testing123testing")</syntaxhighlight> |
||
=={{header|OCaml}}== |
=={{header|OCaml}}== |
||
===Recursion=== |
===Recursion=== |
||
from Haskell |
from Haskell |
||
< |
<syntaxhighlight lang="ocaml">let longest xs ys = if List.length xs > List.length ys then xs else ys |
||
let rec lcs a b = match a, b with |
let rec lcs a b = match a, b with |
||
Line 1,647: | Line 2,055: | ||
x :: lcs xs ys |
x :: lcs xs ys |
||
else |
else |
||
longest (lcs a ys) (lcs xs b)</ |
longest (lcs a ys) (lcs xs b)</syntaxhighlight> |
||
===Memoized recursion=== |
===Memoized recursion=== |
||
< |
<syntaxhighlight lang="ocaml"> |
||
let lcs xs ys = |
let lcs xs ys = |
||
let cache = Hashtbl.create 16 in |
let cache = Hashtbl.create 16 in |
||
Line 1,670: | Line 2,078: | ||
result |
result |
||
in |
in |
||
lcs xs ys</ |
lcs xs ys</syntaxhighlight> |
||
===Dynamic programming=== |
===Dynamic programming=== |
||
< |
<syntaxhighlight lang="ocaml">let lcs xs' ys' = |
||
let xs = Array.of_list xs' |
let xs = Array.of_list xs' |
||
and ys = Array.of_list ys' in |
and ys = Array.of_list ys' in |
||
Line 1,687: | Line 2,095: | ||
done |
done |
||
done; |
done; |
||
a.(0).(0)</ |
a.(0).(0)</syntaxhighlight> |
||
Because both solutions only work with lists, here are some functions to convert to and from strings: |
Because both solutions only work with lists, here are some functions to convert to and from strings: |
||
< |
<syntaxhighlight lang="ocaml">let list_of_string str = |
||
let result = ref [] in |
let result = ref [] in |
||
String.iter (fun x -> result := x :: !result) |
String.iter (fun x -> result := x :: !result) |
||
Line 1,699: | Line 2,107: | ||
let result = String.create (List.length lst) in |
let result = String.create (List.length lst) in |
||
ignore (List.fold_left (fun i x -> result.[i] <- x; i+1) 0 lst); |
ignore (List.fold_left (fun i x -> result.[i] <- x; i+1) 0 lst); |
||
result</ |
result</syntaxhighlight> |
||
Both solutions work. Example: |
Both solutions work. Example: |
||
Line 1,712: | Line 2,120: | ||
Recursive solution: |
Recursive solution: |
||
< |
<syntaxhighlight lang="oz">declare |
||
fun {LCS Xs Ys} |
fun {LCS Xs Ys} |
||
case [Xs Ys] |
case [Xs Ys] |
||
Line 1,726: | Line 2,134: | ||
end |
end |
||
in |
in |
||
{System.showInfo {LCS "thisisatest" "testing123testing"}}</ |
{System.showInfo {LCS "thisisatest" "testing123testing"}}</syntaxhighlight> |
||
=={{header|Pascal}}== |
=={{header|Pascal}}== |
||
{{trans|Fortran}} |
{{trans|Fortran}} |
||
< |
<syntaxhighlight lang="pascal">Program LongestCommonSubsequence(output); |
||
function lcs(a, b: string): string; |
function lcs(a, b: string): string; |
||
Line 1,763: | Line 2,171: | ||
s2 := '1224533324'; |
s2 := '1224533324'; |
||
writeln (lcs(s1, s2)); |
writeln (lcs(s1, s2)); |
||
end.</ |
end.</syntaxhighlight> |
||
{{out}} |
{{out}} |
||
<pre>:> ./LongestCommonSequence |
<pre>:> ./LongestCommonSequence |
||
Line 1,770: | Line 2,178: | ||
</pre> |
</pre> |
||
=={{header| |
=={{header|PascalABC.NET}}== |
||
<syntaxhighlight lang="delphi"> |
|||
<lang perl>use Algorithm::Diff qw/ LCS /; |
|||
function Lengths(x,y: string): array[,] of integer; |
|||
begin |
|||
var (m,n) := (x.Length,y.Length); |
|||
var C := new integer[m+1, n+1]; // filled with zeroes |
|||
for var i:=1 to m do |
|||
for var j:=1 to n do |
|||
if x[i] = y[j] then |
|||
C[i,j] := C[i-1,j-1] + 1 |
|||
else C[i,j] := max(C[i,j-1], C[i-1,j]); |
|||
Result := C; |
|||
end; |
|||
function lcshelper(x,y: string; i,j: integer): string; |
|||
my @a = split //, 'thisisatest'; |
|||
begin |
|||
my @b = split //, 'testing123testing'; |
|||
var C := Lengths(x,y); |
|||
if (i = 0) or (j = 0) then |
|||
Result := '' |
|||
else if X[i] = Y[j] then |
|||
Result := lcshelper(X, Y, i-1, j-1) + X[i] |
|||
else if C[i,j-1] > C[i-1,j] then |
|||
Result := lcshelper(X, Y, i, j-1) |
|||
else Result := lcshelper(X, Y, i-1, j) |
|||
end; |
|||
function lcs(x,y: string) := lcshelper(x,y,x.Length,y.Length); |
|||
print LCS( \@a, \@b );</lang> |
|||
begin |
|||
=={{header|Perl 6}}== |
|||
Println(lcs('1234','1224533324')); |
|||
===Recursion=== |
|||
Println(lcs('thisisatest','testing123testing')); |
|||
This solution is similar to the Haskell one. It is slow. |
|||
end. |
|||
<lang perl6>sub lcs(Str $xstr, Str $ystr) { |
|||
</syntaxhighlight> |
|||
return "" unless $xstr & $ystr; |
|||
{{out}} |
|||
<pre> |
|||
1234 |
|||
tsitest |
|||
</pre> |
|||
=={{header|Perl}}== |
|||
my ($x, $xs, $y, $ys) = $xstr.substr(0, 1), $xstr.substr(1), $ystr.substr(0, 1), $ystr.substr(1); |
|||
<syntaxhighlight lang="perl">sub lcs { |
|||
return $x eq $y |
|||
my ($a, $b) = @_; |
|||
if (!length($a) || !length($b)) { |
|||
return ""; |
|||
} |
|||
if (substr($a, 0, 1) eq substr($b, 0, 1)) { |
|||
return substr($a, 0, 1) . lcs(substr($a, 1), substr($b, 1)); |
|||
} |
|||
my $c = lcs(substr($a, 1), $b) || ""; |
|||
my $d = lcs($a, substr($b, 1)) || ""; |
|||
return length($c) > length($d) ? $c : $d; |
|||
} |
} |
||
print lcs("thisisatest", "testing123testing") . "\n";</syntaxhighlight> |
|||
===Dynamic Programming=== |
|||
{{trans|Java}} |
|||
<lang perl6> |
|||
sub lcs(Str $xstr, Str $ystr) { |
|||
my ($xlen, $ylen) = ($xstr, $ystr)>>.chars; |
|||
my @lengths = map {[(0) xx ($ylen+1)]}, 0..$xlen; |
|||
===Alternate letting regex do all the work=== |
|||
for $xstr.comb.kv -> $i, $x { |
|||
<syntaxhighlight lang="perl">use strict; |
|||
for $ystr.comb.kv -> $j, $y { |
|||
use warnings; |
|||
@lengths[$i+1][$j+1] = $x eq $y ?? @lengths[$i][$j]+1 !! (@lengths[$i+1][$j], @lengths[$i][$j+1]).max; |
|||
use feature 'bitwise'; |
|||
} |
|||
} |
|||
print "lcs is ", lcs('thisisatest', 'testing123testing'), "\n"; |
|||
my @x = $xstr.comb; |
|||
my ($x, $y) = ($xlen, $ylen); |
|||
sub lcs |
|||
my $result = ""; |
|||
{ |
|||
while $x != 0 && $y != 0 { |
|||
my ($c, $d) = @_; |
|||
if @lengths[$x][$y] == @lengths[$x-1][$y] { |
|||
for my $len ( reverse 1 .. length($c &. $d) ) |
|||
{ |
|||
"$c\n$d" =~ join '.*', ('(.)') x $len, "\n", map "\\$_", 1 .. $len and |
|||
elsif @lengths[$x][$y] == @lengths[$x][$y-1] { |
|||
return join '', @{^CAPTURE}; |
|||
} |
|||
else { |
|||
$result = @x[$x-1] ~ $result; |
|||
$x--; |
|||
$y--; |
|||
} |
|||
} |
} |
||
return ''; |
|||
}</syntaxhighlight> |
|||
{{out}} |
|||
<pre>lcs is tastiest</pre> |
|||
=={{header|Phix}}== |
|||
return $result; |
|||
If you want this to work with (utf8) Unicode text, just chuck the inputs through utf8_to_utf32() first (and the output through utf32_to_utf8()). |
|||
} |
|||
<!--<syntaxhighlight lang="phix">(phixonline)--> |
|||
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span> |
|||
<span style="color: #008080;">function</span> <span style="color: #000000;">lcs</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">a</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">b</span><span style="color: #0000FF;">)</span> |
|||
<span style="color: #004080;">sequence</span> <span style="color: #000000;">res</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">""</span> |
|||
<span style="color: #008080;">if</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">a</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">and</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">b</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">then</span> |
|||
<span style="color: #008080;">if</span> <span style="color: #000000;">a</span><span style="color: #0000FF;">[$]=</span><span style="color: #000000;">b</span><span style="color: #0000FF;">[$]</span> <span style="color: #008080;">then</span> |
|||
<span style="color: #000000;">res</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">lcs</span><span style="color: #0000FF;">(</span><span style="color: #000000;">a</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">..-</span><span style="color: #000000;">2</span><span style="color: #0000FF;">],</span><span style="color: #000000;">b</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">..-</span><span style="color: #000000;">2</span><span style="color: #0000FF;">])&</span><span style="color: #000000;">a</span><span style="color: #0000FF;">[$]</span> |
|||
<span style="color: #008080;">else</span> |
|||
<span style="color: #004080;">sequence</span> <span style="color: #000000;">l</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">lcs</span><span style="color: #0000FF;">(</span><span style="color: #000000;">a</span><span style="color: #0000FF;">,</span><span style="color: #000000;">b</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">..-</span><span style="color: #000000;">2</span><span style="color: #0000FF;">]),</span> |
|||
<span style="color: #000000;">r</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">lcs</span><span style="color: #0000FF;">(</span><span style="color: #000000;">a</span><span style="color: #0000FF;">[</span><span style="color: #000000;">1</span><span style="color: #0000FF;">..-</span><span style="color: #000000;">2</span><span style="color: #0000FF;">],</span><span style="color: #000000;">b</span><span style="color: #0000FF;">)</span> |
|||
<span style="color: #000000;">res</span> <span style="color: #0000FF;">=</span> <span style="color: #008080;">iff</span><span style="color: #0000FF;">(</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">l</span><span style="color: #0000FF;">)></span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">r</span><span style="color: #0000FF;">)?</span><span style="color: #000000;">l</span><span style="color: #0000FF;">:</span><span style="color: #000000;">r</span><span style="color: #0000FF;">)</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span> |
|||
<span style="color: #008080;">return</span> <span style="color: #000000;">res</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span> |
|||
<span style="color: #008080;">constant</span> <span style="color: #000000;">tests</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{{</span><span style="color: #008000;">"1234"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"1224533324"</span><span style="color: #0000FF;">},</span> |
|||
<span style="color: #0000FF;">{</span><span style="color: #008000;">"thisisatest"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"testing123testing"</span><span style="color: #0000FF;">}}</span> |
|||
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">tests</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span> |
|||
<span style="color: #004080;">string</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">a</span><span style="color: #0000FF;">,</span><span style="color: #000000;">b</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">tests</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> |
|||
<span style="color: #0000FF;">?</span><span style="color: #000000;">lcs</span><span style="color: #0000FF;">(</span><span style="color: #000000;">a</span><span style="color: #0000FF;">,</span><span style="color: #000000;">b</span><span style="color: #0000FF;">)</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span> |
|||
<!--</syntaxhighlight>--> |
|||
{{out}} |
|||
<pre> |
|||
"1234" |
|||
"tsitest" |
|||
</pre> |
|||
===Alternate version=== |
|||
same output |
|||
<!--<syntaxhighlight lang="phix">(phixonline)--> |
|||
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span> |
|||
<span style="color: #008080;">function</span> <span style="color: #000000;">LCSLength</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">X</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">sequence</span> <span style="color: #000000;">Y</span><span style="color: #0000FF;">)</span> |
|||
<span style="color: #004080;">sequence</span> <span style="color: #000000;">C</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">repeat</span><span style="color: #0000FF;">(</span><span style="color: #7060A8;">repeat</span><span style="color: #0000FF;">(</span><span style="color: #000000;">0</span><span style="color: #0000FF;">,</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">Y</span><span style="color: #0000FF;">)+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">),</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">X</span><span style="color: #0000FF;">)+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">)</span> |
|||
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">X</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span> |
|||
<span style="color: #008080;">for</span> <span style="color: #000000;">j</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">Y</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span> |
|||
<span style="color: #008080;">if</span> <span style="color: #000000;">X</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]=</span><span style="color: #000000;">Y</span><span style="color: #0000FF;">[</span><span style="color: #000000;">j</span><span style="color: #0000FF;">]</span> <span style="color: #008080;">then</span> |
|||
<span style="color: #000000;">C</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">][</span><span style="color: #000000;">j</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">:=</span> <span style="color: #000000;">C</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">][</span><span style="color: #000000;">j</span><span style="color: #0000FF;">]+</span><span style="color: #000000;">1</span> |
|||
<span style="color: #008080;">else</span> |
|||
<span style="color: #000000;">C</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">][</span><span style="color: #000000;">j</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">:=</span> <span style="color: #7060A8;">max</span><span style="color: #0000FF;">(</span><span style="color: #000000;">C</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">][</span><span style="color: #000000;">j</span><span style="color: #0000FF;">],</span> <span style="color: #000000;">C</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">][</span><span style="color: #000000;">j</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">])</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span> |
|||
<span style="color: #008080;">return</span> <span style="color: #000000;">C</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span> |
|||
<span style="color: #008080;">function</span> <span style="color: #000000;">backtrack</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">C</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">sequence</span> <span style="color: #000000;">X</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">sequence</span> <span style="color: #000000;">Y</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">j</span><span style="color: #0000FF;">)</span> |
|||
<span style="color: #008080;">if</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span> <span style="color: #008080;">or</span> <span style="color: #000000;">j</span><span style="color: #0000FF;">=</span><span style="color: #000000;">0</span> <span style="color: #008080;">then</span> |
|||
<span style="color: #008080;">return</span> <span style="color: #008000;">""</span> |
|||
<span style="color: #008080;">elsif</span> <span style="color: #000000;">X</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]=</span><span style="color: #000000;">Y</span><span style="color: #0000FF;">[</span><span style="color: #000000;">j</span><span style="color: #0000FF;">]</span> <span style="color: #008080;">then</span> |
|||
<span style="color: #008080;">return</span> <span style="color: #000000;">backtrack</span><span style="color: #0000FF;">(</span><span style="color: #000000;">C</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">X</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">Y</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">j</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">)</span> <span style="color: #0000FF;">&</span> <span style="color: #000000;">X</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> |
|||
<span style="color: #008080;">else</span> |
|||
<span style="color: #008080;">if</span> <span style="color: #000000;">C</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">][</span><span style="color: #000000;">j</span><span style="color: #0000FF;">]></span><span style="color: #000000;">C</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">][</span><span style="color: #000000;">j</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span> <span style="color: #008080;">then</span> |
|||
<span style="color: #008080;">return</span> <span style="color: #000000;">backtrack</span><span style="color: #0000FF;">(</span><span style="color: #000000;">C</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">X</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">Y</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">j</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">)</span> |
|||
<span style="color: #008080;">else</span> |
|||
<span style="color: #008080;">return</span> <span style="color: #000000;">backtrack</span><span style="color: #0000FF;">(</span><span style="color: #000000;">C</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">X</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">Y</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">j</span><span style="color: #0000FF;">)</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">if</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span> |
|||
<span style="color: #008080;">function</span> <span style="color: #000000;">lcs</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">a</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">sequence</span> <span style="color: #000000;">b</span><span style="color: #0000FF;">)</span> |
|||
<span style="color: #008080;">return</span> <span style="color: #000000;">backtrack</span><span style="color: #0000FF;">(</span><span style="color: #000000;">LCSLength</span><span style="color: #0000FF;">(</span><span style="color: #000000;">a</span><span style="color: #0000FF;">,</span><span style="color: #000000;">b</span><span style="color: #0000FF;">),</span><span style="color: #000000;">a</span><span style="color: #0000FF;">,</span><span style="color: #000000;">b</span><span style="color: #0000FF;">,</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">a</span><span style="color: #0000FF;">),</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">b</span><span style="color: #0000FF;">))</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">function</span> |
|||
<span style="color: #008080;">constant</span> <span style="color: #000000;">tests</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{{</span><span style="color: #008000;">"1234"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"1224533324"</span><span style="color: #0000FF;">},</span> |
|||
<span style="color: #0000FF;">{</span><span style="color: #008000;">"thisisatest"</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"testing123testing"</span><span style="color: #0000FF;">}}</span> |
|||
<span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">tests</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span> |
|||
<span style="color: #004080;">string</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">a</span><span style="color: #0000FF;">,</span><span style="color: #000000;">b</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">tests</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> |
|||
<span style="color: #0000FF;">?</span><span style="color: #000000;">lcs</span><span style="color: #0000FF;">(</span><span style="color: #000000;">a</span><span style="color: #0000FF;">,</span><span style="color: #000000;">b</span><span style="color: #0000FF;">)</span> |
|||
<span style="color: #008080;">end</span> <span style="color: #008080;">for</span> |
|||
<!--</syntaxhighlight>--> |
|||
=={{header|Picat}}== |
|||
say lcs("thisisatest", "testing123testing");</lang> |
|||
===Wikipedia algorithm=== |
|||
With some added trickery for a 1-based language. |
|||
<syntaxhighlight lang="picat">lcs_wiki(X,Y) = V => |
|||
[C, _Len] = lcs_length(X,Y), |
|||
V = backTrace(C,X,Y,X.length+1,Y.length+1). |
|||
lcs_length(X, Y) = V=> |
|||
M = X.length, |
|||
N = Y.length, |
|||
C = [[0 : J in 1..N+1] : I in 1..N+1], |
|||
foreach(I in 2..M+1,J in 2..N+1) |
|||
if X[I-1] == Y[J-1] then |
|||
C[I,J] := C[I-1,J-1] + 1 |
|||
else |
|||
C[I,J] := max([C[I,J-1], C[I-1,J]]) |
|||
end |
|||
end, |
|||
V = [C, C[M+1,N+1]]. |
|||
backTrace(C, X, Y, I, J) = V => |
|||
if I == 1; J == 1 then |
|||
V = "" |
|||
elseif X[I-1] == Y[J-1] then |
|||
V = backTrace(C, X, Y, I-1, J-1) ++ [X[I-1]] |
|||
else |
|||
if C[I,J-1] > C[I-1,J] then |
|||
V = backTrace(C, X, Y, I, J-1) |
|||
else |
|||
V = backTrace(C, X, Y, I-1, J) |
|||
end |
|||
end.</syntaxhighlight> |
|||
===Rule-based=== |
|||
{{trans|SETL}} |
|||
<syntaxhighlight lang="picat">table |
|||
lcs_rule(A, B) = "", (A == ""; B == "") => true. |
|||
lcs_rule(A, B) = [A[1]] ++ lcs_rule(butfirst(A), butfirst(B)), A[1] == B[1] => true. |
|||
lcs_rule(A, B) = longest(lcs_rule(butfirst(A), B), lcs_rule(A, butfirst(B))) => true. |
|||
% Return the longest string of A and B |
|||
longest(A, B) = cond(A.length > B.length, A, B). |
|||
% butfirst (everything except first element) |
|||
butfirst(A) = [A[I] : I in 2..A.length].</syntaxhighlight> |
|||
===Test=== |
|||
<syntaxhighlight lang="picat">go => |
|||
Tests = [["thisisatest","testing123testing"], |
|||
["XMJYAUZ", "MZJAWXU"], |
|||
["1234", "1224533324"], |
|||
["beginning-middle-ending","beginning-diddle-dum-ending"] |
|||
], |
|||
Funs = [lcs_wiki,lcs_rule], |
|||
foreach(Fun in Funs) |
|||
println(fun=Fun), |
|||
foreach(Test in Tests) |
|||
printf("%w : %w\n", Test, apply(Fun,Test[1],Test[2])) |
|||
end, |
|||
nl |
|||
end, |
|||
nl.</syntaxhighlight> |
|||
{{out}} |
|||
<pre>fun = lcs_wiki |
|||
[thisisatest,testing123testing] : tsitest |
|||
[XMJYAUZ,MZJAWXU] : MJAU |
|||
[1234,1224533324] : 1234 |
|||
[beginning-middle-ending,beginning-diddle-dum-ending] : beginning-iddle-ending |
|||
fun = lcs_rule |
|||
[thisisatest,testing123testing] : tsitest |
|||
[XMJYAUZ,MZJAWXU] : MJAU |
|||
[1234,1224533324] : 1234 |
|||
[beginning-middle-ending,beginning-diddle-dum-ending] : beginning-iddle-ending</pre> |
|||
=={{header|PicoLisp}}== |
=={{header|PicoLisp}}== |
||
< |
<syntaxhighlight lang="picolisp">(de commonSequences (A B) |
||
(when A |
(when A |
||
(conc |
(conc |
||
Line 1,838: | Line 2,417: | ||
(commonSequences |
(commonSequences |
||
(chop "thisisatest") |
(chop "thisisatest") |
||
(chop "testing123testing") ) )</ |
(chop "testing123testing") ) )</syntaxhighlight> |
||
{{out}} |
{{out}} |
||
<pre>-> ("t" "s" "i" "t" "e" "s" "t")</pre> |
<pre>-> ("t" "s" "i" "t" "e" "s" "t")</pre> |
||
=={{header|PowerShell}}== |
|||
Returns a sequence (array) of a type: |
|||
<syntaxhighlight lang="powershell"> |
|||
function Get-Lcs ($ReferenceObject, $DifferenceObject) |
|||
{ |
|||
$longestCommonSubsequence = @() |
|||
$x = $ReferenceObject.Length |
|||
$y = $DifferenceObject.Length |
|||
$lengths = New-Object -TypeName 'System.Object[,]' -ArgumentList ($x + 1), ($y + 1) |
|||
for($i = 0; $i -lt $x; $i++) |
|||
{ |
|||
for ($j = 0; $j -lt $y; $j++) |
|||
{ |
|||
if ($ReferenceObject[$i] -ceq $DifferenceObject[$j]) |
|||
{ |
|||
$lengths[($i+1),($j+1)] = $lengths[$i,$j] + 1 |
|||
} |
|||
else |
|||
{ |
|||
$lengths[($i+1),($j+1)] = [Math]::Max(($lengths[($i+1),$j]),($lengths[$i,($j+1)])) |
|||
} |
|||
} |
|||
} |
|||
while (($x -ne 0) -and ($y -ne 0)) |
|||
{ |
|||
if ( $lengths[$x,$y] -eq $lengths[($x-1),$y]) |
|||
{ |
|||
--$x |
|||
} |
|||
elseif ($lengths[$x,$y] -eq $lengths[$x,($y-1)]) |
|||
{ |
|||
--$y |
|||
} |
|||
else |
|||
{ |
|||
if ($ReferenceObject[($x-1)] -ceq $DifferenceObject[($y-1)]) |
|||
{ |
|||
$longestCommonSubsequence = ,($ReferenceObject[($x-1)]) + $longestCommonSubsequence |
|||
} |
|||
--$x |
|||
--$y |
|||
} |
|||
} |
|||
$longestCommonSubsequence |
|||
} |
|||
</syntaxhighlight> |
|||
Returns the character array as a string: |
|||
<syntaxhighlight lang="powershell"> |
|||
(Get-Lcs -ReferenceObject "thisisatest" -DifferenceObject "testing123testing") -join "" |
|||
</syntaxhighlight> |
|||
{{Out}} |
|||
<pre> |
|||
tsitest |
|||
</pre> |
|||
Returns an array of integers: |
|||
<syntaxhighlight lang="powershell"> |
|||
Get-Lcs -ReferenceObject @(1,2,3,4) -DifferenceObject @(1,2,2,4,5,3,3,3,2,4) |
|||
</syntaxhighlight> |
|||
{{Out}} |
|||
<pre> |
|||
1 |
|||
2 |
|||
3 |
|||
4 |
|||
</pre> |
|||
Given two lists of objects, return the LCS of the ID property: |
|||
<syntaxhighlight lang="powershell"> |
|||
$list1 |
|||
ID X Y |
|||
-- - - |
|||
1 101 201 |
|||
2 102 202 |
|||
3 103 203 |
|||
4 104 204 |
|||
5 105 205 |
|||
6 106 206 |
|||
7 107 207 |
|||
8 108 208 |
|||
9 109 209 |
|||
$list2 |
|||
ID X Y |
|||
-- - - |
|||
1 101 201 |
|||
3 103 203 |
|||
5 105 205 |
|||
7 107 207 |
|||
9 109 209 |
|||
Get-Lcs -ReferenceObject $list1.ID -DifferenceObject $list2.ID |
|||
</syntaxhighlight> |
|||
{{Out}} |
|||
<pre> |
|||
1 |
|||
3 |
|||
5 |
|||
7 |
|||
9 |
|||
</pre> |
|||
=={{header|Prolog}}== |
=={{header|Prolog}}== |
||
===Recursive Version=== |
===Recursive Version=== |
||
First version: |
First version: |
||
< |
<syntaxhighlight lang="prolog">test :- |
||
time(lcs("thisisatest", "testing123testing", Lcs)), |
time(lcs("thisisatest", "testing123testing", Lcs)), |
||
writef('%s',[Lcs]). |
writef('%s',[Lcs]). |
||
Line 1,864: | Line 2,550: | ||
length(L1,Length1), |
length(L1,Length1), |
||
length(L2,Length2), |
length(L2,Length2), |
||
((Length1 > Length2) -> Longest = L1; Longest = L2).</ |
((Length1 > Length2) -> Longest = L1; Longest = L2).</syntaxhighlight> |
||
Second version, with memoization: |
Second version, with memoization: |
||
< |
<syntaxhighlight lang="prolog">%declare that we will add lcs_db facts during runtime |
||
:- dynamic lcs_db/3. |
:- dynamic lcs_db/3. |
||
Line 1,894: | Line 2,580: | ||
length(L1,Length1), |
length(L1,Length1), |
||
length(L2,Length2), |
length(L2,Length2), |
||
((Length1 > Length2) -> Longest = L1; Longest = L2).</ |
((Length1 > Length2) -> Longest = L1; Longest = L2).</syntaxhighlight> |
||
{{out|Demonstrating}} |
{{out|Demonstrating}} |
||
Example for "beginning-middle-ending" and "beginning-diddle-dum-ending" <BR> |
Example for "beginning-middle-ending" and "beginning-diddle-dum-ending" <BR> |
||
First version : |
First version : |
||
< |
<syntaxhighlight lang="prolog">?- time(lcs("beginning-middle-ending","beginning-diddle-dum-ending", Lcs)),writef('%s', [Lcs]). |
||
% 10,875,184 inferences, 1.840 CPU in 1.996 seconds (92% CPU, 5910426 Lips) |
% 10,875,184 inferences, 1.840 CPU in 1.996 seconds (92% CPU, 5910426 Lips) |
||
beginning-iddle-ending</ |
beginning-iddle-ending</syntaxhighlight> |
||
Second version which is much faster : |
Second version which is much faster : |
||
< |
<syntaxhighlight lang="prolog">?- time(lcs("beginning-middle-ending","beginning-diddle-dum-ending", Lcs)),writef('%s', [Lcs]). |
||
% 2,376 inferences, 0.010 CPU in 0.003 seconds (300% CPU, 237600 Lips) |
% 2,376 inferences, 0.010 CPU in 0.003 seconds (300% CPU, 237600 Lips) |
||
beginning-iddle-ending</ |
beginning-iddle-ending</syntaxhighlight> |
||
=={{header|PureBasic}}== |
=={{header|PureBasic}}== |
||
{{trans|Basic}} |
{{trans|Basic}} |
||
< |
<syntaxhighlight lang="purebasic">Procedure.s lcs(a$, b$) |
||
Protected x$ , lcs$ |
Protected x$ , lcs$ |
||
If Len(a$) = 0 Or Len(b$) = 0 |
If Len(a$) = 0 Or Len(b$) = 0 |
||
Line 1,927: | Line 2,613: | ||
OpenConsole() |
OpenConsole() |
||
PrintN( lcs("thisisatest", "testing123testing")) |
PrintN( lcs("thisisatest", "testing123testing")) |
||
PrintN("Press any key to exit"): Repeat: Until Inkey() <> ""</ |
PrintN("Press any key to exit"): Repeat: Until Inkey() <> ""</syntaxhighlight> |
||
=={{header|Python}}== |
=={{header|Python}}== |
||
Line 1,934: | Line 2,620: | ||
===Recursion=== |
===Recursion=== |
||
This solution is similar to the Haskell one. It is slow. |
This solution is similar to the Haskell one. It is slow. |
||
< |
<syntaxhighlight lang="python">def lcs(xstr, ystr): |
||
""" |
""" |
||
>>> lcs('thisisatest', 'testing123testing') |
>>> lcs('thisisatest', 'testing123testing') |
||
Line 1,943: | Line 2,629: | ||
x, xs, y, ys = xstr[0], xstr[1:], ystr[0], ystr[1:] |
x, xs, y, ys = xstr[0], xstr[1:], ystr[0], ystr[1:] |
||
if x == y: |
if x == y: |
||
return |
return str(lcs(xs, ys)) + x |
||
else: |
else: |
||
return max(lcs(xstr, ys), lcs(xs, ystr), key=len)</ |
return max(lcs(xstr, ys), lcs(xs, ystr), key=len)</syntaxhighlight> |
||
Test it: |
Test it: |
||
< |
<syntaxhighlight lang="python">if __name__=="__main__": |
||
import doctest; doctest.testmod()</ |
import doctest; doctest.testmod()</syntaxhighlight> |
||
===Dynamic Programming=== |
===Dynamic Programming=== |
||
<syntaxhighlight lang="python">def lcs(a, b): |
|||
{{trans|Java}} |
|||
# generate matrix of length of longest common subsequence for substrings of both words |
|||
<lang python>def lcs(a, b): |
|||
lengths = [[0 |
lengths = [[0] * (len(b)+1) for _ in range(len(a)+1)] |
||
# row 0 and column 0 are initialized to 0 already |
|||
for i, x in enumerate(a): |
for i, x in enumerate(a): |
||
for j, y in enumerate(b): |
for j, y in enumerate(b): |
||
Line 1,960: | Line 2,645: | ||
lengths[i+1][j+1] = lengths[i][j] + 1 |
lengths[i+1][j+1] = lengths[i][j] + 1 |
||
else: |
else: |
||
lengths[i+1][j+1] = |
lengths[i+1][j+1] = max(lengths[i+1][j], lengths[i][j+1]) |
||
max(lengths[i+1][j], lengths[i][j+1]) |
|||
# read |
# read a substring from the matrix |
||
result = |
result = '' |
||
j = len(b) |
|||
for i in range(1, len(a)+1): |
|||
if lengths[ |
if lengths[i][j] != lengths[i-1][j]: |
||
result += a[i-1] |
|||
elif lengths[x][y] == lengths[x][y-1]: |
|||
return result</syntaxhighlight> |
|||
y -= 1 |
|||
else: |
|||
assert a[x-1] == b[y-1] |
|||
result = a[x-1] + result |
|||
x -= 1 |
|||
y -= 1 |
|||
return result</lang> |
|||
=={{header|Racket}}== |
=={{header|Racket}}== |
||
< |
<syntaxhighlight lang="racket">#lang racket |
||
(define (longest xs ys) |
(define (longest xs ys) |
||
(if (> (length xs) (length ys)) |
(if (> (length xs) (length ys)) |
||
Line 2,003: | Line 2,682: | ||
(list->string (lcs/list (string->list sx) (string->list sy)))) |
(list->string (lcs/list (string->list sx) (string->list sy)))) |
||
(lcs "thisisatest" "testing123testing")</ |
(lcs "thisisatest" "testing123testing")</syntaxhighlight> |
||
{{out}} |
{{out}} |
||
<pre>"tsitest"></pre> |
<pre>"tsitest"></pre> |
||
=={{header|Raku}}== |
|||
(formerly Perl 6) |
|||
===Recursion=== |
|||
{{works with|rakudo|2015-09-16}} |
|||
This solution is similar to the Haskell one. It is slow. |
|||
<syntaxhighlight lang="raku" line>say lcs("thisisatest", "testing123testing");sub lcs(Str $xstr, Str $ystr) { |
|||
return "" unless $xstr && $ystr; |
|||
my ($x, $xs, $y, $ys) = $xstr.substr(0, 1), $xstr.substr(1), $ystr.substr(0, 1), $ystr.substr(1); |
|||
return $x eq $y |
|||
?? $x ~ lcs($xs, $ys) |
|||
!! max(:by{ $^a.chars }, lcs($xstr, $ys), lcs($xs, $ystr) ); |
|||
} |
|||
say lcs("thisisatest", "testing123testing");</syntaxhighlight> |
|||
===Dynamic Programming=== |
|||
{{trans|Java}} |
|||
<syntaxhighlight lang="raku" line> |
|||
sub lcs(Str $xstr, Str $ystr) { |
|||
my ($xlen, $ylen) = ($xstr, $ystr)>>.chars; |
|||
my @lengths = map {[(0) xx ($ylen+1)]}, 0..$xlen; |
|||
for $xstr.comb.kv -> $i, $x { |
|||
for $ystr.comb.kv -> $j, $y { |
|||
@lengths[$i+1][$j+1] = $x eq $y ?? @lengths[$i][$j]+1 !! (@lengths[$i+1][$j], @lengths[$i][$j+1]).max; |
|||
} |
|||
} |
|||
my @x = $xstr.comb; |
|||
my ($x, $y) = ($xlen, $ylen); |
|||
my $result = ""; |
|||
while $x != 0 && $y != 0 { |
|||
if @lengths[$x][$y] == @lengths[$x-1][$y] { |
|||
$x--; |
|||
} |
|||
elsif @lengths[$x][$y] == @lengths[$x][$y-1] { |
|||
$y--; |
|||
} |
|||
else { |
|||
$result = @x[$x-1] ~ $result; |
|||
$x--; |
|||
$y--; |
|||
} |
|||
} |
|||
return $result; |
|||
} |
|||
say lcs("thisisatest", "testing123testing");</syntaxhighlight> |
|||
===Bit Vector=== |
|||
Bit parallel dynamic programming with nearly linear complexity O(n). It is fast. |
|||
<syntaxhighlight lang="raku" line>sub lcs(Str $xstr, Str $ystr) { |
|||
my (@a, @b) := ($xstr, $ystr)».comb; |
|||
my (%positions, @Vs, $lcs); |
|||
for @a.kv -> $i, $x { %positions{$x} +|= 1 +< ($i % @a) } |
|||
my $S = +^ 0; |
|||
for (0 ..^ @b) -> $j { |
|||
my $u = $S +& (%positions{@b[$j]} // 0); |
|||
@Vs[$j] = $S = ($S + $u) +| ($S - $u) |
|||
} |
|||
my ($i, $j) = @a-1, @b-1; |
|||
while ($i ≥ 0 and $j ≥ 0) { |
|||
unless (@Vs[$j] +& (1 +< $i)) { |
|||
$lcs [R~]= @a[$i] unless $j and ^@Vs[$j-1] +& (1 +< $i); |
|||
$j-- |
|||
} |
|||
$i-- |
|||
} |
|||
$lcs |
|||
} |
|||
say lcs("thisisatest", "testing123testing");</syntaxhighlight> |
|||
=={{header|ReasonML}}== |
|||
<syntaxhighlight lang="ocaml"> |
|||
let longest = (xs, ys) => |
|||
if (List.length(xs) > List.length(ys)) { |
|||
xs; |
|||
} else { |
|||
ys; |
|||
}; |
|||
let rec lcs = (a, b) => |
|||
switch (a, b) { |
|||
| ([], _) |
|||
| (_, []) => [] |
|||
| ([x, ...xs], [y, ...ys]) => |
|||
if (x == y) { |
|||
[x, ...lcs(xs, ys)]; |
|||
} else { |
|||
longest(lcs(a, ys), lcs(xs, b)); |
|||
} |
|||
}; |
|||
</syntaxhighlight> |
|||
=={{header|REXX}}== |
=={{header|REXX}}== |
||
< |
<syntaxhighlight lang="rexx">/*REXX program tests the LCS (Longest Common Subsequence) subroutine. */ |
||
parse arg aaa bbb . /* |
parse arg aaa bbb . /*obtain optional arguments from the CL*/ |
||
say 'string A = |
say 'string A =' aaa /*echo the string A to the screen. */ |
||
say 'string B = |
say 'string B =' bbb /* " " " B " " " */ |
||
say ' LCS = |
say ' LCS =' LCS(aaa, bbb) /*tell the Longest Common Sequence. */ |
||
exit /*stick a fork in it, we're done.*/ |
exit /*stick a fork in it, we're all done. */ |
||
/*──────────────────────────────────────────────────────────────────────────────────────*/ |
|||
/*──────────────────────────────────LCS subroutine──────────────────────*/ |
|||
LCS: procedure; parse arg a,b,z /*Longest Common Subsequence. */ |
|||
/*reduce recursions, removes the */ |
/*reduce recursions, removes the */ |
||
/*chars in A ¬ in B, |
/*chars in A ¬ in B, and vice─versa.*/ |
||
if z=='' then return |
if z=='' then return LCS( LCS(a,b,0), LCS(b,a,0), 9) /*Is Z null? Do recurse. */ |
||
j=length(a) |
j= length(a) |
||
if z==0 then do /*special invocation: shrink Z. */ |
if z==0 then do /*a special invocation: shrink Z. */ |
||
do j=1 for j; _=substr(a,j,1) |
do j=1 for j; _= substr(a, j, 1) |
||
if pos(_,b)\==0 then z=z||_ |
if pos(_, b)\==0 then z= z || _ |
||
end /*j*/ |
end /*j*/ |
||
return substr(z,2) |
return substr(z, 2) |
||
end |
end |
||
k=length(b) |
k= length(b) |
||
if j==0 | k==0 then return '' /* |
if j==0 | k==0 then return '' /*Is either string null? Bupkis. */ |
||
_=substr(a,j,1) |
_= substr(a, j, 1) |
||
if _==substr(b,k,1) then return |
if _==substr(b, k, 1) then return LCS( substr(a, 1, j - 1), substr(b, 1, k - 1), 9)_ |
||
x= |
x= LCS(a, substr(b, 1, k - 1), 9) |
||
y= |
y= LCS( substr(a, 1, j - 1), b, 9) |
||
if length(x)>length(y) then return x |
if length(x)>length(y) then return x |
||
return y</ |
return y</syntaxhighlight> |
||
{{out| |
{{out|output|text= when using the input of: <tt> 1234 1224533324 </tt>}} |
||
<pre> |
<pre> |
||
string A=1234 |
string A = 1234 |
||
string B=1224533324 |
string B = 1224533324 |
||
LCS=1234 |
LCS = 1234 |
||
</pre> |
</pre> |
||
{{out| |
{{out|output|text= when using the input of: <tt> thisisatest testing123testing </tt>}} |
||
<pre> |
<pre> |
||
string A=thisisatest |
string A = thisisatest |
||
string B=testing123testing |
string B = testing123testing |
||
LCS=tsitest |
LCS = tsitest |
||
</pre> |
|||
=={{header|Ring}}== |
|||
<syntaxhighlight lang="ring"> |
|||
see longest("1267834", "1224533324") + nl |
|||
func longest a, b |
|||
if a = "" or b = "" return "" ok |
|||
if right(a, 1) = right(b, 1) |
|||
lcs = longest(left(a, len(a) - 1), left(b, len(b) - 1)) + right(a, 1) |
|||
return lcs |
|||
else |
|||
x1 = longest(a, left(b, len(b) - 1)) |
|||
x2 = longest(left(a, len(a) - 1), b) |
|||
if len(x1) > len(x2) |
|||
lcs = x1 |
|||
return lcs |
|||
else |
|||
lcs = x2 |
|||
return lcs ok ok |
|||
</syntaxhighlight> |
|||
Output: |
|||
<pre> |
|||
1234 |
|||
</pre> |
</pre> |
||
Line 2,051: | Line 2,855: | ||
This solution is similar to the Haskell one. It is slow (The time complexity is exponential.) |
This solution is similar to the Haskell one. It is slow (The time complexity is exponential.) |
||
{{works with|Ruby|1.9}} |
{{works with|Ruby|1.9}} |
||
< |
<syntaxhighlight lang="ruby">=begin |
||
irb(main):001:0> lcs('thisisatest', 'testing123testing') |
irb(main):001:0> lcs('thisisatest', 'testing123testing') |
||
=> "tsitest" |
=> "tsitest" |
||
Line 2,064: | Line 2,868: | ||
[lcs(xstr, ys), lcs(xs, ystr)].max_by {|x| x.size} |
[lcs(xstr, ys), lcs(xs, ystr)].max_by {|x| x.size} |
||
end |
end |
||
end</ |
end</syntaxhighlight> |
||
===Dynamic programming=== |
===Dynamic programming=== |
||
Line 2,071: | Line 2,875: | ||
Walker class for the LCS matrix: |
Walker class for the LCS matrix: |
||
< |
<syntaxhighlight lang="ruby">class LCS |
||
SELF, LEFT, UP, DIAG = [0,0], [0,-1], [-1,0], [-1,-1] |
SELF, LEFT, UP, DIAG = [0,0], [0,-1], [-1,0], [-1,-1] |
||
Line 2,127: | Line 2,931: | ||
end |
end |
||
end |
end |
||
end |
end |
||
lcs function: |
|||
def lcs(a, b) |
|||
walker = LCS.new(a, b) |
walker = LCS.new(a, b) |
||
walker.backtrack. |
walker.backtrack.map{|i| a[i]}.join |
||
end |
end |
||
if $0 == __FILE__ |
|||
puts lcs('thisisatest', 'testing123testing') |
|||
puts lcs( |
puts lcs('thisisatest', 'testing123testing') |
||
puts lcs("rosettacode", "raisethysword") |
|||
end</syntaxhighlight> |
|||
{{out}} |
{{out}} |
||
Line 2,144: | Line 2,948: | ||
rsetod |
rsetod |
||
</pre> |
</pre> |
||
Referring to LCS [[Levenshtein distance/Alignment#Ruby|here]]. |
Referring to LCS [[Levenshtein distance/Alignment#Ruby|here]] and [[Shortest common supersequence#Ruby|here]]. |
||
=={{header|Run BASIC}}== |
=={{header|Run BASIC}}== |
||
< |
<syntaxhighlight lang="runbasic">a$ = "aebdaef" |
||
b$ = "cacbac" |
b$ = "cacbac" |
||
print lcs$(a$,b$) |
print lcs$(a$,b$) |
||
Line 2,173: | Line 2,977: | ||
END IF |
END IF |
||
[ext] |
[ext] |
||
END FUNCTION</ |
END FUNCTION</syntaxhighlight><pre>aba</pre> |
||
=={{header| |
=={{header|Rust}}== |
||
Dynamic programming version: |
|||
{{improve|Scala}}<!-- Why? This template added in place of direct categorization that failed to give any reason. --> |
|||
<syntaxhighlight lang="rust"> |
|||
{{trans|Java}}{{works with|Scala 2.9.1}} |
|||
use std::cmp; |
|||
<lang scala>object LCS extends App { |
|||
fn lcs(string1: String, string2: String) -> (usize, String){ |
|||
// recursive version: |
|||
let total_rows = string1.len() + 1; |
|||
def lcsr(a: String, b: String): String = { |
|||
let total_columns = string2.len() + 1; |
|||
if (a.size==0 || b.size==0) "" |
|||
// rust doesn't allow accessing string by index |
|||
else if (a==b) a |
|||
let string1_chars = string1.as_bytes(); |
|||
else |
|||
let string2_chars = string2.as_bytes(); |
|||
if(a(a.size-1)==b(b.size-1)) lcsr(a.substring(0,a.size-1),b.substring(0,b.size-1))+a(a.size-1) |
|||
let mut table = vec![vec![0; total_columns]; total_rows]; |
|||
for row in 1..total_rows{ |
|||
for col in 1..total_columns { |
|||
if string1_chars[row - 1] == string2_chars[col - 1]{ |
|||
table[row][col] = table[row - 1][col - 1] + 1; |
|||
} else { |
|||
table[row][col] = cmp::max(table[row][col-1], table[row-1][col]); |
|||
} |
|||
} |
|||
} |
|||
let mut common_seq = Vec::new(); |
|||
let mut x = total_rows - 1; |
|||
let mut y = total_columns - 1; |
|||
while x != 0 && y != 0 { |
|||
// Check element above is equal |
|||
if table[x][y] == table[x - 1][y] { |
|||
x = x - 1; |
|||
} |
|||
// check element to the left is equal |
|||
else if table[x][y] == table[x][y - 1] { |
|||
y = y - 1; |
|||
} |
|||
else { |
else { |
||
// check the two element at the respective x,y position is same |
|||
val x = lcsr(a,b.substring(0,b.size-1)) |
|||
assert_eq!(string1_chars[x-1], string2_chars[y-1]); |
|||
let char = string1_chars[x - 1]; |
|||
common_seq.push(char); |
|||
x = x - 1; |
|||
y = y - 1; |
|||
} |
} |
||
} |
|||
common_seq.reverse(); |
|||
(table[total_rows - 1][total_columns - 1], String::from_utf8(common_seq).unwrap()) |
|||
} |
|||
fn main() { |
|||
let res = lcs("abcdaf".to_string(), "acbcf".to_string()); |
|||
assert_eq!((4 as usize, "abcf".to_string()), res); |
|||
let res = lcs("thisisatest".to_string(), "testing123testing".to_string()); |
|||
assert_eq!((7 as usize, "tsitest".to_string()), res); |
|||
// LCS for input Sequences “AGGTAB” and “GXTXAYB” is “GTAB” of length 4. |
|||
let res = lcs("AGGTAB".to_string(), "GXTXAYB".to_string()); |
|||
assert_eq!((4 as usize, "GTAB".to_string()), res); |
|||
}</syntaxhighlight> |
|||
=={{header|Scala}}== |
|||
{{works with|Scala 2.13}} |
|||
Using lazily evaluated lists: |
|||
<syntaxhighlight lang="scala"> def lcsLazy[T](u: IndexedSeq[T], v: IndexedSeq[T]): IndexedSeq[T] = { |
|||
def su = subsets(u).to(LazyList) |
|||
def sv = subsets(v).to(LazyList) |
|||
su.intersect(sv).headOption match{ |
|||
case Some(sub) => sub |
|||
case None => IndexedSeq[T]() |
|||
} |
|||
} |
} |
||
def subsets[T](u: IndexedSeq[T]): Iterator[IndexedSeq[T]] = { |
|||
// dynamic programming version: |
|||
u.indices.reverseIterator.flatMap{n => u.indices.combinations(n + 1).map(_.map(u))} |
|||
def lcsd(a: String, b: String): String = { |
|||
}</syntaxhighlight> |
|||
if (a.size==0 || b.size==0) "" |
|||
else if (a==b) a |
|||
Using recursion: |
|||
else { |
|||
<syntaxhighlight lang="scala"> def lcsRec[T]: (IndexedSeq[T], IndexedSeq[T]) => IndexedSeq[T] = { |
|||
val lengths = Array.ofDim[Int](a.size+1,b.size+1) |
|||
case (a +: as, b +: bs) if a == b => a +: lcsRec(as, bs) |
|||
case (as, bs) if as.isEmpty || bs.isEmpty => IndexedSeq[T]() |
|||
for (j <- 0 until b.size) |
|||
case (a +: as, b +: bs) => |
|||
val (s1, s2) = (lcsRec(a +: as, bs), lcsRec(as, b +: bs)) |
|||
if(s1.length > s2.length) s1 else s2 |
|||
}</syntaxhighlight> |
|||
lengths(i+1)(j+1) = scala.math.max(lengths(i+1)(j),lengths(i)(j+1)) |
|||
{{out}} |
|||
// read the substring out from the matrix |
|||
<pre>scala> lcsLazy("thisisatest", "testing123testing").mkString |
|||
val sb = new StringBuilder() |
|||
res0: String = tsitest |
|||
var x = a.size |
|||
var y = b.size |
|||
scala> lcsRec("thisisatest", "testing123testing").mkString |
|||
do { |
|||
res1: String = tsitest</pre> |
|||
if (lengths(x)(y) == lengths(x-1)(y)) |
|||
{{works with|Scala 2.9.3}} |
|||
x -= 1 |
|||
Recursive version: |
|||
else if (lengths(x)(y) == lengths(x)(y-1)) |
|||
<syntaxhighlight lang="scala"> def lcs[T]: (List[T], List[T]) => List[T] = { |
|||
y -= 1 |
|||
case (_, Nil) => Nil |
|||
case (Nil, _) => Nil |
|||
case (x :: xs, y :: ys) if x == y => x :: lcs(xs, ys) |
|||
case (x :: xs, y :: ys) => { |
|||
(lcs(x :: xs, ys), lcs(xs, y :: ys)) match { |
|||
case (xs, ys) if xs.length > ys.length => xs |
|||
} |
|||
case (xs, ys) => ys |
|||
sb.toString.reverse |
|||
} |
} |
||
} |
} |
||
}</syntaxhighlight> |
|||
val elapsed: (=> Unit) => Long = f => {val s = System.currentTimeMillis; f; (System.currentTimeMillis - s)/1000} |
|||
val pairs = List(("thisiaatest","testing123testing") |
|||
,("","x") |
|||
,("x","x") |
|||
,("beginning-middle-ending", "beginning-diddle-dum-ending")) |
|||
The dynamic programming version: |
|||
var s = "" |
|||
println("recursive version:") |
|||
pairs foreach {p => |
|||
println{val t = elapsed(s = lcsr(p._1,p._2)) |
|||
"lcsr(\""+p._1+"\",\""+p._2+"\") = \""+s+"\" ("+t+" sec)"} |
|||
} |
|||
<syntaxhighlight lang="scala"> case class Memoized[A1, A2, B](f: (A1, A2) => B) extends ((A1, A2) => B) { |
|||
println("\n"+"dynamic programming version:") |
|||
val cache = scala.collection.mutable.Map.empty[(A1, A2), B] |
|||
pairs foreach {p => |
|||
def apply(x: A1, y: A2) = cache.getOrElseUpdate((x, y), f(x, y)) |
|||
println{val t = elapsed(s = lcsd(p._1,p._2)) |
|||
"lcsd(\""+p._1+"\",\""+p._2+"\") = \""+s+"\" ("+t+" sec)"} |
|||
} |
} |
||
}</lang>{{out}} |
|||
lazy val lcsM: Memoized[List[Char], List[Char], List[Char]] = Memoized { |
|||
recursive version: |
|||
case (_, Nil) => Nil |
|||
lcsr("thisiaatest","testing123testing") = "tsitest" (0 sec) |
|||
case (Nil, _) => Nil |
|||
lcsr("","x") = "" (0 sec) |
|||
case (x :: xs, y :: ys) if x == y => x :: lcsM(xs, ys) |
|||
case (x :: xs, y :: ys) => { |
|||
lcsr("beginning-middle-ending","beginning-diddle-dum-ending") = "beginning-iddle-ending" (29 sec) |
|||
(lcsM(x :: xs, ys), lcsM(xs, y :: ys)) match { |
|||
case (xs, ys) if xs.length > ys.length => xs |
|||
dynamic programming version: |
|||
case (xs, ys) => ys |
|||
lcsd("thisiaatest","testing123testing") = "tsitest" (0 sec) |
|||
} |
|||
} |
|||
lcsd("x","x") = "x" (0 sec) |
|||
}</syntaxhighlight> |
|||
lcsd("beginning-middle-ending","beginning-diddle-dum-ending") = "beginning-iddle-ending" (0 sec) |
|||
{{out}} |
|||
scala> lcsM("thisiaatest".toList, "testing123testing".toList).mkString |
|||
res0: String = tsitest |
|||
=={{header|Scheme}}== |
=={{header|Scheme}}== |
||
Line 2,262: | Line 3,111: | ||
Port from Clojure. |
Port from Clojure. |
||
< |
<syntaxhighlight lang="scheme"> |
||
;; using srfi-69 |
;; using srfi-69 |
||
(define (memoize proc) |
(define (memoize proc) |
||
Line 2,292: | Line 3,141: | ||
'())) |
'())) |
||
'())))) |
'())))) |
||
</syntaxhighlight> |
|||
</lang> |
|||
Testing: |
Testing: |
||
< |
<syntaxhighlight lang="scheme"> |
||
(test-group |
(test-group |
||
Line 2,309: | Line 3,158: | ||
(lcs '(a b d e f g h i j) |
(lcs '(a b d e f g h i j) |
||
'(A b c d e f F a g h j)))) |
'(A b c d e f F a g h j)))) |
||
</syntaxhighlight> |
|||
</lang> |
|||
=={{header|Seed7}}== |
=={{header|Seed7}}== |
||
< |
<syntaxhighlight lang="seed7">$ include "seed7_05.s7i"; |
||
const func string: lcs (in string: a, in string: b) is func |
const func string: lcs (in string: a, in string: b) is func |
||
Line 2,340: | Line 3,189: | ||
writeln(lcs("thisisatest", "testing123testing")); |
writeln(lcs("thisisatest", "testing123testing")); |
||
writeln(lcs("1234", "1224533324")); |
writeln(lcs("1234", "1224533324")); |
||
end func;</ |
end func;</syntaxhighlight> |
||
Output: |
Output: |
||
Line 2,346: | Line 3,195: | ||
tsitest |
tsitest |
||
1234 |
1234 |
||
</pre> |
|||
=={{header|SequenceL}}== |
|||
{{trans|C#}} |
|||
It is interesting to note that x and y are computed in parallel, dividing work across threads repeatedly down through the recursion. |
|||
<syntaxhighlight lang="sequencel">import <Utilities/Sequence.sl>; |
|||
lcsBack(a(1), b(1)) := |
|||
let |
|||
aSub := allButLast(a); |
|||
bSub := allButLast(b); |
|||
x := lcsBack(a, bSub); |
|||
y := lcsBack(aSub, b); |
|||
in |
|||
[] when size(a) = 0 or size(b) = 0 |
|||
else |
|||
lcsBack(aSub, bSub) ++ [last(a)] when last(a) = last(b) |
|||
else |
|||
x when size(x) > size(y) |
|||
else |
|||
y; |
|||
main(args(2)) := |
|||
lcsBack(args[1], args[2]) when size(args) >=2 |
|||
else |
|||
lcsBack("thisisatest", "testing123testing");</syntaxhighlight> |
|||
{{out}} |
|||
<pre> |
|||
"tsitest" |
|||
</pre> |
</pre> |
||
=={{header|SETL}}== |
=={{header|SETL}}== |
||
Recursive; Also works on tuples (vectors) |
Recursive; Also works on tuples (vectors) |
||
< |
<syntaxhighlight lang="setl"> op .longest(a, b); |
||
return if #a > #b then a else b end; |
return if #a > #b then a else b end; |
||
end .longest; |
end .longest; |
||
Line 2,362: | Line 3,244: | ||
return lcs(a(2..), b) .longest lcs(a, b(2..)); |
return lcs(a(2..), b) .longest lcs(a, b(2..)); |
||
end; |
end; |
||
end lcs;</ |
end lcs;</syntaxhighlight> |
||
=={{header|Sidef}}== |
=={{header|Sidef}}== |
||
<lang |
<syntaxhighlight lang="ruby">func lcs(xstr, ystr) is cached { |
||
(xstr.is_empty || ystr.is_empty) && return ''; |
|||
xstr.is_empty && return xstr |
|||
ystr.is_empty && return ystr |
|||
var(x, xs, y, ys) = (xstr.first(1), xstr.slice(1), |
|||
ystr.first(1), ystr.slice(1)) |
|||
if (x == y) { |
if (x == y) { |
||
x + lcs( |
x + lcs(xs, ys) |
||
} else { |
} else { |
||
[lcs(xstr, ys), lcs(xs, ystr)].max_by { |
[lcs(xstr, ys), lcs(xs, ystr)].max_by { .len } |
||
} |
} |
||
} |
} |
||
say lcs("thisisatest", "testing123testing") |
say lcs("thisisatest", "testing123testing")</syntaxhighlight> |
||
{{out}} |
{{out}} |
||
<pre> |
|||
<pre>% time sidef -Mblock lcs.sf |
|||
tsitest |
tsitest |
||
</pre> |
|||
sidef -Mblock lcs.sf 0.23s user 0.01s system 97% cpu 0.240 total</pre> |
|||
=={{header|Slate}}== |
=={{header|Slate}}== |
||
Line 2,388: | Line 3,272: | ||
===Recursion=== |
===Recursion=== |
||
{{trans|Java}} |
{{trans|Java}} |
||
< |
<syntaxhighlight lang="slate">s1@(Sequence traits) longestCommonSubsequenceWith: s2@(Sequence traits) |
||
[ |
[ |
||
s1 isEmpty \/ s2 isEmpty ifTrue: [^ {}]. |
s1 isEmpty \/ s2 isEmpty ifTrue: [^ {}]. |
||
Line 2,397: | Line 3,281: | ||
y: (s1 allButLast longestCommonSubsequenceWith: s2). |
y: (s1 allButLast longestCommonSubsequenceWith: s2). |
||
x length > y length ifTrue: [x] ifFalse: [y]] |
x length > y length ifTrue: [x] ifFalse: [y]] |
||
].</ |
].</syntaxhighlight> |
||
===Dynamic Programming=== |
===Dynamic Programming=== |
||
{{trans|Ruby}} |
{{trans|Ruby}} |
||
< |
<syntaxhighlight lang="slate">s1@(Sequence traits) longestCommonSubsequenceWith: s2@(Sequence traits) |
||
[| lengths | |
[| lengths | |
||
lengths: (ArrayMD newWithDimensions: {s1 length `cache. s2 length `cache} defaultElement: 0). |
lengths: (ArrayMD newWithDimensions: {s1 length `cache. s2 length `cache} defaultElement: 0). |
||
Line 2,423: | Line 3,307: | ||
index2: index2 - 1]] |
index2: index2 - 1]] |
||
] writingAs: s1) reverse |
] writingAs: s1) reverse |
||
].</ |
].</syntaxhighlight> |
||
=={{header|Swift}}== |
|||
Swift 5.1 |
|||
===Recursion=== |
|||
<syntaxhighlight lang="swift">rlcs(_ s1: String, _ s2: String) -> String { |
|||
if s1.count == 0 || s2.count == 0 { |
|||
return "" |
|||
} else if s1[s1.index(s1.endIndex, offsetBy: -1)] == s2[s2.index(s2.endIndex, offsetBy: -1)] { |
|||
return rlcs(String(s1[s1.startIndex..<s1.index(s1.endIndex, offsetBy: -1)]), |
|||
String(s2[s2.startIndex..<s2.index(s2.endIndex, offsetBy: -1)])) + String(s1[s1.index(s1.endIndex, offsetBy: -1)]) |
|||
} else { |
|||
let str1 = rlcs(s1, String(s2[s2.startIndex..<s2.index(s2.endIndex, offsetBy: -1)])) |
|||
let str2 = rlcs(String(s1[s1.startIndex..<s1.index(s1.endIndex, offsetBy: -1)]), s2) |
|||
return str1.count > str2.count ? str1 : str2 |
|||
} |
|||
}</syntaxhighlight> |
|||
===Dynamic Programming=== |
|||
<syntaxhighlight lang="swift">func lcs(_ s1: String, _ s2: String) -> String { |
|||
var lens = Array( |
|||
repeating:Array(repeating: 0, count: s2.count + 1), |
|||
count: s1.count + 1 |
|||
) |
|||
for i in 0..<s1.count { |
|||
for j in 0..<s2.count { |
|||
if s1[s1.index(s1.startIndex, offsetBy: i)] == s2[s2.index(s2.startIndex, offsetBy: j)] { |
|||
lens[i + 1][j + 1] = lens[i][j] + 1 |
|||
} else { |
|||
lens[i + 1][j + 1] = max(lens[i + 1][j], lens[i][j + 1]) |
|||
} |
|||
} |
|||
} |
|||
var returnStr = "" |
|||
var x = s1.count |
|||
var y = s2.count |
|||
while x != 0 && y != 0 { |
|||
if lens[x][y] == lens[x - 1][y] { |
|||
x -= 1 |
|||
} else if lens[x][y] == lens[x][y - 1] { |
|||
y -= 1 |
|||
} else { |
|||
returnStr += String(s1[s1.index(s1.startIndex, offsetBy: x - 1)]) |
|||
x -= 1 |
|||
y -= 1 |
|||
} |
|||
} |
|||
return String(returnStr.reversed()) |
|||
}</syntaxhighlight> |
|||
=={{header|Tcl}}== |
=={{header|Tcl}}== |
||
===Recursive=== |
===Recursive=== |
||
{{trans|Java}} |
{{trans|Java}} |
||
< |
<syntaxhighlight lang="tcl">proc r_lcs {a b} { |
||
if {$a eq "" || $b eq ""} {return ""} |
if {$a eq "" || $b eq ""} {return ""} |
||
set a_ [string range $a 1 end] |
set a_ [string range $a 1 end] |
||
Line 2,439: | Line 3,375: | ||
return [expr {[string length $x] > [string length $y] ? $x :$y}] |
return [expr {[string length $x] > [string length $y] ? $x :$y}] |
||
} |
} |
||
}</ |
}</syntaxhighlight> |
||
===Dynamic=== |
===Dynamic=== |
||
{{trans|Java}} |
{{trans|Java}} |
||
{{works with|Tcl|8.5}} |
{{works with|Tcl|8.5}} |
||
< |
<syntaxhighlight lang="tcl">package require Tcl 8.5 |
||
namespace import ::tcl::mathop::+ |
namespace import ::tcl::mathop::+ |
||
namespace import ::tcl::mathop::- |
namespace import ::tcl::mathop::- |
||
Line 2,466: | Line 3,402: | ||
set x $la |
set x $la |
||
set y $lb |
set y $lb |
||
while {$x >0 && $ |
while {$x > 0 && $y > 0} { |
||
if {[lindex $lengths $x $y] == [lindex $lengths [- $x 1] $y]} { |
if {[lindex $lengths $x $y] == [lindex $lengths [- $x 1] $y]} { |
||
incr x -1 |
incr x -1 |
||
Line 2,481: | Line 3,417: | ||
} |
} |
||
return [string reverse $result] |
return [string reverse $result] |
||
}</ |
}</syntaxhighlight> |
||
===Performance Comparison=== |
===Performance Comparison=== |
||
< |
<syntaxhighlight lang="tcl">% time {d_lcs thisisatest testing123testing} 10 |
||
637.5 microseconds per iteration |
637.5 microseconds per iteration |
||
% time {r_lcs thisisatest testing123testing} 10 |
% time {r_lcs thisisatest testing123testing} 10 |
||
1275566.8 microseconds per iteration</ |
1275566.8 microseconds per iteration</syntaxhighlight> |
||
=={{header|Ursala}}== |
=={{header|Ursala}}== |
||
This uses the same recursive algorithm as in the Haskell example, |
This uses the same recursive algorithm as in the Haskell example, |
||
and works on lists of any type. |
and works on lists of any type. |
||
< |
<syntaxhighlight lang="ursala">#import std |
||
lcs = ~&alrB^& ~&E?abh/~&alh2fabt2RC @faltPrXlrtPXXPW leql?/~&r ~&l</ |
lcs = ~&alrB^& ~&E?abh/~&alh2fabt2RC @faltPrXlrtPXXPW leql?/~&r ~&l</syntaxhighlight> |
||
test program: |
test program: |
||
< |
<syntaxhighlight lang="ursala">#cast %s |
||
example = lcs('thisisatest','testing123testing')</ |
example = lcs('thisisatest','testing123testing')</syntaxhighlight> |
||
{{out}} |
{{out}} |
||
<pre>'tsitest'</pre> |
<pre>'tsitest'</pre> |
||
=={{header|Wren}}== |
|||
{{trans|Kotlin}} |
|||
<syntaxhighlight lang="wren">var lcs // recursive |
|||
lcs = Fn.new { |x, y| |
|||
if (x.count == 0 || y.count == 0) return "" |
|||
var x1 = x[0...-1] |
|||
var y1 = y[0...-1] |
|||
if (x[-1] == y[-1]) return lcs.call(x1, y1) + x[-1] |
|||
var x2 = lcs.call(x, y1) |
|||
var y2 = lcs.call(x1, y) |
|||
return (x2.count > y2.count) ? x2 : y2 |
|||
} |
|||
var x = "thisisatest" |
|||
var y = "testing123testing" |
|||
System.print(lcs.call(x, y))</syntaxhighlight> |
|||
{{out}} |
|||
<pre> |
|||
tsitest |
|||
</pre> |
|||
=={{header|zkl}}== |
=={{header|zkl}}== |
||
This is quite vile in terms of [time] efficiency, another algorithm should be used for real work. |
This is quite vile in terms of [time] efficiency, another algorithm should be used for real work. |
||
{{trans|D}} |
{{trans|D}} |
||
< |
<syntaxhighlight lang="zkl">fcn lcs(a,b){ |
||
if(not a or not b) return(""); |
if(not a or not b) return(""); |
||
if (a[0]==b[0]) return(a[0] + self.fcn(a[1,*],b[1,*])); |
if (a[0]==b[0]) return(a[0] + self.fcn(a[1,*],b[1,*])); |
||
return(fcn(x,y){if(x.len()>y.len())x else y}(lcs(a,b[1,*]),lcs(a[1,*],b))) |
return(fcn(x,y){if(x.len()>y.len())x else y}(lcs(a,b[1,*]),lcs(a[1,*],b))) |
||
}</ |
}</syntaxhighlight> |
||
The last line looks strange but it is just return(lambda longest(lcs.lcs)) |
The last line looks strange but it is just return(lambda longest(lcs.lcs)) |
||
{{out}} |
{{out}} |
Latest revision as of 17:55, 4 July 2024
You are encouraged to solve this task according to the task description, using any language you may know.
Introduction
Define a subsequence to be any output string obtained by deleting zero or more symbols from an input string.
The Longest Common Subsequence (LCS) is a subsequence of maximum length common to two or more strings.
Let A ≡ A[0]… A[m - 1] and B ≡ B[0]… B[n - 1], m < n be strings drawn from an alphabet Σ of size s, containing every distinct symbol in A + B.
An ordered pair (i, j) will be referred to as a match if A[i] = B[j], where 0 ≤ i < m and 0 ≤ j < n.
The set of matches M defines a relation over matches: M[i, j] ⇔ (i, j) ∈ M.
Define a non-strict product-order (≤) over ordered pairs, such that (i1, j1) ≤ (i2, j2) ⇔ i1 ≤ i2 and j1 ≤ j2. We define (≥) similarly.
We say ordered pairs p1 and p2 are comparable if either p1 ≤ p2 or p1 ≥ p2 holds. If i1 < i2 and j2 < j1 (or i2 < i1 and j1 < j2) then neither p1 ≤ p2 nor p1 ≥ p2 are possible, and we say p1 and p2 are incomparable.
Define the strict product-order (<) over ordered pairs, such that (i1, j1) < (i2, j2) ⇔ i1 < i2 and j1 < j2. We define (>) similarly.
A chain C is a subset of M consisting of at least one element m; and where either m1 < m2 or m1 > m2 for every pair of distinct elements m1 and m2. An antichain D is any subset of M in which every pair of distinct elements m1 and m2 are incomparable.
A chain can be visualized as a strictly increasing curve that passes through matches (i, j) in the m*n coordinate space of M[i, j].
Every Common Sequence of length q corresponds to a chain of cardinality q, over the set of matches M. Thus, finding an LCS can be restated as the problem of finding a chain of maximum cardinality p.
According to [Dilworth 1950], this cardinality p equals the minimum number of disjoint antichains into which M can be decomposed. Note that such a decomposition into the minimal number p of disjoint antichains may not be unique.
Background
Where the number of symbols appearing in matches is small relative to the length of the input strings, reuse of the symbols increases; and the number of matches will tend towards O(m*n) quadratic growth. This occurs, for example, in the Bioinformatics application of nucleotide and protein sequencing.
The divide-and-conquer approach of [Hirschberg 1975] limits the space required to O(n). However, this approach requires O(m*n) time even in the best case.
This quadratic time dependency may become prohibitive, given very long input strings. Thus, heuristics are often favored over optimal Dynamic Programming solutions.
In the application of comparing file revisions, records from the input files form a large symbol space; and the number of symbols approaches the length of the LCS. In this case the number of matches reduces to linear, O(n) growth.
A binary search optimization due to [Hunt and Szymanski 1977] can be applied to the basic Dynamic Programming approach, resulting in an expected performance of O(n log m). Performance can degrade to O(m*n log m) time in the worst case, as the number of matches grows to O(m*n).
Note
[Rick 2000] describes a linear-space algorithm with a time bound of O(n*s + p*min(m, n - p)).
Legend
A, B are input strings of lengths m, n respectively p is the length of the LCS M is the set of matches (i, j) such that A[i] = B[j] r is the magnitude of M s is the magnitude of the alphabet Σ of distinct symbols in A + B
References
[Dilworth 1950] "A decomposition theorem for partially ordered sets" by Robert P. Dilworth, published January 1950, Annals of Mathematics [Volume 51, Number 1, pp. 161-166]
[Goeman and Clausen 2002] "A New Practical Linear Space Algorithm for the Longest Common Subsequence Problem" by Heiko Goeman and Michael Clausen, published 2002, Kybernetika [Volume 38, Issue 1, pp. 45-66]
[Hirschberg 1975] "A linear space algorithm for computing maximal common subsequences" by Daniel S. Hirschberg, published June 1975 Communications of the ACM [Volume 18, Number 6, pp. 341-343]
[Hunt and McIlroy 1976] "An Algorithm for Differential File Comparison" by James W. Hunt and M. Douglas McIlroy, June 1976 Computing Science Technical Report, Bell Laboratories 41
[Hunt and Szymanski 1977] "A Fast Algorithm for Computing Longest Common Subsequences" by James W. Hunt and Thomas G. Szymanski, published May 1977 Communications of the ACM [Volume 20, Number 5, pp. 350-353]
[Rick 2000] "Simple and fast linear space computation of longest common subsequences"
by Claus Rick, received 17 March 2000, Information Processing Letters,
Elsevier Science [Volume 75, pp. 275–281]
Examples
The sequences "1234" and "1224533324" have an LCS of "1234":
1234 1224533324
For a string example, consider the sequences "thisisatest" and "testing123testing". An LCS would be "tsitest":
thisisatest testing123testing
In this puzzle, your code only needs to deal with strings. Write a function which returns an LCS of two strings (case-sensitive). You don't need to show multiple LCS's.
For more information on this problem please see Wikipedia.
- Metrics
- Counting
- Word frequency
- Letter frequency
- Jewels and stones
- I before E except after C
- Bioinformatics/base count
- Count occurrences of a substring
- Count how many vowels and consonants occur in a string
- Remove/replace
- XXXX redacted
- Conjugate a Latin verb
- Remove vowels from a string
- String interpolation (included)
- Strip block comments
- Strip comments from a string
- Strip a set of characters from a string
- Strip whitespace from a string -- top and tail
- Strip control codes and extended characters from a string
- Anagrams/Derangements/shuffling
- Word wheel
- ABC problem
- Sattolo cycle
- Knuth shuffle
- Ordered words
- Superpermutation minimisation
- Textonyms (using a phone text pad)
- Anagrams
- Anagrams/Deranged anagrams
- Permutations/Derangements
- Find/Search/Determine
- ABC words
- Odd words
- Word ladder
- Semordnilap
- Word search
- Wordiff (game)
- String matching
- Tea cup rim text
- Alternade words
- Changeable words
- State name puzzle
- String comparison
- Unique characters
- Unique characters in each string
- Extract file extension
- Levenshtein distance
- Palindrome detection
- Common list elements
- Longest common suffix
- Longest common prefix
- Compare a list of strings
- Longest common substring
- Find common directory path
- Words from neighbour ones
- Change e letters to i in words
- Non-continuous subsequences
- Longest common subsequence
- Longest palindromic substrings
- Longest increasing subsequence
- Words containing "the" substring
- Sum of the digits of n is substring of n
- Determine if a string is numeric
- Determine if a string is collapsible
- Determine if a string is squeezable
- Determine if a string has all unique characters
- Determine if a string has all the same characters
- Longest substrings without repeating characters
- Find words which contains all the vowels
- Find words which contains most consonants
- Find words which contains more than 3 vowels
- Find words which first and last three letters are equals
- Find words which odd letters are consonants and even letters are vowels or vice_versa
- Formatting
- Substring
- Rep-string
- Word wrap
- String case
- Align columns
- Literals/String
- Repeat a string
- Brace expansion
- Brace expansion using ranges
- Reverse a string
- Phrase reversals
- Comma quibbling
- Special characters
- String concatenation
- Substring/Top and tail
- Commatizing numbers
- Reverse words in a string
- Suffixation of decimal numbers
- Long literals, with continuations
- Numerical and alphabetical suffixes
- Abbreviations, easy
- Abbreviations, simple
- Abbreviations, automatic
- Song lyrics/poems/Mad Libs/phrases
- Mad Libs
- Magic 8-ball
- 99 Bottles of Beer
- The Name Game (a song)
- The Old lady swallowed a fly
- The Twelve Days of Christmas
- Tokenize
- Text between
- Tokenize a string
- Word break problem
- Tokenize a string with escaping
- Split a character string based on change of character
- Sequences
11l
F lcs(a, b)
V lengths = [[0] * (b.len+1)] * (a.len+1)
L(x) a
V i = L.index
L(y) b
V j = L.index
I x == y
lengths[i + 1][j + 1] = lengths[i][j] + 1
E
lengths[i + 1][j + 1] = max(lengths[i + 1][j], lengths[i][j + 1])
V result = ‘’
V j = b.len
L(i) 1..a.len
I lengths[i][j] != lengths[i - 1][j]
result ‘’= a[i - 1]
R result
print(lcs(‘1234’, ‘1224533324’))
print(lcs(‘thisisatest’, ‘testing123testing’))
- Output:
1234 tisitst
Ada
Using recursion:
with Ada.Text_IO; use Ada.Text_IO;
procedure Test_LCS is
function LCS (A, B : String) return String is
begin
if A'Length = 0 or else B'Length = 0 then
return "";
elsif A (A'Last) = B (B'Last) then
return LCS (A (A'First..A'Last - 1), B (B'First..B'Last - 1)) & A (A'Last);
else
declare
X : String renames LCS (A, B (B'First..B'Last - 1));
Y : String renames LCS (A (A'First..A'Last - 1), B);
begin
if X'Length > Y'Length then
return X;
else
return Y;
end if;
end;
end if;
end LCS;
begin
Put_Line (LCS ("thisisatest", "testing123testing"));
end Test_LCS;
- Output:
tsitest
Non-recursive solution:
with Ada.Text_IO; use Ada.Text_IO;
procedure Test_LCS is
function LCS (A, B : String) return String is
L : array (A'First..A'Last + 1, B'First..B'Last + 1) of Natural;
begin
for I in L'Range (1) loop
L (I, B'First) := 0;
end loop;
for J in L'Range (2) loop
L (A'First, J) := 0;
end loop;
for I in A'Range loop
for J in B'Range loop
if A (I) = B (J) then
L (I + 1, J + 1) := L (I, J) + 1;
else
L (I + 1, J + 1) := Natural'Max (L (I + 1, J), L (I, J + 1));
end if;
end loop;
end loop;
declare
I : Integer := L'Last (1);
J : Integer := L'Last (2);
R : String (1..Integer'Max (A'Length, B'Length));
K : Integer := R'Last;
begin
while I > L'First (1) and then J > L'First (2) loop
if L (I, J) = L (I - 1, J) then
I := I - 1;
elsif L (I, J) = L (I, J - 1) then
J := J - 1;
else
I := I - 1;
J := J - 1;
R (K) := A (I);
K := K - 1;
end if;
end loop;
return R (K + 1..R'Last);
end;
end LCS;
begin
Put_Line (LCS ("thisisatest", "testing123testing"));
end Test_LCS;
- Output:
tsitest
ALGOL 68
main:(
PROC lcs = (STRING a, b)STRING:
BEGIN
IF UPB a = 0 OR UPB b = 0 THEN
""
ELIF a [UPB a] = b [UPB b] THEN
lcs (a [:UPB a - 1], b [:UPB b - 1]) + a [UPB a]
ELSE
STRING x = lcs (a, b [:UPB b - 1]);
STRING y = lcs (a [:UPB a - 1], b);
IF UPB x > UPB y THEN x ELSE y FI
FI
END # lcs #;
print((lcs ("thisisatest", "testing123testing"), new line))
)
- Output:
tsitest
APL
lcs←{
⎕IO←0
betterof←{⊃(</+/¨⍺ ⍵)⌽⍺ ⍵} ⍝ better of 2 selections
cmbn←{↑,⊃∘.,/(⊂⊂⍬),⍵} ⍝ combine lists
rr←{∧/↑>/1 ¯1↓[1]¨⊂⍵} ⍝ rising rows
hmrr←{∨/(rr ⍵)∧∧/⍵=⌈\⍵} ⍝ has monotonically rising rows
rnbc←{{⍵/⍳⍴⍵}¨↓[0]×⍵} ⍝ row numbers by column
valid←hmrr∘cmbn∘rnbc ⍝ any valid solutions?
a w←(</⊃∘⍴¨⍺ ⍵)⌽⍺ ⍵ ⍝ longest first
matches←a∘.=w
aps←{⍵[;⍒+⌿⍵]}∘{(⍵/2)⊤⍳2*⍵} ⍝ all possible subsequences
swps←{⍵/⍨∧⌿~(~∨⌿⍺)⌿⍵} ⍝ subsequences with possible solns
sstt←matches swps aps⊃⍴w ⍝ subsequences to try
w/⍨{
⍺←0⍴⍨⊃⍴⍵ ⍝ initial selection
(+/⍺)≥+/⍵[;0]:⍺ ⍝ no scope to improve
this←⍺ betterof{⍵×valid ⍵/matches}⍵[;0] ⍝ try to improve
1=1⊃⍴⍵:this ⍝ nothing left to try
this ∇ 1↓[1]⍵ ⍝ keep looking
}sstt
}
Arturo
lcs: function [a,b][
ls: new array.of: @[inc size a, inc size b] 0
loop.with:'i a 'x [
loop.with:'j b 'y [
ls\[i+1]\[j+1]: (x=y)? -> ls\[i]\[j] + 1
-> max @[ls\[i+1]\[j], ls\[i]\[j+1]]
]
]
[result, x, y]: @[new "", size a, size b]
while [and? [x > 0][y > 0]][
if? ls\[x]\[y] = ls\[x-1]\[y] -> x: x-1
else [
if? ls\[x]\[y] = ls\[x]\[y-1] -> y: y-1
else [
result: a\[x-1] ++ result
x: x-1
y: y-1
]
]
]
return result
]
print lcs "1234" "1224533324"
print lcs "thisisatest" "testing123testing"
- Output:
1234 tsitest
AutoHotkey
using dynamic programming
ahk forum: discussion
lcs(a,b) { ; Longest Common Subsequence of strings, using Dynamic Programming
Loop % StrLen(a)+2 { ; Initialize
i := A_Index-1
Loop % StrLen(b)+2
j := A_Index-1, len%i%_%j% := 0
}
Loop Parse, a ; scan a
{
i := A_Index, i1 := i+1, x := A_LoopField
Loop Parse, b ; scan b
{
j := A_Index, j1 := j+1, y := A_LoopField
len%i1%_%j1% := x=y ? len%i%_%j% + 1
: (u:=len%i1%_%j%) > (v:=len%i%_%j1%) ? u : v
}
}
x := StrLen(a)+1, y := StrLen(b)+1
While x*y { ; construct solution from lengths
x1 := x-1, y1 := y-1
If (len%x%_%y% = len%x1%_%y%)
x := x1
Else If (len%x%_%y% = len%x%_%y1%)
y := y1
Else
x := x1, y := y1, t := SubStr(a,x,1) t
}
Return t
}
BASIC
QuickBASIC
FUNCTION lcs$ (a$, b$)
IF LEN(a$) = 0 OR LEN(b$) = 0 THEN
lcs$ = ""
ELSEIF RIGHT$(a$, 1) = RIGHT$(b$, 1) THEN
lcs$ = lcs$(LEFT$(a$, LEN(a$) - 1), LEFT$(b$, LEN(b$) - 1)) + RIGHT$(a$, 1)
ELSE
x$ = lcs$(a$, LEFT$(b$, LEN(b$) - 1))
y$ = lcs$(LEFT$(a$, LEN(a$) - 1), b$)
IF LEN(x$) > LEN(y$) THEN
lcs$ = x$
ELSE
lcs$ = y$
END IF
END IF
END FUNCTION
BASIC256
function LCS(a, b)
if length(a) = 0 or length(b) = 0 then return ""
if right(a, 1) = right(b, 1) then
LCS = LCS(left(a, length(a) - 1), left(b, length(b) - 1)) + right(a, 1)
else
x = LCS(a, left(b, length(b) - 1))
y = LCS(left(a, length(a) - 1), b)
if length(x) > length(y) then return x else return y
end if
end function
print LCS("1234", "1224533324")
print LCS("thisisatest", "testing123testing")
end
BBC BASIC
This makes heavy use of BBC BASIC's shortcut LEFT$(a$) and RIGHT$(a$) functions.
PRINT FNlcs("1234", "1224533324")
PRINT FNlcs("thisisatest", "testing123testing")
END
DEF FNlcs(a$, b$)
IF a$="" OR b$="" THEN = ""
IF RIGHT$(a$) = RIGHT$(b$) THEN = FNlcs(LEFT$(a$), LEFT$(b$)) + RIGHT$(a$)
LOCAL x$, y$
x$ = FNlcs(a$, LEFT$(b$))
y$ = FNlcs(LEFT$(a$), b$)
IF LEN(y$) > LEN(x$) SWAP x$,y$
= x$
Output:
1234 tsitest
BQN
It's easier and faster to get only the length of the longest common subsequence, using LcsLen ← ¯1 ⊑ 0¨∘⊢ {𝕩⌈⌈`𝕨+»𝕩}˝ =⌜⟜⌽
. This function can be expanded by changing ⌈
to ⊣⍟(>○≠)
(choosing from two arguments one that has the greatest length), and replacing the empty length 0 with the empty string ""
in the right places.
LCS ← ¯1 ⊑ "" <⊸∾ ""¨∘⊢ ⊣⍟(>○≠){𝕩𝔽¨𝔽`𝕨∾¨""<⊸»𝕩}˝ (=⌜⥊¨⊣)⟜⌽
Output:
"1234" LCS "1224533324"
"1234"
"thisisatest" LCS "testing123testing"
"tsitest"
Bracmat
( LCS
= A a ta B b tb prefix
. !arg:(?prefix.@(?A:%?a ?ta).@(?B:%?b ?tb))
& ( !a:!b&LCS$(!prefix !a.!ta.!tb)
| LCS$(!prefix.!A.!tb)&LCS$(!prefix.!ta.!B)
)
| !prefix:? ([>!max:[?max):?lcs
|
)
& 0:?max
& :?lcs
& LCS$(.thisisatest.testing123testing)
& out$(max !max lcs !lcs);
- Output:
max 7 lcs t s i t e s t
C
#include <stdio.h>
#include <stdlib.h>
#define MAX(a, b) (a > b ? a : b)
int lcs (char *a, int n, char *b, int m, char **s) {
int i, j, k, t;
int *z = calloc((n + 1) * (m + 1), sizeof (int));
int **c = calloc((n + 1), sizeof (int *));
for (i = 0; i <= n; i++) {
c[i] = &z[i * (m + 1)];
}
for (i = 1; i <= n; i++) {
for (j = 1; j <= m; j++) {
if (a[i - 1] == b[j - 1]) {
c[i][j] = c[i - 1][j - 1] + 1;
}
else {
c[i][j] = MAX(c[i - 1][j], c[i][j - 1]);
}
}
}
t = c[n][m];
*s = malloc(t);
for (i = n, j = m, k = t - 1; k >= 0;) {
if (a[i - 1] == b[j - 1])
(*s)[k] = a[i - 1], i--, j--, k--;
else if (c[i][j - 1] > c[i - 1][j])
j--;
else
i--;
}
free(c);
free(z);
return t;
}
Testing
int main () {
char a[] = "thisisatest";
char b[] = "testing123testing";
int n = sizeof a - 1;
int m = sizeof b - 1;
char *s = NULL;
int t = lcs(a, n, b, m, &s);
printf("%.*s\n", t, s); // tsitest
return 0;
}
C#
With recursion
using System;
namespace LCS
{
class Program
{
static void Main(string[] args)
{
string word1 = "thisisatest";
string word2 = "testing123testing";
Console.WriteLine(lcsBack(word1, word2));
Console.ReadKey();
}
public static string lcsBack(string a, string b)
{
string aSub = a.Substring(0, (a.Length - 1 < 0) ? 0 : a.Length - 1);
string bSub = b.Substring(0, (b.Length - 1 < 0) ? 0 : b.Length - 1);
if (a.Length == 0 || b.Length == 0)
return "";
else if (a[a.Length - 1] == b[b.Length - 1])
return lcsBack(aSub, bSub) + a[a.Length - 1];
else
{
string x = lcsBack(a, bSub);
string y = lcsBack(aSub, b);
return (x.Length > y.Length) ? x : y;
}
}
}
}
C++
Hunt and Szymanski algorithm
#include <stdint.h>
#include <string>
#include <memory> // for shared_ptr<>
#include <iostream>
#include <deque>
#include <unordered_map> //[C++11]
#include <algorithm> // for lower_bound()
#include <iterator> // for next() and prev()
using namespace std;
class LCS {
protected:
// Instances of the Pair linked list class are used to recover the LCS:
class Pair {
public:
uint32_t index1;
uint32_t index2;
shared_ptr<Pair> next;
Pair(uint32_t index1, uint32_t index2, shared_ptr<Pair> next = nullptr)
: index1(index1), index2(index2), next(next) {
}
static shared_ptr<Pair> Reverse(const shared_ptr<Pair> pairs) {
shared_ptr<Pair> head = nullptr;
for (auto next = pairs; next != nullptr; next = next->next)
head = make_shared<Pair>(next->index1, next->index2, head);
return head;
}
};
typedef deque<shared_ptr<Pair>> PAIRS;
typedef deque<uint32_t> INDEXES;
typedef unordered_map<char, INDEXES> CHAR_TO_INDEXES_MAP;
typedef deque<INDEXES*> MATCHES;
static uint32_t FindLCS(
MATCHES& indexesOf2MatchedByIndex1, shared_ptr<Pair>* pairs) {
auto traceLCS = pairs != nullptr;
PAIRS chains;
INDEXES prefixEnd;
//
//[Assert]After each index1 iteration prefixEnd[index3] is the least index2
// such that the LCS of s1[0:index1] and s2[0:index2] has length index3 + 1
//
uint32_t index1 = 0;
for (const auto& it1 : indexesOf2MatchedByIndex1) {
auto dq2 = *it1;
auto limit = prefixEnd.end();
for (auto it2 = dq2.rbegin(); it2 != dq2.rend(); it2++) {
// Each index1, index2 pair corresponds to a match
auto index2 = *it2;
//
// Note: The reverse iterator it2 visits index2 values in descending order,
// allowing in-place update of prefixEnd[]. std::lower_bound() is used to
// perform a binary search.
//
limit = lower_bound(prefixEnd.begin(), limit, index2);
//
// Look ahead to the next index2 value to optimize Pairs used by the Hunt
// and Szymanski algorithm. If the next index2 is also an improvement on
// the value currently held in prefixEnd[index3], a new Pair will only be
// superseded on the next index2 iteration.
//
// Verify that a next index2 value exists; and that this value is greater
// than the final index2 value of the LCS prefix at prev(limit):
//
auto preferNextIndex2 = next(it2) != dq2.rend() &&
(limit == prefixEnd.begin() || *prev(limit) < *next(it2));
//
// Depending on match redundancy, this optimization may reduce the number
// of Pair allocations by factors ranging from 2 up to 10 or more.
//
if (preferNextIndex2) continue;
auto index3 = distance(prefixEnd.begin(), limit);
if (limit == prefixEnd.end()) {
// Insert Case
prefixEnd.push_back(index2);
// Refresh limit iterator:
limit = prev(prefixEnd.end());
if (traceLCS) {
chains.push_back(pushPair(chains, index3, index1, index2));
}
}
else if (index2 < *limit) {
// Update Case
// Update limit value:
*limit = index2;
if (traceLCS) {
chains[index3] = pushPair(chains, index3, index1, index2);
}
}
} // next index2
index1++;
} // next index1
if (traceLCS) {
// Return the LCS as a linked list of matched index pairs:
auto last = chains.empty() ? nullptr : chains.back();
// Reverse longest chain
*pairs = Pair::Reverse(last);
}
auto length = prefixEnd.size();
return length;
}
private:
static shared_ptr<Pair> pushPair(
PAIRS& chains, const ptrdiff_t& index3, uint32_t& index1, uint32_t& index2) {
auto prefix = index3 > 0 ? chains[index3 - 1] : nullptr;
return make_shared<Pair>(index1, index2, prefix);
}
protected:
//
// Match() avoids m*n comparisons by using CHAR_TO_INDEXES_MAP to
// achieve O(m+n) performance, where m and n are the input lengths.
//
// The lookup time can be assumed constant in the case of characters.
// The symbol space is larger in the case of records; but the lookup
// time will be O(log(m+n)), at most.
//
static void Match(
CHAR_TO_INDEXES_MAP& indexesOf2MatchedByChar, MATCHES& indexesOf2MatchedByIndex1,
const string& s1, const string& s2) {
uint32_t index = 0;
for (const auto& it : s2)
indexesOf2MatchedByChar[it].push_back(index++);
for (const auto& it : s1) {
auto& dq2 = indexesOf2MatchedByChar[it];
indexesOf2MatchedByIndex1.push_back(&dq2);
}
}
static string Select(shared_ptr<Pair> pairs, uint32_t length,
bool right, const string& s1, const string& s2) {
string buffer;
buffer.reserve(length);
for (auto next = pairs; next != nullptr; next = next->next) {
auto c = right ? s2[next->index2] : s1[next->index1];
buffer.push_back(c);
}
return buffer;
}
public:
static string Correspondence(const string& s1, const string& s2) {
CHAR_TO_INDEXES_MAP indexesOf2MatchedByChar;
MATCHES indexesOf2MatchedByIndex1; // holds references into indexesOf2MatchedByChar
Match(indexesOf2MatchedByChar, indexesOf2MatchedByIndex1, s1, s2);
shared_ptr<Pair> pairs; // obtain the LCS as index pairs
auto length = FindLCS(indexesOf2MatchedByIndex1, &pairs);
return Select(pairs, length, false, s1, s2);
}
};
Example:
auto s = LCS::Correspondence(s1, s2);
cout << s << endl;
More fully featured examples are available at Samples/C++/LCS.
Clojure
Based on algorithm from Wikipedia.
(defn longest [xs ys] (if (> (count xs) (count ys)) xs ys))
(def lcs
(memoize
(fn [[x & xs] [y & ys]]
(cond
(or (= x nil) (= y nil)) nil
(= x y) (cons x (lcs xs ys))
:else (longest (lcs (cons x xs) ys)
(lcs xs (cons y ys)))))))
CoffeeScript
lcs = (s1, s2) ->
len1 = s1.length
len2 = s2.length
# Create a virtual matrix that is (len1 + 1) by (len2 + 1),
# where m[i][j] is the longest common string using only
# the first i chars of s1 and first j chars of s2. The
# matrix is virtual, because we only keep the last two rows
# in memory.
prior_row = ('' for i in [0..len2])
for i in [0...len1]
row = ['']
for j in [0...len2]
if s1[i] == s2[j]
row.push prior_row[j] + s1[i]
else
subs1 = row[j]
subs2 = prior_row[j+1]
if subs1.length > subs2.length
row.push subs1
else
row.push subs2
prior_row = row
row[len2]
s1 = "thisisatest"
s2 = "testing123testing"
console.log lcs(s1, s2)
Common Lisp
Here's a memoizing/dynamic-programming solution that uses an n × m array where n and m are the lengths of the input arrays. The first return value is a sequence (of the same type as array1) which is the longest common subsequence. The second return value is the length of the longest common subsequence.
(defun longest-common-subsequence (array1 array2)
(let* ((l1 (length array1))
(l2 (length array2))
(results (make-array (list l1 l2) :initial-element nil)))
(declare (dynamic-extent results))
(labels ((lcs (start1 start2)
;; if either sequence is empty, return (() 0)
(if (or (eql start1 l1) (eql start2 l2)) (list '() 0)
;; otherwise, return any memoized value
(let ((result (aref results start1 start2)))
(if (not (null result)) result
;; otherwise, compute and store a value
(setf (aref results start1 start2)
(if (eql (aref array1 start1) (aref array2 start2))
;; if they start with the same element,
;; move forward in both sequences
(destructuring-bind (seq len)
(lcs (1+ start1) (1+ start2))
(list (cons (aref array1 start1) seq) (1+ len)))
;; otherwise, move ahead in each separately,
;; and return the better result.
(let ((a (lcs (1+ start1) start2))
(b (lcs start1 (1+ start2))))
(if (> (second a) (second b))
a
b)))))))))
(destructuring-bind (seq len) (lcs 0 0)
(values (coerce seq (type-of array1)) len)))))
For example,
(longest-common-subsequence "123456" "1a2b3c")
produces the two values
"123"
3
An alternative adopted from Clojure
Here is another version with its own memoization macro:
(defmacro mem-defun (name args body)
(let ((hash-name (gensym)))
`(let ((,hash-name (make-hash-table :test 'equal)))
(defun ,name ,args
(or (gethash (list ,@args) ,hash-name)
(setf (gethash (list ,@args) ,hash-name)
,body))))))
(mem-defun lcs (xs ys)
(labels ((longer (a b) (if (> (length a) (length b)) a b)))
(cond ((or (null xs) (null ys)) nil)
((equal (car xs) (car ys)) (cons (car xs) (lcs (cdr xs) (cdr ys))))
(t (longer (lcs (cdr xs) ys)
(lcs xs (cdr ys)))))))
When we test it, we get:
(coerce (lcs (coerce "thisisatest" 'list) (coerce "testing123testing" 'list)) 'string))))
"tsitest"
D
Both versions don't work correctly with Unicode text.
Recursive version
import std.stdio, std.array;
T[] lcs(T)(in T[] a, in T[] b) pure nothrow @safe {
if (a.empty || b.empty) return null;
if (a[0] == b[0])
return a[0] ~ lcs(a[1 .. $], b[1 .. $]);
const longest = (T[] x, T[] y) => x.length > y.length ? x : y;
return longest(lcs(a, b[1 .. $]), lcs(a[1 .. $], b));
}
void main() {
lcs("thisisatest", "testing123testing").writeln;
}
- Output:
tsitest
Faster dynamic programming version
The output is the same.
import std.stdio, std.algorithm, std.traits;
T[] lcs(T)(in T[] a, in T[] b) pure /*nothrow*/ {
auto L = new uint[][](a.length + 1, b.length + 1);
foreach (immutable i; 0 .. a.length)
foreach (immutable j; 0 .. b.length)
L[i + 1][j + 1] = (a[i] == b[j]) ? (1 + L[i][j]) :
max(L[i + 1][j], L[i][j + 1]);
Unqual!T[] result;
for (auto i = a.length, j = b.length; i > 0 && j > 0; ) {
if (a[i - 1] == b[j - 1]) {
result ~= a[i - 1];
i--;
j--;
} else
if (L[i][j - 1] < L[i - 1][j])
i--;
else
j--;
}
result.reverse(); // Not nothrow.
return result;
}
void main() {
lcs("thisisatest", "testing123testing").writeln;
}
Hirschberg algorithm version
See: http://en.wikipedia.org/wiki/Hirschberg_algorithm
This is currently a little slower than the classic dynamic programming version, but it uses a linear amount of memory, so it's usable for much larger inputs. To speed up this code on dmd remove the memory allocations from lensLCS, and do not use the retro range (replace it with foreach_reverse). The output is the same.
Adapted from Python code: http://wordaligned.org/articles/longest-common-subsequence
import std.stdio, std.algorithm, std.range, std.array, std.string, std.typecons;
uint[] lensLCS(R)(R xs, R ys) pure nothrow @safe {
auto prev = new typeof(return)(1 + ys.length);
auto curr = new typeof(return)(1 + ys.length);
foreach (immutable x; xs) {
swap(curr, prev);
size_t i = 0;
foreach (immutable y; ys) {
curr[i + 1] = (x == y) ? prev[i] + 1 : max(curr[i], prev[i + 1]);
i++;
}
}
return curr;
}
void calculateLCS(T)(in T[] xs, in T[] ys, bool[] xs_in_lcs,
in size_t idx=0) pure nothrow @safe {
immutable nx = xs.length;
immutable ny = ys.length;
if (nx == 0)
return;
if (nx == 1) {
if (ys.canFind(xs[0]))
xs_in_lcs[idx] = true;
} else {
immutable mid = nx / 2;
const xb = xs[0.. mid];
const xe = xs[mid .. $];
immutable ll_b = lensLCS(xb, ys);
const ll_e = lensLCS(xe.retro, ys.retro); // retro is slow with dmd.
//immutable k = iota(ny + 1)
// .reduce!(max!(j => ll_b[j] + ll_e[ny - j]));
immutable k = iota(ny + 1)
.minPos!((i, j) => tuple(ll_b[i] + ll_e[ny - i]) >
tuple(ll_b[j] + ll_e[ny - j]))[0];
calculateLCS(xb, ys[0 .. k], xs_in_lcs, idx);
calculateLCS(xe, ys[k .. $], xs_in_lcs, idx + mid);
}
}
const(T)[] lcs(T)(in T[] xs, in T[] ys) pure /*nothrow*/ @safe {
auto xs_in_lcs = new bool[xs.length];
calculateLCS(xs, ys, xs_in_lcs);
return zip(xs, xs_in_lcs).filter!q{ a[1] }.map!q{ a[0] }.array; // Not nothrow.
}
string lcsString(in string s1, in string s2) pure /*nothrow*/ @safe {
return lcs(s1.representation, s2.representation).assumeUTF;
}
void main() {
lcsString("thisisatest", "testing123testing").writeln;
}
Dart
import 'dart:math';
String lcsRecursion(String a, String b) {
int aLen = a.length;
int bLen = b.length;
if (aLen == 0 || bLen == 0) {
return "";
} else if (a[aLen-1] == b[bLen-1]) {
return lcsRecursion(a.substring(0,aLen-1),b.substring(0,bLen-1)) + a[aLen-1];
} else {
var x = lcsRecursion(a, b.substring(0,bLen-1));
var y = lcsRecursion(a.substring(0,aLen-1), b);
return (x.length > y.length) ? x : y;
}
}
String lcsDynamic(String a, String b) {
var lengths = new List<List<int>>.generate(a.length + 1,
(_) => new List.filled(b.length+1, 0), growable: false);
// row 0 and column 0 are initialized to 0 already
for (int i = 0; i < a.length; i++) {
for (int j = 0; j < b.length; j++) {
if (a[i] == b[j]) {
lengths[i+1][j+1] = lengths[i][j] + 1;
} else {
lengths[i+1][j+1] = max(lengths[i+1][j], lengths[i][j+1]);
}
}
}
// read the substring out from the matrix
StringBuffer reversedLcsBuffer = new StringBuffer();
for (int x = a.length, y = b.length; x != 0 && y != 0;) {
if (lengths[x][y] == lengths[x-1][y]) {
x--;
} else if (lengths[x][y] == lengths[x][y-1]) {
y--;
} else {
assert(a[x-1] == b[y-1]);
reversedLcsBuffer.write(a[x-1]);
x--;
y--;
}
}
// reverse String
var reversedLCS = reversedLcsBuffer.toString();
var lcsBuffer = new StringBuffer();
for(var i = reversedLCS.length - 1; i>=0; i--) {
lcsBuffer.write(reversedLCS[i]);
}
return lcsBuffer.toString();
}
void main() {
print("lcsDynamic('1234', '1224533324') = ${lcsDynamic('1234', '1224533324')}");
print("lcsDynamic('thisisatest', 'testing123testing') = ${lcsDynamic('thisisatest', 'testing123testing')}");
print("lcsDynamic('', 'x') = ${lcsDynamic('', 'x')}");
print("lcsDynamic('x', 'x') = ${lcsDynamic('x', 'x')}");
print('');
print("lcsRecursion('1234', '1224533324') = ${lcsRecursion('1234', '1224533324')}");
print("lcsRecursion('thisisatest', 'testing123testing') = ${lcsRecursion('thisisatest', 'testing123testing')}");
print("lcsRecursion('', 'x') = ${lcsRecursion('', 'x')}");
print("lcsRecursion('x', 'x') = ${lcsRecursion('x', 'x')}");
}
- Output:
lcsDynamic('1234', '1224533324') = 1234 lcsDynamic('thisisatest', 'testing123testing') = tsitest lcsDynamic('', 'x') = lcsDynamic('x', 'x') = x lcsRecursion('1234', '1224533324') = 1234 lcsRecursion('thisisatest', 'testing123testing') = tsitest lcsRecursion('', 'x') = lcsRecursion('x', 'x') = x
EasyLang
func$ right a$ n .
return substr a$ (len a$ - n + 1) n
.
func$ left a$ n .
if n < 0
n = len a$ + n
.
return substr a$ 1 n
.
func$ lcs a$ b$ .
if len a$ = 0 or len b$ = 0
return ""
.
if right a$ 1 = right b$ 1
return lcs left a$ -1 left b$ -1 & right a$ 1
.
x$ = lcs a$ left b$ -1
y$ = lcs left a$ -1 b$
if len x$ > len y$
return x$
else
return y$
.
.
print lcs "1234" "1224533324"
print lcs "thisisatest" "testing123testing"
- Output:
1234 tsitest
Egison
(define $common-seqs
(lambda [$xs $ys]
(match-all [xs ys] [(list char) (list char)]
[[(loop $i [1 $n] <join _ <cons $c_i ...>> _)
(loop $i [1 ,n] <join _ <cons ,c_i ...>> _)]
(map (lambda [$i] c_i) (between 1 n))])))
(define $lcs (compose common-seqs rac))
Output:
> (lcs "thisisatest" "testing123testing"))
"tsitest"
Elixir
Simple recursion
This solution is Brute force. It is slow
defmodule LCS do
def lcs(a, b) do
lcs(to_charlist(a), to_charlist(b), []) |> to_string
end
defp lcs([h|at], [h|bt], res), do: lcs(at, bt, [h|res])
defp lcs([_|at]=a, [_|bt]=b, res) do
Enum.max_by([lcs(a, bt, res), lcs(at, b, res)], &length/1)
end
defp lcs(_, _, res), do: res |> Enum.reverse
end
IO.puts LCS.lcs("thisisatest", "testing123testing")
IO.puts LCS.lcs('1234','1224533324')
Dynamic Programming
defmodule LCS do
def lcs_length(s,t), do: lcs_length(s,t,Map.new) |> elem(0)
defp lcs_length([],t,cache), do: {0,Map.put(cache,{[],t},0)}
defp lcs_length(s,[],cache), do: {0,Map.put(cache,{s,[]},0)}
defp lcs_length([h|st]=s,[h|tt]=t,cache) do
{l,c} = lcs_length(st,tt,cache)
{l+1,Map.put(c,{s,t},l+1)}
end
defp lcs_length([_sh|st]=s,[_th|tt]=t,cache) do
if Map.has_key?(cache,{s,t}) do
{Map.get(cache,{s,t}),cache}
else
{l1,c1} = lcs_length(s,tt,cache)
{l2,c2} = lcs_length(st,t,c1)
l = max(l1,l2)
{l,Map.put(c2,{s,t},l)}
end
end
def lcs(s,t) do
{s,t} = {to_charlist(s),to_charlist(t)}
{_,c} = lcs_length(s,t,Map.new)
lcs(s,t,c,[]) |> to_string
end
defp lcs([],_,_,acc), do: Enum.reverse(acc)
defp lcs(_,[],_,acc), do: Enum.reverse(acc)
defp lcs([h|st],[h|tt],cache,acc), do: lcs(st,tt,cache,[h|acc])
defp lcs([_sh|st]=s,[_th|tt]=t,cache,acc) do
if Map.get(cache,{s,tt}) > Map.get(cache,{st,t}) do
lcs(s,tt,cache,acc)
else
lcs(st,t,cache,acc)
end
end
end
IO.puts LCS.lcs("thisisatest","testing123testing")
IO.puts LCS.lcs("1234","1224533324")
- Output:
tsitest 1234
Referring to LCS here.
Erlang
This implementation also includes the ability to calculate the length of the longest common subsequence. In calculating that length, we generate a cache which can be traversed to generate the longest common subsequence.
module(lcs).
-compile(export_all).
lcs_length(S,T) ->
{L,_C} = lcs_length(S,T,dict:new()),
L.
lcs_length([]=S,T,Cache) ->
{0,dict:store({S,T},0,Cache)};
lcs_length(S,[]=T,Cache) ->
{0,dict:store({S,T},0,Cache)};
lcs_length([H|ST]=S,[H|TT]=T,Cache) ->
{L,C} = lcs_length(ST,TT,Cache),
{L+1,dict:store({S,T},L+1,C)};
lcs_length([_SH|ST]=S,[_TH|TT]=T,Cache) ->
case dict:is_key({S,T},Cache) of
true -> {dict:fetch({S,T},Cache),Cache};
false ->
{L1,C1} = lcs_length(S,TT,Cache),
{L2,C2} = lcs_length(ST,T,C1),
L = lists:max([L1,L2]),
{L,dict:store({S,T},L,C2)}
end.
lcs(S,T) ->
{_,C} = lcs_length(S,T,dict:new()),
lcs(S,T,C,[]).
lcs([],_,_,Acc) ->
lists:reverse(Acc);
lcs(_,[],_,Acc) ->
lists:reverse(Acc);
lcs([H|ST],[H|TT],Cache,Acc) ->
lcs(ST,TT,Cache,[H|Acc]);
lcs([_SH|ST]=S,[_TH|TT]=T,Cache,Acc) ->
case dict:fetch({S,TT},Cache) > dict:fetch({ST,T},Cache) of
true ->
lcs(S,TT,Cache, Acc);
false ->
lcs(ST,T,Cache,Acc)
end.
Output:
77> lcs:lcs("thisisatest","testing123testing").
"tsitest"
78> lcs:lcs("1234","1224533324").
"1234"
We can also use the process dictionary to memoize the recursive implementation:
lcs(Xs0, Ys0) ->
CacheKey = {lcs_cache, Xs0, Ys0},
case get(CacheKey)
of undefined ->
Result =
case {Xs0, Ys0}
of {[], _} -> []
; {_, []} -> []
; {[Same | Xs], [Same | Ys]} ->
[Same | lcs(Xs, Ys)]
; {[_ | XsRest]=XsAll, [_ | YsRest]=YsAll} ->
A = lcs(XsRest, YsAll),
B = lcs(XsAll , YsRest),
case length(A) > length(B)
of true -> A
; false -> B
end
end,
undefined = put(CacheKey, Result),
Result
; Result ->
Result
end.
Similar to the above, but without using the process dictionary:
-module(lcs).
%% API exports
-export([
lcs/2
]).
%%====================================================================
%% API functions
%%====================================================================
lcs(A, B) ->
{LCS, _Cache} = get_lcs(A, B, [], #{}),
lists:reverse(LCS).
%%====================================================================
%% Internal functions
%%=====================================================
get_lcs(A, B, Acc, Cache) ->
case maps:find({A, B, Acc}, Cache) of
{ok, LCS} -> {LCS, Cache};
error ->
{NewLCS, NewCache} = compute_lcs(A, B, Acc, Cache),
{NewLCS, NewCache#{ {A, B, Acc} => NewLCS }}
end.
compute_lcs(A, B, Acc, Cache) when length(A) == 0 orelse length(B) == 0 ->
{Acc, Cache};
compute_lcs([Token |ATail], [Token |BTail], Acc, Cache) ->
get_lcs(ATail, BTail, [Token |Acc], Cache);
compute_lcs([_AToken |ATail]=A, [_BToken |BTail]=B, Acc, Cache) ->
{LCSA, CacheA} = get_lcs(A, BTail, Acc, Cache),
{LCSB, CacheB} = get_lcs(ATail, B, Acc, CacheA),
LCS = case length(LCSA) > length(LCSB) of
true -> LCSA;
false -> LCSB
end,
{LCS, CacheB}.
Output:
48> lcs:lcs("thisisatest", "testing123testing").
"tsitest"
F#
Copied and slightly adapted from OCaml (direct recursion)
open System
let longest xs ys = if List.length xs > List.length ys then xs else ys
let rec lcs a b =
match a, b with
| [], _
| _, [] -> []
| x::xs, y::ys ->
if x = y then
x :: lcs xs ys
else
longest (lcs a ys) (lcs xs b)
[<EntryPoint>]
let main argv =
let split (str:string) = List.init str.Length (fun i -> str.[i])
printfn "%A" (String.Join("",
(lcs (split "thisisatest") (split "testing123testing"))))
0
Factor
USE: lcs
"thisisatest" "testing123testing" lcs print
- Output:
tsitest
Fortran
Using the iso_varying_string module which can be found here (or equivalent module conforming to the ISO/IEC 1539-2:2000 API or to a subset according to the need of this code: char
, len
, //
, extract
, ==
, =
)
program lcstest
use iso_varying_string
implicit none
type(varying_string) :: s1, s2
s1 = "thisisatest"
s2 = "testing123testing"
print *, char(lcs(s1, s2))
s1 = "1234"
s2 = "1224533324"
print *, char(lcs(s1, s2))
contains
recursive function lcs(a, b) result(l)
type(varying_string) :: l
type(varying_string), intent(in) :: a, b
type(varying_string) :: x, y
l = ""
if ( (len(a) == 0) .or. (len(b) == 0) ) return
if ( extract(a, len(a), len(a)) == extract(b, len(b), len(b)) ) then
l = lcs(extract(a, 1, len(a)-1), extract(b, 1, len(b)-1)) // extract(a, len(a), len(a))
else
x = lcs(a, extract(b, 1, len(b)-1))
y = lcs(extract(a, 1, len(a)-1), b)
if ( len(x) > len(y) ) then
l = x
else
l = y
end if
end if
end function lcs
end program lcstest
FreeBASIC
Function LCS(a As String, b As String) As String
Dim As String x, y
If Len(a) = 0 Or Len(b) = 0 Then
Return ""
Elseif Right(a, 1) = Right(b, 1) Then
LCS = LCS(Left(a, Len(a) - 1), Left(b, Len(b) - 1)) + Right(a, 1)
Else
x = LCS(a, Left(b, Len(b) - 1))
y = LCS(Left(a, Len(a) - 1), b)
If Len(x) > Len(y) Then Return x Else Return y
End If
End Function
Print LCS("1234", "1224533324")
Print LCS("thisisatest", "testing123testing")
Sleep
Go
Recursion
Brute force
func lcs(a, b string) string {
aLen := len(a)
bLen := len(b)
if aLen == 0 || bLen == 0 {
return ""
} else if a[aLen-1] == b[bLen-1] {
return lcs(a[:aLen-1], b[:bLen-1]) + string(a[aLen-1])
}
x := lcs(a, b[:bLen-1])
y := lcs(a[:aLen-1], b)
if len(x) > len(y) {
return x
}
return y
}
Dynamic Programming
func lcs(a, b string) string {
arunes := []rune(a)
brunes := []rune(b)
aLen := len(arunes)
bLen := len(brunes)
lengths := make([][]int, aLen+1)
for i := 0; i <= aLen; i++ {
lengths[i] = make([]int, bLen+1)
}
// row 0 and column 0 are initialized to 0 already
for i := 0; i < aLen; i++ {
for j := 0; j < bLen; j++ {
if arunes[i] == brunes[j] {
lengths[i+1][j+1] = lengths[i][j] + 1
} else if lengths[i+1][j] > lengths[i][j+1] {
lengths[i+1][j+1] = lengths[i+1][j]
} else {
lengths[i+1][j+1] = lengths[i][j+1]
}
}
}
// read the substring out from the matrix
s := make([]rune, 0, lengths[aLen][bLen])
for x, y := aLen, bLen; x != 0 && y != 0; {
if lengths[x][y] == lengths[x-1][y] {
x--
} else if lengths[x][y] == lengths[x][y-1] {
y--
} else {
s = append(s, arunes[x-1])
x--
y--
}
}
// reverse string
for i, j := 0, len(s)-1; i < j; i, j = i+1, j-1 {
s[i], s[j] = s[j], s[i]
}
return string(s)
}
Groovy
Recursive solution:
def lcs(xstr, ystr) {
if (xstr == "" || ystr == "") {
return "";
}
def x = xstr[0];
def y = ystr[0];
def xs = xstr.size() > 1 ? xstr[1..-1] : "";
def ys = ystr.size() > 1 ? ystr[1..-1] : "";
if (x == y) {
return (x + lcs(xs, ys));
}
def lcs1 = lcs(xstr, ys);
def lcs2 = lcs(xs, ystr);
lcs1.size() > lcs2.size() ? lcs1 : lcs2;
}
println(lcs("1234", "1224533324"));
println(lcs("thisisatest", "testing123testing"));
- Output:
1234 tsitest
Haskell
The Wikipedia solution translates directly into Haskell, with the only difference that equal characters are added in front:
longest xs ys = if length xs > length ys then xs else ys
lcs [] _ = []
lcs _ [] = []
lcs (x:xs) (y:ys)
| x == y = x : lcs xs ys
| otherwise = longest (lcs (x:xs) ys) (lcs xs (y:ys))
A Memoized version of the naive algorithm.
import qualified Data.MemoCombinators as M
lcs = memoize lcsm
where
lcsm [] _ = []
lcsm _ [] = []
lcsm (x:xs) (y:ys)
| x == y = x : lcs xs ys
| otherwise = maxl (lcs (x:xs) ys) (lcs xs (y:ys))
maxl x y = if length x > length y then x else y
memoize = M.memo2 mString mString
mString = M.list M.char -- Chars, but you can specify any type you need for the memo
Memoization (aka dynamic programming) of that uses zip to make both the index and the character available:
import Data.Array
lcs xs ys = a!(0,0) where
n = length xs
m = length ys
a = array ((0,0),(n,m)) $ l1 ++ l2 ++ l3
l1 = [((i,m),[]) | i <- [0..n]]
l2 = [((n,j),[]) | j <- [0..m]]
l3 = [((i,j), f x y i j) | (x,i) <- zip xs [0..], (y,j) <- zip ys [0..]]
f x y i j
| x == y = x : a!(i+1,j+1)
| otherwise = longest (a!(i,j+1)) (a!(i+1,j))
All 3 solutions work of course not only with strings, but also with any other list. Example:
*Main> lcs "thisisatest" "testing123testing"
"tsitest"
The dynamic programming version without using arrays:
import Data.List
longest xs ys = if length xs > length ys then xs else ys
lcs xs ys = head $ foldr(\xs -> map head. scanr1 f. zipWith (\x y -> [x,y]) xs) e m where
m = map (\x -> flip (++) [[]] $ map (\y -> [x | x==y]) ys) xs
e = replicate (length ys) []
f [a,b] [c,d]
| null a = longest b c: [b]
| otherwise = (a++d):[b]
Simple and slow solution:
import Data.Ord
import Data.List
-- longest common
lcs xs ys = maximumBy (comparing length) $ intersect (subsequences xs) (subsequences ys)
main = print $ lcs "thisisatest" "testing123testing"
- Output:
"tsitest"
Icon and Unicon
This solution is a modified variant of the recursive solution. The modifications include (a) deleting all characters not common to both strings and (b) stripping off common prefixes and suffixes in a single step.
- Output:
lcs( "thisisatest", "testing123testing" ) = "tsitest" lcs( "", "x" ) = "" lcs( "x", "x" ) = "x" lcs( "beginning-middle-ending", "beginning-diddle-dum-ending" ) = "beginning-iddle-ending"
J
lcs=: dyad define
|.x{~ 0{"1 cullOne^:_ (\: +/"1)(\:{."1) 4$.$. x =/ y
)
cullOne=: ({~[: <@<@< [: (i. 0:)1,[: *./[: |: 2>/\]) :: ]
Here's another approach:
mergeSq=: ;@}: ~.@, {.@;@{. ,&.> 3 {:: 4&{.
common=: 2 2 <@mergeSq@,;.3^:_ [: (<@#&.> i.@$) =/
lcs=: [ {~ 0 {"1 ,&$ #: 0 ({:: (#~ [: (= >./) #@>)) 0 ({:: ,) common
Example use (works with either definition of lcs):
'thisisatest' lcs 'testing123testing'
tsitest
Dynamic programming version
longest=: ]`[@.(>&#)
upd=:{:@[,~ ({.@[ ,&.> {:@])`({:@[ longest&.> {.@])@.(0 = #&>@{.@[)
lcs=: 0{:: [: ([: {.&> [: upd&.>/\.<"1@:,.)/ a:,.~a:,~=/{"1 a:,.<"0@[
Output:
'1234' lcs '1224533324'
1234
'thisisatest' lcs 'testing123testing'
tsitest
Recursion
lcs=:;(($:}.) longest }.@[ $: ])`({.@[,$:&}.)@.(=&{.)`((i.0)"_)@.(+.&(0=#))&((e.#[)&>/) ;~
Java
Recursion
This is not a particularly fast algorithm, but it gets the job done eventually. The speed is a result of many recursive function calls.
public static String lcs(String a, String b){
int aLen = a.length();
int bLen = b.length();
if(aLen == 0 || bLen == 0){
return "";
}else if(a.charAt(aLen-1) == b.charAt(bLen-1)){
return lcs(a.substring(0,aLen-1),b.substring(0,bLen-1))
+ a.charAt(aLen-1);
}else{
String x = lcs(a, b.substring(0,bLen-1));
String y = lcs(a.substring(0,aLen-1), b);
return (x.length() > y.length()) ? x : y;
}
}
Dynamic Programming
public static String lcs(String a, String b) {
int[][] lengths = new int[a.length()+1][b.length()+1];
// row 0 and column 0 are initialized to 0 already
for (int i = 0; i < a.length(); i++)
for (int j = 0; j < b.length(); j++)
if (a.charAt(i) == b.charAt(j))
lengths[i+1][j+1] = lengths[i][j] + 1;
else
lengths[i+1][j+1] =
Math.max(lengths[i+1][j], lengths[i][j+1]);
// read the substring out from the matrix
StringBuffer sb = new StringBuffer();
for (int x = a.length(), y = b.length();
x != 0 && y != 0; ) {
if (lengths[x][y] == lengths[x-1][y])
x--;
else if (lengths[x][y] == lengths[x][y-1])
y--;
else {
assert a.charAt(x-1) == b.charAt(y-1);
sb.append(a.charAt(x-1));
x--;
y--;
}
}
return sb.reverse().toString();
}
JavaScript
Recursion
This is more or less a translation of the recursive Java version above.
function lcs(a, b) {
var aSub = a.substr(0, a.length - 1);
var bSub = b.substr(0, b.length - 1);
if (a.length === 0 || b.length === 0) {
return '';
} else if (a.charAt(a.length - 1) === b.charAt(b.length - 1)) {
return lcs(aSub, bSub) + a.charAt(a.length - 1);
} else {
var x = lcs(a, bSub);
var y = lcs(aSub, b);
return (x.length > y.length) ? x : y;
}
}
ES6 recursive implementation
const longest = (xs, ys) => (xs.length > ys.length) ? xs : ys;
const lcs = (xx, yy) => {
if (!xx.length || !yy.length) { return ''; }
const [x, ...xs] = xx;
const [y, ...ys] = yy;
return (x === y) ? (x + lcs(xs, ys)) : longest(lcs(xx, ys), lcs(xs, yy));
};
Dynamic Programming
This version runs in O(mn) time and consumes O(mn) space. Factoring out loop edge cases could get a small constant time improvement, and it's fairly trivial to edit the final loop to produce a full diff in addition to the lcs.
function lcs(x,y){
var s,i,j,m,n,
lcs=[],row=[],c=[],
left,diag,latch;
//make sure shorter string is the column string
if(m<n){s=x;x=y;y=s;}
m = x.length;
n = y.length;
//build the c-table
for(j=0;j<n;row[j++]=0);
for(i=0;i<m;i++){
c[i] = row = row.slice();
for(diag=0,j=0;j<n;j++,diag=latch){
latch=row[j];
if(x[i] == y[j]){row[j] = diag+1;}
else{
left = row[j-1]||0;
if(left>row[j]){row[j] = left;}
}
}
}
i--,j--;
//row[j] now contains the length of the lcs
//recover the lcs from the table
while(i>-1&&j>-1){
switch(c[i][j]){
default: j--;
lcs.unshift(x[i]);
case (i&&c[i-1][j]): i--;
continue;
case (j&&c[i][j-1]): j--;
}
}
return lcs.join('');
}
BUG note: In line 6, m and n are not yet initialized, and so x and y are never swapped. Swapping is useless here, and becomes wrong when extending the algorithm to produce a diff.
The final loop can be modified to concatenate maximal common substrings rather than individual characters:
var t=i;
while(i>-1&&j>-1){
switch(c[i][j]){
default:i--,j--;
continue;
case (i&&c[i-1][j]):
if(t!==i){lcs.unshift(x.substring(i+1,t+1));}
t=--i;
continue;
case (j&&c[i][j-1]): j--;
if(t!==i){lcs.unshift(x.substring(i+1,t+1));}
t=i;
}
}
if(t!==i){lcs.unshift(x.substring(i+1,t+1));}
Greedy Algorithm
This is an heuristic algorithm that won't always return the correct answer, but is significantly faster and less memory intensive than the dynamic programming version, in exchange for giving up the ability to re-use the table to find alternate solutions and greater complexity in generating diffs. Note that this implementation uses a binary buffer for additional efficiency gains, but it's simple to transform to use string or array concatenation;
function lcs_greedy(x,y){
var p1, i, idx,
symbols = {},
r = 0,
p = 0,
l = 0,
m = x.length,
n = y.length,
s = new Buffer((m < n) ? n : m);
p1 = popsym(0);
for (i = 0; i < m; i++) {
p = (r === p) ? p1 : popsym(i);
p1 = popsym(i + 1);
if (p > p1) {
i += 1;
idx = p1;
} else {
idx = p;
}
if (idx === n) {
p = popsym(i);
} else {
r = idx;
s[l] = x.charCodeAt(i);
l += 1;
}
}
return s.toString('utf8', 0, l);
function popsym(index) {
var s = x[index],
pos = symbols[s] + 1;
pos = y.indexOf(s, ((pos > r) ? pos : r));
if (pos === -1) { pos = n; }
symbols[s] = pos;
return pos;
}
}
Note that it won't return the correct answer for all inputs. For example:
lcs_greedy('bcaaaade', 'deaaaabc'); // 'bc' instead of 'aaaa'
jq
Naive recursive version:
def lcs(xstr; ystr):
if (xstr == "" or ystr == "") then ""
else
xstr[0:1] as $x
| xstr[1:] as $xs
| ystr[1:] as $ys
| if ($x == ystr[0:1]) then ($x + lcs($xs; $ys))
else
lcs(xstr; $ys) as $one
| lcs($xs; ystr) as $two
| if ($one|length) > ($two|length) then $one else $two end
end
end;
Example:
lcs("1234"; "1224533324"),
lcs("thisisatest"; "testing123testing")
Output:
# jq -n -f lcs-recursive.jq
"1234"
"tsitest"
Julia
longest(a::String, b::String) = length(a) ≥ length(b) ? a : b
"""
julia> lcsrecursive("thisisatest", "testing123testing")
"tsitest"
"""
# Recursive
function lcsrecursive(xstr::String, ystr::String)
if length(xstr) == 0 || length(ystr) == 0
return ""
end
x, xs, y, ys = xstr[1], xstr[2:end], ystr[1], ystr[2:end]
if x == y
return string(x, lcsrecursive(xs, ys))
else
return longest(lcsrecursive(xstr, ys), lcsrecursive(xs, ystr))
end
end
# Dynamic
function lcsdynamic(a::String, b::String)
lengths = zeros(Int, length(a) + 1, length(b) + 1)
# row 0 and column 0 are initialized to 0 already
for (i, x) in enumerate(a), (j, y) in enumerate(b)
if x == y
lengths[i+1, j+1] = lengths[i, j] + 1
else
lengths[i+1, j+1] = max(lengths[i+1, j], lengths[i, j+1])
end
end
# read the substring out from the matrix
result = ""
x, y = length(a) + 1, length(b) + 1
while x > 1 && y > 1
if lengths[x, y] == lengths[x-1, y]
x -= 1
elseif lengths[x, y] == lengths[x, y-1]
y -= 1
else
@assert a[x-1] == b[y-1]
result = string(a[x-1], result)
x -= 1
y -= 1
end
end
return result
end
@show lcsrecursive("thisisatest", "testing123testing")
@time lcsrecursive("thisisatest", "testing123testing")
@show lcsdynamic("thisisatest", "testing123testing")
@time lcsdynamic("thisisatest", "testing123testing")
- Output:
lcsrecursive("thisisatest", "testing123testing") = "tsitest" 0.038153 seconds (537.77 k allocations: 16.415 MiB) lcsdynamic("thisisatest", "testing123testing") = "tsitest" 0.000004 seconds (12 allocations: 2.141 KiB)
Kotlin
// version 1.1.2
fun lcs(x: String, y: String): String {
if (x.length == 0 || y.length == 0) return ""
val x1 = x.dropLast(1)
val y1 = y.dropLast(1)
if (x.last() == y.last()) return lcs(x1, y1) + x.last()
val x2 = lcs(x, y1)
val y2 = lcs(x1, y)
return if (x2.length > y2.length) x2 else y2
}
fun main(args: Array<String>) {
val x = "thisisatest"
val y = "testing123testing"
println(lcs(x, y))
}
- Output:
tsitest
Liberty BASIC
'variation of BASIC example
w$="aebdef"
z$="cacbc"
print lcs$(w$,z$)
'output:
'ab
wait
FUNCTION lcs$(a$, b$)
IF LEN(a$) = 0 OR LEN(b$) = 0 THEN
lcs$ = ""
exit function
end if
IF RIGHT$(a$, 1) = RIGHT$(b$, 1) THEN
lcs$ = lcs$(LEFT$(a$, LEN(a$) - 1), LEFT$(b$, LEN(b$) - 1)) + RIGHT$(a$, 1)
exit function
ELSE
x$ = lcs$(a$, LEFT$(b$, LEN(b$) - 1))
y$ = lcs$(LEFT$(a$, LEN(a$) - 1), b$)
IF LEN(x$) > LEN(y$) THEN
lcs$ = x$
exit function
ELSE
lcs$ = y$
exit function
END IF
END IF
END FUNCTION
Logo
This implementation works on both words and lists.
to longest :s :t
output ifelse greater? count :s count :t [:s] [:t]
end
to lcs :s :t
if empty? :s [output :s]
if empty? :t [output :t]
if equal? first :s first :t [output combine first :s lcs bf :s bf :t]
output longest lcs :s bf :t lcs bf :s :t
end
Lua
function LCS( a, b )
if #a == 0 or #b == 0 then
return ""
elseif string.sub( a, -1, -1 ) == string.sub( b, -1, -1 ) then
return LCS( string.sub( a, 1, -2 ), string.sub( b, 1, -2 ) ) .. string.sub( a, -1, -1 )
else
local a_sub = LCS( a, string.sub( b, 1, -2 ) )
local b_sub = LCS( string.sub( a, 1, -2 ), b )
if #a_sub > #b_sub then
return a_sub
else
return b_sub
end
end
end
print( LCS( "thisisatest", "testing123testing" ) )
M4
define(`set2d',`define(`$1[$2][$3]',`$4')')
define(`get2d',`defn($1[$2][$3])')
define(`tryboth',
`pushdef(`x',lcs(`$1',substr(`$2',1),`$1 $2'))`'pushdef(`y',
lcs(substr(`$1',1),`$2',`$1 $2'))`'ifelse(eval(len(x)>len(y)),1,
`x',`y')`'popdef(`x')`'popdef(`y')')
define(`checkfirst',
`ifelse(substr(`$1',0,1),substr(`$2',0,1),
`substr(`$1',0,1)`'lcs(substr(`$1',1),substr(`$2',1))',
`tryboth(`$1',`$2')')')
define(`lcs',
`ifelse(get2d(`c',`$1',`$2'),`',
`pushdef(`a',ifelse(
`$1',`',`',
`$2',`',`',
`checkfirst(`$1',`$2')'))`'a`'set2d(`c',`$1',`$2',a)`'popdef(`a')',
`get2d(`c',`$1',`$2')')')
lcs(`1234',`1224533324')
lcs(`thisisatest',`testing123testing')
Note: the caching (set2d/get2d) obscures the code even more than usual, but is necessary in order to get the second test to run in a reasonable amount of time.
Maple
> StringTools:-LongestCommonSubSequence( "thisisatest", "testing123testing" );
"tsitest"
Mathematica /Wolfram Language
A built-in function can do this for us:
a = "thisisatest";
b = "testing123testing";
LongestCommonSequence[a, b]
gives:
tsitest
Note that Mathematica also has a built-in function called LongestCommonSubsequence[a,b]:
finds the longest contiguous subsequence of elements common to the strings or lists a and b.
which would give "test" as the result for LongestCommonSubsequence[a, b].
The description for LongestCommonSequence[a,b] is:
finds the longest sequence of contiguous or disjoint elements common to the strings or lists a and b.
I added this note because the name of this article suggests LongestCommonSubsequence does the job, however LongestCommonSequence performs the puzzle-description.
Nim
Recursion
proc lcs(x, y: string): string =
if x == "" or y == "":
return ""
if x[0] == y[0]:
return x[0] & lcs(x[1..x.high], y[1..y.high])
let a = lcs(x, y[1..y.high])
let b = lcs(x[1..x.high], y)
result = if a.len > b.len: a else: b
echo lcs("1234", "1224533324")
echo lcs("thisisatest", "testing123testing")
This recursive version is not efficient but the execution time can be greatly improved by using memoization.
Dynamic Programming
proc lcs(a, b: string): string =
var ls = newSeq[seq[int]](a.len+1)
for i in 0 .. a.len:
ls[i].newSeq(b.len+1)
for i, x in a:
for j, y in b:
if x == y:
ls[i+1][j+1] = ls[i][j] + 1
else:
ls[i+1][j+1] = max(ls[i+1][j], ls[i][j+1])
result = ""
var x = a.len
var y = b.len
while x > 0 and y > 0:
if ls[x][y] == ls[x-1][y]:
dec x
elif ls[x][y] == ls[x][y-1]:
dec y
else:
assert a[x-1] == b[y-1]
result = a[x-1] & result
dec x
dec y
echo lcs("1234", "1224533324")
echo lcs("thisisatest", "testing123testing")
OCaml
Recursion
from Haskell
let longest xs ys = if List.length xs > List.length ys then xs else ys
let rec lcs a b = match a, b with
[], _
| _, [] -> []
| x::xs, y::ys ->
if x = y then
x :: lcs xs ys
else
longest (lcs a ys) (lcs xs b)
Memoized recursion
let lcs xs ys =
let cache = Hashtbl.create 16 in
let rec lcs xs ys =
try Hashtbl.find cache (xs, ys) with
| Not_found ->
let result =
match xs, ys with
| [], _ -> []
| _, [] -> []
| x :: xs, y :: ys when x = y ->
x :: lcs xs ys
| _ :: xs_rest, _ :: ys_rest ->
let a = lcs xs_rest ys in
let b = lcs xs ys_rest in
if (List.length a) > (List.length b) then a else b
in
Hashtbl.add cache (xs, ys) result;
result
in
lcs xs ys
Dynamic programming
let lcs xs' ys' =
let xs = Array.of_list xs'
and ys = Array.of_list ys' in
let n = Array.length xs
and m = Array.length ys in
let a = Array.make_matrix (n+1) (m+1) [] in
for i = n-1 downto 0 do
for j = m-1 downto 0 do
a.(i).(j) <- if xs.(i) = ys.(j) then
xs.(i) :: a.(i+1).(j+1)
else
longest a.(i).(j+1) a.(i+1).(j)
done
done;
a.(0).(0)
Because both solutions only work with lists, here are some functions to convert to and from strings:
let list_of_string str =
let result = ref [] in
String.iter (fun x -> result := x :: !result)
str;
List.rev !result
let string_of_list lst =
let result = String.create (List.length lst) in
ignore (List.fold_left (fun i x -> result.[i] <- x; i+1) 0 lst);
result
Both solutions work. Example:
# string_of_list (lcs (list_of_string "thisisatest") (list_of_string "testing123testing"));; - : string = "tsitest"
Oz
Recursive solution:
declare
fun {LCS Xs Ys}
case [Xs Ys]
of [nil _] then nil
[] [_ nil] then nil
[] [X|Xr Y|Yr] andthen X==Y then X|{LCS Xr Yr}
[] [_|Xr _|Yr] then {Longest {LCS Xs Yr} {LCS Xr Ys}}
end
end
fun {Longest Xs Ys}
if {Length Xs} > {Length Ys} then Xs else Ys end
end
in
{System.showInfo {LCS "thisisatest" "testing123testing"}}
Pascal
Program LongestCommonSubsequence(output);
function lcs(a, b: string): string;
var
x, y: string;
lenga, lengb: integer;
begin
lenga := length(a);
lengb := length(b);
lcs := '';
if (lenga > 0) and (lengb > 0) then
if a[lenga] = b[lengb] then
lcs := lcs(copy(a, 1, lenga-1), copy(b, 1, lengb-1)) + a[lenga]
else
begin
x := lcs(a, copy(b, 1, lengb-1));
y := lcs(copy(a, 1, lenga-1), b);
if length(x) > length(y) then
lcs := x
else
lcs := y;
end;
end;
var
s1, s2: string;
begin
s1 := 'thisisatest';
s2 := 'testing123testing';
writeln (lcs(s1, s2));
s1 := '1234';
s2 := '1224533324';
writeln (lcs(s1, s2));
end.
- Output:
:> ./LongestCommonSequence tsitest 1234
PascalABC.NET
function Lengths(x,y: string): array[,] of integer;
begin
var (m,n) := (x.Length,y.Length);
var C := new integer[m+1, n+1]; // filled with zeroes
for var i:=1 to m do
for var j:=1 to n do
if x[i] = y[j] then
C[i,j] := C[i-1,j-1] + 1
else C[i,j] := max(C[i,j-1], C[i-1,j]);
Result := C;
end;
function lcshelper(x,y: string; i,j: integer): string;
begin
var C := Lengths(x,y);
if (i = 0) or (j = 0) then
Result := ''
else if X[i] = Y[j] then
Result := lcshelper(X, Y, i-1, j-1) + X[i]
else if C[i,j-1] > C[i-1,j] then
Result := lcshelper(X, Y, i, j-1)
else Result := lcshelper(X, Y, i-1, j)
end;
function lcs(x,y: string) := lcshelper(x,y,x.Length,y.Length);
begin
Println(lcs('1234','1224533324'));
Println(lcs('thisisatest','testing123testing'));
end.
- Output:
1234 tsitest
Perl
sub lcs {
my ($a, $b) = @_;
if (!length($a) || !length($b)) {
return "";
}
if (substr($a, 0, 1) eq substr($b, 0, 1)) {
return substr($a, 0, 1) . lcs(substr($a, 1), substr($b, 1));
}
my $c = lcs(substr($a, 1), $b) || "";
my $d = lcs($a, substr($b, 1)) || "";
return length($c) > length($d) ? $c : $d;
}
print lcs("thisisatest", "testing123testing") . "\n";
Alternate letting regex do all the work
use strict;
use warnings;
use feature 'bitwise';
print "lcs is ", lcs('thisisatest', 'testing123testing'), "\n";
sub lcs
{
my ($c, $d) = @_;
for my $len ( reverse 1 .. length($c &. $d) )
{
"$c\n$d" =~ join '.*', ('(.)') x $len, "\n", map "\\$_", 1 .. $len and
return join '', @{^CAPTURE};
}
return '';
}
- Output:
lcs is tastiest
Phix
If you want this to work with (utf8) Unicode text, just chuck the inputs through utf8_to_utf32() first (and the output through utf32_to_utf8()).
with javascript_semantics function lcs(sequence a, b) sequence res = "" if length(a) and length(b) then if a[$]=b[$] then res = lcs(a[1..-2],b[1..-2])&a[$] else sequence l = lcs(a,b[1..-2]), r = lcs(a[1..-2],b) res = iff(length(l)>length(r)?l:r) end if end if return res end function constant tests = {{"1234","1224533324"}, {"thisisatest","testing123testing"}} for i=1 to length(tests) do string {a,b} = tests[i] ?lcs(a,b) end for
- Output:
"1234" "tsitest"
Alternate version
same output
with javascript_semantics function LCSLength(sequence X, sequence Y) sequence C = repeat(repeat(0,length(Y)+1),length(X)+1) for i=1 to length(X) do for j=1 to length(Y) do if X[i]=Y[j] then C[i+1][j+1] := C[i][j]+1 else C[i+1][j+1] := max(C[i+1][j], C[i][j+1]) end if end for end for return C end function function backtrack(sequence C, sequence X, sequence Y, integer i, integer j) if i=0 or j=0 then return "" elsif X[i]=Y[j] then return backtrack(C, X, Y, i-1, j-1) & X[i] else if C[i+1][j]>C[i][j+1] then return backtrack(C, X, Y, i, j-1) else return backtrack(C, X, Y, i-1, j) end if end if end function function lcs(sequence a, sequence b) return backtrack(LCSLength(a,b),a,b,length(a),length(b)) end function constant tests = {{"1234","1224533324"}, {"thisisatest","testing123testing"}} for i=1 to length(tests) do string {a,b} = tests[i] ?lcs(a,b) end for
Picat
Wikipedia algorithm
With some added trickery for a 1-based language.
lcs_wiki(X,Y) = V =>
[C, _Len] = lcs_length(X,Y),
V = backTrace(C,X,Y,X.length+1,Y.length+1).
lcs_length(X, Y) = V=>
M = X.length,
N = Y.length,
C = [[0 : J in 1..N+1] : I in 1..N+1],
foreach(I in 2..M+1,J in 2..N+1)
if X[I-1] == Y[J-1] then
C[I,J] := C[I-1,J-1] + 1
else
C[I,J] := max([C[I,J-1], C[I-1,J]])
end
end,
V = [C, C[M+1,N+1]].
backTrace(C, X, Y, I, J) = V =>
if I == 1; J == 1 then
V = ""
elseif X[I-1] == Y[J-1] then
V = backTrace(C, X, Y, I-1, J-1) ++ [X[I-1]]
else
if C[I,J-1] > C[I-1,J] then
V = backTrace(C, X, Y, I, J-1)
else
V = backTrace(C, X, Y, I-1, J)
end
end.
Rule-based
table
lcs_rule(A, B) = "", (A == ""; B == "") => true.
lcs_rule(A, B) = [A[1]] ++ lcs_rule(butfirst(A), butfirst(B)), A[1] == B[1] => true.
lcs_rule(A, B) = longest(lcs_rule(butfirst(A), B), lcs_rule(A, butfirst(B))) => true.
% Return the longest string of A and B
longest(A, B) = cond(A.length > B.length, A, B).
% butfirst (everything except first element)
butfirst(A) = [A[I] : I in 2..A.length].
Test
go =>
Tests = [["thisisatest","testing123testing"],
["XMJYAUZ", "MZJAWXU"],
["1234", "1224533324"],
["beginning-middle-ending","beginning-diddle-dum-ending"]
],
Funs = [lcs_wiki,lcs_rule],
foreach(Fun in Funs)
println(fun=Fun),
foreach(Test in Tests)
printf("%w : %w\n", Test, apply(Fun,Test[1],Test[2]))
end,
nl
end,
nl.
- Output:
fun = lcs_wiki [thisisatest,testing123testing] : tsitest [XMJYAUZ,MZJAWXU] : MJAU [1234,1224533324] : 1234 [beginning-middle-ending,beginning-diddle-dum-ending] : beginning-iddle-ending fun = lcs_rule [thisisatest,testing123testing] : tsitest [XMJYAUZ,MZJAWXU] : MJAU [1234,1224533324] : 1234 [beginning-middle-ending,beginning-diddle-dum-ending] : beginning-iddle-ending
PicoLisp
(de commonSequences (A B)
(when A
(conc
(when (member (car A) B)
(mapcar '((L) (cons (car A) L))
(cons NIL (commonSequences (cdr A) (cdr @))) ) )
(commonSequences (cdr A) B) ) ) )
(maxi length
(commonSequences
(chop "thisisatest")
(chop "testing123testing") ) )
- Output:
-> ("t" "s" "i" "t" "e" "s" "t")
PowerShell
Returns a sequence (array) of a type:
function Get-Lcs ($ReferenceObject, $DifferenceObject)
{
$longestCommonSubsequence = @()
$x = $ReferenceObject.Length
$y = $DifferenceObject.Length
$lengths = New-Object -TypeName 'System.Object[,]' -ArgumentList ($x + 1), ($y + 1)
for($i = 0; $i -lt $x; $i++)
{
for ($j = 0; $j -lt $y; $j++)
{
if ($ReferenceObject[$i] -ceq $DifferenceObject[$j])
{
$lengths[($i+1),($j+1)] = $lengths[$i,$j] + 1
}
else
{
$lengths[($i+1),($j+1)] = [Math]::Max(($lengths[($i+1),$j]),($lengths[$i,($j+1)]))
}
}
}
while (($x -ne 0) -and ($y -ne 0))
{
if ( $lengths[$x,$y] -eq $lengths[($x-1),$y])
{
--$x
}
elseif ($lengths[$x,$y] -eq $lengths[$x,($y-1)])
{
--$y
}
else
{
if ($ReferenceObject[($x-1)] -ceq $DifferenceObject[($y-1)])
{
$longestCommonSubsequence = ,($ReferenceObject[($x-1)]) + $longestCommonSubsequence
}
--$x
--$y
}
}
$longestCommonSubsequence
}
Returns the character array as a string:
(Get-Lcs -ReferenceObject "thisisatest" -DifferenceObject "testing123testing") -join ""
- Output:
tsitest
Returns an array of integers:
Get-Lcs -ReferenceObject @(1,2,3,4) -DifferenceObject @(1,2,2,4,5,3,3,3,2,4)
- Output:
1 2 3 4
Given two lists of objects, return the LCS of the ID property:
$list1
ID X Y
-- - -
1 101 201
2 102 202
3 103 203
4 104 204
5 105 205
6 106 206
7 107 207
8 108 208
9 109 209
$list2
ID X Y
-- - -
1 101 201
3 103 203
5 105 205
7 107 207
9 109 209
Get-Lcs -ReferenceObject $list1.ID -DifferenceObject $list2.ID
- Output:
1 3 5 7 9
Prolog
Recursive Version
First version:
test :-
time(lcs("thisisatest", "testing123testing", Lcs)),
writef('%s',[Lcs]).
lcs([ H|L1],[ H|L2],[H|Lcs]) :- !,
lcs(L1,L2,Lcs).
lcs([H1|L1],[H2|L2],Lcs):-
lcs( L1 ,[H2|L2],Lcs1),
lcs([H1|L1], L2 ,Lcs2),
longest(Lcs1,Lcs2,Lcs),!.
lcs(_,_,[]).
longest(L1,L2,Longest) :-
length(L1,Length1),
length(L2,Length2),
((Length1 > Length2) -> Longest = L1; Longest = L2).
Second version, with memoization:
%declare that we will add lcs_db facts during runtime
:- dynamic lcs_db/3.
test :-
retractall(lcs_db(_,_,_)), %clear the database of known results
time(lcs("thisisatest", "testing123testing", Lcs)),
writef('%s',[Lcs]).
% check if the result is known
lcs(L1,L2,Lcs) :-
lcs_db(L1,L2,Lcs),!.
lcs([ H|L1],[ H|L2],[H|Lcs]) :- !,
lcs(L1,L2,Lcs).
lcs([H1|L1],[H2|L2],Lcs) :-
lcs( L1 ,[H2|L2],Lcs1),
lcs([H1|L1], L2 ,Lcs2),
longest(Lcs1,Lcs2,Lcs),!,
assert(lcs_db([H1|L1],[H2|L2],Lcs)).
lcs(_,_,[]).
longest(L1,L2,Longest) :-
length(L1,Length1),
length(L2,Length2),
((Length1 > Length2) -> Longest = L1; Longest = L2).
- Demonstrating:
Example for "beginning-middle-ending" and "beginning-diddle-dum-ending"
First version :
?- time(lcs("beginning-middle-ending","beginning-diddle-dum-ending", Lcs)),writef('%s', [Lcs]).
% 10,875,184 inferences, 1.840 CPU in 1.996 seconds (92% CPU, 5910426 Lips)
beginning-iddle-ending
Second version which is much faster :
?- time(lcs("beginning-middle-ending","beginning-diddle-dum-ending", Lcs)),writef('%s', [Lcs]).
% 2,376 inferences, 0.010 CPU in 0.003 seconds (300% CPU, 237600 Lips)
beginning-iddle-ending
PureBasic
Procedure.s lcs(a$, b$)
Protected x$ , lcs$
If Len(a$) = 0 Or Len(b$) = 0
lcs$ = ""
ElseIf Right(a$, 1) = Right(b$, 1)
lcs$ = lcs(Left(a$, Len(a$) - 1), Left(b$, Len(b$) - 1)) + Right(a$, 1)
Else
x$ = lcs(a$, Left(b$, Len(b$) - 1))
y$ = lcs(Left(a$, Len(a$) - 1), b$)
If Len(x$) > Len(y$)
lcs$ = x$
Else
lcs$ = y$
EndIf
EndIf
ProcedureReturn lcs$
EndProcedure
OpenConsole()
PrintN( lcs("thisisatest", "testing123testing"))
PrintN("Press any key to exit"): Repeat: Until Inkey() <> ""
Python
The simplest way is to use LCS within mlpy package
Recursion
This solution is similar to the Haskell one. It is slow.
def lcs(xstr, ystr):
"""
>>> lcs('thisisatest', 'testing123testing')
'tsitest'
"""
if not xstr or not ystr:
return ""
x, xs, y, ys = xstr[0], xstr[1:], ystr[0], ystr[1:]
if x == y:
return str(lcs(xs, ys)) + x
else:
return max(lcs(xstr, ys), lcs(xs, ystr), key=len)
Test it:
if __name__=="__main__":
import doctest; doctest.testmod()
Dynamic Programming
def lcs(a, b):
# generate matrix of length of longest common subsequence for substrings of both words
lengths = [[0] * (len(b)+1) for _ in range(len(a)+1)]
for i, x in enumerate(a):
for j, y in enumerate(b):
if x == y:
lengths[i+1][j+1] = lengths[i][j] + 1
else:
lengths[i+1][j+1] = max(lengths[i+1][j], lengths[i][j+1])
# read a substring from the matrix
result = ''
j = len(b)
for i in range(1, len(a)+1):
if lengths[i][j] != lengths[i-1][j]:
result += a[i-1]
return result
Racket
#lang racket
(define (longest xs ys)
(if (> (length xs) (length ys))
xs ys))
(define memo (make-hash))
(define (lookup xs ys)
(hash-ref memo (cons xs ys) #f))
(define (store xs ys r)
(hash-set! memo (cons xs ys) r)
r)
(define (lcs/list sx sy)
(or (lookup sx sy)
(store sx sy
(match* (sx sy)
[((cons x xs) (cons y ys))
(if (equal? x y)
(cons x (lcs/list xs ys))
(longest (lcs/list sx ys) (lcs/list xs sy)))]
[(_ _) '()]))))
(define (lcs sx sy)
(list->string (lcs/list (string->list sx) (string->list sy))))
(lcs "thisisatest" "testing123testing")
- Output:
"tsitest">
Raku
(formerly Perl 6)
Recursion
This solution is similar to the Haskell one. It is slow.
say lcs("thisisatest", "testing123testing");sub lcs(Str $xstr, Str $ystr) {
return "" unless $xstr && $ystr;
my ($x, $xs, $y, $ys) = $xstr.substr(0, 1), $xstr.substr(1), $ystr.substr(0, 1), $ystr.substr(1);
return $x eq $y
?? $x ~ lcs($xs, $ys)
!! max(:by{ $^a.chars }, lcs($xstr, $ys), lcs($xs, $ystr) );
}
say lcs("thisisatest", "testing123testing");
Dynamic Programming
sub lcs(Str $xstr, Str $ystr) {
my ($xlen, $ylen) = ($xstr, $ystr)>>.chars;
my @lengths = map {[(0) xx ($ylen+1)]}, 0..$xlen;
for $xstr.comb.kv -> $i, $x {
for $ystr.comb.kv -> $j, $y {
@lengths[$i+1][$j+1] = $x eq $y ?? @lengths[$i][$j]+1 !! (@lengths[$i+1][$j], @lengths[$i][$j+1]).max;
}
}
my @x = $xstr.comb;
my ($x, $y) = ($xlen, $ylen);
my $result = "";
while $x != 0 && $y != 0 {
if @lengths[$x][$y] == @lengths[$x-1][$y] {
$x--;
}
elsif @lengths[$x][$y] == @lengths[$x][$y-1] {
$y--;
}
else {
$result = @x[$x-1] ~ $result;
$x--;
$y--;
}
}
return $result;
}
say lcs("thisisatest", "testing123testing");
Bit Vector
Bit parallel dynamic programming with nearly linear complexity O(n). It is fast.
sub lcs(Str $xstr, Str $ystr) {
my (@a, @b) := ($xstr, $ystr)».comb;
my (%positions, @Vs, $lcs);
for @a.kv -> $i, $x { %positions{$x} +|= 1 +< ($i % @a) }
my $S = +^ 0;
for (0 ..^ @b) -> $j {
my $u = $S +& (%positions{@b[$j]} // 0);
@Vs[$j] = $S = ($S + $u) +| ($S - $u)
}
my ($i, $j) = @a-1, @b-1;
while ($i ≥ 0 and $j ≥ 0) {
unless (@Vs[$j] +& (1 +< $i)) {
$lcs [R~]= @a[$i] unless $j and ^@Vs[$j-1] +& (1 +< $i);
$j--
}
$i--
}
$lcs
}
say lcs("thisisatest", "testing123testing");
ReasonML
let longest = (xs, ys) =>
if (List.length(xs) > List.length(ys)) {
xs;
} else {
ys;
};
let rec lcs = (a, b) =>
switch (a, b) {
| ([], _)
| (_, []) => []
| ([x, ...xs], [y, ...ys]) =>
if (x == y) {
[x, ...lcs(xs, ys)];
} else {
longest(lcs(a, ys), lcs(xs, b));
}
};
REXX
/*REXX program tests the LCS (Longest Common Subsequence) subroutine. */
parse arg aaa bbb . /*obtain optional arguments from the CL*/
say 'string A =' aaa /*echo the string A to the screen. */
say 'string B =' bbb /* " " " B " " " */
say ' LCS =' LCS(aaa, bbb) /*tell the Longest Common Sequence. */
exit /*stick a fork in it, we're all done. */
/*──────────────────────────────────────────────────────────────────────────────────────*/
LCS: procedure; parse arg a,b,z /*Longest Common Subsequence. */
/*reduce recursions, removes the */
/*chars in A ¬ in B, and vice─versa.*/
if z=='' then return LCS( LCS(a,b,0), LCS(b,a,0), 9) /*Is Z null? Do recurse. */
j= length(a)
if z==0 then do /*a special invocation: shrink Z. */
do j=1 for j; _= substr(a, j, 1)
if pos(_, b)\==0 then z= z || _
end /*j*/
return substr(z, 2)
end
k= length(b)
if j==0 | k==0 then return '' /*Is either string null? Bupkis. */
_= substr(a, j, 1)
if _==substr(b, k, 1) then return LCS( substr(a, 1, j - 1), substr(b, 1, k - 1), 9)_
x= LCS(a, substr(b, 1, k - 1), 9)
y= LCS( substr(a, 1, j - 1), b, 9)
if length(x)>length(y) then return x
return y
- output when using the input of: 1234 1224533324
string A = 1234 string B = 1224533324 LCS = 1234
- output when using the input of: thisisatest testing123testing
string A = thisisatest string B = testing123testing LCS = tsitest
Ring
see longest("1267834", "1224533324") + nl
func longest a, b
if a = "" or b = "" return "" ok
if right(a, 1) = right(b, 1)
lcs = longest(left(a, len(a) - 1), left(b, len(b) - 1)) + right(a, 1)
return lcs
else
x1 = longest(a, left(b, len(b) - 1))
x2 = longest(left(a, len(a) - 1), b)
if len(x1) > len(x2)
lcs = x1
return lcs
else
lcs = x2
return lcs ok ok
Output:
1234
Ruby
Recursion
This solution is similar to the Haskell one. It is slow (The time complexity is exponential.)
=begin
irb(main):001:0> lcs('thisisatest', 'testing123testing')
=> "tsitest"
=end
def lcs(xstr, ystr)
return "" if xstr.empty? || ystr.empty?
x, xs, y, ys = xstr[0..0], xstr[1..-1], ystr[0..0], ystr[1..-1]
if x == y
x + lcs(xs, ys)
else
[lcs(xstr, ys), lcs(xs, ystr)].max_by {|x| x.size}
end
end
Dynamic programming
Walker class for the LCS matrix:
class LCS
SELF, LEFT, UP, DIAG = [0,0], [0,-1], [-1,0], [-1,-1]
def initialize(a, b)
@m = Array.new(a.length) { Array.new(b.length) }
a.each_char.with_index do |x, i|
b.each_char.with_index do |y, j|
match(x, y, i, j)
end
end
end
def match(c, d, i, j)
@i, @j = i, j
@m[i][j] = compute_entry(c, d)
end
def lookup(x, y) [@i+x, @j+y] end
def valid?(i=@i, j=@j) i >= 0 && j >= 0 end
def peek(x, y)
i, j = lookup(x, y)
valid?(i, j) ? @m[i][j] : 0
end
def compute_entry(c, d)
c == d ? peek(*DIAG) + 1 : [peek(*LEFT), peek(*UP)].max
end
def backtrack
@i, @j = @m.length-1, @m[0].length-1
y = []
y << @i+1 if backstep? while valid?
y.reverse
end
def backtrack2
@i, @j = @m.length-1, @m[0].length-1
y = []
y << @j+1 if backstep? while valid?
[backtrack, y.reverse]
end
def backstep?
backstep = compute_backstep
@i, @j = lookup(*backstep)
backstep == DIAG
end
def compute_backstep
case peek(*SELF)
when peek(*LEFT) then LEFT
when peek(*UP) then UP
else DIAG
end
end
end
def lcs(a, b)
walker = LCS.new(a, b)
walker.backtrack.map{|i| a[i]}.join
end
if $0 == __FILE__
puts lcs('thisisatest', 'testing123testing')
puts lcs("rosettacode", "raisethysword")
end
- Output:
tsitest rsetod
Referring to LCS here and here.
Run BASIC
a$ = "aebdaef"
b$ = "cacbac"
print lcs$(a$,b$)
end
FUNCTION lcs$(a$, b$)
IF a$ = "" OR b$ = "" THEN
lcs$ = ""
goto [ext]
end if
IF RIGHT$(a$, 1) = RIGHT$(b$, 1) THEN
lcs$ = lcs$(LEFT$(a$, LEN(a$) - 1), LEFT$(b$, LEN(b$) - 1)) + RIGHT$(a$, 1)
goto [ext]
ELSE
x1$ = lcs$(a$, LEFT$(b$, LEN(b$) - 1))
x2$ = lcs$(LEFT$(a$, LEN(a$) - 1), b$)
IF LEN(x1$) > LEN(x2$) THEN
lcs$ = x1$
goto [ext]
ELSE
lcs$ = x2$
goto [ext]
END IF
END IF
[ext]
END FUNCTION
aba
Rust
Dynamic programming version:
use std::cmp;
fn lcs(string1: String, string2: String) -> (usize, String){
let total_rows = string1.len() + 1;
let total_columns = string2.len() + 1;
// rust doesn't allow accessing string by index
let string1_chars = string1.as_bytes();
let string2_chars = string2.as_bytes();
let mut table = vec![vec![0; total_columns]; total_rows];
for row in 1..total_rows{
for col in 1..total_columns {
if string1_chars[row - 1] == string2_chars[col - 1]{
table[row][col] = table[row - 1][col - 1] + 1;
} else {
table[row][col] = cmp::max(table[row][col-1], table[row-1][col]);
}
}
}
let mut common_seq = Vec::new();
let mut x = total_rows - 1;
let mut y = total_columns - 1;
while x != 0 && y != 0 {
// Check element above is equal
if table[x][y] == table[x - 1][y] {
x = x - 1;
}
// check element to the left is equal
else if table[x][y] == table[x][y - 1] {
y = y - 1;
}
else {
// check the two element at the respective x,y position is same
assert_eq!(string1_chars[x-1], string2_chars[y-1]);
let char = string1_chars[x - 1];
common_seq.push(char);
x = x - 1;
y = y - 1;
}
}
common_seq.reverse();
(table[total_rows - 1][total_columns - 1], String::from_utf8(common_seq).unwrap())
}
fn main() {
let res = lcs("abcdaf".to_string(), "acbcf".to_string());
assert_eq!((4 as usize, "abcf".to_string()), res);
let res = lcs("thisisatest".to_string(), "testing123testing".to_string());
assert_eq!((7 as usize, "tsitest".to_string()), res);
// LCS for input Sequences “AGGTAB” and “GXTXAYB” is “GTAB” of length 4.
let res = lcs("AGGTAB".to_string(), "GXTXAYB".to_string());
assert_eq!((4 as usize, "GTAB".to_string()), res);
}
Scala
Using lazily evaluated lists:
def lcsLazy[T](u: IndexedSeq[T], v: IndexedSeq[T]): IndexedSeq[T] = {
def su = subsets(u).to(LazyList)
def sv = subsets(v).to(LazyList)
su.intersect(sv).headOption match{
case Some(sub) => sub
case None => IndexedSeq[T]()
}
}
def subsets[T](u: IndexedSeq[T]): Iterator[IndexedSeq[T]] = {
u.indices.reverseIterator.flatMap{n => u.indices.combinations(n + 1).map(_.map(u))}
}
Using recursion:
def lcsRec[T]: (IndexedSeq[T], IndexedSeq[T]) => IndexedSeq[T] = {
case (a +: as, b +: bs) if a == b => a +: lcsRec(as, bs)
case (as, bs) if as.isEmpty || bs.isEmpty => IndexedSeq[T]()
case (a +: as, b +: bs) =>
val (s1, s2) = (lcsRec(a +: as, bs), lcsRec(as, b +: bs))
if(s1.length > s2.length) s1 else s2
}
- Output:
scala> lcsLazy("thisisatest", "testing123testing").mkString res0: String = tsitest scala> lcsRec("thisisatest", "testing123testing").mkString res1: String = tsitest
Recursive version:
def lcs[T]: (List[T], List[T]) => List[T] = {
case (_, Nil) => Nil
case (Nil, _) => Nil
case (x :: xs, y :: ys) if x == y => x :: lcs(xs, ys)
case (x :: xs, y :: ys) => {
(lcs(x :: xs, ys), lcs(xs, y :: ys)) match {
case (xs, ys) if xs.length > ys.length => xs
case (xs, ys) => ys
}
}
}
The dynamic programming version:
case class Memoized[A1, A2, B](f: (A1, A2) => B) extends ((A1, A2) => B) {
val cache = scala.collection.mutable.Map.empty[(A1, A2), B]
def apply(x: A1, y: A2) = cache.getOrElseUpdate((x, y), f(x, y))
}
lazy val lcsM: Memoized[List[Char], List[Char], List[Char]] = Memoized {
case (_, Nil) => Nil
case (Nil, _) => Nil
case (x :: xs, y :: ys) if x == y => x :: lcsM(xs, ys)
case (x :: xs, y :: ys) => {
(lcsM(x :: xs, ys), lcsM(xs, y :: ys)) match {
case (xs, ys) if xs.length > ys.length => xs
case (xs, ys) => ys
}
}
}
- Output:
scala> lcsM("thisiaatest".toList, "testing123testing".toList).mkString res0: String = tsitest
Scheme
Port from Clojure.
;; using srfi-69
(define (memoize proc)
(let ((results (make-hash-table)))
(lambda args
(or (hash-table-ref results args (lambda () #f))
(let ((r (apply proc args)))
(hash-table-set! results args r)
r)))))
(define (longest xs ys)
(if (> (length xs)
(length ys))
xs ys))
(define lcs
(memoize
(lambda (seqx seqy)
(if (pair? seqx)
(let ((x (car seqx))
(xs (cdr seqx)))
(if (pair? seqy)
(let ((y (car seqy))
(ys (cdr seqy)))
(if (equal? x y)
(cons x (lcs xs ys))
(longest (lcs seqx ys)
(lcs xs seqy))))
'()))
'()))))
Testing:
(test-group
"lcs"
(test '() (lcs '(a b c) '(A B C)))
(test '(a) (lcs '(a a a) '(A A a)))
(test '() (lcs '() '(a b c)))
(test '() (lcs '(a b c) '()))
(test '(a c) (lcs '(a b c) '(a B c)))
(test '(b) (lcs '(a b c) '(A b C)))
(test '( b d e f g h j)
(lcs '(a b d e f g h i j)
'(A b c d e f F a g h j))))
Seed7
$ include "seed7_05.s7i";
const func string: lcs (in string: a, in string: b) is func
result
var string: lcs is "";
local
var string: x is "";
var string: y is "";
begin
if a <> "" and b <> "" then
if a[length(a)] = b[length(b)] then
lcs := lcs(a[.. pred(length(a))], b[.. pred(length(b))]) & str(a[length(a)]);
else
x := lcs(a, b[.. pred(length(b))]);
y := lcs(a[.. pred(length(a))], b);
if length(x) > length(y) then
lcs := x;
else
lcs := y;
end if;
end if;
end if;
end func;
const proc: main is func
begin
writeln(lcs("thisisatest", "testing123testing"));
writeln(lcs("1234", "1224533324"));
end func;
Output:
tsitest 1234
SequenceL
It is interesting to note that x and y are computed in parallel, dividing work across threads repeatedly down through the recursion.
import <Utilities/Sequence.sl>;
lcsBack(a(1), b(1)) :=
let
aSub := allButLast(a);
bSub := allButLast(b);
x := lcsBack(a, bSub);
y := lcsBack(aSub, b);
in
[] when size(a) = 0 or size(b) = 0
else
lcsBack(aSub, bSub) ++ [last(a)] when last(a) = last(b)
else
x when size(x) > size(y)
else
y;
main(args(2)) :=
lcsBack(args[1], args[2]) when size(args) >=2
else
lcsBack("thisisatest", "testing123testing");
- Output:
"tsitest"
SETL
Recursive; Also works on tuples (vectors)
op .longest(a, b);
return if #a > #b then a else b end;
end .longest;
procedure lcs(a, b);
if exists empty in {a, b} | #empty = 0 then
return empty;
elseif a(1) = b(1) then
return a(1) + lcs(a(2..), b(2..));
else
return lcs(a(2..), b) .longest lcs(a, b(2..));
end;
end lcs;
Sidef
func lcs(xstr, ystr) is cached {
xstr.is_empty && return xstr
ystr.is_empty && return ystr
var(x, xs, y, ys) = (xstr.first(1), xstr.slice(1),
ystr.first(1), ystr.slice(1))
if (x == y) {
x + lcs(xs, ys)
} else {
[lcs(xstr, ys), lcs(xs, ystr)].max_by { .len }
}
}
say lcs("thisisatest", "testing123testing")
- Output:
tsitest
Slate
We define this on the Sequence type since there is nothing string-specific about the concept.
Recursion
s1@(Sequence traits) longestCommonSubsequenceWith: s2@(Sequence traits)
[
s1 isEmpty \/ s2 isEmpty ifTrue: [^ {}].
s1 last = s2 last
ifTrue: [(s1 allButLast longestCommonSubsequenceWith: s2 allButLast) copyWith: s1 last]
ifFalse: [| x y |
x: (s1 longestCommonSubsequenceWith: s2 allButLast).
y: (s1 allButLast longestCommonSubsequenceWith: s2).
x length > y length ifTrue: [x] ifFalse: [y]]
].
Dynamic Programming
s1@(Sequence traits) longestCommonSubsequenceWith: s2@(Sequence traits)
[| lengths |
lengths: (ArrayMD newWithDimensions: {s1 length `cache. s2 length `cache} defaultElement: 0).
s1 doWithIndex: [| :elem1 :index1 |
s2 doWithIndex: [| :elem2 :index2 |
elem1 = elem2
ifTrue: [lengths at: {index1 + 1. index2 + 1} put: (lengths at: {index1. index2}) + 1]
ifFalse: [lengths at: {index1 + 1. index2 + 1} put:
((lengths at: {index1 + 1. index2}) max: (lengths at: {index1. index2 + 1}))]]].
([| :result index1 index2 |
index1: s1 length.
index2: s2 length.
[index1 isPositive /\ index2 isPositive]
whileTrue:
[(lengths at: {index1. index2}) = (lengths at: {index1 - 1. index2})
ifTrue: [index1: index1 - 1]
ifFalse: [(lengths at: {index1. index2}) = (lengths at: {index1. index2 - 1})]
ifTrue: [index2: index2 - 1]
ifFalse: ["assert: (s1 at: index1 - 1) = (s2 at: index2 - 1)."
result nextPut: (s1 at: index1 - 1).
index1: index1 - 1.
index2: index2 - 1]]
] writingAs: s1) reverse
].
Swift
Swift 5.1
Recursion
rlcs(_ s1: String, _ s2: String) -> String {
if s1.count == 0 || s2.count == 0 {
return ""
} else if s1[s1.index(s1.endIndex, offsetBy: -1)] == s2[s2.index(s2.endIndex, offsetBy: -1)] {
return rlcs(String(s1[s1.startIndex..<s1.index(s1.endIndex, offsetBy: -1)]),
String(s2[s2.startIndex..<s2.index(s2.endIndex, offsetBy: -1)])) + String(s1[s1.index(s1.endIndex, offsetBy: -1)])
} else {
let str1 = rlcs(s1, String(s2[s2.startIndex..<s2.index(s2.endIndex, offsetBy: -1)]))
let str2 = rlcs(String(s1[s1.startIndex..<s1.index(s1.endIndex, offsetBy: -1)]), s2)
return str1.count > str2.count ? str1 : str2
}
}
Dynamic Programming
func lcs(_ s1: String, _ s2: String) -> String {
var lens = Array(
repeating:Array(repeating: 0, count: s2.count + 1),
count: s1.count + 1
)
for i in 0..<s1.count {
for j in 0..<s2.count {
if s1[s1.index(s1.startIndex, offsetBy: i)] == s2[s2.index(s2.startIndex, offsetBy: j)] {
lens[i + 1][j + 1] = lens[i][j] + 1
} else {
lens[i + 1][j + 1] = max(lens[i + 1][j], lens[i][j + 1])
}
}
}
var returnStr = ""
var x = s1.count
var y = s2.count
while x != 0 && y != 0 {
if lens[x][y] == lens[x - 1][y] {
x -= 1
} else if lens[x][y] == lens[x][y - 1] {
y -= 1
} else {
returnStr += String(s1[s1.index(s1.startIndex, offsetBy: x - 1)])
x -= 1
y -= 1
}
}
return String(returnStr.reversed())
}
Tcl
Recursive
proc r_lcs {a b} {
if {$a eq "" || $b eq ""} {return ""}
set a_ [string range $a 1 end]
set b_ [string range $b 1 end]
if {[set c [string index $a 0]] eq [string index $b 0]} {
return "$c[r_lcs $a_ $b_]"
} else {
set x [r_lcs $a $b_]
set y [r_lcs $a_ $b]
return [expr {[string length $x] > [string length $y] ? $x :$y}]
}
}
Dynamic
package require Tcl 8.5
namespace import ::tcl::mathop::+
namespace import ::tcl::mathop::-
namespace import ::tcl::mathfunc::max
proc d_lcs {a b} {
set la [string length $a]
set lb [string length $b]
set lengths [lrepeat [+ $la 1] [lrepeat [+ $lb 1] 0]]
for {set i 0} {$i < $la} {incr i} {
for {set j 0} {$j < $lb} {incr j} {
if {[string index $a $i] eq [string index $b $j]} {
lset lengths [+ $i 1] [+ $j 1] [+ [lindex $lengths $i $j] 1]
} else {
lset lengths [+ $i 1] [+ $j 1] [max [lindex $lengths [+ $i 1] $j] [lindex $lengths $i [+ $j 1]]]
}
}
}
set result ""
set x $la
set y $lb
while {$x > 0 && $y > 0} {
if {[lindex $lengths $x $y] == [lindex $lengths [- $x 1] $y]} {
incr x -1
} elseif {[lindex $lengths $x $y] == [lindex $lengths $x [- $y 1]]} {
incr y -1
} else {
if {[set c [string index $a [- $x 1]]] ne [string index $b [- $y 1]]} {
error "assertion failed: a.charAt(x-1) == b.charAt(y-1)"
}
append result $c
incr x -1
incr y -1
}
}
return [string reverse $result]
}
Performance Comparison
% time {d_lcs thisisatest testing123testing} 10
637.5 microseconds per iteration
% time {r_lcs thisisatest testing123testing} 10
1275566.8 microseconds per iteration
Ursala
This uses the same recursive algorithm as in the Haskell example, and works on lists of any type.
#import std
lcs = ~&alrB^& ~&E?abh/~&alh2fabt2RC @faltPrXlrtPXXPW leql?/~&r ~&l
test program:
#cast %s
example = lcs('thisisatest','testing123testing')
- Output:
'tsitest'
Wren
var lcs // recursive
lcs = Fn.new { |x, y|
if (x.count == 0 || y.count == 0) return ""
var x1 = x[0...-1]
var y1 = y[0...-1]
if (x[-1] == y[-1]) return lcs.call(x1, y1) + x[-1]
var x2 = lcs.call(x, y1)
var y2 = lcs.call(x1, y)
return (x2.count > y2.count) ? x2 : y2
}
var x = "thisisatest"
var y = "testing123testing"
System.print(lcs.call(x, y))
- Output:
tsitest
zkl
This is quite vile in terms of [time] efficiency, another algorithm should be used for real work.
fcn lcs(a,b){
if(not a or not b) return("");
if (a[0]==b[0]) return(a[0] + self.fcn(a[1,*],b[1,*]));
return(fcn(x,y){if(x.len()>y.len())x else y}(lcs(a,b[1,*]),lcs(a[1,*],b)))
}
The last line looks strange but it is just return(lambda longest(lcs.lcs))
- Output:
zkl: lcs("thisisatest", "testing123testing") tsitest
- Programming Tasks
- Solutions by Programming Task
- Recursion
- Memoization
- 11l
- Ada
- ALGOL 68
- APL
- Arturo
- AutoHotkey
- BASIC
- QuickBASIC
- BASIC256
- BBC BASIC
- BQN
- Bracmat
- C
- C sharp
- C++
- Clojure
- CoffeeScript
- Common Lisp
- D
- Dart
- EasyLang
- Egison
- Elixir
- Erlang
- F Sharp
- Factor
- Fortran
- FreeBASIC
- Go
- Groovy
- Haskell
- Icon
- Unicon
- Icon Programming Library
- J
- Java
- JavaScript
- Jq
- Julia
- Kotlin
- Liberty BASIC
- Logo
- Lua
- M4
- Maple
- Mathematica
- Wolfram Language
- Nim
- OCaml
- Oz
- Pascal
- PascalABC.NET
- Perl
- Phix
- Picat
- PicoLisp
- PowerShell
- Prolog
- PureBasic
- Python
- Racket
- Raku
- ReasonML
- REXX
- Ring
- Ruby
- Run BASIC
- Rust
- Scala
- Scheme
- Seed7
- SequenceL
- SETL
- Sidef
- Slate
- Swift
- Tcl
- Ursala
- Wren
- Zkl
- Pages with too many expensive parser function calls