Anagrams: Difference between revisions
→{{header|Ruby}}: Reworked last example |
m →{{header|Ruby}}: added remark. |
||
Line 3,999: | Line 3,999: | ||
Short version |
Short version (with lexical ordered result). |
||
<lang ruby>require 'open-uri' |
<lang ruby>require 'open-uri' |
||
Revision as of 22:32, 10 November 2013
You are encouraged to solve this task according to the task description, using any language you may know.
Two or more words can be composed of the same characters, but in a different order. Using the word list at http://www.puzzlers.org/pub/wordlists/unixdict.txt, find the sets of words that share the same characters that contain the most words in them.
ABAP
<lang ABAP>report zz_anagrams no standard page heading. define update_progress.
call function 'SAPGUI_PROGRESS_INDICATOR' exporting text = &1.
end-of-definition.
" Selection screen segment allowing the person to choose which file will act as input. selection-screen begin of block file_choice.
parameters p_file type string lower case.
selection-screen end of block file_choice.
" When the user requests help with input, run the routine to allow them to navigate the presentation server. at selection-screen on value-request for p_file.
perform getfile using p_file.
at selection-screen output.
%_p_file_%_app_%-text = 'Input File: '.
start-of-selection.
data: gt_data type table of string.
" Read the specified file from the presentation server into memory. perform readfile using p_file changing gt_data. " After the file has been read into memory, loop through it line-by-line and make anagrams. perform anagrams using gt_data.
" Subroutine for generating a list of anagrams. " The supplied input is a table, with each entry corresponding to a word. form anagrams using it_data like gt_data.
types begin of ty_map. types key type string. types value type string. types end of ty_map.
data: lv_char type c, lv_len type i, lv_string type string, ls_entry type ty_map, lt_anagrams type standard table of ty_map, lt_c_tab type table of string.
field-symbols: <fs_raw> type string. " Loop through each word in the table, and make an associative array. loop at gt_data assigning <fs_raw>. " First, we need to re-order the word alphabetically. This generated a key. All anagrams will use this same key. " Add each character to a table, which we will then sort alphabetically. lv_len = strlen( <fs_raw> ). refresh lt_c_tab. do lv_len times. lv_len = sy-index - 1. append <fs_raw>+lv_len(1) to lt_c_tab. enddo. sort lt_c_tab as text. " Now append the characters to a string and add it as a key into the map. clear lv_string. loop at lt_c_tab into lv_char. concatenate lv_char lv_string into lv_string respecting blanks. endloop. ls_entry-key = lv_string. ls_entry-value = <fs_raw>. append ls_entry to lt_anagrams. endloop. " After we're done processing, output a list of the anagrams. clear lv_string. loop at lt_anagrams into ls_entry. " Is it part of the same key --> Output in the same line, else a new entry. if lv_string = ls_entry-key. write: ', ', ls_entry-value. else. if sy-tabix <> 1. write: ']'. endif. write: / '[', ls_entry-value. endif. lv_string = ls_entry-key. endloop. " Close last entry. write ']'.
endform.
" Read a specified file from the presentation server. form readfile using i_file type string changing it_raw like gt_data.
data: l_datat type string, l_msg(2048), l_lines(10).
" Read the file into memory. update_progress 'Reading file...'. call method cl_gui_frontend_services=>gui_upload exporting filename = i_file changing data_tab = it_raw exceptions others = 1. " Output error if the file could not be uploaded. if sy-subrc <> 0. write : / 'Error reading the supplied file!'. return. endif.
endform.</lang> Output:
[ angel , angle , galen , glean , lange ] [ elan , lane , lean , lena , neal ] [ alger , glare , lager , large , regal ] [ abel , able , bale , bela , elba ] [ evil , levi , live , veil , vile ] [ caret , carte , cater , crate , trace ]
Ada
<lang ada>with Ada.Text_IO; use Ada.Text_IO;
with Ada.Containers.Indefinite_Ordered_Maps; with Ada.Containers.Indefinite_Ordered_Sets;
procedure Words_Of_Equal_Characters is
package Set_Of_Words is new Ada.Containers.Indefinite_Ordered_Sets (String); use Ada.Containers, Set_Of_Words; package Anagrams is new Ada.Containers.Indefinite_Ordered_Maps (String, Set); use Anagrams;
File : File_Type; Result : Map; Max : Count_Type := 1;
procedure Put (Position : Anagrams.Cursor) is First : Boolean := True; List : Set renames Element (Position); procedure Put (Position : Set_Of_Words.Cursor) is begin if First then First := False; else Put (','); end if; Put (Element (Position)); end Put; begin if List.Length = Max then Iterate (List, Put'Access); New_Line; end if; end Put;
begin
Open (File, In_File, "unixdict.txt"); loop declare Word : constant String := Get_Line (File); Key : String (Word'Range) := (others => Character'Last); List : Set; Position : Anagrams.Cursor; begin for I in Word'Range loop for J in Word'Range loop if Key (J) > Word (I) then Key (J + 1..I) := Key (J..I - 1); Key (J) := Word (I); exit; end if; end loop; end loop; Position := Find (Result, Key); if Has_Element (Position) then List := Element (Position); Insert (List, Word); Replace_Element (Result, Position, List); else Insert (List, Word); Include (Result, Key, List); end if; Max := Count_Type'Max (Max, Length (List)); end; end loop;
exception
when End_Error => Iterate (Result, Put'Access); Close (File);
end Words_Of_Equal_Characters;</lang> Sample output:
abel,able,bale,bela,elba caret,carte,cater,crate,trace angel,angle,galen,glean,lange alger,glare,lager,large,regal elan,lane,lean,lena,neal evil,levi,live,veil,vile
AutoHotkey
contributed by Laszlo on the ahk forum <lang AutoHotkey>MsgBox % anagrams("able")
anagrams(word) {
Static dict IfEqual dict,, FileRead dict, unixdict.txt ; file in the script directory w := sort(word) Loop Parse, dict, `n, `r If (w = sort(A_LoopField)) t .= A_LoopField "`n" Return t
}
sort(word) {
a := RegExReplace(word,".","$0`n") Sort a Return a
}</lang>
AWK
<lang AWK># JUMBLEA.AWK - words with the most duplicate spellings
- syntax: GAWK -f JUMBLEA.AWK UNIXDICT.TXT
{ for (i=1; i<=NF; i++) {
w = sortstr(toupper($i)) arr[w] = arr[w] $i " " n = gsub(/ /,"&",arr[w]) if (max_n < n) { max_n = n } }
} END {
for (w in arr) { if (gsub(/ /,"&",arr[w]) == max_n) { printf("%s\t%s\n",w,arr[w]) } } exit(0)
} function sortstr(str, i,j,leng) {
leng = length(str) for (i=2; i<=leng; i++) { for (j=i; j>1 && substr(str,j-1,1) > substr(str,j,1); j--) { str = substr(str,1,j-2) substr(str,j,1) substr(str,j-1,1) substr(str,j+1) } } return(str)
}</lang> Produces this output:
ABEL abel able bale bela elba ACERT caret carte cater crate trace AEGLN angel angle galen glean lange AEGLR alger glare lager large regal AELN elan lane lean lena neal EILV evil levi live veil vile
Alternatively, non-POSIX version:
<lang awk>#!/bin/gawk -f
{ patsplit($0, chars, ".")
asort(chars) sorted = "" for (i = 1; i <= length(chars); i++)
sorted = sorted chars[i]
if (++count[sorted] > countMax) countMax++ accum[sorted] = accum[sorted] " " $0
}
END {
for (i in accum)
if (count[i] == countMax) print substr(accum[i], 2) }</lang>
BBC BASIC
<lang bbcbasic> INSTALL @lib$+"SORTLIB"
sort% = FN_sortinit(0,0) REM Count number of words in dictionary: nwords% = 0 dict% = OPENIN("unixdict.txt") WHILE NOT EOF#dict% word$ = GET$#dict% nwords% += 1 ENDWHILE CLOSE #dict% REM Create arrays big enough to contain the dictionary: DIM dict$(nwords%), sort$(nwords%) REM Load the dictionary and sort the characters in the words: dict% = OPENIN("unixdict.txt") FOR word% = 1 TO nwords% word$ = GET$#dict% dict$(word%) = word$ sort$(word%) = FNsortchars(word$) NEXT word% CLOSE #dict% REM Sort arrays using the 'sorted character' words as a key: C% = nwords% CALL sort%, sort$(1), dict$(1) REM Count the longest sets of anagrams: max% = 0 set% = 1 FOR word% = 1 TO nwords%-1 IF sort$(word%) = sort$(word%+1) THEN set% += 1 ELSE IF set% > max% THEN max% = set% set% = 1 ENDIF NEXT word% REM Output the results: set% = 1 FOR word% = 1 TO nwords%-1 IF sort$(word%) = sort$(word%+1) THEN set% += 1 ELSE IF set% = max% THEN FOR anagram% = word%-max%+1 TO word% PRINT dict$(anagram%),; NEXT PRINT ENDIF set% = 1 ENDIF NEXT word% END DEF FNsortchars(word$) LOCAL C%, char&() DIM char&(LEN(word$)) $$^char&(0) = word$ C% = LEN(word$) CALL sort%, char&(0) = $$^char&(0)</lang>
Produces this output:
abel able bale bela elba caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile
Bracmat
This solution makes extensive use of Bracmat's computer algebra mechanisms. A trick is needed to handle words that are merely repetitions of a single letter, such as iii
. That's why the variabe sum
isn't initialised with 0
, but with a non-number, in this case the empty string. Also te correct handling of characters 0-9 needs a trick so that they are not numerically added: they are prepended with a non-digit, an N
in this case. After completely traversing the word list, the program writes a file product.txt
that can be visually inspected.
The program is not fast. (Minutes rather than seconds.)
<lang bracmat>( get$("unixdict.txt",STR):?list
& 1:?product
& whl
' ( @(!list:(%?word:?w) \n ?list) & :?sum & whl ' ( @(!w:%?let ?w) & (!let:~#|str$(N !let))+!sum:?sum ) & !sum^!word*!product:?product )
& lst$(product,"product.txt",NEW) & 0:?max & :?group & ( !product
: ? * ?^(%+%:?exp) * ( ? & !exp : ? + ( [>!max:[?max&!exp:?group | [~<!max&!group !exp:?group ) & ~ ) | out$!group )
);</lang> Output:
abel+able+bale+bela+elba caret+carte+cater+crate+trace angel+angle+galen+glean+lange alger+glare+lager+large+regal elan+lane+lean+lena+neal evil+levi+live+veil+vile
C
<lang c>#include <stdio.h>
- include <stdlib.h>
- include <string.h>
- include <ctype.h>
- include <time.h>
char *sortedWord(const char *word, char *wbuf) {
char *p1, *p2, *endwrd; char t; int swaps;
strcpy(wbuf, word); endwrd = wbuf+strlen(wbuf); do { swaps = 0; p1 = wbuf; p2 = endwrd-1; while (p1<p2) { if (*p2 > *p1) { t = *p2; *p2 = *p1; *p1 = t; swaps = 1; } p1++; p2--; } p1 = wbuf; p2 = p1+1; while(p2 < endwrd) { if (*p2 > *p1) { t = *p2; *p2 = *p1; *p1 = t; swaps = 1; } p1++; p2++; } } while (swaps); return wbuf;
}
static short cxmap[] = {
0x06, 0x1f, 0x4d, 0x0c, 0x5c, 0x28, 0x5d, 0x0e, 0x09, 0x33, 0x31, 0x56, 0x52, 0x19, 0x29, 0x53, 0x32, 0x48, 0x35, 0x55, 0x5e, 0x14, 0x27, 0x24, 0x02, 0x3e, 0x18, 0x4a, 0x3f, 0x4c, 0x45, 0x30, 0x08, 0x2c, 0x1a, 0x03, 0x0b, 0x0d, 0x4f, 0x07, 0x20, 0x1d, 0x51, 0x3b, 0x11, 0x58, 0x00, 0x49, 0x15, 0x2d, 0x41, 0x17, 0x5f, 0x39, 0x16, 0x42, 0x37, 0x22, 0x1c, 0x0f, 0x43, 0x5b, 0x46, 0x4b, 0x0a, 0x26, 0x2e, 0x40, 0x12, 0x21, 0x3c, 0x36, 0x38, 0x1e, 0x01, 0x1b, 0x05, 0x4e, 0x44, 0x3d, 0x04, 0x10, 0x5a, 0x2a, 0x23, 0x34, 0x25, 0x2f, 0x2b, 0x50, 0x3a, 0x54, 0x47, 0x59, 0x13, 0x57, };
- define CXMAP_SIZE (sizeof(cxmap)/sizeof(short))
int Str_Hash( const char *key, int ix_max )
{
const char *cp; short mash; int hash = 33501551; for (cp = key; *cp; cp++) { mash = cxmap[*cp % CXMAP_SIZE]; hash = (hash >>4) ^ 0x5C5CF5C ^ ((hash<<1) + (mash<<5)); hash &= 0x3FFFFFFF; } return hash % ix_max;
}
typedef struct sDictWord *DictWord; struct sDictWord {
const char *word; DictWord next;
};
typedef struct sHashEntry *HashEntry; struct sHashEntry {
const char *key; HashEntry next; DictWord words; HashEntry link; short wordCount;
};
- define HT_SIZE 8192
HashEntry hashTable[HT_SIZE];
HashEntry mostPerms = NULL;
int buildAnagrams( FILE *fin ) {
char buffer[40]; char bufr2[40]; char *hkey; int hix; HashEntry he, *hep; DictWord we; int maxPC = 2; int numWords = 0; while ( fgets(buffer, 40, fin)) { for(hkey = buffer; *hkey && (*hkey!='\n'); hkey++); *hkey = 0; hkey = sortedWord(buffer, bufr2); hix = Str_Hash(hkey, HT_SIZE); he = hashTable[hix]; hep = &hashTable[hix]; while( he && strcmp(he->key , hkey) ) { hep = &he->next; he = he->next; } if ( ! he ) { he = malloc(sizeof(struct sHashEntry)); he->next = NULL; he->key = strdup(hkey); he->wordCount = 0; he->words = NULL; he->link = NULL; *hep = he; } we = malloc(sizeof(struct sDictWord)); we->word = strdup(buffer); we->next = he->words; he->words = we; he->wordCount++; if ( maxPC < he->wordCount) { maxPC = he->wordCount; mostPerms = he; he->link = NULL; } else if (maxPC == he->wordCount) { he->link = mostPerms; mostPerms = he; } numWords++; } printf("%d words in dictionary max ana=%d\n", numWords, maxPC); return maxPC;
}
int main( )
{
HashEntry he; DictWord we; FILE *f1; f1 = fopen("unixdict.txt","r"); buildAnagrams(f1); fclose(f1); f1 = fopen("anaout.txt","w");
// f1 = stdout;
for (he = mostPerms; he; he = he->link) { fprintf(f1,"%d:", he->wordCount); for(we = he->words; we; we = we->next) { fprintf(f1,"%s, ", we->word); } fprintf(f1, "\n"); }
fclose(f1); return 0;
}</lang> Output: (less than 1 second on old P500)
5:vile, veil, live, levi, evil, 5:trace, crate, cater, carte, caret, 5:regal, large, lager, glare, alger, 5:neal, lena, lean, lane, elan, 5:lange, glean, galen, angle, angel, 5:elba, bela, bale, able, abel,
A much shorter version with no fancy data structures: <lang c>#include <stdio.h>
- include <stdlib.h>
- include <string.h>
- include <fcntl.h>
- include <unistd.h>
- include <sys/stat.h>
- include <string.h>
typedef struct { const char *key, *word; int cnt; } kw_t;
int lst_cmp(const void *a, const void *b) { return strcmp(((const kw_t*)a)->key, ((const kw_t*)b)->key); }
/* Bubble sort. Faster than stock qsort(), believe it or not */ void sort_letters(char *s) { int i, j; char t; for (i = 0; s[i] != '\0'; i++) { for (j = i + 1; s[j] != '\0'; j++) if (s[j] < s[i]) { t = s[j]; s[j] = s[i]; s[i] = t; } } }
int main() { struct stat s; char *words, *keys; size_t i, j, k, longest, offset; int n_word = 0; kw_t *list;
int fd = open("unixdict.txt", O_RDONLY); if (fd == -1) return 1; fstat(fd, &s); words = malloc(s.st_size * 2); keys = words + s.st_size;
read(fd, words, s.st_size); memcpy(keys, words, s.st_size);
/* change newline to null for easy use; sort letters in keys */ for (i = j = 0; i < s.st_size; i++) { if (words[i] == '\n') { words[i] = keys[i] = '\0'; sort_letters(keys + j); j = i + 1; n_word ++; } }
list = calloc(n_word, sizeof(kw_t));
/* make key/word pointer pairs for sorting */ for (i = j = k = 0; i < s.st_size; i++) { if (words[i] == '\0') { list[j].key = keys + k; list[j].word = words + k; k = i + 1; j++; } }
qsort(list, n_word, sizeof(kw_t), lst_cmp);
/* count each key's repetition */ for (i = j = k = offset = longest = 0; i < n_word; i++) { if (!strcmp(list[i].key, list[j].key)) { ++k; continue; }
/* move current longest to begining of array */ if (k < longest) { k = 0; j = i; continue; }
if (k > longest) offset = 0;
while (j < i) list[offset++] = list[j++]; longest = k; k = 0; }
/* show the longest */ for (i = 0; i < offset; i++) { printf("%s ", list[i].word); if (i < n_word - 1 && strcmp(list[i].key, list[i+1].key)) printf("\n"); }
/* free(list); free(words); */ close(fd); return 0; }</lang> output
abel able bale bela elba caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile
C++
<lang cpp>#include <iostream>
- include <fstream>
- include <string>
- include <map>
- include <vector>
- include <algorithm>
- include <iterator>
int main() {
std::ifstream in("unixdict.txt"); typedef std::map<std::string, std::vector<std::string> > AnagramMap; AnagramMap anagrams; std::string word; size_t count = 0; while (std::getline(in, word)) { std::string key = word; std::sort(key.begin(), key.end()); // note: the [] op. automatically inserts a new value if key does not exist AnagramMap::mapped_type & v = anagrams[key]; v.push_back(word); count = std::max(count, v.size()); } in.close(); for (AnagramMap::const_iterator it = anagrams.begin(), e = anagrams.end(); it != e; it++) if (it->second.size() >= count) { std::copy(it->second.begin(), it->second.end(), std::ostream_iterator<std::string>(std::cout, ", ")); std::cout << std::endl; } return 0;
}</lang> Output:
abel, able, bale, bela, elba, caret, carte, cater, crate, trace, angel, angle, galen, glean, lange, alger, glare, lager, large, regal, elan, lane, lean, lena, neal, evil, levi, live, veil, vile,
C#
<lang csharp>using System; using System.IO; using System.Linq; using System.Net; using System.Text.RegularExpressions;
namespace Anagram {
class Program { const string DICO_URL = "http://www.puzzlers.org/pub/wordlists/unixdict.txt";
static void Main( string[] args ) { WebRequest request = WebRequest.Create(DICO_URL); string[] words; using (StreamReader sr = new StreamReader(request.GetResponse().GetResponseStream(), true)) { words = Regex.Split(sr.ReadToEnd(), @"\r?\n"); } var groups = from string w in words group w by string.Concat(w.OrderBy(x => x)) into c group c by c.Count() into d orderby d.Key descending select d; foreach (var c in groups.First()) { Console.WriteLine(string.Join(" ", c)); } } }
}</lang> output:
abel able bale bela elba alger glare lager large regal angel angle galen glean lange caret carte cater crate trace elan lane lean lena neal evil levi live veil vile
Clojure
Assume wordfile is the path of the local file containing the words. This code makes a map (groups) whose keys are sorted letters and values are lists of the key's anagrams. It then determines the length of the longest list, and prints out all the lists of that length. <lang clojure>(require '[clojure.java.io :as io])
(def groups
(with-open [r (io/reader wordfile)] (group-by sort (line-seq r))))
(let [wordlists (sort-by (comp - count) (vals groups))
maxlength (count (first wordlists))] (doseq [wordlist (take-while #(= (count %) maxlength) wordlists)] (println wordlist))</lang>
CoffeeScript
<lang coffeescript>http = require 'http'
show_large_anagram_sets = (word_lst) ->
anagrams = {} max_size = 0 for word in word_lst key = word.split().sort().join() anagrams[key] ?= [] anagrams[key].push word size = anagrams[key].length max_size = size if size > max_size for key, variations of anagrams if variations.length == max_size console.log variations.join ' '
get_word_list = (process) ->
options = host: "www.puzzlers.org" path: "/pub/wordlists/unixdict.txt" req = http.request options, (res) -> s = res.on 'data', (chunk) -> s += chunk res.on 'end', -> process s.split '\n' req.end()
get_word_list show_large_anagram_sets</lang> output <lang coffeescript>> coffee anagrams.coffee [ 'abel', 'able', 'bale', 'bela', 'elba' ] [ 'alger', 'glare', 'lager', 'large', 'regal' ] [ 'angel', 'angle', 'galen', 'glean', 'lange' ] [ 'caret', 'carte', 'cater', 'crate', 'trace' ] [ 'elan', 'lane', 'lean', 'lena', 'neal' ] [ 'evil', 'levi', 'live', 'veil', 'vile' ]</lang>
Common Lisp
to retrieve the wordlist.
<lang lisp>(defun anagrams (&optional (url "http://www.puzzlers.org/pub/wordlists/unixdict.txt"))
(let ((words (drakma:http-request url :want-stream t)) (wordsets (make-hash-table :test 'equalp))) ;; populate the wordsets and close stream (do ((word (read-line words nil nil) (read-line words nil nil))) ((null word) (close words)) (let ((letters (sort (copy-seq word) 'char<))) (multiple-value-bind (pair presentp) (gethash letters wordsets) (if presentp (setf (car pair) (1+ (car pair)) (cdr pair) (cons word (cdr pair))) (setf (gethash letters wordsets) (cons 1 (list word))))))) ;; find and return the biggest wordsets (loop with maxcount = 0 with maxwordsets = '() for pair being each hash-value of wordsets if (> (car pair) maxcount) do (setf maxcount (car pair) maxwordsets (list (cdr pair))) else if (eql (car pair) maxcount) do (push (cdr pair) maxwordsets) finally (return (values maxwordsets maxcount)))))</lang>
Evalutating <lang lisp>(multiple-value-bind (wordsets count) (anagrams)
(pprint wordsets) (print count))</lang>
produces the following output.
(("vile" "veil" "live" "levi" "evil") ("regal" "large" "lager" "glare" "alger") ("lange" "glean" "galen" "angle" "angel") ("neal" "lena" "lean" "lane" "elan") ("trace" "crate" "cater" "carte" "caret") ("elba" "bela" "bale" "able" "abel")) 5
Another method, assuming file is local: <lang lisp>(defun read-words (file)
(with-open-file (stream file) (loop with w = "" while w collect (setf w (read-line stream nil)))))
(defun anagram (file)
(let ((wordlist (read-words file))
(h (make-hash-table :test #'equal)) longest)
(loop for w in wordlist with ws do
(setf ws (sort (copy-seq w) #'char<)) (setf (gethash ws h) (cons w (gethash ws h))))
(loop for w being the hash-keys in h using (hash-value wl)
with max-len = 0 do (let ((l (length wl))) (if (> l max-len) (setf longest nil max-len l)) (if (= l max-len) (push wl longest))))
longest))
(format t "~{~{~a ~}~^~%~}" (anagram "unixdict.txt"))</lang> output
elba bela bale able abel regal large lager glare alger lange glean galen angle angel trace crate cater carte caret neal lena lean lane elan vile veil live levi evil
Component Pascal
BlackBox Component Builder <lang oberon2> MODULE BbtAnagrams; IMPORT StdLog,Files,Strings,Args; CONST MAXPOOLSZ = 1024;
TYPE Node = POINTER TO LIMITED RECORD; count: INTEGER; word: Args.String; desc: Node; next: Node; END;
Pool = POINTER TO LIMITED RECORD capacity,max: INTEGER; words: POINTER TO ARRAY OF Node; END;
PROCEDURE NewNode(word: ARRAY OF CHAR): Node; VAR n: Node; BEGIN NEW(n);n.count := 0;n.word := word$; n.desc := NIL;n.next := NIL; RETURN n END NewNode;
PROCEDURE Index(s: ARRAY OF CHAR;cap: INTEGER): INTEGER; VAR i,sum: INTEGER; BEGIN sum := 0; FOR i := 0 TO LEN(s$) DO INC(sum,ORD(s[i])) END; RETURN sum MOD cap END Index;
PROCEDURE ISort(VAR s: ARRAY OF CHAR); VAR
i, j: INTEGER; t: CHAR;
BEGIN
FOR i := 0 TO LEN(s$) - 1 DO
j := i; t := s[j]; WHILE (j > 0) & (s[j -1] > t) DO s[j] := s[j - 1]; DEC(j) END; s[j] := t
END
END ISort;
PROCEDURE SameLetters(x,y: ARRAY OF CHAR): BOOLEAN; BEGIN
ISort(x);ISort(y); RETURN x = y
END SameLetters;
PROCEDURE NewPoolWith(cap: INTEGER): Pool; VAR i: INTEGER; p: Pool; BEGIN NEW(p); p.capacity := cap; p.max := 0; NEW(p.words,cap); i := 0; WHILE i < p.capacity DO p.words[i] := NIL; INC(i); END; RETURN p END NewPoolWith;
PROCEDURE NewPool(): Pool; BEGIN RETURN NewPoolWith(MAXPOOLSZ); END NewPool;
PROCEDURE (p: Pool) Add(w: ARRAY OF CHAR), NEW; VAR idx: INTEGER; iter,n: Node; BEGIN idx := Index(w,p.capacity); iter := p.words[idx]; n := NewNode(w); WHILE(iter # NIL) DO IF SameLetters(w,iter.word) THEN INC(iter.count); IF iter.count > p.max THEN p.max := iter.count END; n.desc := iter.desc; iter.desc := n; RETURN END; iter := iter.next END; ASSERT(iter = NIL); n.next := p.words[idx];p.words[idx] := n END Add;
PROCEDURE ShowAnagrams(l: Node); VAR iter: Node; BEGIN iter := l; WHILE iter # NIL DO StdLog.String(iter.word);StdLog.String(" "); iter := iter.desc END; StdLog.Ln END ShowAnagrams;
PROCEDURE (p: Pool) ShowMax(),NEW; VAR i: INTEGER; iter: Node; BEGIN FOR i := 0 TO LEN(p.words) - 1 DO IF p.words[i] # NIL THEN iter := p.words^[i]; WHILE iter # NIL DO IF iter.count = p.max THEN ShowAnagrams(iter); END; iter := iter.next END END END END ShowMax;
PROCEDURE GetLine(rd: Files.Reader; OUT str: ARRAY OF CHAR); VAR i: INTEGER; b: BYTE; BEGIN rd.ReadByte(b);i := 0; WHILE (~rd.eof) & (i < LEN(str)) DO IF (b = ORD(0DX)) OR (b = ORD(0AX)) THEN str[i] := 0X; RETURN END; str[i] := CHR(b); rd.ReadByte(b);INC(i) END; str[LEN(str) - 1] := 0X END GetLine;
PROCEDURE DoProcess*; VAR params : Args.Params; loc: Files.Locator; fd: Files.File; rd: Files.Reader; line: ARRAY 81 OF CHAR; p: Pool; BEGIN Args.Get(params); IF params.argc = 1 THEN loc := Files.dir.This("Bbt"); fd := Files.dir.Old(loc,params.args[0]$,FALSE); StdLog.String("Processing: " + params.args[0]);StdLog.Ln;StdLog.Ln; rd := fd.NewReader(NIL); p := NewPool(); REPEAT GetLine(rd,line); p.Add(line); UNTIL rd.eof; p.ShowMax() ELSE StdLog.String("Error: Missing file to process");StdLog.Ln END; END DoProcess;
END BbtAnagrams.
</lang>
Execute:^Q BbtAnagrams.DoProcess unixdict.txt~
Output:
Processing: unixdict.txt abel elba bela bale able elan neal lena lean lane evil vile veil live levi angel lange glean galen angle alger regal large lager glare caret trace crate cater carte
D
Short Functional Version
<lang d>import std.stdio, std.algorithm, std.range, std.string, std.exception;
void main() {
string[][const ubyte[]] an; foreach (w; "unixdict.txt".File.byLine(KeepTerminator.no)) an[w.dup.representation.sort().release.assumeUnique] ~= w.idup; immutable m = an.byValue.map!q{ a.length }.reduce!max; writefln("%(%s\n%)", an.byValue.filter!(ws => ws.length == m));
}</lang>
- Output:
["caret", "carte", "cater", "crate", "trace"] ["evil", "levi", "live", "veil", "vile"] ["abel", "able", "bale", "bela", "elba"] ["elan", "lane", "lean", "lena", "neal"] ["alger", "glare", "lager", "large", "regal"] ["angel", "angle", "galen", "glean", "lange"]
Runtime: about 0.08 seconds.
Faster Version
Less safe, same output. <lang d>import std.stdio, std.algorithm, std.file, std.string;
void main() {
auto keys = cast(char[])"unixdict.txt".read; immutable vals = keys.idup; string[][string] anags; foreach (w; keys.splitter) { immutable k = cast(string)w.representation.sort().release; anags[k] ~= vals[k.ptr-keys.ptr .. k.ptr-keys.ptr + k.length]; } //immutable m = anags.byValue.max!q{ a.length }; immutable m = anags.byValue.map!q{ a.length }.reduce!max; writefln("%(%s\n%)", anags.byValue.filter!(ws => ws.length == m));
}</lang> Runtime: about 0.06 seconds.
E
<lang e>println("Downloading...") when (def wordText := <http://www.puzzlers.org/pub/wordlists/unixdict.txt> <- getText()) -> {
def words := wordText.split("\n")
def storage := [].asMap().diverge() def anagramTable extends storage { to get(key) { return storage.fetch(key, fn { storage[key] := [].diverge() }) } }
println("Grouping...") var largestGroupSeen := 0 for word in words { def anagramGroup := anagramTable[word.sort()] anagramGroup.push(word) largestGroupSeen max= anagramGroup.size() }
println("Selecting...") for _ => anagramGroup ? (anagramGroup.size() == mostSeen) in anagramTable { println(anagramGroup.snapshot()) }
}</lang>
Elena
<lang elena>#define system.
- define system'collections.
- define extensions'text.
- define extensions'io.
// --- Normalized ---
- symbol Normalized = &&:aLiteral
[
^ Summing new:(String new) foreach:(arrayControl sort:(stringControl toArray:aLiteral)) Literal.
].
// --- Program ---
- symbol program =
[
#var aDictionary := Dictionary new.
textFileControl forEachLine:"unixdict.txt" &do: &&:aWord [ #var aKey := Normalized eval:aWord. #var anItem := aDictionary @ aKey. nil == anItem ? [ anItem := List new. aDictionary setAt:aKey:anItem. ]. anItem += aWord. ].
listControl sort:aDictionary &with: &&:aFormer:aLater [ aFormer Value Count > aLater Value Count ].
listControl foreach:aDictionary &top:20 &do: &&:aPair [ consoleEx writeLine:(aPair Value) ].
].</lang>
Erlang
The function fetch/2 is used to solve Anagrams/Deranged_anagrams. Please keep backwards compatibility when editing. Or update the other module, too. <lang erlang>-module(anagrams). -compile(export_all).
play() ->
{ok, P} = file:read_file('unixdict.txt'), D = dict:new(), E=fetch(string:tokens(binary_to_list(P), "\n"), D), get_value(dict:fetch_keys(E), E).
fetch([H|T], D) ->
fetch(T, dict:append(lists:sort(H), H, D));
fetch([], D) ->
D.
get_value(L, D) -> get_value(L,D,1,[]). get_value([H|T], D, N, L) ->
Var = dict:fetch(H,D), Len = length(Var), if Len > N -> get_value(T, D, Len, [Var]); Len == N -> get_value(T, D, Len, [Var | L]); Len < N -> get_value(T, D, N, L) end;
get_value([], _, _, L) ->
L.
</lang> Output:
1> anagrams:play(). [["caret","carte","cater","crate","trace"], ["elan","lane","lean","lena","neal"], ["alger","glare","lager","large","regal"], ["angel","angle","galen","glean","lange"], ["evil","levi","live","veil","vile"], ["abel","able","bale","bela","elba"]] 2>
Euphoria
<lang euphoria>include sort.e
function compare_keys(sequence a, sequence b)
return compare(a[1],b[1])
end function
constant fn = open("unixdict.txt","r") sequence words, anagrams object word words = {} while 1 do
word = gets(fn) if atom(word) then exit end if word = word[1..$-1] -- truncate new-line character words = append(words, {sort(word), word})
end while close(fn)
integer maxlen maxlen = 0 words = custom_sort(routine_id("compare_keys"), words) anagrams = {words[1]} for i = 2 to length(words) do
if equal(anagrams[$][1],words[i][1]) then anagrams[$] = append(anagrams[$], words[i][2]) elsif length(anagrams[$]) = 2 then anagrams[$] = words[i] else if length(anagrams[$]) > maxlen then maxlen = length(anagrams[$]) end if anagrams = append(anagrams, words[i]) end if
end for if length(anagrams[$]) = 2 then
anagrams = anagrams[1..$-1]
end if
for i = 1 to length(anagrams) do
if length(anagrams[i]) = maxlen then for j = 2 to length(anagrams[i]) do puts(1,anagrams[i][j]) puts(1,' ') end for puts(1,"\n") end if
end for</lang> Output:
abel bela bale elba able crate cater carte caret trace angle galen glean lange angel regal lager large alger glare elan lean neal lane lena live veil vile levi evil
F#
Read the lines in the dictionary, group by the sorted letters in each word, find the length of the longest sets of anagrams, extract the longest sequences of words sharing the same letters (i.e. anagrams): <lang fsharp>let xss = Seq.groupBy (Array.ofSeq >> Array.sort) (System.IO.File.ReadAllLines "unixdict.txt") Seq.map snd xss |> Seq.filter (Seq.length >> ( = ) (Seq.map (snd >> Seq.length) xss |> Seq.max))</lang> Note that it is necessary to convert the sorted letters in each word from sequences to arrays because the groupBy function uses the default comparison and sequences do not compare structurally (but arrays do in F#).
Takes 0.8s to return: <lang fsharp>val it : string seq seq =
seq [seq ["abel"; "able"; "bale"; "bela"; "elba"]; seq ["alger"; "glare"; "lager"; "large"; "regal"]; seq ["angel"; "angle"; "galen"; "glean"; "lange"]; seq ["caret"; "carte"; "cater"; "crate"; "trace"]; seq ["elan"; "lane"; "lean"; "lena"; "neal"]; seq ["evil"; "levi"; "live"; "veil"; "vile"]]</lang>
FBSL
A little bit of cheating: literatim re-implementation of C solution in FBSL's Dynamic C layer. <lang C>#APPTYPE CONSOLE
DIM gtc = GetTickCount() Anagram() PRINT "Done in ", (GetTickCount() - gtc) / 1000, " seconds"
PAUSE
DYNC Anagram() #include <windows.h> #include <stdio.h>
char* sortedWord(const char* word, char* wbuf) { char* p1, *p2, *endwrd; char t; int swaps;
strcpy(wbuf, word); endwrd = wbuf + strlen(wbuf); do { swaps = 0; p1 = wbuf; p2 = endwrd - 1; while (p1 < p2) { if (*p2 >* p1) { t = *p2; *p2 = *p1; *p1 = t; swaps = 1; } p1++; p2--; } p1 = wbuf; p2 = p1 + 1; while (p2 < endwrd) { if (*p2 >* p1) { t = *p2; *p2 = *p1; *p1 = t; swaps = 1; } p1++; p2++; } } while (swaps); return wbuf; }
static short cxmap[] = { 0x06, 0x1f, 0x4d, 0x0c, 0x5c, 0x28, 0x5d, 0x0e, 0x09, 0x33, 0x31, 0x56, 0x52, 0x19, 0x29, 0x53, 0x32, 0x48, 0x35, 0x55, 0x5e, 0x14, 0x27, 0x24, 0x02, 0x3e, 0x18, 0x4a, 0x3f, 0x4c, 0x45, 0x30, 0x08, 0x2c, 0x1a, 0x03, 0x0b, 0x0d, 0x4f, 0x07, 0x20, 0x1d, 0x51, 0x3b, 0x11, 0x58, 0x00, 0x49, 0x15, 0x2d, 0x41, 0x17, 0x5f, 0x39, 0x16, 0x42, 0x37, 0x22, 0x1c, 0x0f, 0x43, 0x5b, 0x46, 0x4b, 0x0a, 0x26, 0x2e, 0x40, 0x12, 0x21, 0x3c, 0x36, 0x38, 0x1e, 0x01, 0x1b, 0x05, 0x4e, 0x44, 0x3d, 0x04, 0x10, 0x5a, 0x2a, 0x23, 0x34, 0x25, 0x2f, 0x2b, 0x50, 0x3a, 0x54, 0x47, 0x59, 0x13, 0x57, }; #define CXMAP_SIZE (sizeof(cxmap) / sizeof(short))
int Str_Hash(const char* key, int ix_max) { const char* cp; short mash; int hash = 33501551; for (cp = key; *cp; cp++) { mash = cxmap[*cp % CXMAP_SIZE]; hash = (hash >>4) ^ 0x5C5CF5C ^ ((hash << 1) + (mash << 5)); hash &= 0x3FFFFFFF; } return hash % ix_max; }
typedef struct sDictWord* DictWord; struct sDictWord { const char* word; DictWord next; };
typedef struct sHashEntry* HashEntry; struct sHashEntry { const char* key; HashEntry next; DictWord words; HashEntry link; short wordCount; };
#define HT_SIZE 8192
HashEntry hashTable[HT_SIZE];
HashEntry mostPerms = NULL;
int buildAnagrams(FILE* fin) { char buffer[40]; char bufr2[40]; char* hkey; int hix; HashEntry he, *hep; DictWord we; int maxPC = 2; int numWords = 0;
while (fgets(buffer, 40, fin)) { for (hkey = buffer; *hkey && (*hkey != '\n'); hkey++); *hkey = 0; hkey = sortedWord(buffer, bufr2); hix = Str_Hash(hkey, HT_SIZE); he = hashTable[hix]; hep = &hashTable[hix]; while (he && strcmp(he->key, hkey)) { hep = &he->next; he = he->next; } if (! he) { he = (HashEntry)malloc(sizeof(struct sHashEntry)); he->next = NULL; he->key = strdup(hkey); he->wordCount = 0; he->words = NULL; he->link = NULL; *hep = he; } we = (DictWord)malloc(sizeof(struct sDictWord)); we->word = strdup(buffer); we->next = he->words; he->words = we; he->wordCount++; if (maxPC < he->wordCount) { maxPC = he->wordCount; mostPerms = he; he->link = NULL; } else if (maxPC == he->wordCount) { he->link = mostPerms; mostPerms = he; } numWords++; } printf("%d words in dictionary max ana=%d\n", numWords, maxPC); return maxPC; }
void main() { HashEntry he; DictWord we; FILE* f1;
f1 = fopen("unixdict.txt", "r"); buildAnagrams(f1); fclose(f1);
f1 = fopen("anaout.txt", "w");
for (he = mostPerms; he; he = he->link) { fprintf(f1, "%d: ", he->wordCount); for (we = he->words; we; we = we->next) { fprintf(f1, "%s, ", we->word); } fprintf(f1, "\n"); } fclose(f1); } END DYNC</lang> Console output (2.2GHz Intel Core2 Duo):
25104 words in dictionary max ana=5 Done in 0.031 seconds Press any key to continue...
"anaout.txt" listing:
5: vile, veil, live, levi, evil, 5: trace, crate, cater, carte, caret, 5: regal, large, lager, glare, alger, 5: neal, lena, lean, lane, elan, 5: lange, glean, galen, angle, angel, 5: elba, bela, bale, able, abel,
Factor
<lang factor> "resource:unixdict.txt" utf8 file-lines
[ [ natural-sort >string ] keep ] { } map>assoc sort-keys [ [ first ] compare +eq+ = ] monotonic-split dup 0 [ length max ] reduce '[ length _ = ] filter [ values ] map .</lang>
<lang factor>{
{ "abel" "able" "bale" "bela" "elba" } { "caret" "carte" "cater" "crate" "trace" } { "angel" "angle" "galen" "glean" "lange" } { "alger" "glare" "lager" "large" "regal" } { "elan" "lane" "lean" "lena" "neal" } { "evil" "levi" "live" "veil" "vile" }
}</lang>
Fantom
<lang fantom>class Main {
// take given word and return a string rearranging characters in order static Str toOrderedChars (Str word) { Str[] chars := [,] word.each |Int c| { chars.add (c.toChar) } return chars.sort.join("") }
// add given word to anagrams map static Void addWord (Str:Str[] anagrams, Str word) { Str orderedWord := toOrderedChars (word) if (anagrams.containsKey (orderedWord)) anagrams[orderedWord].add (word) else anagrams[orderedWord] = [word] }
public static Void main () { Str:Str[] anagrams := [:] // map Str -> Str[] // loop through input file, adding each word to map of anagrams File (`unixdict.txt`).eachLine |Str word| { addWord (anagrams, word) } // loop through anagrams, keeping the keys with values of largest size Str[] largestKeys := [,] anagrams.keys.each |Str k| { if ((largestKeys.size < 1) || (anagrams[k].size == anagrams[largestKeys[0]].size)) largestKeys.add (k) else if (anagrams[k].size > anagrams[largestKeys[0]].size) largestKeys = [k] } largestKeys.each |Str k| { echo ("Key: $k -> " + anagrams[k].join(", ")) } }
}</lang>
Output:
Key: abel -> abel, able, bale, bela, elba Key: aeln -> elan, lane, lean, lena, neal Key: eilv -> evil, levi, live, veil, vile Key: aegln -> angel, angle, galen, glean, lange Key: aeglr -> alger, glare, lager, large, regal Key: acert -> caret, carte, cater, crate, trace
Fortran
This program: <lang fortran>!*************************************************************************************** module anagram_routines !*************************************************************************************** implicit none
!the dictionary file: integer,parameter :: file_unit = 1000 character(len=*),parameter :: filename = 'unixdict.txt'
!maximum number of characters in a word: integer,parameter :: max_chars = 50
!maximum number of characters in the string displaying the anagram lists: integer,parameter :: str_len = 256
type word character(len=max_chars) :: str = repeat(' ',max_chars) !the word from the dictionary integer :: n = 0 !length of this word integer :: n_anagrams = 0 !number of anagrams found logical :: checked = .false. !if this one has already been checked character(len=str_len) :: anagrams = repeat(' ',str_len) !the anagram list for this word end type word
!the dictionary structure: type(word),dimension(:),allocatable,target :: dict
contains !***************************************************************************************
!****************************************************************************** function count_lines_in_file(fid) result(n_lines) !****************************************************************************** implicit none
integer :: n_lines integer,intent(in) :: fid character(len=1) :: tmp integer :: i integer :: ios
!the file is assumed to be open already.
rewind(fid) !rewind to beginning of the file
n_lines = 0 do !read each line until the end of the file. read(fid,'(A1)',iostat=ios) tmp if (ios < 0) exit !End of file n_lines = n_lines + 1 !row counter end do
rewind(fid) !rewind to beginning of the file
!****************************************************************************** end function count_lines_in_file !******************************************************************************
!****************************************************************************** pure elemental function is_anagram(x,y) !****************************************************************************** implicit none character(len=*),intent(in) :: x character(len=*),intent(in) :: y logical :: is_anagram
character(len=len(x)) :: x_tmp !a copy of x integer :: i,j
!a character not found in any word: character(len=1),parameter :: null = achar(0)
!x and y are assumed to be the same size.
x_tmp = x do i=1,len_trim(x) j = index(x_tmp, y(i:i)) !look for this character in x_tmp if (j/=0) then x_tmp(j:j) = null !clear it so it won't be checked again else is_anagram = .false. !character not found: x,y are not anagrams return end if end do
!if we got to this point, all the characters ! were the same, so x,y are anagrams: is_anagram = .true.
!****************************************************************************** end function is_anagram !******************************************************************************
!*************************************************************************************** end module anagram_routines !***************************************************************************************
!*************************************************************************************** program main !*************************************************************************************** use anagram_routines implicit none
integer :: n,i,j,n_max type(word),pointer :: x,y logical :: first_word real :: start, finish
call cpu_time(start) !..start timer
!open the dictionary and read in all the words: open(unit=file_unit,file=filename) !open the file n = count_lines_in_file(file_unit) !count lines in the file allocate(dict(n)) !allocate dictionary structure do i=1,n ! read(file_unit,'(A)') dict(i)%str !each line is a word in the dictionary dict(i)%n = len_trim(dict(i)%str) !saving length here to avoid trim's below end do close(file_unit) !close the file
!search dictionary for anagrams: do i=1,n
x => dict(i) !pointer to simplify code first_word = .true. !initialize
do j=i,n
y => dict(j) !pointer to simplify code
!checks to avoid checking words unnecessarily: if (x%checked .or. y%checked) cycle !both must not have been checked already if (x%n/=y%n) cycle !must be the same size if (x%str(1:x%n)==y%str(1:y%n)) cycle !can't be the same word
! check to see if x,y are anagrams: if (is_anagram(x%str(1:x%n), y%str(1:y%n))) then !they are anagrams. y%checked = .true. !don't check this one again. x%n_anagrams = x%n_anagrams + 1 if (first_word) then !this is the first anagram found for this word. first_word = .false. x%n_anagrams = x%n_anagrams + 1 x%anagrams = trim(x%anagrams)//x%str(1:x%n) !add first word to list end if x%anagrams = trim(x%anagrams)//','//y%str(1:y%n) !add next word to list end if
end do x%checked = .true. !don't check this one again
end do
!anagram groups with the most words: write(*,*) n_max = maxval(dict%n_anagrams) do i=1,n if (dict(i)%n_anagrams==n_max) write(*,'(A)') trim(dict(i)%anagrams) end do
!anagram group containing longest words: write(*,*) n_max = maxval(dict%n, mask=dict%n_anagrams>0) do i=1,n if (dict(i)%n_anagrams>0 .and. dict(i)%n==n_max) write(*,'(A)') trim(dict(i)%anagrams) end do write(*,*)
call cpu_time(finish) !...stop timer write(*,'(A,F6.3,A)') '[Runtime = ',finish-start,' sec]' write(*,*)
!*************************************************************************************** end program main !***************************************************************************************</lang>
produces this output:
abel,able,bale,bela,elba alger,glare,lager,large,regal angel,angle,galen,glean,lange caret,carte,cater,crate,trace elan,lane,lean,lena,neal evil,levi,live,veil,vile conservation,conversation [Runtime = 6.897 sec]
GAP
<lang gap>Anagrams := function(name)
local f, p, L, line, word, words, swords, res, cur, r; words := [ ]; swords := [ ]; f := InputTextFile(name); while true do line := ReadLine(f); if line = fail then break; else word := Chomp(line); Add(words, word); Add(swords, SortedList(word)); fi; od; CloseStream(f); p := SortingPerm(swords); L := Permuted(words, p); r := ""; cur := [ ]; res := [ ]; for word in L do if SortedList(word) = r then Add(cur, word); else if Length(cur) > 0 then Add(res, cur); fi; r := SortedList(word); cur := [ word ]; fi; od; if Length(cur) > 0 then Add(res, cur); fi; return Filtered(res, v -> Length(v) > 1);
end;
ana := Anagrams("my/gap/unixdict.txt");;
- What is the longest anagram sequence ?
Maximum(List(ana, Length));
- 5
- Which are they ?
Filtered(ana, v -> Length(v) = 5);
- [ [ "abel", "able", "bale", "bela", "elba" ],
- [ "caret", "carte", "cater", "crate", "trace" ],
- [ "angel", "angle", "galen", "glean", "lange" ],
- [ "alger", "glare", "lager", "large", "regal" ],
- [ "elan", "lane", "lean", "lena", "neal" ],
- [ "evil", "levi", "live", "veil", "vile" ] ]</lang>
Go
<lang go>package main
import (
"fmt" "io/ioutil" "sort" "strings"
)
func main() {
b, err := ioutil.ReadFile("unixdict.txt") if err != nil { fmt.Println(err) return } var ma int m := make(map[string][]string) for _, word := range strings.Split(string(b), "\n") { bs := byteSlice(word) sort.Sort(bs) k := string(bs) a := append(m[k], word) if len(a) > ma { ma = len(a) } m[k] = a } for _, a := range m { if len(a) == ma { fmt.Println(a) } }
}
type byteSlice []byte
func (b byteSlice) Len() int { return len(b) } func (b byteSlice) Swap(i, j int) { b[i], b[j] = b[j], b[i] } func (b byteSlice) Less(i, j int) bool { return b[i] < b[j] }</lang> Output:
[angel angle galen glean lange] [elan lane lean lena neal] [evil levi live veil vile] [abel able bale bela elba] [caret carte cater crate trace] [alger glare lager large regal]
Groovy
This program: <lang groovy>def words = new URL('http://www.puzzlers.org/pub/wordlists/unixdict.txt').text.readLines() def groups = words.groupBy{ it.toList().sort() } def bigGroupSize = groups.collect{ it.value.size() }.max() def isBigAnagram = { it.value.size() == bigGroupSize } println groups.findAll(isBigAnagram).collect{ it.value }.collect{ it.join(' ') }.join('\n')</lang> produces this output:
abel able bale bela elba alger glare lager large regal angel angle galen glean lange caret carte cater crate trace elan lane lean lena neal evil levi live veil vile
Haskell
<lang haskell>import Data.List
groupon f x y = f x == f y
main = do
f <- readFile "./../Puzzels/Rosetta/unixdict.txt" let words = lines f wix = groupBy (groupon fst) . sort $ zip (map sort words) words mxl = maximum $ map length wix mapM_ (print . map snd) . filter ((==mxl).length) $ wix</lang>
Sample output: <lang haskell>*Main> main ["abel","able","bale","bela","elba"] ["caret","carte","cater","crate","trace"] ["angel","angle","galen","glean","lange"] ["alger","glare","lager","large","regal"] ["elan","lane","lean","lena","neal"] ["evil","levi","live","veil","vile"]</lang>
Icon and Unicon
<lang icon>procedure main(args)
every writeSet(!getLongestAnagramSets())
end
procedure getLongestAnagramSets()
wordSets := table() longestWSet := 0 longSets := set()
every word := !&input do { wChars := csort(word) /wordSets[wChars] := set() insert(wordSets[wChars], word)
if 1 < *wordSets[wChars} == longestWSet then insert(longSets, wordSets[wChars]) if 1 < *wordSets[wChars} > longestWSet then { longestWSet := *wordSets[wChars} longSets := set([wordSets[wChars]]) } }
return longSets
end
procedure writeSet(words)
every writes("\t"|!words," ") write()
end
procedure csort(w)
every (s := "") ||:= (find(c := !cset(w),w),c) return s
end</lang> Sample run:
->an <unixdict.txt abel bale bela able elba lean neal elan lane lena angle galen lange angel glean alger glare lager large regal veil evil levi live vile caret cater crate carte trace ->
J
If the unixdict file has been retrieved and saved in the current directory (for example, using wget):
<lang j> (#~ a: ~: {:"1) (]/.~ /:~&>) <;._2 ] 1!:1 <'unixdict.txt'
+-----+-----+-----+-----+-----+
|abel |able |bale |bela |elba |
+-----+-----+-----+-----+-----+
|alger|glare|lager|large|regal|
+-----+-----+-----+-----+-----+
|angel|angle|galen|glean|lange|
+-----+-----+-----+-----+-----+
|caret|carte|cater|crate|trace|
+-----+-----+-----+-----+-----+
|elan |lane |lean |lena |neal |
+-----+-----+-----+-----+-----+
|evil |levi |live |veil |vile |
+-----+-----+-----+-----+-----+</lang>
Explanation:
<lang J> <;._2 ] 1!:1 <'unixdict.txt'</lang>
This reads in the dictionary and produces a list of boxes. Each box contains one line (one word) from the dictionary.
<lang J> (]/.~ /:~&>)</lang>
This groups the words into rows where anagram equivalents appear in the same row. In other words, creates a copy of the original list where the characters contained in each box have been sorted. Then it organizes the contents of the original list in rows, with each new row keyed by the values in the new list.
<lang J> (#~ a: ~: {:"1)</lang>
This selects rows whose last element is not an empty box.
(In the previous step we created an array of rows of boxes. The short rows were automatically padded with empty boxes so that all rows would be the same length.)
Java
The key to this algorithm is the sorting of the characters in each word from the dictionary. The line Arrays.sort(chars); sorts all of the letters in the word in ascending order using a built-in quicksort, so all of the words in the first group in the result end up under the key "aegln" in the anagrams map.
<lang java5>import java.net.*; import java.io.*; import java.util.*;
public class WordsOfEqChars {
public static void main(String[] args) throws IOException { URL url = new URL("http://www.puzzlers.org/pub/wordlists/unixdict.txt"); InputStreamReader isr = new InputStreamReader(url.openStream()); BufferedReader reader = new BufferedReader(isr);
Map<String, Collection<String>> anagrams = new HashMap<String, Collection<String>>(); String word; int count = 0; while ((word = reader.readLine()) != null) { char[] chars = word.toCharArray(); Arrays.sort(chars); String key = new String(chars); if (!anagrams.containsKey(key)) anagrams.put(key, new ArrayList<String>()); anagrams.get(key).add(word); count = Math.max(count, anagrams.get(key).size()); }
reader.close();
for (Collection<String> ana : anagrams.values()) if (ana.size() >= count) System.out.println(ana); }
}</lang>
<lang java5>import java.net.*; import java.io.*; import java.util.*; import java.util.concurrent.*; import java.util.function.*;
public interface Anagram {
public static <AUTOCLOSEABLE extends AutoCloseable, OUTPUT> Supplier<OUTPUT> tryWithResources(Callable<AUTOCLOSEABLE> callable, Function<AUTOCLOSEABLE, Supplier<OUTPUT>> function, Supplier<OUTPUT> defaultSupplier) { return () -> { try (AUTOCLOSEABLE autoCloseable = callable.call()) { return function.apply(autoCloseable).get(); } catch (Throwable throwable) { return defaultSupplier.get(); } }; }
public static <INPUT, OUTPUT> Function<INPUT, OUTPUT> function(Supplier<OUTPUT> supplier) { return i -> supplier.get(); }
public static void main(String... args) { Map<String, Collection<String>> anagrams = new ConcurrentSkipListMap<>(); int count = tryWithResources( () -> new BufferedReader( new InputStreamReader( new URL( "http://www.puzzlers.org/pub/wordlists/unixdict.txt" ).openStream() ) ), reader -> () -> reader.lines() .parallel() .mapToInt(word -> { char[] chars = word.toCharArray(); Arrays.parallelSort(chars); String key = Arrays.toString(chars); Collection<String> collection = anagrams.computeIfAbsent( key, function(ArrayList::new) ); collection.add(word); return collection.size(); }) .max() .orElse(0), () -> 0 ).get(); anagrams.values().stream() .filter(ana -> ana.size() >= count) .forEach(System.out::println) ; }
}</lang> Output:
[angel, angle, galen, glean, lange] [elan, lane, lean, lena, neal] [alger, glare, lager, large, regal] [abel, able, bale, bela, elba] [evil, levi, live, veil, vile] [caret, carte, cater, crate, trace]
JavaScript
Spidermonkey
<lang JavaScript>#!/usr/bin/env js
var anas = {}; var words = read('unixdict.txt').split(/\n/g);
for (var w in words) {
var key = words[w].split("").sort().join(); if (!(key in anas)) { anas[key] = []; } anas[key].push(words[w]);
}
for (var a in anas) {
if (anas[a].length >= 2) { print(anas[a]); }
}
quit();</lang>
Sample output:
abbe,babe abc,cab abed,bade,bead abel,able,bale,bela,elba abet,bate,beat,beta abo,boa aboard,abroad abode,adobe
Julia
<lang Julia>url = "http://www.puzzlers.org/pub/wordlists/unixdict.txt"
wordlist = map!(chomp,(open(readlines, download(url)))) ;
function anagram(wordlist)
hash = Dict() ; ananum = 0 for word in wordlist sorted = CharString(sort(collect(word.data))) hash[sorted] = [ get(hash, sorted, {}), word ] ananum = max(length(hash[sorted]), ananum) end collect(values(filter((x,y)-> length(y) == ananum, hash)))
end</lang>
- Output:
julia> anagram(wordlist) 6-element Array{Any,1}: {"elan","lane","lean","lena","neal"} {"evil","levi","live","veil","vile"} {"angel","angle","galen","glean","lange"} {"alger","glare","lager","large","regal"} {"abel","able","bale","bela","elba"} {"caret","carte","cater","crate","trace"}
K
<lang k>{x@&a=|/a:#:'x}{x g@&1<#:'g:={x@<x}'x}0::`unixdict.txt</lang>
Lasso
<lang lasso>local( anagrams = map, words = include_url('http://www.puzzlers.org/pub/wordlists/unixdict.txt') -> split('\n'), key, max = 0, findings = array )
with word in #words do { #key = #word -> split() -> sort& -> join() if(not(#anagrams >> #key)) => { #anagrams -> insert(#key = array) } #anagrams -> find(#key) -> insert(#word) } with ana in #anagrams let ana_size = #ana -> size do { if(#ana_size > #max) => { #findings = array(#ana -> join(', ')) #max = #ana_size else(#ana_size == #max) #findings -> insert(#ana -> join(', ')) } }
- findings -> join('
\n')
</lang> Result -> abel, able, bale, bela, elba caret, carte, cater, crate, trace angel, angle, galen, glean, lange alger, glare, lager, large, regal elan, lane, lean, lena, neal evil, levi, live, veil, vile
Liberty BASIC
<lang lb>' count the word list open "unixdict.txt" for input as #1 while not(eof(#1))
line input #1,null$ numWords=numWords+1
wend close #1
'import to an array appending sorted letter set open "unixdict.txt" for input as #1 dim wordList$(numWords,3) dim chrSort$(45) wordNum=1 while wordNum<numWords
line input #1,actualWord$ wordList$(wordNum,1)=actualWord$ wordList$(wordNum,2)=sorted$(actualWord$) wordNum=wordNum+1
wend
'sort on letter set sort wordList$(),1,numWords,2
'count and store number of anagrams found wordNum=1 startPosition=wordNum numAnagrams=0 currentChrSet$=wordList$(wordNum,2) while wordNum < numWords
while currentChrSet$=wordList$(wordNum,2) numAnagrams=numAnagrams+1 wordNum=wordNum+1 wend for n= startPosition to startPosition+numAnagrams wordList$(n,3)=right$("0000"+str$(numAnagrams),4)+wordList$(n,2) next startPosition=wordNum numAnagrams=0 currentChrSet$=wordList$(wordNum,2)
wend
'sort on number of anagrams+letter set sort wordList$(),numWords,1,3
'display the top anagram sets found wordNum=1 while wordNum<150
currentChrSet$=wordList$(wordNum,2) print "Anagram set"; while currentChrSet$=wordList$(wordNum,2) print " : ";wordList$(wordNum,1); wordNum=wordNum+1 wend print currentChrSet$=wordList$(wordNum,2)
wend
close #1 end
function sorted$(w$)
nchr=len(w$) for chr = 1 to nchr chrSort$(chr)=mid$(w$,chr,1) next sort chrSort$(),1,nchr sorted$="" for chr = 1 to nchr sorted$=sorted$+chrSort$(chr) next
end function</lang>
LiveCode
LiveCode could definitely use a sort characters command. As it is this code converts the letters into items and then sorts that. I wrote a merge sort for characters, but the conversion to items, built-in-sort, conversion back to string is about 10% faster, and certainly easier to write.
<lang LiveCode>on mouseUp
put mostCommonAnagrams(url "http://www.puzzlers.org/pub/wordlists/unixdict.txt")
end mouseUp
function mostCommonAnagrams X
put 0 into maxCount repeat for each word W in X get sortChars(W) put W & comma after A[it] add 1 to C[it] if C[it] >= maxCount then if C[it] > maxCount then put C[it] into maxCount put char 1 to -2 of A[it] into winnerList else put cr & char 1 to -2 of A[it] after winnerList end if end if end repeat return winnerList
end mostCommonAnagrams
function sortChars X
get charsToItems(X) sort items of it return itemsToChars(it)
end sortChars
function charsToItems X
repeat for each char C in X put C & comma after R end repeat return char 1 to -2 of R
end charsToItems
function itemsToChars X
replace comma with empty in X return X
end itemsToChars</lang> Output:
abel,able,bale,bela,elba angel,angle,galen,glean,lange elan,lane,lean,lena,neal alger,glare,lager,large,regal caret,carte,cater,crate,trace evil,levi,live,veil,vile
Lua
Lua's core library is very small and does not include built-in network functionality. If a networking library were imported, the local file in the following script could be replaced with the remote dictionary file. This may or may not be a good implementation, but I thought the method was interesting. <lang lua>-- Build the word set local set = {} local file = io.open("unixdict.txt") local str = file:read() while str do
table.insert(set,str) str = file:read()
end
-- Build the anagram tree local tree = {} for i,word in next,set do
-- Sort a string from lowest char to highest local function sortString(str) if #str <= 1 then return str end local less = local greater = local pivot = str:byte(1) for i = 2, #str do if str:byte(i) <= pivot then less = less..(str:sub(i,i)) else greater = greater..(str:sub(i,i)) end end return sortString(less)..str:sub(1,1)..sortString(greater) end local sortchar = sortString(word) if not tree[#word] then tree[#word] = {} end local node = tree[#word] for i = 1,#word do if not node[sortchar:byte(i)] then node[sortchar:byte(i)] = {} end node = node[sortchar:byte(i)] end table.insert(node,word)
end
-- Gather largest groups by gathering all groups of current max size and droping gathered groups and increasing max when a new largest group is found local max = 0 local set = {} local function recurse (tree)
local num = 0 for i,node in next,tree do if type(node) == 'string' then num = num + 1 end end if num > max then set = {} max = num end if num == max then local newset = {} for i,node in next,tree do if type(node) == 'string' then table.insert(newset,node) end end table.insert(set,newset) end for i,node in next,tree do if type(node) == 'table' then recurse(node) end end
end
recurse (tree) for i,v in next,set do io.write (i..':\t')for j,u in next,v do io.write (u..' ') end print() end</lang>
M4
<lang M4>divert(-1) changequote(`[',`]') define([for],
[ifelse($#,0,$0, [ifelse(eval($2<=$3),1, [pushdef([$1],$2)$4[]popdef([$1])$0([$1],incr($2),$3,[$4])])])])
define([_bar],include(t.txt)) define([eachlineA],
[ifelse(eval($2>0),1, [$3(substr([$1],0,$2))[]eachline(substr([$1],incr($2)),[$3])])])
define([eachline],[eachlineA([$1],index($1,[ ]),[$2])]) define([removefirst],
[substr([$1],0,$2)[]substr([$1],incr($2))])
define([checkfirst],
[ifelse(eval(index([$2],substr([$1],0,1))<0),1, 0, [ispermutation(substr([$1],1), removefirst([$2],index([$2],substr([$1],0,1))))])])
define([ispermutation],
[ifelse([$1],[$2],1, eval(len([$1])!=len([$2])),1,0, len([$1]),0,0, [checkfirst([$1],[$2])])])
define([_set],[define($1<$2>,$3)]) define([_get],[defn([$1<$2>])]) define([_max],1) define([_n],0) define([matchj],
[_set([count],$2,incr(_get([count],$2)))[]ifelse(eval(_get([count],$2)>_max), 1,[define([_max],incr(_max))])[]_set([list],$2,[_get([list],$2) $1])])
define([checkwordj],
[ifelse(ispermutation([$1],_get([word],$2)),1,[matchj([$1],$2)], [addwordj([$1],incr($2))])])
define([_append],
[_set([word],_n,[$1])[]_set([count],_n,1)[]_set([list],_n, [$1 ])[]define([_n],incr(_n))])
define([addwordj],
[ifelse($2,_n,[_append([$1])],[checkwordj([$1],$2)])])
define([addword],
[addwordj([$1],0)])
divert eachline(_bar,[addword]) _max for([x],1,_n,[ifelse(_get([count],x),_max,[_get([list],x) ])])</lang>
Memory limitations keep this program from working on the full-sized dictionary. Run against the first 100 words, here is the output:
2 abel able aboard abroad
Maple
The first line downloads the specified dictionary. (You could, instead, read it from a file, or use one of Maple's built-in word lists.) Next, turn it into a list of words. The assignment to T is where the real work is done (via Classify, in the ListTools package). This creates sets of words all of which have the same "hash", which is, in this case, the sorted word. The convert call discards the hashes, which have done their job, and leaves us with a list L of anagram sets. Finally, we just note the size of the largest sets of anagrams, and pick those off. <lang Maple> words := HTTP:-Get( "http://www.puzzlers.org/pub/wordlists/unixdict.txt" )[2]: # ignore errors use StringTools, ListTools in
T := Classify( Sort, map( Trim, Split( words ) ) )
end use: L := convert( T, 'list' ): m := max( map( nops, L ) ); # what is the largest set? A := select( s -> evalb( nops( s ) = m ), L ); # get the maximal sets of anagrams </lang> The result of running this code is <lang Maple> A := [{"abel", "able", "bale", "bela", "elba"}, {"angel", "angle", "galen", "glean", "lange"}, {"alger", "glare", "lager", "large", "regal"}, {"evil", "levi", "live", "veil", "vile"}, {"caret", "carte", "cater", "crate", "trace"} , {"elan", "lane", "lean", "lena", "neal"}]; </lang>
Mathematica
Download the dictionary, split the lines, split the word in characters and sort them. Now sort by those words, and find sequences of equal 'letter-hashes'. Return the longest sequences:
<lang Mathematica>list=Import["http://www.puzzlers.org/pub/wordlists/unixdict.txt","Lines"];
text={#,StringJoin@@Sort[Characters[#]]}&/@list;
text=SortBy[text,#2&];
splits=Split[text,#12==#22&]All,All,1;
maxlen=Max[Length/@splits];
Select[splits,Length[#]==maxlen&]</lang>
gives back:
<lang Mathematica>{{abel,able,bale,bela,elba},{caret,carte,cater,crate,trace},{angel,angle,galen,glean,lange},{alger,glare,lager,large,regal},{elan,lane,lean,lena,neal},{evil,levi,live,veil,vile}}</lang>
An alternative is faster, but requires version 7 (for Gather
):
<lang Mathematica>splits = Gather[list, Sort[Characters[#]] == Sort[Characters[#2]] &];
maxlen = Max[Length /@ splits];
Select[splits, Length[#] == maxlen &]</lang>
Or using build-in functions for sorting and gathering elements in lists it can be implimented as:
<lang Mathematica>anagramGroups = GatherBy[SortBy[GatherBy[list,Sort[Characters[#]] &],Length],Length];
anagramGroups-1</lang>
Also, Mathematica's own word list is available; replacing the list definition with list = WordData[];
and forcing maxlen
to 5 yields instead this result:
{{angered,derange,enraged,grandee,grenade}, {anisometric,creationism,miscreation,reactionism,romanticise}, {aper,pare,pear,rape,reap}, {ardeb,barde,bared,beard,bread,debar}, {aril,lair,lari,liar,lira,rail,rial}, {aster,rates,stare,tears,teras}, {caret,carte,cater,crate,react,trace}, {east,eats,sate,seat,seta}, {ester,reset,steer,teres,terse}, {inert,inter,niter,nitre,trine}, {latrine,ratline,reliant,retinal,trenail}, {least,slate,stale,steal,stela,tesla}, {luster,lustre,result,rustle,sutler,ulster}, {merit,miter,mitre,remit,timer}, {part,prat,rapt,tarp,trap}, {resin,rinse,risen,serin,siren}, {respect,scepter,sceptre,specter,spectre}}
Maxima
<lang maxima>read_file(name) := block([file, s, L], file: openr(name), L: [], while stringp(s: readline(file)) do L: cons(s, L), close(file), L)$
u: read_file("C:/my/mxm/unixdict.txt")$
v: map(lambda([s], [ssort(s), s]), u)$
w: sort(v, lambda([x, y], orderlessp(x[1], y[1])))$
ana(L) := block([m, n, p, r, u, v, w], L: endcons(["", ""], L), n: length(L), r: "", m: 0, v: [ ], w: [ ], for i from 1 thru n do (
u: L[i], if r = u[1] then ( w: cons(u[2], w) ) else ( p: length(w), if p >= m then ( if p > m then (m: p, v: []), v: cons(w, v) ), w: [u[2]], r: u[1] )
), v)$
ana(w); /* [["evil", "levi", "live", "veil", "vile"],
["elan", "lane", "lean", "lena", "neal"], ["alger", "glare", "lager", "large", "regal"], ["angel", "angle", "galen", "glean", "lange"], ["caret", "carte", "cater", "crate", "trace"], ["abel", "able", "bale", "bela", "elba"]] */</lang>
MUMPS
<lang MUMPS>Anagrams New ii,file,longest,most,sorted,word Set file="unixdict.txt" Open file:"r" Use file For Quit:$ZEOF DO . New char,sort . Read word Quit:word="" . For ii=1:1:$Length(word) Do . . Set char=$ASCII(word,ii) . . If char>64,char<91 Set char=char+32 . . Set sort(char)=$Get(sort(char))+1 . . Quit . Set (sorted,char)="" For Set char=$Order(sort(char)) Quit:char="" Do . . For ii=1:1:sort(char) Set sorted=sorted_$Char(char) . . Quit . Set table(sorted,word)=1 . Quit Close file Set sorted="" For Set sorted=$Order(table(sorted)) Quit:sorted="" Do . Set ii=0,word="" For Set word=$Order(table(sorted,word)) Quit:word="" Set ii=ii+1 . Quit:ii<2 . Set most(ii,sorted)=1 . Quit Write !,"The anagrams with the most variations:" Set ii=$Order(most(""),-1) Set sorted="" For Set sorted=$Order(most(ii,sorted)) Quit:sorted="" Do . Write ! Set word="" For Set word=$Order(table(sorted,word)) Quit:word="" Write " ",word . Quit Write !,"The longest anagrams:" Set ii=$Order(longest(""),-1) Set sorted="" For Set sorted=$Order(longest(ii,sorted)) Quit:sorted="" Do . Write ! Set word="" For Set word=$Order(table(sorted,word)) Quit:word="" Write " ",word . Quit Quit
Do Anagrams</lang>
The anagrams with the most variations: abel able bale bela elba caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile The longest anagrams: conservation conversation
NetRexx
Java–Like
<lang NetRexx>/* NetRexx */ options replace format comments java crossref symbols nobinary
class RAnagramsV01 public
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ method runSample(arg) public signals MalformedURLException, IOException parse arg localFile . isr = Reader if localFile = then do durl = URL("http://www.puzzlers.org/pub/wordlists/unixdict.txt") dictFrom = durl.toString() isr = InputStreamReader(durl.openStream()) end else do dictFrom = localFile isr = FileReader(localFile) end say 'Searching' dictFrom 'for anagrams' dictionaryReader = BufferedReader(isr)
anagrams = Map HashMap() aWord = String count = 0 loop label w_ forever aWord = dictionaryReader.readLine() if aWord = null then leave w_ chars = aWord.toCharArray() Arrays.sort(chars) key = String(chars) if (\anagrams.containsKey(key)) then do anagrams.put(key, ArrayList()) end (ArrayList anagrams.get(key)).add(Object aWord) count = Math.max(count, (ArrayList anagrams.get(key)).size()) end w_ dictionaryReader.close
ani = anagrams.values().iterator() loop label a_ while ani.hasNext() ana = ani.next() if (ArrayList ana).size() >= count then do say ana end end a_
return
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ method main(args = String[]) public static
arg = Rexx(args) Do ra = RAnagramsV01() ra.runSample(arg) Catch ex = Exception ex.printStackTrace() End
return
</lang>
- Output:
Searching http://www.puzzlers.org/pub/wordlists/unixdict.txt for anagrams [abel, able, bale, bela, elba] [elan, lane, lean, lena, neal] [evil, levi, live, veil, vile] [angel, angle, galen, glean, lange] [alger, glare, lager, large, regal] [caret, carte, cater, crate, trace]
Rexx–Like
Implemented with more NetRexx idioms such as indexed strings, PARSE and the NetRexx "built–in functions". <lang NetRexx>/* NetRexx */ options replace format comments java crossref symbols nobinary
runSample(arg) return
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ method findMostAnagrams(arg) public static signals MalformedURLException, IOException
parse arg localFile . isr = Reader if localFile = then do durl = URL("http://www.puzzlers.org/pub/wordlists/unixdict.txt") dictFrom = durl.toString() isr = InputStreamReader(durl.openStream()) end else do dictFrom = localFile isr = FileReader(localFile) end say 'Searching' dictFrom 'for anagrams' dictionaryReader = BufferedReader(isr)
anagrams = 0 maxWords = 0 loop label w_ forever aWord = dictionaryReader.readLine() if aWord = null then leave w_ chars = aWord.toCharArray() Arrays.sort(chars) key = Rexx(chars) parse anagrams[key] count aWords aWords = (aWords aWord).space() maxWords = maxWords.max(aWords.words()) anagrams[key] = aWords.words() aWords end w_ dictionaryReader.close
loop key over anagrams parse anagrams[key] count aWords if count >= maxWords then say aWords else anagrams[key] = null -- remove unwanted elements from the indexed string end key
return
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ method runSample(arg) public static
Do findMostAnagrams(arg) Catch ex = Exception ex.printStackTrace() End
Return
</lang>
- Output:
Searching http://www.puzzlers.org/pub/wordlists/unixdict.txt for anagrams abel able bale bela elba elan lane lean lena neal evil levi live veil vile angel angle galen glean lange alger glare lager large regal caret carte cater crate trace
Nimrod
<lang nimrod> import tables
proc sort(s: string): string =
var i,j: int t: char
result = s for i in 0 .. result.len - 1: j = i t = result[j] while(j > 0) and (result[j - 1] > t): result[j] = result[j - 1] dec(j) result[j] = t
- end sort
proc maxCount(an: TTable[string,seq[string]]): int =
result = 0 for v in an.values: if v.len > result: result = v.len
- end maxCount
proc showAnagrams(s: seq[string]) =
for v in s: write(stdout,v) write(stdout," ") writeln(stdout,"")
- end showAnagrams
proc processFile: TTable[string,seq[string]] =
var fd: TFile sline,line: string
result = initTable[string,seq[string]]() if Open(fd,"unixdict.txt"): while not EndOfFile(fd): line = fd.readLine() sline = sort(line) if result.hasKey(sline): result[sline] = result[sline] & line else: result[sline] = @[line]
var
anagrams:TTable[string,seq[string]] = processFile() max = anagrams.maxCount
for v in anagrams.values:
if v.len == max: showAnagrams(v)
</lang>
Output:
evil levi live veil vile caret carte cater crate trace elan lane lean lena neal alger glare lager large regal abel able bale bela elba angel angle galen glean lange
Oberon-2
Oxford Oberon-2 <lang oberon2> MODULE Anagrams; IMPORT Files,Out,In,Strings; CONST MAXPOOLSZ = 1024;
TYPE String = ARRAY 80 OF CHAR;
Node = POINTER TO NodeDesc; NodeDesc = RECORD; count: INTEGER; word: String; desc: Node; next: Node; END;
Pool = POINTER TO PoolDesc; PoolDesc = RECORD capacity,max: INTEGER; words: POINTER TO ARRAY OF Node; END;
PROCEDURE InitNode(n: Node); BEGIN n^.count := 0; n^.word := ""; n^.desc := NIL; n^.next := NIL; END InitNode;
PROCEDURE Index(s: ARRAY OF CHAR;cap: INTEGER): INTEGER; VAR i,sum: INTEGER; BEGIN sum := 0; FOR i := 0 TO Strings.Length(s) DO INC(sum,ORD(s[i])) END; RETURN sum MOD cap END Index;
PROCEDURE ISort(VAR s: ARRAY OF CHAR); VAR
i, j: INTEGER; t: CHAR;
BEGIN
FOR i := 0 TO Strings.Length(s) - 1 DO
j := i; t := s[j]; WHILE (j > 0) & (s[j -1] > t) DO s[j] := s[j - 1]; DEC(j) END; s[j] := t
END
END ISort;
PROCEDURE SameLetters(x,y: ARRAY OF CHAR): BOOLEAN; BEGIN
ISort(x);ISort(y); RETURN (Strings.Compare(x,y) = 0)
END SameLetters;
PROCEDURE InitPool(p:Pool); BEGIN InitPoolWith(p,MAXPOOLSZ); END InitPool;
PROCEDURE InitPoolWith(p:Pool;cap: INTEGER); VAR i: INTEGER; BEGIN p^.capacity := cap; p^.max := 0; NEW(p^.words,cap); i := 0; WHILE i < p^.capacity DO p^.words^[i] := NIL; INC(i); END; END InitPoolWith;
PROCEDURE (p: Pool) Add(w: ARRAY OF CHAR); VAR idx: INTEGER; iter,n: Node; BEGIN idx := Index(w,p^.capacity); iter := p^.words^[idx]; NEW(n);InitNode(n);COPY(w,n^.word); WHILE(iter # NIL) DO IF SameLetters(w,iter^.word) THEN INC(iter^.count); IF iter^.count > p^.max THEN p^.max := iter^.count END; n^.desc := iter^.desc; iter^.desc := n; RETURN END; iter := iter^.next END; ASSERT(iter = NIL); n^.next := p^.words^[idx];p^.words^[idx] := n END Add;
PROCEDURE ShowAnagrams(l: Node); VAR iter: Node; BEGIN iter := l; WHILE iter # NIL DO Out.String(iter^.word);Out.String(" "); iter := iter^.desc END; Out.Ln END ShowAnagrams;
PROCEDURE (p: Pool) ShowMax(); VAR i: INTEGER; iter: Node; BEGIN FOR i := 0 TO LEN(p^.words^) - 1 DO IF p^.words^[i] # NIL THEN iter := p^.words^[i]; WHILE iter # NIL DO IF iter^.count = p^.max THEN ShowAnagrams(iter); END; iter := iter^.next END END END END ShowMax;
PROCEDURE DoProcess(fnm: ARRAY OF CHAR); VAR stdinBck,istream: Files.File; line: String; p: Pool; BEGIN istream := Files.Open(fnm,"r"); stdinBck := Files.stdin; Files.stdin := istream; NEW(p);InitPool(p); WHILE In.Done DO In.Line(line); p.Add(line); END; Files.stdin := stdinBck; Files.Close(istream); p^.ShowMax(); END DoProcess;
BEGIN DoProcess("unixdict.txt"); END Anagrams. </lang> Output:
abel elba bela bale able elan neal lena lean lane evil vile veil live levi angel lange glean galen angle alger regal large lager glare caret trace crate cater carte
Objeck
<lang objeck>use HTTP; use Collection;
class Anagrams {
function : Main(args : String[]) ~ Nil { lines := HttpClient->New()->Get("http://www.puzzlers.org/pub/wordlists/unixdict.txt"); if(lines->Size() = 1) { line := lines->Get(0)->As(String); words := line->Split("\n"); anagrams := StringMap->New(); words->Size()->PrintLine(); each(i : words) { word := words[i]->Trim(); key := String->New(word->ToCharArray()->Sort()); list := anagrams->Find(key)->As(Vector); if(list = Nil) { list := Vector->New(); anagrams->Insert(key, list); }; list->AddBack(word); }; lists := anagrams->GetValues(); each(i : lists) { list := lists->Get(i)->As(Vector); if(list->Size() > 1) { '['->Print(); each(j : list) { list->Get(j)->As(String)->Print(); if(j + 1 < list->Size()) { ','->Print(); }; }; ']'->PrintLine(); }; }; }; }
} </lang>
Output:
[lascar,rascal,sacral,scalar] [calvary,cavalry] [casual,causal] [alexander,alexandre] [drainage,gardenia] [andean,deanna] [diana,nadia,naiad]
OCaml
<lang ocaml>let explode str =
let l = ref [] in let n = String.length str in for i = n - 1 downto 0 do l := str.[i] :: !l done; (!l)
let implode li =
let n = List.length li in let s = String.create n in let i = ref 0 in List.iter (fun c -> s.[!i] <- c; incr i) li; (s)
let () =
let h = Hashtbl.create 3571 in let ic = open_in "unixdict.txt" in try while true do let w = input_line ic in let k = implode (List.sort compare (explode w)) in let l = try Hashtbl.find h k with Not_found -> [] in Hashtbl.add h k (w::l); done with End_of_file -> (); let n = Hashtbl.fold (fun _ lw n -> max n (List.length lw)) h 0 in Hashtbl.iter (fun _ lw -> if List.length lw >= n then ( List.iter (Printf.printf " %s") lw; print_newline () ) ) h</lang>
ooRexx
Two versions of this, using different collection classes.
Version 1: Directory of arrays
<lang ooRexx> -- This assumes you've already downloaded the following file and placed it -- in the current directory: http://www.puzzlers.org/pub/wordlists/unixdict.txt
-- There are several different ways of reading the file. I chose the -- supplier method just because I haven't used it yet in any other examples. source = .stream~new('unixdict.txt')~supplier -- this holds our mappings of the anagrams anagrams = .directory~new count = 0 -- this is used to keep track of the maximums
loop while source~available
word = source~item -- this produces a string consisting of the characters in sorted order -- Note: the ~~ used to invoke sort makes that message return value be -- the target array. The sort method does not normally have a return value. key = word~makearray()~~sort~tostring("l", "")
-- make sure we have an accumulator collection for this key list = anagrams[key] if list == .nil then do list = .array~new anagrams[key] = list end -- this word is now associate with this key list~append(word) -- and see if this is a new highest count count = max(count, list~items) source~next
end
loop letters over anagrams
list = anagrams[letters] if list~items >= count then say letters":" list~makestring("l", ", ")
end </lang>
Version 2: Using the relation class
This version appears to be the fastest. <lang ooRexx> -- This assumes you've already downloaded the following file and placed it -- in the current directory: http://www.puzzlers.org/pub/wordlists/unixdict.txt
-- There are several different ways of reading the file. I chose the -- supplier method just because I haven't used it yet in any other examples. source = .stream~new('unixdict.txt')~supplier -- this holds our mappings of the anagrams. This is good use for the -- relation class anagrams = .relation~new count = 0 -- this is used to keep track of the maximums
loop while source~available
word = source~item -- this produces a string consisting of the characters in sorted order -- Note: the ~~ used to invoke sort makes that message return value be -- the target array. The sort method does not normally have a return value. key = word~makearray()~~sort~tostring("l", "") -- add this to our mapping. This creates multiple entries for each -- word that uses the same key anagrams[key] = word source~next
end
-- now get the set of unique keys keys = .set~new~~putall(anagrams~allIndexes) count = 0 -- this is used to keep track of the maximums most = .directory~new
loop key over keys
words = anagrams~allAt(key) newCount = words~items if newCount > count then do -- throw away our old set most~empty count = newCount most[key] = words end -- matches our highest count, add it to the list else if newCount == count then most[key] = words
end
loop letters over most
words = most[letters] say letters":" words~makestring("l", ", ")
end </lang> Timings taken on my laptop:
Version 1 1.2 seconds Version 2 0.4 seconds Rexx 51.1 seconds (!) as of 04.08.2013 (using ooRexx after adapting the code for incompatibilities: @->y, a=, Upper) REXX v1 1.7 seconds as of 05.08.2013 -"- (improved version of REXX code) REXX v1 1.2 seconds 09.08.2013 -"- REXX v2 1.2 seconds 09.08.2013 PL/I 4.3 seconds NetRexx v1 .2 seconds (using local file, 4 seconds with remote) NetRexx v2 .09 seconds (using local file) It probably should be noted that the REXX timings are actually for ooRexx executing a modified version of the REXX code. Statistics: sets number of words 22022 1 1089 2 155 3 31 4 6 5
Oz
<lang oz>declare
%% Helper function fun {ReadLines Filename} File = {New class $ from Open.file Open.text end init(name:Filename)} in for collect:C break:B do
case {File getS($)} of false then {File close} {B} [] Line then {C Line}
end end end
%% Groups anagrams by using a mutable dictionary %% with sorted words as keys WordDict = {Dictionary.new} for Word in {ReadLines "unixdict.txt"} do Keyword = {String.toAtom {Sort Word Value.'<'}} in WordDict.Keyword := Word|{CondSelect WordDict Keyword nil} end Sets = {Dictionary.items WordDict}
%% Filter such that only the largest sets remain MaxSetSize = {FoldL {Map Sets Length} Max 0} LargestSets = {Filter Sets fun {$ S} {Length S} == MaxSetSize end}
in
%% Display result (make sure strings are shown as string, not as number lists) {Inspector.object configureEntry(widgetShowStrings true)} {Inspect LargestSets}</lang>
Pascal
<lang pascal>Program Anagrams;
// assumes a local file
uses
classes, math;
var
i, j, k, maxCount: integer; sortedString: string; WordList: TStringList; SortedWordList: TStringList; AnagramList: array of TStringlist;
begin
WordList := TStringList.Create; WordList.LoadFromFile('unixdict.txt'); for i := 0 to WordList.Count - 1 do begin setLength(sortedString,Length(WordList.Strings[i])); sortedString[1] := WordList.Strings[i][1];
// sorted assign j := 2; while j <= Length(WordList.Strings[i]) do begin k := j - 1; while (WordList.Strings[i][j] < sortedString[k]) and (k > 0) do begin sortedString[k+1] := sortedString[k]; k := k - 1; end; sortedString[k+1] := WordList.Strings[i][j]; j := j + 1; end;
// create the stringlists of the sorted letters and // the list of the original words if not assigned(SortedWordList) then begin SortedWordList := TStringList.Create; SortedWordList.append(sortedString); setlength(AnagramList,1); AnagramList[0] := TStringList.Create; AnagramList[0].append(WordList.Strings[i]); end else begin j := 0; while sortedString <> SortedWordList.Strings[j] do begin inc(j); if j = (SortedWordList.Count) then begin SortedWordList.append(sortedString); setlength(AnagramList,length(AnagramList) + 1); AnagramList[j] := TStringList.Create; break; end; end; AnagramList[j].append(WordList.Strings[i]); end; end;
maxCount := 1; for i := 0 to length(AnagramList) - 1 do maxCount := max(maxCount, AnagramList[i].Count); // create output writeln('The largest sets of words have ', maxCount, ' members:'); for i := 0 to length(AnagramList) - 1 do begin if AnagramList[i].Count = maxCount then begin write('"', SortedWordList.strings[i], '": '); for j := 0 to AnagramList[i].Count - 2 do write(AnagramList[i].strings[j], ', '); writeln(AnagramList[i].strings[AnagramList[i].Count - 1]); end; end;
// Cleanup WordList.Destroy; SortedWordList.Destroy; for i := 0 to length(AnagramList) - 1 do AnagramList[i].Destroy;
end.</lang> Output:
The largest sets of words have 5 members: "abel": abel, able, bale, bela, elba "aeglr": alger, glare, lager, large, regal "aegln": angel, angle, galen, glean, lange "acert": caret, carte, cater, crate, trace "aeln": elan, lane, lean, lena, neal "eilv": evil, levi, live, veil, vile
Perl
<lang perl>use LWP::Simple; use List::Util qw(max);
my @words = split(' ', get('http://www.puzzlers.org/pub/wordlists/unixdict.txt')); my %anagram; foreach my $word (@words) {
push @{ $anagram{join(, sort(split(//, $word)))} }, $word;
}
my $count = max(map {scalar @$_} values %anagram); foreach my $ana (values %anagram) {
if (@$ana >= $count) { print "@$ana\n"; }
}</lang> refactor of above: <lang perl>use LWP::Simple;
for (split ' ' => get 'http://www.puzzlers.org/pub/wordlists/unixdict.txt')
{push @{$anagram{ join => sort split // }}, $_}
$max > @$_ or $max = @$_ for values %anagram; @$_ >= $max and print "@$_\n" for values %anagram;</lang> Output:
alger glare lager large regal abel able bale bela elba evil levi live veil vile angel angle galen glean lange elan lane lean lena neal caret carte cater crate trace
Perl 6
<lang perl6>my %anagram = slurp('unixdict.txt').words.classify( { .comb.sort.join } );
my $max = [max] map { +@($_) }, %anagram.values;
%anagram.values.grep( { +@($_) >= $max } )».join(' ')».say;</lang> Output:
caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile abel able bale bela elba
Just for the fun of it, here's one-liner that uses no temporaries. Since it would be rather long, we've oriented it vertically: <lang perl6> .say for # print each element of the array made this way: slurp('unixdict.txt')\ # load file in memory .words\ # extract words .classify( *.comb.sort.join )\ # group by common anagram .classify( *.value.elems )\ # group by number of anagrams in a group .max( :by(*.key) ).value\ # get the group with highest number of anagrams ».value # get all groups of anagrams in the group just selected</lang>
PHP
<lang php><?php $words = explode("\n", file_get_contents('http://www.puzzlers.org/pub/wordlists/unixdict.txt')); foreach ($words as $word) {
$chars = str_split($word); sort($chars); $anagram[implode($chars)][] = $word;
}
$best = max(array_map('count', $anagram)); foreach ($anagram as $ana)
if (count($ana) == $best) print_r($ana);
?></lang>
PicoLisp
A straight-forward implementation using 'group' takes 48 seconds on a 1.7 GHz Pentium: <lang PicoLisp>(flip
(by length sort (by '((L) (sort (copy L))) group (in "unixdict.txt" (make (while (line) (link @)))) ) ) )</lang>
Using a binary tree with the 'idx' function, it takes only 0.42 seconds on the same machine, a factor of 100 faster: <lang PicoLisp>(let Words NIL
(in "unixdict.txt" (while (line) (let (Word (pack @) Key (pack (sort @))) (if (idx 'Words Key T) (push (car @) Word) (set Key (list Word)) ) ) ) ) (flip (by length sort (mapcar val (idx 'Words)))) )</lang>
Output:
-> (("vile" "veil" "live" "levi" "evil") ("trace" "crate" "cater" "carte" "caret ") ("regal" "large" "lager" "glare" "alger") ("neal" "lena" "lean" "lane" "elan" ) ("lange" "glean" "galen" "angle" "angel") ("elba" "bela" "bale" "able" "abel") ("tulsa" "talus" "sault" "latus") ...
PL/I
<lang PL/I>/* Search a list of words, finding those having the same letters. */
word_test: proc options (main);
declare words (50000) character (20) varying, frequency (50000) fixed binary; declare word character (20) varying; declare (i, k, wp, most) fixed binary (31);
on endfile (sysin) go to done;
words = ; frequency = 0; wp = 0; do forever; get edit (word) (L); call search_word_list (word); end;
done:
put skip list ('There are ' || wp || ' words'); most = 0; /* Determine the word(s) having the greatest number of anagrams. */ do i = 1 to wp; if most < frequency(i) then most = frequency(i); end; put skip edit ('The following word(s) have ', trim(most), ' anagrams:') (a); put skip; do i = 1 to wp; if most = frequency(i) then put edit (words(i)) (x(1), a); end;
search_word_list: procedure (word) options (reorder);
declare word character (*) varying; declare i fixed binary (31);
do i = 1 to wp; if length(words(i)) = length(word) then if is_anagram(word, words(i)) then do; frequency(i) = frequency(i) + 1; return; end; end; /* The word does not exist in the list, so add it. */ if wp >= hbound(words,1) then return; wp = wp + 1; words(wp) = word; frequency(wp) = 1; return;
end search_word_list;
/* Returns true if the words are anagrams, otherwise returns false. */ is_anagram: procedure (word1, word2) returns (bit(1)) options (reorder);
declare (word1, word2) character (*) varying; declare tword character (20) varying, c character (1); declare (i, j) fixed binary;
tword = word2; do i = 1 to length(word1); c = substr(word1, i, 1); j = index(tword, c); if j = 0 then return ('0'b); substr(tword, j, 1) = ' '; end; return ('1'b);
end is_anagram;
end word_test;</lang> Output:
There are 23565 words The following word(s) have 5 anagrams: abel alger angel caret elan evil
PowerShell
<lang powershell>$c = New-Object Net.WebClient $words = -split ($c.DownloadString('http://www.puzzlers.org/pub/wordlists/unixdict.txt')) $top_anagrams = $words `
| ForEach-Object { $_ | Add-Member -PassThru NoteProperty Characters ` (-join (([char[]] $_) | Sort-Object)) } ` | Group-Object Characters ` | Group-Object Count ` | Sort-Object Count ` | Select-Object -First 1
$top_anagrams.Group | ForEach-Object { $_.Group -join ', ' }</lang> Output:
abel, able, bale, bela, elba alger, glare, lager, large, regal angel, angle, galen, glean, lange caret, carte, cater, crate, trace elan, lane, lean, lena, neal evil, levi, live, veil, vile
Prolog
<lang Prolog>:- use_module(library( http/http_open )).
anagrams:-
% we read the URL of the words
http_open('http://www.puzzlers.org/pub/wordlists/unixdict.txt', In, []), read_file(In, [], Out), close(In),
% we get a list of pairs key-value where key = a-word value = <list-of-its-codes> % this list must be sorted
msort(Out, MOut),
% in order to gather values with the same keys
group_pairs_by_key(MOut, GPL),
% we sorted this list in decreasing order of the length of values
predsort(my_compare, GPL, GPLSort),
% we extract the first 6 items
GPLSort = [_H1-T1, _H2-T2, _H3-T3, _H4-T4, _H5-T5, _H6-T6 | _],
% Tnn are lists of codes (97 for 'a'), we create the strings
maplist(maplist(atom_codes), L, [T1, T2, T3, T4, T5, T6] ),
maplist(writeln, L).
read_file(In, L, L1) :-
read_line_to_codes(In, W),
( W == end_of_file ->
% the file is read
L1 = L ;
% we sort the list of codes of the line
msort(W, W1),
% to create the key in alphabetic order
atom_codes(A, W1),
% and we have the pair Key-Value in the result list
read_file(In, [A-W | L], L1)).
% predicate for sorting list of pairs Key-Values % if the lentgh of values is the same % we sort the keys in alhabetic order my_compare(R, K1-V1, K2-V2) :- length(V1, L1), length(V2, L2), ( L1 < L2 -> R = >; L1 > L2 -> R = <; compare(R, K1, K2)).</lang> The result is
[abel,able,bale,bela,elba] [caret,carte,cater,crate,trace] [angel,angle,galen,glean,lange] [alger,glare,lager,large,regal] [elan,lane,lean,lena,neal] [evil,levi,live,veil,vile] true
PureBasic
<lang PureBasic>InitNetwork() ; OpenConsole()
Procedure.s sortWord(word$)
len.i = Len(word$) Dim CharArray.s (len) For n = 1 To len ; Transfering each single character CharArray(n) = Mid(word$, n, 1) ; of the word into an array. Next SortArray(CharArray(),#PB_Sort_NoCase ) ; Sorting the array. word$ ="" For n = 1 To len ; Writing back each single word$ + CharArray(n) ; character of the array. Next ProcedureReturn word$
EndProcedure
tmpdir$ = GetTemporaryDirectory()
filename$ = tmpdir$ + "unixdict.txt"
Structure ana
isana.l anas.s
EndStructure
NewMap anaMap.ana()
If ReceiveHTTPFile("http://www.puzzlers.org/pub/wordlists/unixdict.txt", filename$)
If ReadFile(1, filename$) Repeat word$ = (ReadString(1)) ; Reading a word from a file. key$ = (sortWord(word$)) ; Sorting the word and storing in key$. If FindMapElement(anaMap(), key$) ; Looking up if a word already had the same key$. ; if yes anaMap()\anas = anaMap()\anas+ ", " + word$ ; adding the word anaMap()\isana + 1 Else ; if no anaMap(key$)\anas = word$ ; applying a new record anaMap()\isana + 1 EndIf Until Eof(1) CloseFile(1) DeleteFile(filename$) ;----- output ----- ForEach anaMap() If anaMap()\isana >= 4 ; only emit what had 4 or more hits. PrintN(anaMap()\anas) EndIf Next PrintN("Press any key"): Repeat: Until Inkey() <> "" EndIf
EndIf </lang>
Python
Python 3.2 shell input (IDLE) <lang python>>>> import urllib.request >>> from collections import defaultdict >>> words = urllib.request.urlopen('http://www.puzzlers.org/pub/wordlists/unixdict.txt').read().split() >>> anagram = defaultdict(list) # map sorted chars to anagrams >>> for word in words: anagram[tuple(sorted(word))].append( word )
>>> count = max(len(ana) for ana in anagram.values())
>>> for ana in anagram.values():
if len(ana) >= count:
print ([x.decode() for x in ana])</lang>
Python 3.2.1 groupby (in place sort instead of max) <lang python>import urllib.request, itertools import time words = urllib.request.urlopen('http://www.puzzlers.org/pub/wordlists/unixdict.txt').read().split() print('Words ready') t0 = time.clock() anagrams = [list(g) for k,g in itertools.groupby(sorted(words, key=sorted), key=sorted)] anagrams.sort(key=len, reverse=True) count = len(anagrams[0]) for ana in anagrams:
if len(ana) < count: break print(ana)
t0 -= time.clock() print('Finished in %f s' % -t0)</lang>
Python 2.5 shell input (IDLE) <lang python>>>> import urllib >>> from collections import defaultdict >>> words = urllib.urlopen('http://www.puzzlers.org/pub/wordlists/unixdict.txt').read().split() >>> len(words) 25104 >>> anagram = defaultdict(list) # map sorted chars to anagrams >>> for word in words: anagram[tuple(sorted(word))].append( word )
>>> count = max(len(ana) for ana in anagram.itervalues())
>>> for ana in anagram.itervalues():
if len(ana) >= count:
print ana
['angel', 'angle', 'galen', 'glean', 'lange']
['alger', 'glare', 'lager', 'large', 'regal']
['caret', 'carte', 'cater', 'crate', 'trace']
['evil', 'levi', 'live', 'veil', 'vile']
['elan', 'lane', 'lean', 'lena', 'neal']
['abel', 'able', 'bale', 'bela', 'elba']
>>> count
5
>>></lang>
sort and then group using groupby()
<lang python>>>> import urllib, itertools >>> words = urllib.urlopen('http://www.puzzlers.org/pub/wordlists/unixdict.txt').read().split() >>> len(words) 25104 >>> anagrams = [list(g) for k,g in itertools.groupby(sorted(words, key=sorted), key=sorted)]
>>> count = max(len(ana) for ana in anagrams)
>>> for ana in anagrams:
if len(ana) >= count:
print ana
['abel', 'able', 'bale', 'bela', 'elba']
['caret', 'carte', 'cater', 'crate', 'trace']
['angel', 'angle', 'galen', 'glean', 'lange']
['alger', 'glare', 'lager', 'large', 'regal']
['elan', 'lane', 'lean', 'lena', 'neal']
['evil', 'levi', 'live', 'veil', 'vile']
>>> count
5
>>></lang>
R
<lang R>words <- readLines("http://www.puzzlers.org/pub/wordlists/unixdict.txt") word_group <- sapply(
strsplit(words, split=""), # this will split all words to single letters... function(x) paste(sort(x), collapse="") # ...which we sort and paste again
)
counts <- tapply(words, word_group, length) # group words by class to get number of anagrams anagrams <- tapply(words, word_group, paste, collapse=", ") # group to get string with all anagrams
- Results
table(counts) counts
1 2 3 4 5
22263 1111 155 31 6
anagrams[counts == max(counts)]
abel acert "abel, able, bale, bela, elba" "caret, carte, cater, crate, trace" aegln aeglr
"angel, angle, galen, glean, lange" "alger, glare, lager, large, regal"
aeln eilv "elan, lane, lean, lena, neal" "evil, levi, live, veil, vile" </lang>
Racket
<lang racket>
- lang racket
(require net/url)
(define (get-lines url-string)
(define port (get-pure-port (string->url url-string))) (for/list ([l (in-lines port)]) l))
(define (hash-words words)
(for/fold ([ws-hash (hash)]) ([w words]) (hash-update ws-hash (list->string (sort (string->list w) < #:key (λ (c) (char->integer c)))) (λ (ws) (cons w ws)) (λ () '()))))
(define (get-maxes h)
(define max-ws (apply max (map length (hash-values h)))) (define max-keys (filter (λ (k) (= (length (hash-ref h k)) max-ws)) (hash-keys h))) (map (λ (k) (hash-ref h k)) max-keys))
(get-maxes (hash-words (get-lines "http://www.puzzlers.org/pub/wordlists/unixdict.txt"))) </lang> Output:
'(("neal" "lena" "lean" "lane" "elan") ("trace" "crate" "cater" "carte" "caret") ("regal" "large" "lager" "glare" "alger") ("elba" "bela" "bale" "able" "abel") ("lange" "glean" "galen" "angle" "angel") ("vile" "veil" "live" "levi" "evil"))
Rascal
<lang rascal>import Prelude;
list[str] OrderedRep(str word){ return sort([word[i] | i <- [0..size(word)-1]]); } public list[set[str]] anagram(){ allwords = readFileLines(|http://www.puzzlers.org/pub/wordlists/unixdict.txt%7C); AnagramMap = invert((word : OrderedRep(word) | word <- allwords)); longest = max([size(group) | group <- range(AnagramMap)]); return [AnagramMap[rep]| rep <- AnagramMap, size(AnagramMap[rep]) == longest]; }</lang> Returns: <lang rascal>value: [
{"glean","galen","lange","angle","angel"}, {"glare","lager","regal","large","alger"}, {"carte","trace","crate","caret","cater"}, {"lane","lena","lean","elan","neal"}, {"able","bale","abel","bela","elba"}, {"levi","live","vile","evil","veil"}
]</lang>
REXX
version 1.1, idomatic
This version doesn't assume that the dictionary is in alphabetical order, nor
does it assume the words are in any specific case (lower/upper/mixed).
<lang rexx>/*REXX program finds words with the largest set of anagrams (same size).*/
iFID='unixdict.txt' /*input file identifier, # words.*/
hc=; !.=; #.=0; w=0; words=0; most=0 /*initialize some REXX variables.*/
/* [↓] read entire file by line.*/ do recs=0 while lines(iFID)\==0 /*Got data? Then read a record.*/ x=space(linein(iFID),0) /*pick off a word from the input.*/ L=length(x); if L<3 then iterate /*onesies and twosies can't win. */ if \datatype(x,'M') then iterate /*filter out nonanagramable words*/ words=words+1 /*count of (useable) words. */ z=sortA(x) /*sort the letters in the word. */ !.z=!.z x; #.z=#.z+1 /*append it to !.z, bump the ctr.*/ if #.z>most then do; hc=z; most=#.z; if L>w then w=L; iterate; end if #.z==most then hc=hc z /*append sorted word─►max anagram*/ end /*recs*/ /*hc◄─list of high count anagrams.*/
say '──────────────────────────────' recs 'words in the dictionary file: ' iFID say
do m=1 for words(hc); z=subword(hc,m,1) /*high count anagrams*/ say ' ' left(subword(!.z,1,1),w) ' [anagrams: ' subword(!.z,2)"]" end /*m*/ /* W is the maximum width word. */
say say '───── Found' words(hc) "words (each of which have" #.z-1 'anagrams).' exit /*stick a fork in it, we're done.*/ /*──────────────────────────────────SORTA subroutine────────────────────*/ sortA: procedure; arg char +1 xx _. /*get 1st letter of arg, _.=null.*/ _.char=char /*no need to concatenate 1st char*/
/*[↓] put letters alphabetically.*/ do length(xx); parse var xx char +1 xx; _.char=_.char||char; end /*reassemble word, sorted letters*/
return _.a||_.b||_.c||_.d||_.e||_.f||_.g||_.h||_.i||_.j||_.k||_.l||_.m||,
_.n||_.o||_.p||_.q||_.r||_.s||_.t||_.u||_.v||_.w||_.x||_.y||_.z</lang>
Programming note: the long (wide) assignment for return _.a||... could've been coded as an elegant do loop instead of hardcoding 26 letters,
but since the dictionary (word list) is rather large, a rather expaciated method was used for speed.
output when using the default input (dictionary)
────────────────────────────── 25104 words in the dictionary file: unixdict.txt abel [anagrams: able bale bela elba] angel [anagrams: angle galen glean lange] elan [anagrams: lane lean lena neal] alger [anagrams: glare lager large regal] caret [anagrams: carte cater crate trace] evil [anagrams: levi live veil vile] ───── Found 6 words (each of which have 4 anagrams).
version 1.2, optimized
This optimized version eliminates the sortA subroutine and puts that subroutine's code in-line. <lang rexx>/*REXX program finds words with the largest set of anagrams (same size).*/ iFID='unixdict.txt' /*input file identifier, # words.*/ hc=; !.=; #.=0; w=0; words=0; most=0 /*initialize some REXX variables.*/
/* [↓] read entire file by line.*/ do recs=0 while lines(iFID)\==0 /*Got data? Then read a record.*/ x=space(linein(iFID),0) /*pick off a word from the input.*/ L=length(x); if L<3 then iterate /*onesies and twosies can't win. */ if \datatype(x,'M') then iterate /*filter out nonanagramable words*/ words=words+1 /*count of (useable) words. */ parse upper var x y +1 u _. /*get uppercase X & nullify "_." */ xx='?'y; _.xx=y /*get 1st letter (special case).*/ /*[↓] put letters alphabetically.*/ do length(u); parse var u y +1 u; xx='?'y; _.xx=_.xx||y; end /*reassemble word, sorted letters*/ z=_.?a||_.?b||_.?c||_.?d||_.?e||_.?f||_.?g||_.?h||_.?i||_.?j||_.?k||_.?l||_.?m||, _.?n||_.?o||_.?p||_.?q||_.?r||_.?s||_.?t||_.?u||_.?v||_.?w||_.?x||_.?y||_.?z !.z=!.z x; #.z=#.z+1 /*append it to !.z, bump the ctr.*/ if #.z>most then do; hc=z; most=#.z; if L>w then w=L; iterate; end if #.z==most then hc=hc z /*append sorted word─►hc anagrams*/ end /*recs*/ /*hc◄─list of high count anagrams*/
say '──────────────────────────────' recs 'words in the dictionary file: ' iFID say
do m=1 for words(hc); z=subword(hc,m,1) /*high count anagrams*/ say ' ' left(subword(!.z,1,1),w) ' [anagrams: ' subword(!.z,2)"]" end /*m*/ /* W is the maximum width word. */
say say '───── Found' words(hc) "words (each of which have" #.z-1 'anagrams).'
/*stick a fork in it, we're done.*/</lang>
output is the same as version 1.1
Programming note: the above REXX programs adopted the method that the REXX version 2 uses for extracting each character of a word.
The method is more obtuse, but when invoking the routine tens of thousands of times, this faster method lends itself to heavy use.
version 1.3, faster
<lang rexx>/*REXX program finds words with the largest set of anagrams (same size).*/ iFID='unixdict.txt' /*input file identifier, # words.*/ hc=; !.=; #.=0; ww=0; words=0; most=0 /*initialize some REXX variables.*/
/* [↓] read entire file by line.*/ do recs=0 while lines(iFID)\==0 /*Got data? Then read a record.*/ @=space(linein(iFID),0) /*pick off a word from the input.*/ LL=length(@); if LL<3 then iterate /*onesies and twosies can't win. */ if \datatype(@,'M') then iterate /*exclude non-anagramable words. */ words=words+1 /*count of (useable) words. */ parse upper var @ _ +1 xx _. /*get uppercase @ & nullify "_." */ _._=_ /*get 1st letter (special case).*/ /*[↓] put letters alphabetically.*/ do LL-1; parse var xx _ +1 xx; _._=_._||_; end /*rest of word.*/ /*reassemble word, sorted letters*/ zz=_.a||_.b||_.c||_.d||_.e||_.f||_.g||_.h||_.i||_.j||_.k||_.l||_.m||, _.n||_.o||_.p||_.q||_.r||_.s||_.t||_.u||_.v||_.w||_.x||_.y||_.z !.zz=!.zz @; #.zz=#.zz+1 /*append it to !.zz, bump the ctr.*/ if #.zz>most then do; hc=zz; most=#.zz; if LL>ww then ww=LL; iterate; end if #.zz==most then hc=hc zz /*append sorted word─►hc anagrams*/ end /*recs*/ /*this loop can't have 1-letter vars.*/
say '──────────────────────────────' recs 'words in the dictionary file: ' iFID say
do m=1 for words(hc); z=subword(hc,m,1) /*high count anagrams*/ say ' ' left(subword(!.z,1,1),ww) ' [anagrams: ' subword(!.z,2)"]" end /*m*/ /* WW is the maximum width word. */
say say '───── Found' words(hc) "words (each of which have" #.z-1 'anagrams).'
/*stick a fork in it, we're done.*/</lang>
output is the same as version 1.1
Programming note: the do LL-1 loop takes a lexalphabetical shortcut and uses one-letter indices
(instead of compounded letters), which eliminates the need for setting the xx variable in version 1.2.
Howerver, the precludes the use of one-letter REXX variables in the main do recs loop,
so this method is discouraged for less maintainability and the changing or adding of new code.
Timing notes:
- REXX version 1.2 is about 17% faster than version 1.1
- REXX version 1.3 is about 19% faster than version 1.1
annotated version using PARSE
<lang rexx>u='Halloween' /*the word to be sorted by letter*/ upper u /*fast method to uppercase a var.*/
/*another: u = translate(u) */ /*another: parse upper var u u */ /*another: u = upper(u) */ /*not always available [↑] */
say 'u=' u _.=
do until u== /*keep truckin' until U is null.*/ parse var u y +1 u /*get the next (first) char in U.*/ xx = '?'y /*assign a prefixed char to XX. */ _.xx = _.xx || y /*append it to all the Y chars.*/ end /*until*/ /*U now has the first char gone.*/ /*Note: the var U is destroyed.*/
/* [↓] build sorted letter word. */
z=_.?a||_.?b||_.?c||_.?d||_.?e||_.?f||_.?g||_.?h||_.?i||_.?j||_.?k||_.?l||_.?m||,
_.?n||_.?o||_.?p||_.?q||_.?r||_.?s||_.?t||_.?u||_.?v||_.?w||_.?x||_.?y||_.?z
/*Note: the ? is prefixed to the letter to avoid */ /*collisions with other REXX one-character variables.*/
say 'z=' z</lang> output
u= HALLOWEEN z= AEEHLLNOW
annotated version using a DO loop
<lang rexx>u='Halloween' /*the word to be sorted by letter*/ upper u /*fast method to uppercase a var.*/ L=length(u) /*get the length of the word. */ say 'u=' u say 'L=' L _.=
do k=1 for L /*keep truckin' for L chars. */ y = substr(u,k,1) /*get the next character in U. */ xx = '?'y /*assign a prefixed char to XX. */ _.xx = _.xx || y /*append it to all the Y chars.*/ end /*do k*/ /*U now has the first char gone.*/
/* [↓] build sorted letter word. */
z=_.?a||_.?b||_.?c||_.?d||_.?e||_.?f||_.?g||_.?h||_.?i||_.?j||_.?k||_.?l||_.?m||,
_.?n||_.?o||_.?p||_.?q||_.?r||_.?s||_.?t||_.?u||_.?v||_.?w||_.?x||_.?y||_.?z
say 'z=' z</lang> output
u= HALLOWEEN L= 9 z= AEEHLLNOW
version 2
<lang rexx>/*REXX program finds words with the largest set of anagrams (same size)
- 07.08.2013 Walter Pachl
- sorta for word compression courtesy Gerard Schildberger,
- modified, however, to obey lowercase
- 10.08.2013 Walter Pachl take care of mixed case dictionary
- following Version 1's method
- /
Parse Value 'A B C D E F G H I J K L M N O P Q R S T U V W X Y Z',
With a b c d e f g h i j k l m n o p q r s t u v w x y z
Call time 'R' ifid='unixdict.txt' /* input file identifier */ words=0 /* number of usable words */ maxl=0 /* maximum number of anagrams */ wl.= /* wl.ws words that have ws */ Do ri=1 By 1 While lines(ifid)\==0 /* read each word in file */
word=space(linein(ifid),0) /* pick off a word from the input.*/ If length(word)<3 Then /* onesies and twosies can't win. */ Iterate If\datatype(word,'M') Then /* not an anagramable word */ Iterate words=words+1 /* count of (useable) words. */ ws=sorta(word) /* sort the letters in the word. */ wl.ws=wl.ws word /* add word to list of ws */ wln=words(wl.ws) /* number of anagrams with ws */ Select When wln>maxl Then Do /* a new maximum */ maxl=wln /* use this */ wsl=ws /* list of resulting ws values */ End When wln=maxl Then /* same as the one found */ wsl=wsl ws /* add ws to the list */ Otherwise /* shorter */ Nop /* not yet of interest */ End End
Say ' ' Say copies('-',10) ri-1 'words in the dictionary file: ' ifid Say copies(' ',10) words 'thereof are anagram candidates' Say ' ' Say 'There are' words(wsl) 'set(s) of anagrams with' maxl,
'elements each:'
Say ' ' Do while wsl<>
Parse Var wsl ws wsl Say ' 'wl.ws End
Say time('E') Exit sorta: /**********************************************************************
- sort the characters in word_p (lowercase translated to uppercase)
- 'chARa' -> 'AACHR'
- /
Parse Upper Arg word_p c.= Do While word_p> Parse Var word_p cc +1 word_p c.cc=c.cc||cc End Return c.a||c.b||c.c||c.d||c.e||c.f||c.g||c.h||c.i||c.j||c.k||c.l||, c.m||c.n||c.o||c.p||c.q||c.r||c.s||c.t||c.u||c.v||c.w||c.x||c.y||c.z</lang>
Output:
---------- 25108 words in the dictionary file: unixdict.txt 24819 thereof are anagram candidates There are 6 set(s) of anagrams with 5 elements each: abel able bale bela elba angel angle galen glean lange elan lane lean lena neal alger glare lager large regal caret carte cater crate trace evil levi live veil vile 1.170000
Ruby
<lang ruby>require 'open-uri'
anagram = Hash.new {|hash, key| hash[key] = []} # map sorted chars to anagrams
open('http://www.puzzlers.org/pub/wordlists/unixdict.txt') do |f|
words = f.read.split for word in words anagram[word.split().sort] << word end
end
count = anagram.values.map {|ana| ana.length}.max anagram.each_value do |ana|
if ana.length >= count p ana end
end</lang> Output:
["evil", "levi", "live", "veil", "vile"] ["abel", "able", "bale", "bela", "elba"] ["elan", "lane", "lean", "lena", "neal"] ["alger", "glare", "lager", "large", "regal"] ["angel", "angle", "galen", "glean", "lange"] ["caret", "carte", "cater", "crate", "trace"]
Short version (with lexical ordered result).
<lang ruby>require 'open-uri'
anagrams = open('http://www.puzzlers.org/pub/wordlists/unixdict.txt'){%7Cf%7C f.read.split.group_by{|w| w.each_char.sort} } anagrams.values.group_by(&:size).max.last.each{|group| puts group.join(", ") } </lang>
- Output:
abel, able, bale, bela, elba alger, glare, lager, large, regal angel, angle, galen, glean, lange caret, carte, cater, crate, trace elan, lane, lean, lena, neal evil, levi, live, veil, vile
Scala
<lang scala>val src = io.Source fromURL "http://www.puzzlers.org/pub/wordlists/unixdict.txt" val vls = src.getLines.toList.groupBy(_.sorted).values val max = vls.map(_.size).max vls filter (_.size == max) map (_ mkString " ") mkString "\n"</lang> Output:
abel able bale bela elba angel angle galen glean lange evil levi live veil vile alger glare lager large regal elan lane lean lena neal caret carte cater crate trace
Another take: <lang scala>Source
.fromURL("http://www.puzzlers.org/pub/wordlists/unixdict.txt").getLines.toList .groupBy(_.sorted).values .groupBy(_.size).maxBy(_._1)._2 .map(_.mkString("\t")) .foreach(println)</lang>
Prints:
abel able bale bela elba angel angle galen glean lange evil levi live veil vile alger glare lager large regal elan lane lean lena neal caret carte cater crate trace
Seed7
<lang seed7>$ include "seed7_05.s7i";
include "gethttp.s7i"; include "strifile.s7i";
const type: anagramHash is hash [string] array string;
const func string: sort (in string: stri) is func
result var string: sortedStri is ""; local var integer: i is 0; var integer: j is 0; var char: ch is ' '; begin sortedStri := stri; for i range 1 to length(sortedStri) do for j range succ(i) to length(sortedStri) do if sortedStri[i] > sortedStri[j] then ch := sortedStri[i]; sortedStri @:= [i] sortedStri[j]; sortedStri @:= [j] ch; end if; end for; end for; end func;
const proc: main is func
local var file: dictFile is STD_NULL; var string: word is ""; var string: sortedLetters is ""; var anagramHash: anagrams is anagramHash.value; var integer: length is 0; var integer: maxLength is 0; begin dictFile := openStrifile(getHttp("www.puzzlers.org/pub/wordlists/unixdict.txt")); while hasNext(dictFile) do readln(dictFile, word); sortedLetters := sort(word); if sortedLetters in anagrams then anagrams[sortedLetters] &:= word; else anagrams @:= [sortedLetters] [] (word); end if; length := length(anagrams[sortedLetters]); if length > maxLength then maxLength := length; end if; end while; close(dictFile); for sortedLetters range sort(keys(anagrams)) do if length(anagrams[sortedLetters]) = maxLength then writeln(join(anagrams[sortedLetters], ", ")); end if; end for; end func;</lang>
Output:
abel, able, bale, bela, elba caret, carte, cater, crate, trace angel, angle, galen, glean, lange alger, glare, lager, large, regal elan, lane, lean, lena, neal evil, levi, live, veil, vile
SETL
<lang SETL>h := open('unixdict.txt', "r"); anagrams := {}; while not eof(h) loop
geta(h, word); if word = om or word = "" then continue; end if; sorted := insertion_sort(word); anagrams{sorted} with:= word;
end loop;
max_size := 0; max_words := {}; for words = anagrams{sorted} loop
size := #words; if size > max_size then max_size := size; max_words := {words}; elseif size = max_size then max_words with:= words; end if;
end loop;
for w in max_words loop
print(w);
end loop;
-- GNU SETL has no built-in sort() procedure insertion_sort(A);
for i in [2..#A] loop v := A(i); j := i-1; while j >= 1 and A(j) > v loop A(j+1) := A(j); j := j - 1; end loop; A(j+1) := v; end loop; return A;
end procedure;</lang> Output:
{abel able bale bela elba} {alger glare lager large regal} {angel angle galen glean lange} {caret carte cater crate trace} {elan lane lean lena neal} {evil levi live veil vile}
Smalltalk
<lang Smalltalk>list:= (FillInTheBlank request: 'myMessageBoxTitle') subStrings: String crlf. dict:= Dictionary new. list do: [:val| (dict at: val copy sort ifAbsent: [dict at: val copy sort put: OrderedCollection new]) add: val. ]. sorted:=dict asSortedCollection: [:a :b| a size > b size].</lang> Documentation:
First ask the user for the list. Then create an empty dictionary (a Map). Which maps strings as keys to OrderedCollections as values. For each entry in the list add an entry to the OrderedCollection under the key of the sorted string (and create a new empty OC if there was no previous entry). Then create a SortedCollection sorting by comparing the sizes of the OrderedCollections. The first 6 entries are: an OrderedCollection('evil' 'levi' 'live' 'veil' 'vile') an OrderedCollection('angel' 'angle' 'galen' 'glean' 'lange') an OrderedCollection('alger' 'glare' 'lager' 'large' 'regal') an OrderedCollection('caret' 'carte' 'cater' 'crate' 'trace') an OrderedCollection('abel' 'able' 'bale' 'bela' 'elba') an OrderedCollection('elan' 'lane' 'lean' 'lena' 'neal')
instead of asking for the strings, read the file: <lang smalltalk>d := Dictionary new. 'unixdict.txt' asFilename
readingLinesDo:[:eachWord | (d at:eachWord copy sort ifAbsentPut:[OrderedCollection new]) add:eachWord ].
((d values select:[:s | s size > 1])
sortBySelector:#size) reverse do:[:s | s printCR]</lang>
Output:
OrderedCollection('angel' 'angle' 'galen' 'glean' 'lange') OrderedCollection('abel' 'able' 'bale' 'bela' 'elba') OrderedCollection('elan' 'lane' 'lean' 'lena' 'neal') OrderedCollection('caret' 'carte' 'cater' 'crate' 'trace') OrderedCollection('evil' 'levi' 'live' 'veil' 'vile') OrderedCollection('alger' 'glare' 'lager' 'large' 'regal') OrderedCollection('mate' 'meat' 'tame' 'team') ...
not sure if getting the dictionary via http is part of the task; if so, replace the file-reading with: <lang smalltalk>'http://www.puzzlers.org/pub/wordlists/unixdict.txt' asURI contents asCollectionOfLines do:[:eachWord | ...</lang>
SNOBOL4
Note: unixdict.txt is passed in locally via STDIN. Newlines must be converted for Win/DOS environment. <lang SNOBOL4>* # Sort letters of word
define('sortw(str)a,i,j') :(sortw_end)
sortw a = array(size(str)) sw1 i = i + 1; str len(1) . a = :s(sw1)
a = sort(a)
sw2 j = j + 1; sortw = sortw a<j> :s(sw2)f(return) sortw_end
- # Count words in string
define('countw(str)') :(countw_end)
countw str break(' ') span(' ') = :f(return)
countw = countw + 1 :(countw)
countw_end
ana = table()
L1 wrd = input :f(L2) ;* unixdict.txt from stdin
sw = sortw(wrd); ana<sw> = ana<sw> wrd ' ' cw = countw(ana<sw>); max = gt(cw,max) cw i = i + 1; terminal = eq(remdr(i,1000),0) wrd :(L1)
L2 kv = convert(ana,'array') L3 j = j + 1; key = kv<j,1>; val = kv<j,2> :f(end)
output = eq(countw(val),max) key ': ' val :(L3)
end</lang> Output:
abel: abel able bale bela elba aeglr: alger glare lager large regal aegln: angel angle galen glean lange acert: caret carte cater crate trace aeln: elan lane lean lena neal eilv: evil levi live veil vile
Tcl
<lang tcl>package require Tcl 8.5 package require http
set url http://www.puzzlers.org/pub/wordlists/unixdict.txt set response [http::geturl $url] set data [http::data $response] http::cleanup $response
set max 0 array set anagrams {}
foreach line [split $data \n] {
foreach word [split $line] { set anagram [join [lsort [split $word ""]] ""] lappend anagrams($anagram) $word set max [::tcl::mathfunc::max $max [llength $anagrams($anagram)]] }
}
foreach key [array names anagrams] {
if {[llength $anagrams($key)] == $max} { puts $anagrams($key) }
}</lang> Outputs:
evil levi live veil vile caret carte cater crate trace abel able bale bela elba elan lane lean lena neal angel angle galen glean lange alger glare lager large regal
TUSCRIPT
<lang tuscript>$$ MODE TUSCRIPT,{} requestdata = REQUEST ("http://www.puzzlers.org/pub/wordlists/unixdict.txt")
DICT anagramm CREATE 99999
COMPILE
LOOP word=requestdata -> ? : any character charsInWord=STRINGS (word," ? ") charString =ALPHA_SORT (charsInWord) DICT anagramm APPEND/QUIET/COUNT charString,num,freq,word;" " ENDLOOP
DICT anagramm UNLOAD charString,all,freq,anagrams
index =DIGIT_INDEX (freq) reverseIndex =REVERSE (index) freq =INDEX_SORT (freq,reverseIndex) anagrams =INDEX_SORT (anagrams,reverseIndex) charString =INDEX_SORT (charString,reverseIndex)
mostWords=SELECT (freq,1), adjust=MAX_LENGTH (charString)
LOOP cs=charString, f=freq, a=anagrams IF (f<mostWords) EXIT cs=CENTER (cs,-adjust) PRINT cs," ",f,": ",a ENDLOOP
ENDCOMPILE</lang> Output:
e'i'l'v 5: evil levi live veil vile a'e'l'n 5: elan lane lean lena neal a'c'e'r't 5: caret carte cater crate trace a'e'g'l'n 5: angel angle galen glean lange a'e'g'l'r 5: alger glare lager large regal a'b'e'l 5: abel able bale bela elba
Ursala
Supplying the input file on the command line during compilation makes its contents accessible as a pre-declared identifier. The algorithm is to group the words together that are made from the same unordered lists of letters, then collect the groups together that have the same number of words in them, and then show the collection associated with the highest number. <lang Ursala>#import std
- show+
anagrams = mat` * leql$^&h eql|=@rK2tFlSS ^(~&,-<&)* unixdict_dot_txt</lang> output:
evil levi live veil vile caret carte cater crate trace alger glare lager large regal elan lane lean lena neal angel angle galen glean lange abel able bale bela elba
Vedit macro language
This implementation first sorts characters of each word using Insertion sort in subroutine SORT_LETTERS.
Then the word list is sorted using built-in Sort function.
Finally, groups of words are analyzed and largest groups are recorded.
The word list is expected to be in the same directory as the script. <lang vedit>File_Open("|(PATH_ONLY)\unixdict.txt")
Repeat(ALL) {
Reg_Copy_Block(10, CP, EOL_Pos) // original word Call("SORT_LETTERS") // sort letters of the word EOL IC(' ') Reg_Ins(10) // add the original word at eol Line(1, ERRBREAK)
}
Sort(0, File_Size) // sort list according to anagrams
BOF Search("|F") Search(' ') // first word in the list Reg_Copy_Block(10, BOL_Pos, CP+1) // reg 10 = sorted anagram word Reg_Copy_Block(11, CP, EOL_Pos) // reg 11 = list of words in current group Reg_Empty(12) // reg 12 = list of words in largest groups Reg_Set(13, " ")
- 1 = 1 // words in this group
- 2 = 2 // words in largest group found
Repeat(ALL) {
Line(1, ERRBREAK) if (Match(@10, ADVANCE) == 0) { // same group as previous word? Reg_Copy_Block(11, CP-1, EOL_Pos, APPEND) // add word to this group #1++ } else { // different anagram group Search(" ", ERRBREAK) if (#1 == #2) { // same size as the largest? Reg_Set(12, @13, APPEND) // append newline Reg_Set(12, @11, APPEND) // append word list } if (#1 > #2) { // new larger size of group Reg_Set(12, @11) // replace word list #2 = #1 } Reg_Copy_Block(10, BOL_Pos, CP+1) Reg_Copy_Block(11, CP, EOL_Pos) // first word of new group #1 = 1 }
}
Buf_Quit(OK) // close word list file Buf_Switch(Buf_Free) // output results in a new edit buffer Reg_Ins(12) // display all groups of longest anagram words Return
//////////////////////////////////////////////////////////////////// // // Sort characters in current line using Insertion sort //
- SORT_LETTERS:
GP(EOL_pos) #9 = Cur_Col-1 for (#1 = 2; #1 <= #9; #1++) {
Goto_Col(#1) #8 = Cur_Char #2 = #1 while (#2 > 1) { #7 = Cur_Char(-1) if (#7 <= #8) { break } Ins_Char(#7, OVERWRITE) #2-- Goto_Col(#2) } Ins_Char(#8, OVERWRITE)
} return</lang> Output:
abel able bale bela elba caret carte cater crate trace angel angle galen glean lange alger glare lager large regal elan lane lean lena neal evil levi live veil vile
Visual Basic .NET
<lang vbnet>Imports System.IO Imports System.Collections.ObjectModel
Module Module1
Dim sWords As New Dictionary(Of String, Collection(Of String))
Sub Main()
Dim oStream As StreamReader = Nothing Dim sLines() As String = Nothing Dim sSorted As String = Nothing Dim iHighCount As Integer = 0 Dim iMaxKeyLength As Integer = 0 Dim sOutput As String = ""
oStream = New StreamReader("unixdict.txt") sLines = oStream.ReadToEnd.Split(New String() {vbCrLf}, StringSplitOptions.RemoveEmptyEntries) oStream.Close()
For i As Integer = 0 To sLines.GetUpperBound(0) sSorted = SortCharacters(sLines(i))
If Not sWords.ContainsKey(sSorted) Then sWords.Add(sSorted, New Collection(Of String))
sWords(sSorted).Add(sLines(i))
If sWords(sSorted).Count > iHighCount Then iHighCount = sWords(sSorted).Count
If sSorted.Length > iMaxKeyLength Then iMaxKeyLength = sSorted.Length End If Next
For Each sKey As String In sWords.Keys If sWords(sKey).Count = iHighCount Then sOutput &= "[" & sKey.ToUpper & "]" & Space(iMaxKeyLength - sKey.Length + 1) & String.Join(", ", sWords(sKey).ToArray()) & vbCrLf End If Next
Console.WriteLine(sOutput) Console.ReadKey()
End Sub
Private Function SortCharacters(ByVal s As String) As String
Dim sReturn() As Char = s.ToCharArray() Dim sTemp As Char = Nothing
For i As Integer = 0 To sReturn.GetUpperBound(0) - 1 If (sReturn(i + 1)) < (sReturn(i)) Then sTemp = sReturn(i) sReturn(i) = sReturn(i + 1) sReturn(i + 1) = sTemp i = -1 End If Next
Return CStr(sReturn)
End Function
End Module</lang> Output:
[ABEL] abel, able, bale, bela, elba [AEGLR] alger, glare, lager, large, regal [AEGLN] angel, angle, galen, glean, lange [ACERT] caret, carte, cater, crate, trace [AELN] elan, lane, lean, lena, neal [EILV] evil, levi, live, veil, vile
- Programming Tasks
- Text processing
- ABAP
- Ada
- AutoHotkey
- AutoHotkey examples needing attention
- Examples needing attention
- AWK
- BBC BASIC
- Bracmat
- C
- C++
- C sharp
- Clojure
- CoffeeScript
- Common Lisp
- DRAKMA
- Component Pascal
- D
- E
- Elena
- Erlang
- Euphoria
- F Sharp
- FBSL
- Factor
- Fantom
- Fortran
- GAP
- Go
- Groovy
- Haskell
- Icon
- Unicon
- J
- Java
- JavaScript
- Spidermonkey examples needing attention
- Julia
- K
- Lasso
- Liberty BASIC
- LiveCode
- Lua
- M4
- Maple
- Mathematica
- Maxima
- MUMPS
- NetRexx
- Nimrod
- Oberon-2
- Objeck
- Objeck examples needing attention
- OCaml
- OoRexx
- Oz
- Pascal
- Perl
- Perl 6
- PHP
- PicoLisp
- PL/I
- PowerShell
- Prolog
- PureBasic
- PureBasic examples needing attention
- Python
- R
- Racket
- Rascal
- REXX
- Ruby
- Scala
- Seed7
- SETL
- Smalltalk
- SNOBOL4
- Tcl
- TUSCRIPT
- Ursala
- Vedit macro language
- PARI/GP/Omit
- Visual Basic .NET