Count occurrences of a substring: Difference between revisions

Content added Content deleted

Inline

Revision as of 12:41, 16 June 2011

The task is to either create a function, or show a built-in function, to count the number of non-overlapping occurrences of a substring inside a string. The function should take two arguments: the first argument being the string to search and the second a substring to be search for. It should return an integer count.

<lang pseudocode>print countSubstring("the three truths","th") 3

// do not count substrings that overlap with previously-counted substrings: print countSubstring("ababababab","abab") 2</lang>

ALGOL 68

Works with: ALGOL 68 version Revision 1 - no extensions to language used.

Works with: ALGOL 68G version Any - tested with release 1.18.0-9h.tiny.

Algol68 has no build in function to do this task, hence the next to create a count string in string routine. <lang algol68>#!/usr/local/bin/a68g --script #

PROC count string in string = (STRING needle, haystack)INT: (

 INT start:=LWB haystack, next, out:=0;
 FOR count WHILE string in string(needle, next, haystack[start:]) DO
   start+:=next+UPB needle-LWB needle;
   out:=count
 OD;
 out

);

printf(($d" "$,

 count string in string("aa", "aaaaaaa"),            # expect 3 #
 count string in string("th", "the three truths"),   # expect 3 #
 count string in string("abab", "ababababab"),       # expect 2 #
 count string in string("a*b","abaabba*bbaba*bbab"), # expect 2 #
 $l$

))</lang> Output:

3 3 2 2

Perl

This solution uses regex, hence the substring cannot contain regex metacharacters <lang perl>sub countSubstring {

 my ($str, $sub) = @_;
 my $count = () = $str =~ /$sub/g;
 return $count;

or return scalar( () = $str =~ /$sub/g );

}

print countSubstring("the three truths","th"), "\n"; print countSubstring("ababababab","abab"), "\n";</lang>

Python

<lang python>>>> "the three truths".count("th") 3 >>> "ababababab".count("abab") 2</lang>

Tcl

The regular expression engine is ideal for this task, especially as the ***= prefix makes it interpret the rest of the argument as a literal string to match: <lang tcl>proc countSubstrings {haystack needle} {

   regexp -all ***=$needle $haystack

} puts [countSubstrings "the three truths" "th"] puts [countSubstrings "ababababab" "abab"] puts [countSubstrings "abaabba*bbaba*bbab" "a*b"]</lang>

Output:

3
2
2