Substring: Difference between revisions
Review needed for C implementation. |
→{{header|C}}: there is no substr() in c; re-implemented to use only one pass of O(n+m) |
||
Line 33:
=={{header|C}}==
<lang c>#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *substring(const char *s, int n, int m)
{
char *result;
/* make sure string does not end before n
* and advance the "s" pointer to beginning of substring */
for ( ; n > 0; s++, n--)
/* string ends before n: invalid */
▲ if ( n < slen ) {
strncat(result, s, m); /* strncat() will automatically add null terminator
* if string ends early or after reading m characters */
return result;
▲ char *r = malloc(m+1);
▲ r[m] = '\0';
▲ return r;
▲ r[0] = '\0';
▲ return r;
}
char *str_wholeless1(const char *s)
{
int slen = strlen(s);
Line 67 ⟶ 66:
}
char *str_fromch(const char *s, int ch, int m)
{
return substring(s, strchr(s, ch) - s, m);
}
char *str_fromstr(const char *s, char *in, int m)
{
return substring(s, strstr(s, in) - s , m);
Line 79 ⟶ 78:
<lang c>#define TEST(A) do { \
const char *r = (A); \
printf("%s\n", r); \
} while(0)
int main()
{
const char *s = "hello world shortest program";
TEST( substring(s, 12, 5) ); // get "short"
|
Revision as of 00:52, 10 August 2009
You are encouraged to solve this task according to the task description, using any language you may know.
Basic Data Operation
This is a basic data operation. It represents a fundamental action on a basic data type.
You may see other such operations in the Basic Data Operations category, or:
Integer Operations
Arithmetic |
Comparison
Boolean Operations
Bitwise |
Logical
String Operations
Concatenation |
Interpolation |
Comparison |
Matching
Memory Operations
Pointers & references |
Addresses
In this task display a substring:
- starting from n characters in and of m length;
- starting from n characters in, up to the end of the string;
- whole string minus last character;
- starting from a known character within the string and of m length;
- starting from a known substring within the string and of m length.
Ada
<lang Ada>with Ada.Text_IO; use Ada.Text_IO; with Ada.Strings.Fixed; use Ada.Strings.Fixed;
procedure Test_Slices is
Str : constant String := "abcdefgh"; N : constant := 2; M : constant := 3;
begin
Put_Line (Str (N..(N + M))); Put_Line (Str (N..Str'Last)); Put_Line (Str (Str'First..Str'Last - 1)); Put_Line (Head (Tail (Str, Str'Last - Index (Str, "d", 1)), M)); Put_Line (Head (Tail (Str, Str'Last - Index (Str, "de", 1) - 1), M));
end Test_Slices;</lang> Sample output:
bcd bcdefgh abcdefg efg fgh
C
<lang c>#include <stdio.h>
- include <stdlib.h>
- include <string.h>
char *substring(const char *s, int n, int m) {
char *result;
/* n < 0 or m < 0 is invalid */ if (n < 0 || m < 0) return NULL;
/* make sure string does not end before n * and advance the "s" pointer to beginning of substring */ for ( ; n > 0; s++, n--) if (*s == '\0') /* string ends before n: invalid */ return NULL;
result = malloc(m+1); result[0] = '\0'; strncat(result, s, m); /* strncat() will automatically add null terminator * if string ends early or after reading m characters */ return result;
}
char *str_wholeless1(const char *s) {
int slen = strlen(s);
return substring(s, 0, slen-1);
}
char *str_fromch(const char *s, int ch, int m) {
return substring(s, strchr(s, ch) - s, m);
}
char *str_fromstr(const char *s, char *in, int m) {
return substring(s, strstr(s, in) - s , m);
}</lang>
<lang c>#define TEST(A) do { \
const char *r = (A); \ printf("%s\n", r); \ free(r); \ } while(0)
int main() {
const char *s = "hello world shortest program";
TEST( substring(s, 12, 5) ); // get "short" TEST( substring(s, 6, -1) ); // get "world shortest program" TEST( str_wholeless1(s) ); // "... progra" TEST( str_fromch(s, 'w', 5) ); // "world" TEST( str_fromstr(s, "ro", 3) ); // "rog"
return 0;
}</lang>
Common Lisp
<lang lisp>(let ((string "0123456789")
(n 2) (m 3) (start #\5) (substring "34")) (list (subseq string n (+ n m)) (subseq string n) (subseq string 0 (1- (length string))) (let ((pos (position start string))) (subseq string pos (+ pos m))) (let ((pos (search substring string))) (subseq string pos (+ pos m)))))</lang>
E
<lang e>def string := "aardvarks" def n := 4 def m := 4 println(string(n, n + m)) println(string(n)) println(string(0, string.size() - 1)) println({string(def i := string.indexOf1('d'), i + m)}) println({string(def i := string.startOf("ard"), i + m)})</lang> Output:
vark varks aardvark dvar ardv
Forth
<lang forth> 2 constant Pos 3 constant Len
- substrings
s" abcdefgh" ( addr len ) over Pos + Len cr type \ cde 2dup Pos /string cr type \ cdefgh 2dup 1- cr type \ abcdefg 2dup 'd scan Len min cr type \ def s" de" search if Len min cr type then \ def
</lang>
Java
<lang java>String x = "testing123"; System.out.println(x.substring(n, n + m)); System.out.println(x.substring(n)); System.out.println(x.substring(0, x.length() - 1)); int index1 = x.indexOf('i'); System.out.println(x.substring(index1, index1 + m)); int index2 = x.indexOf("ing"); System.out.println(x.substring(index2, index2 + m)); //indexOf methods also have an optional "from index" argument which will //make indexOf ignore characters before that index</lang>
Perl
<lang perl>my $str = 'abcdefgh'; my $n = 2; my $m = 3; print substr($str, $n, $m), "\n"; print substr($str, $n), "\n"; print substr($str, 0, -1), "\n"; print substr($str, index($str, 'd'), $m), "\n"; print substr($str, index($str, 'de'), $m), "\n";</lang>
PHP
<lang php><?php $str = 'abcdefgh'; $n = 2; $m = 3; echo substr($str, $n, $m), "\n"; echo substr($str, $n), "\n"; echo substr($str, 0, -1), "\n"; echo substr($str, strpos($str, 'd'), $m), "\n"; echo substr($str, strpos($str, 'de'), $m), "\n"; ?></lang>
Python
Python uses zero-based indexing, so the n'th character is at index n-1.
<lang python>>>> s = 'abcdefgh' >>> n, m, char, chars = 2, 3, 'd', 'cd' >>> # starting from n=2 characters in and m=3 in length; >>> s[n-1:n+m-1] 'bcd' >>> # starting from n characters in, up to the end of the string; >>> s[n-1:] 'bcdefgh' >>> # whole string minus last character; >>> s[:-1] 'abcdefg' >>> # starting from a known character char="d" within the string and of m length; >>> indx = s.index(char) >>> s[indx:indx+m] 'def' >>> # starting from a known substring chars="cd" within the string and of m length. >>> indx = s.index(chars) >>> s[indx:indx+m] 'cde' >>> </lang>
Ruby
<lang ruby>str = 'abcdefgh' n = 2 m = 3 puts str[n, m] puts str[n..-1] puts str[0..-2] puts str[str.index('d'), m] puts str[str.index('de'), m]</lang>
Smalltalk
The distinction between searching a single character or a string into another string is rather blurred. In the following code, instead of using 'w' (a string) we could use $w (a character), but it makes no difference.
<lang smalltalk>|s| s := 'hello world shortest program'.
(s copyFrom: 13 to: (13+4)) displayNl. "4 is the length (5) - 1, since we need the index of the
last char we want, which is included"
(s copyFrom: 7) displayNl. (s allButLast) displayNl.
(s copyFrom: ((s indexOfRegex: 'w') first)
to: ( ((s indexOfRegex: 'w') first) + 4) ) displayNl.
(s copyFrom: ((s indexOfRegex: 'ro') first)
to: ( ((s indexOfRegex: 'ro') first) + 2) ) displayNl.</lang>
These last two examples in particular seem rather complex, so we can extend the string class.
<lang smalltalk>String extend [
copyFrom: index length: nChar [ ^ self copyFrom: index to: ( index + nChar - 1 ) ] copyFromRegex: regEx length: nChar [ |i| i := self indexOfRegex: regEx. ^ self copyFrom: (i first) length: nChar ]
].
"and show it simpler..."
(s copyFrom: 13 length: 5) displayNl. (s copyFromRegex: 'w' length: 5) displayNl. (s copyFromRegex: 'ro' length: 3) displayNl.</lang>
Tcl
<lang tcl>set str "abcdefgh" set n 2 set m 3
puts [string range $str $n [expr {$n+$m-1}]] puts [string range $str $n end] puts [string range $str 0 end-1]
- Because Tcl does substrings with a pair of indices, it is easier to express
- the last two parts of the task as a chained pair of [string range] operations.
puts [string range [string range $str [string first "d" $str] end] [expr {$m-1}]
puts [string range [string range $str [string first "de" $str] end] [expr {$m-1}]</lang>
Of course, if you were doing 'position-plus-length' a lot, it would be easier to add another subcommand to string
, like this:
<lang tcl># Define the substring operation proc ::substring {string start length} {
string range [string range $string $start end] 0 $length-1
}
- Plumb it into the language
set ops [namespace ensemble configure string -map] dict set ops substr ::substring namespace ensemble configure string -map $ops
- Now show off by repeating the challenge!
set str "abcdefgh" set n 2 set m 3
puts [string substr $str $n $m] puts [string range $str $n end] puts [string range $str 0 end-1] puts [string substr $str [string first "d" $str] $m] puts [string substr $str [string first "de" $str] $m]</lang>