Substring: Difference between revisions

Content deleted Content added

Inline

Revision as of 11:09, 24 April 2013

In this task display a substring:

starting from n characters in and of m length;
starting from n characters in, up to the end of the string;
whole string minus last character;
starting from a known character within the string and of m length;
starting from a known substring within the string and of m length.

If the program uses UTF-8 or UTF-16, it must work on any valid Unicode code point, whether in the Basic Multilingual Plane or above it. The program must reference logical characters (code points), not 8-bit code units for UTF-8 or 16-bit code units for UTF-16. Programs for other encodings (such as 8-bit ASCII, or EUC-JP) are not required to handle all Unicode characters.

Ada

String in Ada is an array of Character elements indexed by Positive: <lang Ada>type String is array (Positive range <>) of Character;</lang> Substring is a first-class object in Ada, an anonymous subtype of String. The language uses the term slice for it. Slices can be retrieved, assigned and passed as a parameter to subprograms in mutable or immutable mode. A slice is specified as: <lang Ada>A (<first-index>..<last-index>)</lang>

A string array in Ada can start with any positive index. This is why the implementation below uses Str'First in all slices, which in this concrete case is 1, but intentionally left in the code because the task refers to N as an offset to the string beginning rather than an index in the string. In Ada it is unusual to deal with slices in such way. One uses plain string index instead. <lang Ada>with Ada.Text_IO; use Ada.Text_IO; with Ada.Strings.Fixed; use Ada.Strings.Fixed;

procedure Test_Slices is

  Str : constant String := "abcdefgh";
  N : constant := 2;
  M : constant := 3;

begin

  Put_Line (Str (Str'First + N - 1..Str'First + N + M - 2));
  Put_Line (Str (Str'First + N - 1..Str'Last));
  Put_Line (Str (Str'First..Str'Last - 1));
  Put_Line (Head (Tail (Str, Str'Last - Index (Str, "d", 1)), M));
  Put_Line (Head (Tail (Str, Str'Last - Index (Str, "de", 1) - 1), M));

end Test_Slices;</lang> Sample output:

bcd
bcdefgh
abcdefg
efg
fgh

Aikido

Aikido uses square brackets for slices. The syntax is [start:end]. If you want to use length you have to add to the start. Shifting strings left or right removes characters from the ends.

<lang aikido> const str = "abcdefg" var n = 2 var m = 3

println (str[n:n+m-1]) // pos 2 length 3 println (str[n:]) // pos 2 to end println (str >> 1) // remove last character var p = find (str, 'c') println (str[p:p+m-1]) // from pos of p length 3

var s = find (str, "bc") println (str[s, s+m-1]) // pos of bc length 3 </lang>

ALGOL 68

Translation of: python

Works with: ALGOL 68 version Standard - no extensions to language used

Works with: ALGOL 68G version Any - tested with release 1.18.0-9h.tiny

<lang Algol68>main: (

 STRING s = "abcdefgh";
 INT n = 2, m = 3; 
 CHAR char = "d"; 
 STRING chars = "cd";

 printf(($gl$, s[n:n+m-1]));
 printf(($gl$, s[n:]));
 printf(($gl$, s[:UPB s-1]));

 INT pos; 
 char in string("d", pos, s);
 printf(($gl$, s[pos:pos+m-1]));
 string in string("de", pos, s);
 printf(($gl$, s[pos:pos+m-1]))

)</lang>Output:

bcd
bcdefgh
abcdefg
def
def

AutoHotkey

The code contains some alternatives. <lang autohotkey>String := "abcdefghijklmnopqrstuvwxyz"

also: String = abcdefghijklmnopqrstuvwxyz

n := 12 m := 5

starting from n characters in and of m length;

subString := SubStr(String, n, m)

alternative: StringMid, subString, String, n, m

MsgBox % subString

starting from n characters in, up to the end of the string;

subString := SubStr(String, n)

alternative: StringMid, subString, String, n

MsgBox % subString

whole string minus last character;

StringTrimRight, subString, String, 1

alternatives: subString := SubStr(String, 1, StrLen(String) - 1)
StringMid, subString, String, 1, StrLen(String) - 1

MsgBox % subString

starting from a known character within the string and of m length;

findChar := "q" subString := SubStr(String, InStr(String, findChar), m)

alternatives: RegExMatch(String, findChar . ".{" . m - 1 . "}", subString)
StringMid, subString, String, InStr(String, findChar), m

MsgBox % subString

starting from a known character within the string and of m length;

findString := "pq" subString := SubStr(String, InStr(String, findString), m)

alternatives: RegExMatch(String, findString . ".{" . m - StrLen(findString) . "}", subString)
StringMid, subString, String, InStr(String, findString), m

MsgBox % subString </lang>

Output:

lmnop
lmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxy
qrstu
pqrst

AWK

Translation of: AutoHotKey

<lang awk>BEGIN { str = "abcdefghijklmnopqrstuvwxyz" n = 12 m = 5

print substr(str, n, m) print substr(str, n) print substr(str, 1, length(str) - 1) print substr(str, index(str, "q"), m) print substr(str, index(str, "pq"), m) }</lang>

Output:

$ awk -f substring.awk  
lmnop
lmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxy
qrstu
pqrst

BASIC

<lang qbasic>DIM baseString AS STRING, subString AS STRING, findString AS STRING DIM m AS INTEGER, n AS INTEGER

baseString = "abcdefghijklmnopqrstuvwxyz" n = 12 m = 5

' starting from n characters in and of m length; subString = MID$(baseString, n, m) PRINT subString

' starting from n characters in, up to the end of the string; subString = MID$(baseString, n) PRINT subString

' whole string minus last character; subString = LEFT$(baseString, LEN(baseString) - 1) PRINT subString

' starting from a known character within the string and of m length; ' starting from a known substring within the string and of m length. findString = "pq" subString = MID$(baseString, INSTR(baseString, findString), m) PRINT subString </lang>

Output:

lmnop
lmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxy
pqrst

ZX Spectrum Basic

ZX Spectrum Basic has unfortunately no direct way to find a substring within a string, however a similar effect can be done searching with a for loop: <lang basic>10 LET A$="abcdefghijklmnopqrstuvwxyz" 15 LET n=10: LET m=7 20 PRINT A$(n TO n+m-1) 30 PRINT A$(n TO ) 40 PRINT A$( TO LEN (A$)-1) 50 FOR i=1 TO LEN (A$) 60 IF A$(i)="g" THEN PRINT A$(i TO i+m-1): LET i=LEN (A$): GO TO 70 70 NEXT i 80 LET B$="ijk" 90 FOR i=1 TO LEN (A$)-LEN (B$)+1 100 IF A$(i TO i+LEN (B$)-1)=B$ THEN PRINT A$(i TO i+m-1): LET i=LEN (A$)-LEN (B$)+1: GO TO 110 110 NEXT i 120 STOP </lang> Output:

jklmnop
jklmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxy
ghijklm
ijklmno

BBC BASIC

<lang bbcbasic> basestring$ = "The five boxing wizards jump quickly"

     n% = 10
     m% = 5
     
     REM starting from n characters in and of m length:
     substring$ = MID$(basestring$, n%, m%)
     PRINT substring$
     
     REM starting from n characters in, up to the end of the string:
     substring$ = MID$(basestring$, n%)
     PRINT substring$
     
     REM whole string minus last character:
     substring$ = LEFT$(basestring$)
     PRINT substring$
     
     REM starting from a known character within the string and of m length:
     char$ = "w"
     substring$ = MID$(basestring$, INSTR(basestring$, char$), m%)
     PRINT substring$
     
     REM starting from a known substring within the string and of m length:
     find$ = "iz"
     substring$ = MID$(basestring$, INSTR(basestring$, find$), m%)
     PRINT substring$</lang>

Output:

boxin
boxing wizards jump quickly
The five boxing wizards jump quickl
wizar
izard

Bracmat

Translation of: BBC BASIC

<lang bracmat>( (basestring = "The five boxing wizards jump quickly") & (n = 10) & (m = 5)

 { starting from n characters in and of m length: }

& @(!basestring:? [(!n+-1) ?substring [(!n+!m+-1) ?) & out$!substring

 { starting from n characters in, up to the end of the string: }

& @(!basestring:? [(!n+-1) ?substring) & out$!substring

 { whole string minus last character: }

& @(!basestring:?substring [-2 ?) & out$!substring

 { starting from a known character within the string and of m length: }

& (char = "w") & @(!basestring:? ([?p !char ?: ?substring [(!p+!m) ?)) & out$!substring

 { starting from a known substring within the string and of m length: }

& (find = "iz") & @(!basestring:? ([?p !find ?: ?substring [(!p+!m) ?)) & out$!substring & )</lang> Output:

boxin
boxing wizards jump quickly
The five boxing wizards jump quickl
wizar
izard

C

<lang c>#include <stddef.h>

include <stdio.h>
include <stdlib.h>
include <string.h>

char *substring(const char *s, size_t n, ptrdiff_t m) {

 char *result;
 /* check for null s */
 if (NULL == s)
   return NULL;
 /* negative m to mean 'up to the mth char from right' */
 if (m < 0) 
   m = strlen(s) + m - n + 1;

 /* n < 0 or m < 0 is invalid */
 if (n < 0 || m < 0)
   return NULL;

 /* make sure string does not end before n 
  * and advance the "s" pointer to beginning of substring */
 for ( ; n > 0; s++, n--)
   if (*s == '\0')
     /* string ends before n: invalid */
     return NULL;

 result = malloc(m+1);
 if (NULL == result)
   /* memory allocation failed */
   return NULL;
 result[0]=0;
 strncat(result, s, m); /* strncat() will automatically add null terminator
                         * if string ends early or after reading m characters */
 return result;

}

char *str_wholeless1(const char *s) {

 return substring(s, 0, strlen(s) - 1);

}

char *str_fromch(const char *s, int ch, ptrdiff_t m) {

 return substring(s, strchr(s, ch) - s, m);

}

char *str_fromstr(const char *s, char *in, ptrdiff_t m) {

 return substring(s, strstr(s, in) - s , m);

}

define TEST(A) do { \

   char *r = (A);		\
   if (NULL == r)		\
     puts("--error--");	\
   else {			\
     puts(r);			\
     free(r);			\
   }				\
 } while(0)

int main() {

 const char *s = "hello world shortest program";

 TEST( substring(s, 12, 5) );		// get "short"
 TEST( substring(s, 6, -1) );		// get "world shortest program"
 TEST( str_wholeless1(s) );		// "... progra"
 TEST( str_fromch(s, 'w', 5) );	// "world"
 TEST( str_fromstr(s, "ro", 3) );	// "rog"

 return 0;

}</lang>

C++

<lang cpp>#include <iostream>

include <string>

int main() {

 std::string s = "0123456789";

 int const n = 3;
 int const m = 4;
 char const c = '2';
 std::string const sub = "456";

 std::cout << s.substr(n, m)<< "\n";
 std::cout << s.substr(n) << "\n";
 std::cout << s.substr(0, s.size()-1) << "\n";
 std::cout << s.substr(s.find(c), m) << "\n";
 std::cout << s.substr(s.find(sub), m) << "\n";

}</lang>

C_sharp

<lang csharp>using System; namespace SubString {

   class Program
   {
       static void Main(string[] args)
       {
           string s = "0123456789";
           const int n = 3;
           const int m = 2;
           const char c = '3';
           const string z = "345";

           Console.WriteLine(s.Substring(n, m));
           Console.WriteLine(s.Substring(n, s.Length - n));
           Console.WriteLine(s.Substring(0, s.Length - 1));
           Console.WriteLine(s.Substring(s.IndexOf(c,0,s.Length), m));
           Console.WriteLine(s.Substring(s.IndexOf(z, 0, s.Length), m));
       }
   }

} </lang>

Clojure

(def string "alphabet") (def n 2) (def m 4) (def len (count string))

starting from n characters in and of m length;

(println

(subs string n (+ n m)))              ;phab

starting from n characters in, up to the end of the string;

(println

(subs string n))                      ;phabet

whole string minus last character;

(println

(subs string 0 (dec len)))            ;alphabe

starting from a known character within the string and of m length;

(let [pos (.indexOf string (int \l))]

 (println
  (subs string pos (+ pos m))))     ;lpha

starting from a known substring within the string and of m length.

(let [pos (.indexOf string "ph")]

 (println
  (subs string pos (+ pos m))))      ;phab

</lang>

Common Lisp

<lang lisp>(let ((string "0123456789")

     (n 2)
     (m 3)
     (start #\5)
     (substring "34"))
 (list (subseq string n (+ n m))
       (subseq string n)
       (subseq string 0 (1- (length string)))
       (let ((pos (position start string)))
         (subseq string pos (+ pos m)))
       (let ((pos (search substring string)))
         (subseq string pos (+ pos m)))))</lang>

D

Works with: D version 2

<lang d>import std.stdio, std.string;

void main() {

   const s = "the quick brown fox jumps over the lazy dog";
   enum n = 5, m = 3;

   writeln(s[n .. n + m]);

   writeln(s[n .. $]);

   writeln(s[0 .. $ - 1]);

   const i = s.indexOf("q");
   writeln(s[i .. i + m]);

   const j = s.indexOf("qu");
   writeln(s[j .. j + m]);

}</lang> Output:

uic
uick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog
qui
qui

Delphi

<lang Delphi>program ShowSubstring;

{$APPTYPE CONSOLE}

uses SysUtils;

const

 s = '0123456789';
 n = 3;
 m = 4;
 c = '2';
 sub = '456';

begin

 Writeln(Copy(s, n, m));             // starting from n characters in and of m length;
 Writeln(Copy(s, n, Length(s)));     // starting from n characters in, up to the end of the string;
 Writeln(Copy(s, 1, Length(s) - 1)); // whole string minus last character;
 Writeln(Copy(s, Pos(c, s), m));     // starting from a known character within the string and of m length;
 Writeln(Copy(s, Pos(sub, s), m));   // starting from a known substring within the string and of m length.

end.</lang>

Output:

E

<lang e>def string := "aardvarks" def n := 4 def m := 4 println(string(n, n + m)) println(string(n)) println(string(0, string.size() - 1)) println({string(def i := string.indexOf1('d'), i + m)}) println({string(def i := string.startOf("ard"), i + m)})</lang> Output:

vark
varks
aardvark
dvar
ardv

Euphoria

<lang Euphoria>sequence baseString, subString, findString integer findChar integer m, n

baseString = "abcdefghijklmnopqrstuvwxyz"

-- starting from n characters in and of m length; n = 12 m = 5 subString = baseString[n..n+m-1] puts(1, subString ) puts(1,'\n')

-- starting from n characters in, up to the end of the string; n = 12 subString = baseString[n..$] puts(1, subString ) puts(1,'\n')

-- whole string minus last character; subString = baseString[1..$-1] puts(1, subString ) puts(1,'\n')

-- starting from a known character within the string and of m length; findChar = 'o' m = 5 n = find(findChar,baseString) subString = baseString[n..n+m-1] puts(1, subString ) puts(1,'\n')

-- starting from a known substring within the string and of m length. findString = "pq" m = 5 n = match(findString,baseString) subString = baseString[n..n+m-1] puts(1, subString ) puts(1,'\n')</lang>

Output:

lmnop
lmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxy
opqrs
pqrst

Factor

<lang factor>USING: math sequences kernel ;

! starting from n characters in and of m length

subseq* ( from length seq -- newseq ) [ over + ] dip subseq ;

! starting from n characters in, up to the end of the string

dummy ( seq n -- tailseq ) tail ;

! whole string minus last character

dummy1 ( seq -- headseq ) but-last ;

USING: fry sequences kernel ; ! helper word

subseq-from-* ( subseq len seq quot -- seq ) [ nip ] prepose 2keep subseq* ; inline

! starting from a known character within the string and of m length;

subseq-from-char ( char len seq -- seq ) [ index ] subseq-from-* ;

! starting from a known substring within the string and of m length.

subseq-from-seq ( subseq len seq -- seq ) [ start ] subseq-from-* ;</lang>

Forth

/STRING and SEARCH are standard words. SCAN is widely implemented. Substrings represented by address/length pairs require neither mutation nor allocation.

<lang forth>2 constant Pos 3 constant Len

Str ( -- c-addr u ) s" abcdefgh" ;

Str Pos /string drop Len type \ cde Str Pos /string type \ cdefgh Str 1- type \ abcdefg Str char d scan drop Len type \ def Str s" de" search 2drop Len type \ def</lang>

Fortran

Works with: Fortran version 90 and later

<lang fortran>program test_substring

 character (*), parameter :: string = 'The quick brown fox jumps over the lazy dog.'
 character (*), parameter :: substring = 'brown'
 character    , parameter :: c = 'q'
 integer      , parameter :: n = 5
 integer      , parameter :: m = 15
 integer                  :: i

! Display the substring starting from n characters in and of length m.

 write (*, '(a)') string (n : n + m - 1)

! Display the substring starting from n characters in, up to the end of the string.

 write (*, '(a)') string (n :)

! Display the whole string minus the last character.

 i = len (string) - 1
 write (*, '(a)') string (: i)

! Display the substring starting from a known character and of length m.

 i = index (string, c)
 write (*, '(a)') string (i : i + m - 1)

! Display the substring starting from a known substring and of length m.

 i = index (string, substring)
 write (*, '(a)') string (i : i + m - 1)

end program test_substring</lang> Output:

quick brown fox
quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog
quick brown fox
brown fox jumps

Note that in Fortran positions inside character strings are one-based, i. e. the first character is in position one.

GAP

<lang gap>LETTERS;

"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"

LETTERS{[5 .. 10]};

"EFGHIJ"</lang>

Go

<lang go>package main import "fmt" import "strings"

func main() {

 s := "ABCDEFGH"
 n, m := 2, 3

 fmt.Println(s[n:n+m]) // "CDE"
 fmt.Println(s[n:]) // "CDEFGH"
 fmt.Println(s[0:len(s)-1]) // "ABCDEFG"
 fmt.Println(s[strings.Index(s, "D"):strings.Index(s, "D")+m]) // "DEF"
 fmt.Println(s[strings.Index(s, "DE"):strings.Index(s, "DE")+m]) // "DEF"

}</lang>

Groovy

Strings in Groovy are 0-indexed. <lang groovy>def str = 'abcdefgh' def n = 2 def m = 3 println str[n..n+m-1] println str[n..-1] println str[0..-2] def index1 = str.indexOf('d') println str[index1..index1+m-1] def index2 = str.indexOf('de') println str[index2..index2+m-1]</lang>

Haskell

Works with: Haskell version 6.10.4

A string in Haskell is a list of chars: [Char]

The first three tasks are simply:

*Main> take 3 $ drop 2 "1234567890"
"345"

*Main> drop 2 "1234567890"
"34567890"

*Main> init "1234567890"
"123456789"

The last two can be formulated with the following function:

<lang Haskell>t45 n c s | null sub = []

         | otherwise = take n. head $ sub
 where sub = filter(isPrefixOf c) $ tails s</lang>

*Main> t45 3 "4" "1234567890"
"456"

*Main> t45 3 "45" "1234567890"
"456"

*Main> t45 3 "31" "1234567890"
""

HicEst

<lang hicest>CHARACTER :: string = 'ABCDEFGHIJK', known = 'B', substring = 'CDE' REAL, PARAMETER :: n = 5, m = 8

WRITE(Messagebox) string(n : n + m - 1), "| substring starting from n, length m" WRITE(Messagebox) string(n :), "| substring starting from n, to end of string" WRITE(Messagebox) string(1: LEN(string)-1), "| whole string minus last character"

pos_known = INDEX(string, known) WRITE(Messagebox) string(pos_known : pos_known+m-1), "| substring starting from pos_known, length m"

pos_substring = INDEX(string, substring) WRITE(Messagebox) string(pos_substring : pos_substring+m-1), "| substring starting from pos_substring, length m"</lang>

Icon and Unicon

write( s[n+:m] ) write( s[n:0] ) write( s[1:-1] ) write( s[find(c,s)+:m] ) write( s[find(ss,s)+:m] ) end</lang>

J

<lang J> 5{.3}.'Marshmallow' shmal

  3}.'Marshmallow'

shmallow

  }.'Marshmallow'

arshmallow

  }:'Marshmallow'

Marshmallo

  5{.(}.~ i.&'m')'Marshmallow'

mallo

  5{.(}.~ I.@E.~&'sh')'Marshmallow'

shmal</lang>

Note that there are other, sometimes better, ways of accomplishing this task.

<lang J> 'Marshmallow'{~(+i.)/3 5 shmal</lang>

The taketo / takeafter and dropto / dropafter utilities from the strings script further simplify these types of tasks. <lang J> require 'strings'

  'sh' dropto 'Marshmallow'

shmallow

  5{. 'sh' dropto 'Marshmallow'

shmal

  'sh' takeafter 'Marshmallow'

mallow</lang>

Note also that these operations work the same way on lists of numbers that they do on this example list of characters.

<lang J> 3}. 2 3 5 7 11 13 17 19 7 11 13 17 19

  7 11 dropafter 2 3 5 7 11 13 17 19

2 3 5 7 11</lang>

Java

Strings in Java are 0-indexed. <lang java>String x = "testing123"; System.out.println(x.substring(n, n + m)); System.out.println(x.substring(n)); System.out.println(x.substring(0, x.length() - 1)); int index1 = x.indexOf('i'); System.out.println(x.substring(index1, index1 + m)); int index2 = x.indexOf("ing"); System.out.println(x.substring(index2, index2 + m)); //indexOf methods also have an optional "from index" argument which will //make indexOf ignore characters before that index</lang>

JavaScript

The String object has two similar methods: substr and substring.

substr(start, [len]) returns a substring beginning at a specified location and having a specified length.
substring(start, [end]) returns a string containing the substring from start up to, but not including, end.

<lang javascript>var str = "abcdefgh";

var n = 2; var m = 3;

// * starting from n characters in and of m length; str.substr(n, m); // => "cde"

// * starting from n characters in, up to the end of the string; str.substr(n); // => "cdefgh" str.substring(n); // => "cdefgh"

// * whole string minus last character; str.substring(0, str.length - 1); // => "abcdefg"

// * starting from a known character within the string and of m length; str.substr(str.indexOf('b'), m); // => "bcd"

// * starting from a known substring within the string and of m length. str.substr(str.indexOf('bc'), m); // => "bcd"</lang>

Julia

<lang julia>julia> s = "abcdefg" "abcdefg"

julia> n = 3 3

julia> s[n:end] "cdefg"

julia> m=2 2

julia> s[n:n+m] "cde"

julia> s[1:end-1] "abcdef"

julia> s[search(s,'c')] 'c'

julia> s[search(s,'c'):search(s,'c')+m] "cde"</lang>

LabVIEW

To enhance readability, this task was split into two separate GUI's. In the second, note that "Known Substring" can be a single character.
1:
2:

Lang5

<lang lang5>: cr "\n". ; [] '__A set : dip swap __A swap 1 compress append '__A set execute __A

   -1 extract nip ; : nip swap drop ; : tuck swap over ; : -rot rot rot ; : 0= 0 == ; : 1+ 1 + ;

2dip swap 'dip dip ; : 2drop drop drop ; : |a,b> over - iota + ; : bi* 'dip dip execute ; : bi@ dup bi* ;

comb "" split ; : concat "" join ; : empty? length 0= ; : tail over lensize |a,b> subscript ;

lensize length nip ; : while do 'dup dip 'execute 2dip rot if dup 2dip else break then loop 2drop ;

<substr> comb -rot over + |a,b> subscript concat ;

str-tail tail concat ;

str-index

   : 2streq  2dup over lensize iota subscript eq '* reduce ;
   swap 'comb bi@ length -rot 0 -rot
   "2dup 'lensize bi@ <="
   "2streq if 0 reshape else '1+ 2dip 0 extract drop then"
   while empty? if 2drop tuck == if drop -1 then else 4 ndrop -1 then ;

'abcdefgh 'str set 2 'n set 3 'm set n m str <substr> str comb n str-tail str "d" str-index m str <substr> str "de" str-index m str <substr></lang>

Liberty BASIC

<lang lb>'These tasks can be completed with various combinations of Liberty Basic's 'built in Mid$()/ Instr()/ Left$()/ Right$()/ and Len() functions, but these 'examples only use the Mid$()/ Instr()/ and Len() functions.

baseString$ = "Thequickbrownfoxjumpsoverthelazydog." n = 12 m = 5

'starting from n characters in and of m length Print Mid$(baseString$, n, m)

'starting from n characters in, up to the end of the string Print Mid$(baseString$, n)

'whole string minus last character Print Mid$(baseString$, 1, (Len(baseString$) - 1))

'starting from a known character within the string and of m length Print Mid$(baseString$, Instr(baseString$, "f", 1), m)

'starting from a known substring within the string and of m length Print Mid$(baseString$, Instr(baseString$, "jump", 1), m)</lang>

Logo

Works with: UCB Logo

The following are defined to behave similarly to the built-in index operator ITEM. As with most Logo list operators, these are designed to work for both words (strings) and lists. <lang logo>to items :n :thing

 if :n >= count :thing [output :thing]
 output items :n butlast :thing

end

to butitems :n :thing

 if or :n <= 0 empty? :thing [output :thing]
 output butitems :n-1 butfirst :thing

end

to middle :n :m :thing

 output items :m-(:n-1) butitems :n-1 :thing

end

to lastitems :n :thing

 if :n >= count :thing [output :thing]
output lastitems :n butfirst :thing

end

to starts.with :sub :thing

 if empty? :sub [output "true]
 if empty? :thing [output "false]
 if not equal? first :sub first :thing [output "false]
 output starts.with butfirst :sub butfirst :thing

end

to members :sub :thing

 output cascade [starts.with :sub ?] [bf ?] :thing

end

note: Logo indices start at one

make "s "abcdefgh print items 3 butitems 2 :s ; cde print middle 3 5 :s ; cde print butitems 2 :s ; cdefgh print butlast :s ; abcdefg print items 3 member "d :s ; def print items 3 members "de :s ; def</lang>

Lua

<lang lua>str = "abcdefghijklmnopqrstuvwxyz" n, m = 5, 15

print( string.sub( str, n, m ) ) -- efghijklmno print( string.sub( str, n, -1 ) ) -- efghijklmnopqrstuvwxyz print( string.sub( str, 1, -2 ) ) -- abcdefghijklmnopqrstuvwxy

pos = string.find( str, "i" ) if pos ~= nil then print( string.sub( str, pos, pos+m ) ) end -- ijklmnopqrstuvwx

pos = string.find( str, "ijk" ) if pos ~= nil then print( string.sub( str, pos, pos+m ) ) end-- ijklmnopqrstuvwx

-- Alternative (more modern) notation

print ( str:sub(n,m) ) -- efghijklmno print ( str:sub(n) ) -- efghijklmnopqrstuvwxyz print ( str:sub(1,-2) ) -- abcdefghijklmnopqrstuvwxy

pos = str:find "i" if pos then print (str:sub(pos,pos+m)) end -- ijklmnopqrstuvwx

pos = str:find "ijk" if pos then print (str:sub(pos,pos+m)) end d-- ijklmnopqrstuvwx

</lang>

Mathematica

The StringTake and StringDrop are relevant for this exercise.

<lang Mathematica> n = 2 m = 3 StringTake["Mathematica", {n+1, n+m-1}]

StringDrop["Mathematica", n]

(* StringPosition returns a list of starting and ending character positions for a substring *) pos = StringPosition["Mathematica", "e"]1 1 StringTake["Mathematica", {pos, pos+m-1}]

(* Similar to above *) pos = StringPosition["Mathematica", "the"]1 StringTake["Mathematica", {pos, pos+m-1}] </lang>

MATLAB / Octave

Unicode, UTF-8, UTF-16 is only partially supported. In some cases, a conversion of unicode2native() or native2unicode() is necessary. <lang Matlab>

   % starting from n characters in and of m length;
       s(n+(1:m))
       s(n+1:n+m)
   % starting from n characters in, up to the end of the string;
       s(n+1:end)
   % whole string minus last character;
       s(1:end-1)
   % starting from a known character within the string and of m length;
       s(find(s==c,1)+[0:m-1])
   % starting from a known substring within the string and of m length. 
       s(strfind(s,pattern)+[0:m-1])

</lang>

Maxima

<lang maxima>s: "the quick brown fox jumps over the lazy dog"; substring(s, 17); /* "fox jumps over the lazy dog" */ substring(s, 17, 20); /* "fox" */</lang>

MUMPS

MUMPS has the first position in a string numbered as 1. <lang MUMPS> SUBSTR(S,N,M,C,K)

;show substring operations
;S is the string
;N is a position within the string (that is, n<length(string))
;M is an integer of positions to show
;C is a character within the string S
;K is a substring within the string S
;$Find returns the position after the substring
NEW X
WRITE !,"The base string is:",!,?5,"'",S,"'"
WRITE !,"From position ",N," for ",M," characters:"
WRITE !,?5,$EXTRACT(S,N,N+M-1)
WRITE !,"From position ",N," to the end of the string:"
WRITE !,?5,$EXTRACT(S,N,$LENGTH(S))
WRITE !,"Whole string minus last character:"
WRITE !,?5,$EXTRACT(S,1,$LENGTH(S)-1)
WRITE !,"Starting from character '",C,"' for ",M," characters:"
SET X=$FIND(S,C)-$LENGTH(C)
WRITE !,?5,$EXTRACT(S,X,X+M-1)
WRITE !,"Starting from string '",K,"' for ",M," characters:"
SET X=$FIND(S,K)-$LENGTH(K)
W !,?5,$EXTRACT(S,X,X+M-1)
QUIT

</lang> Usage:

USER>D SUBSTR^ROSETTA("ABCD1234efgh",3,4,"D","23")
 
The base string is:
     'ABCD1234efgh'
From position 3 for 4 characters:
     CD12
From position 3 to the end of the string:
     CD1234efgh
Whole string minus last character:
     ABCD1234efg
Starting from character 'D' for 4 characters:
     D123
Starting from string '23' for 4 characters:
     234e

Nemerle

<lang Nemerle>using System; using System.Console;

module Substrings {

   Main() : void
   {
       string s = "0123456789";
       def n = 3;
       def m = 2;
       def c = '3';
       def z = "345";

       WriteLine(s.Substring(n, m));
       WriteLine(s.Substring(n, s.Length - n));
       WriteLine(s.Substring(0, s.Length - 1));
       WriteLine(s.Substring(s.IndexOf(c,0,s.Length), m));
       WriteLine(s.Substring(s.IndexOf(z, 0, s.Length), m));
   }

}</lang>

NetRexx

Translation of: REXX

<lang NetRexx>/* NetRexx */

options replace format comments java crossref savelog symbols

s = 'abcdefghijk' n = 4 m = 3

say s say s.substr(n, m) say s.substr(n) say s.substr(1, s.length - 1) say s.substr(s.pos('def'), m) say s.substr(s.pos('g'), m)

return </lang>

Output

abcdefghijk
def
defghijk
abcdefghij
def
ghi

newLISP

<lang newLISP>> (set 'str "alphabet" 'n 2 'm 4) 4 > ; starting from n characters in and of m length > (slice str n m) "phab" > ; starting from n characters in, up to the end of the string > (slice str n) "phabet" > ; whole string minus last character > (chop str) "alphabe" > ; starting from a known character within the string and of m length > (slice str (find "l" str) m) "lpha" > ; starting from a known substring within the string and of m length > (slice str (find "ph" str) m) "phab" </lang>

Niue

<lang Niue>( based on the JavaScript code ) 'abcdefgh 's ; s str-len 'len ; 2 'n ; 3 'm ;

( starting from n characters in and of m length ) s n n m + substring . ( => cde ) newline

( starting from n characters in, up to the end of the string ) s n len substring . ( => cdefgh ) newline

( whole string minus last character ) s 0 len 1 - substring . ( => abcdefg ) newline

( starting from a known character within the string and of m length ) s s 'b str-find dup m + substring . ( => bcd ) newline

( starting from a known substring within the string and of m length ) s s 'bc str-find dup m + substring . ( => bcd ) newline </lang>

Objeck

<lang objeck> bundle Default {

 class SubString {
   function : Main(args : String[]) ~ Nil {
     s := "0123456789";

     n := 3;
     m := 4;
     c := '2';
     sub := "456";

     s->SubString(n, m)->PrintLine();
     s->SubString(n)->PrintLine();
     s->SubString(0, s->Size())->PrintLine();
     s->SubString(s->Find(c), m)->PrintLine();
     s->SubString(s->Find(sub), m)->PrintLine();
   }
 }

} </lang>

OCaml

<lang ocaml># let s = "ABCDEFGH" ;; val s : string = "ABCDEFGH"

let n, m = 2, 3 ;;

val n : int = 2 val m : int = 3

String.sub s n m ;;

- : string = "CDE"

String.sub s n (String.length s - n) ;;

- : string = "CDEFGH"

String.sub s 0 (String.length s - 1) ;;

- : string = "ABCDEFG"

String.sub s (String.index s 'D') m ;;

- : string = "DEF"

#load "str.cma";;
let n = Str.search_forward (Str.regexp_string "DE") s 0 in

 String.sub s n m ;;

- : string = "DEF"</lang>

Oz

<lang oz>declare

 fun {DropUntil Xs Prefix}
    case Xs of nil then nil
    [] _|Xr then
       if {List.isPrefix Prefix Xs} then Xs
       else {DropUntil Xr Prefix}
       end
    end
 end

 Digits = "1234567890"

in

 {ForAll
  [{List.take {List.drop Digits 2} 3}     = "345"
   {List.drop Digits 2}                   = "34567890"
   {List.take Digits {Length Digits}-1}   = "123456789"
   {List.take {DropUntil Digits "4"} 3}   = "456"
   {List.take {DropUntil Digits "56"} 3}  = "567"
   {List.take {DropUntil Digits "31"} 3}  = ""
  ]
  System.showInfo}</lang>

Pascal

See Delphi

Perl

<lang perl>my $str = 'abcdefgh'; my $n = 2; my $m = 3; print substr($str, $n, $m), "\n"; print substr($str, $n), "\n"; print substr($str, 0, -1), "\n"; print substr($str, index($str, 'd'), $m), "\n"; print substr($str, index($str, 'de'), $m), "\n";</lang>

Perl 6

<lang perl6>my $str = 'abcdefgh'; my $n = 2; my $m = 3; say $str.substr($n, $m); say $str.substr($n); say $str.substr(0, *-1); say $str.substr($str.index('d'), $m); say $str.substr($str.index('de'), $m);</lang>

PHP

PicoLisp

<lang PicoLisp>(let Str (chop "This is a string")

  (prinl (head 4 (nth Str 6)))        # From 6 of 4 length
  (prinl (nth Str 6))                 # From 6 up to the end
  (prinl (head -1 Str))               # Minus last character
  (prinl (head 8 (member "s" Str)))   # From character "s" of length 8
  (prinl                              # From "isa" of length 8
     (head 8
        (seek '((S) (pre? "is a" S)) Str) ) ) )</lang>

Output:

is a
is a string
This is a strin
s is a s
is a str

PL/I

<lang PL/I> s='abcdefghijk'; n=4; m=3; u=substr(s,n,m); u=substr(s,n); u=substr(s,1,length(s)-1); u=substr(s,index(s,'def',m); u=substr(s,index(s,'g',m); </lang>

PowerShell

Since .NET and PowerShell use zero-based indexing, all character indexes have to be reduced by one. <lang powershell># test string $s = "abcdefgh"

test parameters

$n, $m, $c, $s2 = 2, 3, [char]'d', $s2 = 'cd'

starting from n characters in and of m length
n = 2, m = 3

$s.Substring($n-1, $m) # returns 'bcd'

starting from n characters in, up to the end of the string
n = 2

$s.Substring($n-1) # returns 'bcdefgh'

whole string minus last character

$s.Substring(0, $s.Length - 1) # returns 'abcdefg'

starting from a known character within the string and of m length
c = 'd', m =3

$s.Substring($s.IndexOf($c), $m) # returns 'def'

starting from a known substring within the string and of m length
s2 = 'cd', m = 3

$s.Substring($s.IndexOf($s2), $m) # returns 'cde'</lang>

PureBasic

<lang PureBasic>If OpenConsole()

 Define baseString.s, m, n

 baseString = "Thequickbrownfoxjumpsoverthelazydog."
 n = 12
 m = 5

 ;Display the substring starting from n characters in and of m length.
 PrintN(Mid(baseString, n, m))

 ;Display the substring starting from n characters in, up to the end of the string.
 PrintN(Mid(baseString, n)) ;or PrintN(Right(baseString, Len(baseString) - n))

 ;Display the substring whole string minus last character
 PrintN(Left(baseString, Len(baseString) - 1))

 ;Display the substring starting from a known character within the string and of m length.
 PrintN(Mid(baseString, FindString(baseString, "b", 1), m))

 ;Display the substring starting from a known substring within the string and of m length.
 PrintN(Mid(baseString, FindString(baseString, "ju", 1), m))

 Print(#CRLF$ + #CRLF$ + "Press ENTER to exit")
 Input()
 CloseConsole()

EndIf</lang> Sample output:

wnfox
wnfoxjumpsoverthelazydog.
Thequickbrownfoxjumpsoverthelazydog
brown
jumps

Python

Python uses zero-based indexing, so the n'th character is at index n-1.

<lang python>>>> s = 'abcdefgh' >>> n, m, char, chars = 2, 3, 'd', 'cd' >>> # starting from n=2 characters in and m=3 in length; >>> s[n-1:n+m-1] 'bcd' >>> # starting from n characters in, up to the end of the string; >>> s[n-1:] 'bcdefgh' >>> # whole string minus last character; >>> s[:-1] 'abcdefg' >>> # starting from a known character char="d" within the string and of m length; >>> indx = s.index(char) >>> s[indx:indx+m] 'def' >>> # starting from a known substring chars="cd" within the string and of m length. >>> indx = s.index(chars) >>> s[indx:indx+m] 'cde' >>></lang>

R

Racket

lang racket

(define str "abcdefghijklmnopqrstuvwxyz")

(define n 10) (define m 2) (define start-char #\x) (define start-str "xy")

starting from n characters in and of m length;

(substring str n (+ n m)) ; -> "kl"

starting from n characters in, up to the end of the string;

(substring str m) ; -> "klmnopqrstuvwxyz"

whole string minus last character;

(substring str 0 (sub1 (string-length str))) ; -> "abcdefghijklmnopqrstuvwxy"

starting from a known character within the string and of m length;

(substring str (caar (regexp-match-positions (regexp-quote (string start-char))

                                            str))) ; -> "xyz"

starting from a known substring within the string and of m length.

(substring str (caar (regexp-match-positions (regexp-quote start-str)

                                            str))) ; -> "xyz"

</lang>

Raven

<lang Raven>define println use $s

  $s print "\n" print

"0123456789" as $str

$str 3 2 extract println # at 4th pos get 2 chars $str 8 4 extract println # at 9th pos get 4 chars (when only 1 char available)

$str 3 $str length extract println # at 4th pos get all chars to end of str $str 3 0x7FFFFFFF extract println # at 4th pos get all chars to end of str

$str 3 -1 extract println # at 4th pos get rest of chars but last one $str 0 -1 extract println # all chars but last one

"3" as $matchChr # starting chr for extraction 4 as $subLen # Nr chars after found starting char $str $matchChr split as $l "" $l 0 set $l $matchChr join 0 $subLen extract println

"345" as $matchChrs # starting chrs for extraction 6 as $subLen # Nr chars after found starting chars $str $matchChrs split as $l "" $l 0 set $l $matchChrs join 0 $subLen extract println</lang>

Output:

REBOL

<lang REBOL>REBOL [ Title: "Retrieve Substring" Author: oofoe Date: 2009-12-06 URL: http://rosettacode.org/wiki/Retrieve_a_substring ]

s: "abcdefgh" n: 2 m: 3 char: #"d" chars: "cd"

Note that REBOL uses base-1 indexing. Strings are series values,
just like blocks or lists so I can use the same words to manipulate
them. All these examples use the 'copy' function against the 's'
string with a particular offset as needed.

For the fragment "copy/part skip s n - 1 m", read from right to
left. First you have 'm', which we ignore for now. Then evaluate
'n - 1' (makes 1), to adjust the offset. Then 'skip' jumps from the
start of the string by that offset. 'copy' starts copying from the
new start position and the '/part' refinement limits the copy by 'm'
characters.

print ["Starting from n, length m:" copy/part skip s n - 1 m]

It may be helpful to see the expression with optional parenthesis

print ["Starting from n, length m (parens):" (copy/part (skip s (n - 1)) m)]

This example is much simpler, so hopefully it's easier to see how
the string start is position for the copy

print ["Starting from n to end of string:" copy skip s n - 1]

print ["Whole string minus last character:" copy/part s (length? s) - 1]

print ["Starting from known character, length m:" copy/part find s char m]

print ["Starting from substring, length m:" copy/part find s chars m]</lang>

Output:

Script: "Retrieve Substring" (6-Dec-2009)
Starting from n, length m: bcd
Starting from n, length m (parens): bcd
Starting from n to end of string: bcdefgh
Whole string minus last character: abcdefg
Starting from known character, length m: def
Starting from substring, length m: cde

REXX

Note: in REXX, the 1^st character index of a string is 1, not 0. <lang rexx>/*REXX program demonstrates various ways to extract substrings from a string of characters. */ s='abcdefghijk'; n=4; m=3 /*define come REXX constants (string, index, length of string).*/ say 'original string:' s /* [↑] M can be zero (which indicates a null string). */

u=substr(s,n,m) /*starting from N characters in and of M length. */ say u

u=substr(s,n) /*starting from N characters in, up to the end-of-string. */ say u

u=substr(s,1,length(s)-1) /*OK: the whole string except the last character. */ u=substr(s,1,max(0,length(s)-1)) /*better: this version handles the case of a null string. */ say u

u=substr(s,pos('def',s),m) /*starting from a known char within the string & of M length.*/ say u

u=substr(s,pos('g',s),m) /*starting from a known substr within the string & of M length.*/ say u

                             /*stick a fork in it sir, we're all done and Bob's your uncle. */</lang>

output

original string: abcdefghijk
def
defghijk
abcdefghij
def
ghi

Ruby

<lang ruby>str = 'abcdefgh' n = 2 m = 3 puts str[n, m] puts str[n..-1] puts str[0..-2] puts str[str.index('d'), m] puts str[str.index('de'), m] puts str[/a.*d/]</lang>

Run BASIC

<lang runbasic>n = 2 m = 3 s$ = "abcd" a$ = mid$(a$,n,m) ' starting from n characters in and of m length a$ = mid$(a$,n) ' starting from n characters in, up to the end of the string a$ = Print mid$(a$,1,(len(a$)-1)) ' whole string minus last character a$ = mid$(a$,instr(a$,s$,1),m) ' starting from a known character within the string and of m length a$ = mid$(a$,instr(a$,s$,1), m) ' starting from a known substring within the string and of m length.</lang>

SAS

<lang sas>data _null_;

  a="abracadabra";
  b=substr(a,2,3); /* first number is position, starting at 1,
                      second number is length */
  put _all_;

run;</lang>

Sather

<lang sather>class MAIN is

 main is
   s ::= "hello world shortest program";
   #OUT + s.substring(12, 5) + "\n";
   #OUT + s.substring(6) + "\n";
   #OUT + s.head( s.size - 1) + "\n";
   #OUT + s.substring(s.search('w'), 5) + "\n";
   #OUT + s.substring(s.search("ro"), 3) + "\n";
 end;

end;</lang>

Scala

<lang scala>val str = "The good life is one inspired by love and guided by knowledge." val n = 21 val m = 16

println(str.slice(n, n+m)) println(str.slice(n, str.length)) println(str.slice(0, str.length-1)) println(str.slice(str.indexOf('l'), str.indexOf('l')+m)) println(str.slice(str.indexOf("good"), str.indexOf("good")+m))</lang>

Scheme

Works with: Guile

<lang scheme>(define s "Hello, world!") (define n 5) (define m (+ n 6))

(display (substring s n m)) (newline)

(display (substring s n)) (newline)

(display (substring s 0 (- (string-length s) 1))) (newline)

(display (substring s (string-index s #\o) m)) (newline)

(display (substring s (string-contains s "lo") m)) (newline)</lang>

Sed

2 chars starting from 3rd

$ echo string | sed -r 's/.{3}(.{2}).*/\1/' in

remove first 3 chars

echo string | sed -r 's/^.{3}//'

delete last char

$ echo string | sed -r 's/.$//' strin

`r' with two following chars

$ echo string | sed -r 's/.*(r.{2}).*/\1/' rin </lang>

Seed7

<lang seed7>$ include "seed7_05.s7i";

const proc: main is func

 local
   const string: stri is "abcdefgh";
   const integer: N is 2;
   const integer: M is 3;
 begin
   writeln(stri[N len M]);
   writeln(stri[N ..]);
   writeln(stri[.. pred(length(stri))]);
   writeln(stri[pos(stri, 'c') len M]);
   writeln(stri[pos(stri, "de") len M]);
 end func;</lang>

Sample output:

bcd
bcdefgh
abcdefg
cde
def

Slate

s := 'hello world shortest program'.
n := 13.
m := 4.

inform: (s copyFrom: n to: n + m). inform: (s copyFrom: n). inform: s allButLast. inform: (s copyFrom: (s indexOf: $w) to: (s indexOf: $w) + m). inform: (s copyFrom: (s indexOfSubSeq: 'ro') to: (s indexOfSubSeq: 'ro') + m). </lang>

Smalltalk

The distinction between searching a single character or a string into another string is rather blurred. In the following code, instead of using 'w' (a string) we could use $w (a character), but it makes no difference.

<lang smalltalk>|s| s := 'hello world shortest program'.

(s copyFrom: 13 to: (13+4)) displayNl. "4 is the length (5) - 1, since we need the index of the

last char we want, which is included"

(s copyFrom: 7) displayNl. (s allButLast) displayNl.

(s copyFrom: ((s indexOfRegex: 'w') first)

  to: ( ((s indexOfRegex: 'w') first) + 4) ) displayNl.

(s copyFrom: ((s indexOfRegex: 'ro') first)

  to: ( ((s indexOfRegex: 'ro') first) + 2) ) displayNl.</lang>

These last two examples in particular seem rather complex, so we can extend the string class.

Works with: GNU Smalltalk

<lang smalltalk>String extend [

 copyFrom: index length: nChar [
   ^ self copyFrom: index to: ( index + nChar - 1 )
 ]
 copyFromRegex: regEx length: nChar [
   |i|
   i := self indexOfRegex: regEx.
   ^ self copyFrom: (i first) length: nChar
 ]

].

"and show it simpler..."

(s copyFrom: 13 length: 5) displayNl. (s copyFromRegex: 'w' length: 5) displayNl. (s copyFromRegex: 'ro' length: 3) displayNl.</lang>

SNOBOL4

<lang snobol> string = "abcdefghijklmnopqrstuvwxyz" n = 12 m = 5 known_char = "q" known_str = "pq"

starting from n characters in and of m length;

string len(n - 1) len(m) . output

starting from n characters in, up to the end of the string;

string len(n - 1) rem . output

whole string minus last character;

string rtab(1) . output

starting from a known character within the string and of m length;

string break(known_char) len(m) . output

starting from a known substring <= m within the string and of m length.

string (known_str len(m - size(known_str))) . output end</lang>

Output:

 lmnop
 lmnopqrstuvwxyz
 abcdefghijklmnopqrstuvwxy
 qrstu
 pqrst

Tcl

<lang tcl>set str "abcdefgh" set n 2 set m 3

puts [string range $str $n [expr {$n+$m-1}]] puts [string range $str $n end] puts [string range $str 0 end-1]

Because Tcl does substrings with a pair of indices, it is easier to express
the last two parts of the task as a chained pair of [string range] operations.
A maximally efficient solution would calculate the indices in full first.

puts [string range [string range $str [string first "d" $str] end] [expr {$m-1}]] puts [string range [string range $str [string first "de" $str] end] [expr {$m-1}]]

From Tcl 8.5 onwards, these can be contracted somewhat.

puts [string range [string range $str [string first "d" $str] end] $m-1] puts [string range [string range $str [string first "de" $str] end] $m-1]</lang> Of course, if you were doing 'position-plus-length' a lot, it would be easier to add another subcommand to string, like this:

Works with: Tcl version 8.5

<lang tcl># Define the substring operation, efficiently proc ::substring {string start length} {

   string range $string $start [expr {$start + $length - 1}]

}

Plumb it into the language

set ops [namespace ensemble configure string -map] dict set ops substr ::substring namespace ensemble configure string -map $ops

Now show off by repeating the challenge!

set str "abcdefgh" set n 2 set m 3

puts [string substr $str $n $m] puts [string range $str $n end] puts [string range $str 0 end-1] puts [string substr $str [string first "d" $str] $m] puts [string substr $str [string first "de" $str] $m]</lang>

TUSCRIPT

<lang tuscript> $$ MODE TUSCRIPT string="abcdefgh", n=4,m=n+2 substring=EXTRACT (string,#n,#m)

 PRINT substring

substring=Extract (string,#n,0)

 PRINT substring

substring=EXTRACT (string,0,-1)

 PRINT substring

n=SEARCH (string,":d:"),m=n+2 substring=EXTRACT (string,#n,#m)

 PRINT substring

substring=EXTRACT (string,":{substring}:"|,0)

 PRINT substring

</lang> Output:

de
defgh
abcdefg
de
fgh

UNIX Shell

POSIX shells

Works with: Almquist Shell

<lang bash>str="abc qrdef qrghi" n=6 m=3

expr "x$str" : "x.\{$n\}$.\{1,$m\}$" expr "x$str" : "x.\{$n\}$.*$" printf '%s\n' "${str%?}" expr "r${str#*r}" : "$.\{1,$m\}$" expr "qr${str#*qr}" : "$.\{1,$m\}$"</lang>

def
def qrghi
abc qrdef qrgh
rde
qrd

This program uses expr(1) to capture a substring.

Bourne Shell

Works with: Bourne Shell

<lang bash>str="abc qrdef qrghi" n=6 m=3

expr "x$str" : "x.\{$n\}$.\{1,$m\}$" expr "x$str" : "x.\{$n\}$.*$" expr "x$str" : "x$.*$."

index() { i=0 s=$1 until test "x$s" = x || expr "x$s" : "x$2" >/dev/null; do i=`expr $i + 1` s=`expr "x$s" : "x.$.*$"` done echo $i } expr "x$str" : "x.\{`index "$str" r`\}$.\{1,$m\}$" expr "x$str" : "x.\{`index "$str" qr`\}$.\{1,$m\}$"</lang>

def
def qrghi
abc qrdef qrgh
rde
qrd

zsh

Works with: zsh

Note that the last two constructs won't work with bash as only zsh supports nested string manipulation. <lang bash>

!/bin/zsh

string='abcdefghijk' echo ${string:2:3} # Display 3 chars starting 2 chars in ie: 'cde' echo ${string:2} # Starting 2 chars in, display to end of string echo ${string:0:${#string}-1} # Whole string minus last character echo ${string%?} # Shorter variant of the above echo ${${string/*c/c}:0:3} # Display 3 chars starting with 'c' echo ${${string/*cde/cde}:0:3} # Display 3 chars starting with 'cde' </lang>

Pipe

This example shows how to cut(1) a substring from a string.

Translation of: AWK

Works with: Almquist Shell

<lang bash>#!/bin/sh str=abcdefghijklmnopqrstuvwxyz n=12 m=5

printf %s "$str" | cut -c $n-`expr $n + $m - 1` printf %s "$str" | cut -c $n- printf '%s\n' "${str%?}" printf q%s "${str#*q}" | cut -c 1-$m printf pq%s "${str#*pq}" | cut -c 1-$m</lang>

Output:

$ sh substring.sh                                                              
lmnop
lmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxy
qrstu
pqrst

cut -c counts characters from 1.
cut(1) runs on each line of standard input, therefore the string must not contain a newline.
One can use the old style `expr $n + $m - 1` or the new style $((n + m - 1)) to calculate the index.
cut(1) prints the substring to standard output. To put the substring in a variable, use one of
- var=`printf %s "$str" | cut -c $n-\`expr $n + $m - 1\``
- var=$( printf %s "$str" | cut -c $n-$((n + m - 1)) )

Vala

<lang vala> string s = "Hello, world!"; int n = 1; int m = 3; // start at n and go m letters string s_n_to_m = s[n:n+m]; // start at n and go to end string s_n_to_end = s[n:s.length]; // start at beginning and show all but last string s_notlast = s[0:s.length - 1]; // start from known letter and then go m letters int index_of_l = s.index_of("l"); string s_froml_for_m = s[index_of_l:index_of_l + m]; // start from known substring then go m letters int index_of_lo = s.index_of("lo"); string s_fromlo_for_m = s[index_of_lo:index_of_lo + m]; </lang>

Yorick

<lang yorick>str = "abcdefgh"; n = 2; m = 3;

// starting from n character in and of m length write, strpart(str, n:n+m-1); // starting from n character in, up to the end of the string write, strpart(str, n:); // whole string minus last character write, strpart(str, :-1); // starting from a known character within the string and of m length match = strfind("d", str); write, strpart(str, [match(1), match(1)+m]); // starting from a known substring within the string and of m length match = strfind("cd", str); write, strpart(str, [match(1), match(1)+m]);</lang>