Determine if a string has all the same characters: Difference between revisions

From Rosetta Code
Content added Content deleted
(Added Algol 68)
(→‎{{header|Perl}}: added some Unicode support)
Line 285: Line 285:
use warnings;
use warnings;
use feature 'say';
use feature 'say';
use utf8;
binmode(STDOUT, ':utf8');
use List::AllUtils qw(uniq);
use List::AllUtils qw(uniq);
use Unicode::UCD 'char info';
use Unicode::Normalize qw(NFC);


for my $str (
for my $str (
Line 295: Line 299:
'tttTTT',
'tttTTT',
'4444 444k',
'4444 444k',
'Δ👍👨',
) {
"\N{LATIN CAPITAL LETTER A}\N{COMBINING DIAERESIS}\N{COMBINING MACRON}" .
printf qq{\n"$str" (length: %d) has }, length $str;
"\N{LATIN CAPITAL LETTER A WITH DIAERESIS}\N{COMBINING MACRON}" .
my @U = uniq my @S = split //, $str;
"\N{LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON}") {
my $len = 0;
$len++ while ($str =~ /\X/g);
push @res, sprintf qq{\n"$str" (length: %d) has }, $len;
my @U = uniq my @S = split //, NFC $str;
if (1 != @U and @U > 0) {
if (1 != @U and @U > 0) {
say 'different characters:';
say 'different characters:';
for my $l (@U) {
for my $l (@U) {
printf "'%s' (0x%x) in positions: %s\n",
push @res, sprintf "'%s' %s (0x%x) in positions: %s",
$l, ord($l), join ', ', map { 1+$_ } grep { $l eq $S[$_] } 0..$#S;
$l, charinfo(ord $l)->{'name'}, ord($l), join ', ', map { 1+$_ } grep { $l eq $S[$_] } 0..$#S;
}
}
} else {
} else {
Line 318: Line 327:


".55" (length: 3) has different characters:
".55" (length: 3) has different characters:
'.' (0x2e) in positions: 1
'.' FULL STOP (0x2e) in positions: 1
'5' (0x35) in positions: 2, 3
'5' DIGIT FIVE (0x35) in positions: 2, 3


"tttTTT" (length: 6) has different characters:
"tttTTT" (length: 6) has different characters:
't' (0x74) in positions: 1, 2, 3
't' LATIN SMALL LETTER T (0x74) in positions: 1, 2, 3
'T' (0x54) in positions: 4, 5, 6
'T' LATIN CAPITAL LETTER T (0x54) in positions: 4, 5, 6


"4444 444k" (length: 9) has different characters:
"4444 444k" (length: 9) has different characters:
'4' (0x34) in positions: 1, 2, 3, 4, 6, 7, 8
'4' DIGIT FOUR (0x34) in positions: 1, 2, 3, 4, 6, 7, 8
' ' (0x20) in positions: 5
' ' SPACE (0x20) in positions: 5
'k' (0x6b) in positions: 9</pre>
'k' LATIN SMALL LETTER K (0x6b) in positions: 9

"Δ👍👨" (length: 3) has different characters:
'Δ' GREEK CAPITAL LETTER DELTA (0x394) in positions: 1
'👍' THUMBS UP SIGN (0x1f44d) in positions: 2
'👨' MAN (0x1f468) in positions: 3

"ǞǞǞ" (length: 3) has the same character in all positions.</pre>


=={{header|Perl 6}}==
=={{header|Perl 6}}==

Revision as of 14:05, 4 November 2019

Determine if a string has all the same characters is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.
Task

Given a character string   (which may be empty, or have a length of zero characters):

  •   create a function/procedure/routine to:
  •   determine if all the characters in the string are the same
  •   indicate if or which character is different from the previous character
  •   display each string and its length   (as the strings are being examined)
  •   a zero─length (empty) string shall be considered as all the same character(s)
  •   process the strings from left─to─right
  •   if       all the same character,   display a message saying such
  •   if not all the same character,   then:
  •   display a message saying such
  •   display what character is different
  •   only the 1st different character need be displayed
  •   display where the different character is in the string
  •   the above messages can be part of a single message
  •   display the hexadecimal value of the different character


Use (at least) these seven test values   (strings):

  •   a string of length   0   (an empty string)
  •   a string of length   3   which contains three blanks
  •   a string of length   1   which contains:   2
  •   a string of length   3   which contains:   333
  •   a string of length   3   which contains:   .55
  •   a string of length   6   which contains:   tttTTT
  •   a string of length   9   which a blank in the middle:   4444   444k


Show all output here on this page.


Related tasks



ALGOL 68

<lang algol68>BEGIN

   # return the position of the first different character in s           #
   #     or UPB s + 1 if all the characters are the same                 #
   OP  FIRSTDIFF = ( STRING s )INT:
       IF UPB s <= LWB s
       THEN
           # 0 or 1 character                                            #
           UPB s + 1
       ELSE
           # two or more characters                                      #
           INT  result := LWB s + 1;
           CHAR c1      = s[ LWB s ];
           FOR s pos FROM LWB s + 1 TO UPB s WHILE s[ s pos ] = c1 DO result +:= 1 OD;
           result
       FI # FIRSTDIFF # ;
   # convert a character to a hex string                                 #
   PROC hex = ( CHAR c )STRING:
       BEGIN
           STRING result := "";
           INT    n      := ABS c;
           IF n = 0
           THEN
               result := "0"
           ELSE
               WHILE n > 0 DO
                   INT d = n MOD 16;
                   n OVERAB 16;
                   IF d < 10
                   THEN REPR ( d + ABS "0" )
                   ELSE REPR ( ( d - 10 ) + ABS "0" )
                   FI +=: result
               OD
           FI;
           result
       END # hex # ;
   # show whether s contains all the same character of the first diff    #
   PROC show first diff = ( STRING s )VOID:
       IF  print( ( """", s, """ (length ", whole( ( UPB s + 1 ) - LWB s, 0 ), "): " ) );
           INT diff pos = FIRSTDIFF s;
           diff pos > UPB s
       THEN
           # all characters the same                                     #
           print( ( "all characters are the same", newline ) )
       ELSE
           # not all characters are the same                             #
           print( ( "first different character """
                  , s[ diff pos ]
                  , """(0x", hex( s[ diff pos ] )
                  , ") at position: "
                  , whole( diff pos, 0 )
                  , newline
                  )
                )
       FI # show first diff # ;
   # task test cases                                                     #
   show first diff( ""          );
   show first diff( "   "       );
   show first diff( "2"         );
   show first diff( "333"       );
   show first diff( ".55"       );
   show first diff( "tttTTT"    );
   show first diff( "4444 444k" )

END</lang>

Output:
"" (length 0): all characters are the same
"   " (length 3): all characters are the same
"2" (length 1): all characters are the same
"333" (length 3): all characters are the same
".55" (length 3): first different character "5"(0x35) at position: 2
"tttTTT" (length 6): first different character "T"(0x54) at position: 4
"4444 444k" (length 9): first different character " "(0x20) at position: 5

Factor

<lang factor>USING: formatting io kernel math.parser sequences ;

find-diff ( str -- i elt ) dup ?first [ = not ] curry find ;
len. ( str -- ) dup length "%u — length %d — " printf ;
same. ( -- ) "contains all the same character." print ;
diff. ( -- ) "contains a different character at " write ;
not-same. ( i elt -- )
   dup >hex diff. "index %d: '%c' (0x%s)\n" printf ;
sameness-report. ( str -- )
   dup len. find-diff dup [ not-same. ] [ 2drop same. ] if ;

{

   ""
   "   "
   "2"
   "333"
   ".55"
   "tttTTT"
   "4444 444k"

} [ sameness-report. ] each</lang>

Output:
"" — length 0 — contains all the same character.
"   " — length 3 — contains all the same character.
"2" — length 1 — contains all the same character.
"333" — length 3 — contains all the same character.
".55" — length 3 — contains a different character at index 1: '5' (0x35)
"tttTTT" — length 6 — contains a different character at index 3: 'T' (0x54)
"4444 444k" — length 9 — contains a different character at index 4: ' ' (0x20)

Go

<lang go>package main

import "fmt"

func analyze(s string) {

   chars := []rune(s)
   le := len(chars)
   fmt.Printf("Analyzing %q which has a length of %d:\n", s, le)
   if le > 1 {
       for i := 1; i < le; i++ {
           if chars[i] != chars[i-1] {
               fmt.Println("  Not all characters in the string are the same.")
               fmt.Printf("  %q (%#[1]x) is different at position %d.\n\n", chars[i], i+1)
               return
           }
       }
   }
   fmt.Println("  All characters in the string are the same.\n")

}

func main() {

   strings := []string{
       "",
       "   ",
       "2",
       "333",
       ".55",
       "tttTTT",
       "4444 444k",
       "pépé",
       "🐶🐶🐺🐶",
       "🎄🎄🎄🎄",
   }
   for _, s := range strings {
       analyze(s)
   }

}</lang>

Output:
Analyzing "" which has a length of 0:
  All characters in the string are the same.

Analyzing "   " which has a length of 3:
  All characters in the string are the same.

Analyzing "2" which has a length of 1:
  All characters in the string are the same.

Analyzing "333" which has a length of 3:
  All characters in the string are the same.

Analyzing ".55" which has a length of 3:
  Not all characters in the string are the same.
  '5' (0x35) is different at position 2.

Analyzing "tttTTT" which has a length of 6:
  Not all characters in the string are the same.
  'T' (0x54) is different at position 4.

Analyzing "4444 444k" which has a length of 9:
  Not all characters in the string are the same.
  ' ' (0x20) is different at position 5.

Analyzing "pépé" which has a length of 4:
  Not all characters in the string are the same.
  'é' (0xe9) is different at position 2.

Analyzing "🐶🐶🐺🐶" which has a length of 4:
  Not all characters in the string are the same.
  '🐺' (0x1f43a) is different at position 3.

Analyzing "🎄🎄🎄🎄" which has a length of 4:
  All characters in the string are the same.

Pascal

<lang pascal>program SameNessOfChar; {$IFDEF FPC}

  {$MODE DELPHI}{$OPTIMIZATION ON,ALL}{$CODEALIGN proc=16}{$ALIGN 16}

{$ELSE}

 {$APPTYPE CONSOLE}

{$ENDIF} uses

 sysutils;//Format 

const

 TestData : array[0..6] of String =
    (,'   ','2','333','.55','tttTTT','4444 444k');

function PosOfDifferentChar(const s: String):NativeInt; var

 i: Nativeint;
 ch:char;

Begin

 result := length(s);
 IF result < 2 then
   EXIT;
 ch := s[1];
 i := 2;
 while (i< result) AND (S[i] =ch) do
   inc(i);
 result := i;

end;

procedure OutIsAllSame(const s: String); var

 l,len: NativeInt;

Begin

 l := PosOfDifferentChar(s);
 len := Length(s);
 write('"',s,'" of length ',len);
 IF l = len then
   writeln(' contains all the same character')
 else
   writeln(Format(' is different at position %d "%s" (0x%X)',[l,s[l],Ord(s[l])]));

end;

var

 i : NativeInt;

begin

 For i := Low(TestData) to HIgh(TestData) do
   OutIsAllSame(TestData[i]);

end.</lang>

Output:
"" of length 0 contains all the same character
"   " of length 3 contains all the same character
"2" of length 1 contains all the same character
"333" of length 3 contains all the same character
".55" of length 3 is different at position 2 "5" (0x35)
"tttTTT" of length 6 is different at position 4 "T" (0x54)
"4444 444k" of length 9 is different at position 5 " " (0x20)

Perl

<lang perl>use strict; use warnings; use feature 'say'; use utf8; binmode(STDOUT, ':utf8'); use List::AllUtils qw(uniq); use Unicode::UCD 'char info'; use Unicode::Normalize qw(NFC);

for my $str (

   ,
   '   ',
   '2',
   '333',
   '.55',
   'tttTTT',
   '4444 444k',
   'Δ👍👨',
   "\N{LATIN CAPITAL LETTER A}\N{COMBINING DIAERESIS}\N{COMBINING MACRON}" .
   "\N{LATIN CAPITAL LETTER A WITH DIAERESIS}\N{COMBINING MACRON}" .
   "\N{LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON}") {
   my $len = 0;
   $len++ while ($str =~ /\X/g);
   push @res, sprintf qq{\n"$str" (length: %d) has }, $len;
   my @U = uniq my @S = split //, NFC $str;
   if (1 != @U and @U > 0) {
       say 'different characters:';
       for my $l (@U) {
           push @res, sprintf "'%s' %s (0x%x) in positions: %s",
               $l, charinfo(ord $l)->{'name'}, ord($l), join ', ', map { 1+$_ } grep { $l eq $S[$_] } 0..$#S;
       }
   } else {
       say 'the same character in all positions.'
   }

}</lang>

Output:
"" (length: 0) has the same character in all positions.

"   " (length: 3) has the same character in all positions.

"2" (length: 1) has the same character in all positions.

"333" (length: 3) has the same character in all positions.

".55" (length: 3) has different characters:
'.' FULL STOP (0x2e) in positions: 1
'5' DIGIT FIVE (0x35) in positions: 2, 3

"tttTTT" (length: 6) has different characters:
't' LATIN SMALL LETTER T (0x74) in positions: 1, 2, 3
'T' LATIN CAPITAL LETTER T (0x54) in positions: 4, 5, 6

"4444 444k" (length: 9) has different characters:
'4' DIGIT FOUR (0x34) in positions: 1, 2, 3, 4, 6, 7, 8
' ' SPACE (0x20) in positions: 5
'k' LATIN SMALL LETTER K (0x6b) in positions: 9

"Δ👍👨" (length: 3) has different characters:
'Δ' GREEK CAPITAL LETTER DELTA (0x394) in positions: 1
'👍' THUMBS UP SIGN (0x1f44d) in positions: 2
'👨' MAN (0x1f468) in positions: 3

"ǞǞǞ" (length: 3) has the same character in all positions.

Perl 6

Works with: Rakudo version 2019.07.1

The last string demonstrates how Perl 6 can recognize that glyphs made up of different combinations of combining characters can compare the same. It is built up from explicit codepoints to show that each of the glyphs is made up of different combinations.

<lang perl6> -> $str {

   my $i = 0;
   print "\n{$str.perl} (length: {$str.chars}), has ";
   my %m;
   %m{$_}.push: ++$i for $str.comb;
   if %m > 1 {
       say "different characters:";
       say "'{.key}' ({.key.uninames}; hex ordinal: {(.key.ords).fmt: "0x%X"})" ~
       " in positions: {.value.join: ', '}" for %m.sort( *.value[0] );
   } else {
       say "the same character in all positions."
   }


} for

   ,
   '   ',
   '2',
   '333',
   '.55',
   'tttTTT',
   '4444 444k',
   '🇬🇧🇬🇧🇬🇧🇬🇧',
   "\c[LATIN CAPITAL LETTER A]\c[COMBINING DIAERESIS]\c[COMBINING MACRON]" ~
   "\c[LATIN CAPITAL LETTER A WITH DIAERESIS]\c[COMBINING MACRON]" ~
   "\c[LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON]"</lang>
Output:
"" (length: 0), has the same character in all positions.

"   " (length: 3), has the same character in all positions.

"2" (length: 1), has the same character in all positions.

"333" (length: 3), has the same character in all positions.

".55" (length: 3), has different characters:
'.' (FULL STOP; hex ordinal: 0x2E) in positions: 1
'5' (DIGIT FIVE; hex ordinal: 0x35) in positions: 2, 3

"tttTTT" (length: 6), has different characters:
't' (LATIN SMALL LETTER T; hex ordinal: 0x74) in positions: 1, 2, 3
'T' (LATIN CAPITAL LETTER T; hex ordinal: 0x54) in positions: 4, 5, 6

"4444 444k" (length: 9), has different characters:
'4' (DIGIT FOUR; hex ordinal: 0x34) in positions: 1, 2, 3, 4, 6, 7, 8
' ' (SPACE; hex ordinal: 0x20) in positions: 5
'k' (LATIN SMALL LETTER K; hex ordinal: 0x6B) in positions: 9

"🇬🇧🇬🇧🇬🇧🇬🇧" (length: 4), has the same character in all positions.

"ǞǞǞ" (length: 3), has the same character in all positions.

Python

Functional

What we are testing here is the cardinality of the set of characters from which a string is drawn, so the first thought might well be to use set.

On the other hand, itertools.groupby has the advantage of yielding richer information (the list of groups is ordered), for less work.

Works with: Python version 3.7

<lang python>Determine if a string has all the same characters

from itertools import groupby


  1. firstDifferingCharLR :: String -> Either String Dict

def firstDifferingCharLR(s):

   Either a message reporting that no character changes were
      seen, or a dictionary with details of the  first character
      (if any) that differs from that at the head of the string.
   
   def details(xs):
       c = xs[1][0]
       return {
           'char': repr(c),
           'hex': hex(ord(c)),
           'index': s.index(c),
           'total': len(s)
       }
   xs = list(groupby(s))
   return Right(details(xs)) if 1 < len(xs) else (
       Left('Total length ' + str(len(s)) + ' - No character changes.')
   )


  1. TEST ----------------------------------------------------
  2. main :: IO ()

def main():

   Test of 7 strings
   print(fTable('First, if any, points of difference:\n')(repr)(
       either(identity)(
           lambda dct: dct['char'] + ' (' + dct['hex'] +
           ') at character ' + str(1 + dct['index']) +
           ' of ' + str(dct['total']) + '.'
       )
   )(firstDifferingCharLR)([
       ,
       '   ',
       '2',
       '333',
       '.55',
       'tttTTT',
       '4444 444'
   ]))


  1. GENERIC -------------------------------------------------
  1. either :: (a -> c) -> (b -> c) -> Either a b -> c

def either(fl):

   The application of fl to e if e is a Left value,
      or the application of fr to e if e is a Right value.
   
   return lambda fr: lambda e: fl(e['Left']) if (
       None is e['Right']
   ) else fr(e['Right'])


  1. identity :: a -> a

def identity(x):

   The identity function.
   return x


  1. fTable :: String -> (a -> String) ->
  2. (b -> String) -> (a -> b) -> [a] -> String

def fTable(s):

   Heading -> x display function -> fx display function ->
      f -> xs -> tabular string.
   
   def go(xShow, fxShow, f, xs):
       ys = [xShow(x) for x in xs]
       w = max(map(len, ys))
       return s + '\n' + '\n'.join(map(
           lambda x, y: y.rjust(w, ' ') + ' -> ' + fxShow(f(x)),
           xs, ys
       ))
   return lambda xShow: lambda fxShow: lambda f: lambda xs: go(
       xShow, fxShow, f, xs
   )


  1. Left :: a -> Either a b

def Left(x):

   Constructor for an empty Either (option type) value
      with an associated string.
   
   return {'type': 'Either', 'Right': None, 'Left': x}


  1. Right :: b -> Either a b

def Right(x):

   Constructor for a populated Either (option type) value
   return {'type': 'Either', 'Left': None, 'Right': x}


  1. MAIN ---

if __name__ == '__main__':

   main()</lang>
Output:
First, if any, points of difference:

        '' -> Total length 0 - No character changes.
     '   ' -> Total length 3 - No character changes.
       '2' -> Total length 1 - No character changes.
     '333' -> Total length 3 - No character changes.
     '.55' -> '5' (0x35) at character 2 of 3.
  'tttTTT' -> 'T' (0x54) at character 4 of 6.
'4444 444' -> ' ' (0x20) at character 5 of 8.

REXX

<lang rexx>/*REXX program verifies that all characters in a string are all the same (character). */ @chr= ' [character' /* define a literal used for SAY.*/ @all= 'all the same character for string (length' /* " " " " " " */ @.= /*define a default for the @. array. */ parse arg x /*obtain optional argument from the CL.*/ if x\= then @.1= x /*if user specified an arg, use that. */

         else do;   @.1=                        /*use this null string if no arg given.*/
                    @.2= '   '                  /* "    "          "    "  "  "    "   */
                    @.3= 2                      /* "    "          "    "  "  "    "   */
                    @.4= 333                    /* "    "          "    "  "  "    "   */
                    @.5= .55                    /* "    "          "    "  "  "    "   */
                    @.6= 'tttTTT'               /* "    "          "    "  "  "    "   */
                    @.7= 4444 444k              /* "    "          "    "  "  "    "   */
              end                               /* [↑]  seventh value contains a blank.*/
    do j=1;    L= length(@.j)                   /*obtain the length of an array element*/
    if j>1  &  L==0     then leave              /*if arg is null and  J>1, then leave. */
    r= allSame(@.j)                             /*R:  ≡0,  or the location of bad char.*/
    if r\==0  then ?= substr(@.j,r,1)           /*if  not  monolithic, obtain the char.*/
    if r==0   then say '   ' @all L"):"  @.j
              else say 'not' @all L"):"  @.j  @chr ?  "('"c2x(?)"'x)  at position"  r"]."
    end   /*j*/

exit /*stick a fork in it, we're all done. */ /*──────────────────────────────────────────────────────────────────────────────────────*/ allSame: procedure; parse arg y /*get a value from the argument list. */

        if y==  then return 0                 /*if  Y  is null,  then return 0 (zero)*/
        return verify(y, left(y,1) )            /*All chars the same?   Return 0 (zero)*/
                                                /*                else  return location*/</lang>
output   when using the internal default inputs:
    all the same character for string (length 0):
    all the same character for string (length 3):
    all the same character for string (length 1): 2
    all the same character for string (length 3): 333
not all the same character for string (length 3): .55      [character 5 ('35'x)  at position 2].
not all the same character for string (length 6): tttTTT      [character T ('54'x)  at position 4].
not all the same character for string (length 9): 4444 444K      [character   ('20'x)  at position 5].

zkl

<lang zkl>fcn stringSameness(str){ // Does not handle Unicode

  sz,unique,uz := str.len(), str.unique(), unique.len();
  println("Length %d: \"%s\"".fmt(sz,str));
  if(sz==uz or uz==1) println("\tSame character in all positions");
  else
     println("\tDifferent: ",
       unique[1,*].pump(List,

'wrap(c){ "'%s' (0x%x)[%d]".fmt(c,c.toAsc(), str.find(c)+1) })

       .concat(", "));

}</lang> <lang zkl>testStrings:=T("", " ", "2", "333", ".55", "tttTTT", "4444 444k"); foreach s in (testStrings){ stringSameness(s) }</lang>

Output:
Length 0: ""
	Same character in all positions
Length 3: "   "
	Same character in all positions
Length 1: "2"
	Same character in all positions
Length 3: "333"
	Same character in all positions
Length 3: ".55"
	Different: '5' (0x35)[2]
Length 6: "tttTTT"
	Different: 'T' (0x54)[4]
Length 9: "4444 444k"
	Different: ' ' (0x20)[5], 'k' (0x6b)[9]