String Character Length: Difference between revisions

From Rosetta Code
Content added Content deleted
mNo edit summary
Line 1: Line 1:
{{task}}
[http://ustall.org/web/capodanno-menu.htm capodanno menu] [http://nobinters.org/fucili-subacquei.htm fucili subacquei diving pesca] [http://davte.info/library/html/frua.htm frua] [http://helmed.info/topic/telefoni/ telefoni 3] [http://amohseni.info/img/styles/vieste-gargano/ vieste gargano] [http://amohseni.info/img/styles/tommy-v/ tommy v] [http://amohseni.info/img/styles/ramundo-maria/ ramundo maria grazia] [http://bloprofeldi.info/lib/cartoonnetwor.htm cartoon-networ] [http://ashythro.info/library/html/f-sony/ f828 sony] [http://amohseni.info/img/styles/e-pace/ e pace intima gen rosso] [http://psisemiya.com/images/small/novation-x/ novation x station 61] [http://amohseni.info/img/styles/power-rack/ power rack] [http://amohseni.info/img/styles/handset-manager/ handset manager 8 0] [http://chaba.info/zurigo-alberghi/ zurigo alberghi e hotel] [http://saibso.org/i-film.htm i film porno gratuiti del web] [http://kinunia.cn/la-porta.htm la porta dellinferno] [http://davte.info/library/html/testi-di.htm testi di canzioni] [http://kinunia.cn/aspirapolvere-ad.htm aspirapolvere ad acqua con filtro hepa] [http://kinunia.cn/calendari-.htm calendari 2005 manuela arcuri] [http://chaba.info/agriturismo-tortona/ agriturismo tortona] [http://helmed.info/topic/ez-antivirus/ ez antivirus] [http://amohseni.info/img/styles/www-gold/ www gold] [http://chaba.info/usb-/ usb 2 controller 0 pci] [http://ustall.org/web/incontri-supereva.htm incontri supereva com] [http://nobinters.org/discariche.htm discariche] [http://ashythro.info/library/html/asrock-k/ asrock k8 939] [http://amohseni.info/img/styles/toner-konica/ toner konica minolta 1300] [http://helmed.info/topic/televisori/ televisori 33] [http://davte.info/library/html/financial-times.htm financial times] [http://saibso.org/gruppo-torinese.htm gruppo torinese trasporti] [http://ustall.org/web/concerto-di.htm concerto di tiziano ferro] [http://psisemiya.com/images/small/big-matura/ big matura] [http://ustall.org/web/tagliatelle.htm tagliatelle] [http://ustall.org/web/come-fare.htm come fare lamore con un negro senza stancarsi] [http://kinunia.cn/philips-crt.htm philips crt 107e61] [http://bloprofeldi.info/lib/swaroski.htm swaroski] [http://nobinters.org/le-avventure.htm le avventure di garfild] [http://amohseni.info/img/styles/electro-house/ electro house] [http://ustall.org/web/forni-rex.htm forni rex incasso] [http://amohseni.info/img/styles/tshirt-donna/ t-shirt donna manica lunga] [http://helmed.info/topic/to-breath/ to breath maroon5] [http://helmed.info/topic/lettore-mp/ lettore mp3 scott] [http://bloprofeldi.info/lib/samsung-smt.htm samsung sm214t] [http://saibso.org/chomsky-libri.htm chomsky libri] [http://davte.info/library/html/in-the.htm in the sahdow] [http://davte.info/library/html/polo-comfortline.htm polo comfortline 2000] [http://helmed.info/topic/www-cerca/ www cerca amore it] [http://psisemiya.com/images/small/fotos-juliana/ fotos juliana bbb] [http://nobinters.org/canzoni-testi.htm canzoni testi haiducii] [http://saibso.org/www-telecom.htm www telecom com co] [http://saibso.org/hotel-a.htm hotel a bolzano] [http://saibso.org/joao-pedro.htm joao pedro pais] [http://davte.info/library/html/rover-streetwise.htm rover streetwise] [http://psisemiya.com/images/small/soluzione-per/ soluzione per painkiller pc] [http://psisemiya.com/images/small/eleonore-pedron/ eleonore pedron nuda] [http://ustall.org/web/comune-di.htm comune di fara gera d adda] [http://nobinters.org/toccata-e.htm toccata e fuga in e minore] [http://bloprofeldi.info/lib/calendari-sado.htm calendari sado maso] [http://helmed.info/topic/teleobiettivo-kodak/ teleobiettivo kodak] [http://ashythro.info/library/html/e-il/ e il treno va] [http://davte.info/library/html/giorni-dinverno.htm giorni dinverno] [http://amohseni.info/img/styles/lnb-invacom/ lnb invacom] [http://bloprofeldi.info/lib/soluzione-giochi.htm soluzione giochi playstation 2] [http://chaba.info/sdm-hs/ sdm hs95] [http://psisemiya.com/images/small/karibu/ karibu] [http://psisemiya.com/images/small/iomega-micro/ iomega micro 512] [http://amohseni.info/img/styles/hard-disk/ hard disk hitachi 40gb per notebook] [http://ustall.org/web/www-rossocorsa.htm www rossocorsa it] [http://chaba.info/ball/ ball] [http://amohseni.info/img/styles/bordeaux-henry/ bordeaux, henry] [http://chaba.info/non-vale/ non vale la pena enamorarse] [http://ashythro.info/library/html/cartelloni-illustrativi/ cartelloni illustrativi modelli] [http://kinunia.cn/imco.htm imco] [http://nobinters.org/suoneria-di.htm suoneria di sky] [http://bloprofeldi.info/lib/te-amo.htm te amo a ti de la foto] [http://nobinters.org/santuario-isernia.htm santuario isernia] [http://helmed.info/topic/elettronica-morlacco/ elettronica morlacco] [http://ustall.org/web/nominativi-degli.htm nominativi degli agenti di commercio di] [http://davte.info/library/html/the-dub.htm the dub side of moon] [http://ustall.org/web/midi-no.htm midi no me quiero enamorar] [http://helmed.info/topic/mi-piaci/ mi piaci tu] [http://nobinters.org/phim-de.htm phim de coi] [http://psisemiya.com/images/small/tutto-sulla/ tutto sulla piccola e media traina] [http://helmed.info/topic/lettori-mp/ lettori mp3 120 gb] [http://nobinters.org/potatura-piante.htm potatura piante] [http://helmed.info/topic/prefabbricato-sicilia/ prefabbricato sicilia] [http://nobinters.org/lei-.htm lei 22 anni] [http://kinunia.cn/fuoco-a.htm fuoco a cartagena] [http://chaba.info/usato-chrysler/ usato chrysler] [http://amohseni.info/img/styles/archos-borsa/ archos borsa] [http://chaba.info/marchesina/ marchesina] [http://amohseni.info/img/styles/profumo-one/ profumo one] [http://nobinters.org/sud-carolina.htm sud carolina ristorante] [http://psisemiya.com/images/small/cento-campane/ cento campane mp3] [http://amohseni.info/img/styles/figure-colorate/ figure colorate] [http://davte.info/library/html/tuta-pilota.htm tuta pilota] [http://amohseni.info/img/styles/cefalu/ cefalu 2005] [http://nobinters.org/indirizzo-azienda.htm indirizzo azienda] [http://nobinters.org/lutero-lutero.htm lutero lutero] [http://psisemiya.com/images/small/iwan-czarne/ iwan czarne oczy] [http://chaba.info/avent-biberon/ avent biberon] [http://bloprofeldi.info/lib/bmw-.htm bmw 330 coupe] [http://ashythro.info/library/html/audi-tt/ audi tt in toscana] [http://chaba.info/quando-volvera/ quando volvera] [http://ashythro.info/library/html/palmari-qtek/ palmari qtek 2020] [http://amohseni.info/img/styles/nokia-/ nokia 6680 blu] [http://kinunia.cn/pioggia-d.htm pioggia d estate] [http://kinunia.cn/orchidea.htm orchidea 71] [http://kinunia.cn/ww-ph.htm ww ph care] [http://bloprofeldi.info/lib/logitech-lx.htm logitech lx7 cordless optical mouse] [http://bloprofeldi.info/lib/fuso-orario.htm fuso orario] [http://ustall.org/web/legna-per.htm legna per pizzerie] [http://chaba.info/agenzia-generali/ agenzia generali assicurazioni] [http://bloprofeldi.info/lib/blaupunkt.htm blaupunkt] [http://kinunia.cn/minni.htm minni] [http://nobinters.org/vicario-comunication.htm vicario comunication] [http://ustall.org/web/casa-di.htm casa di] [http://nobinters.org/rover.htm rover 25 1.4] [http://chaba.info/epson-t/ epson t041] [http://saibso.org/elcangri.htm elcangri] [http://helmed.info/topic/piedi-collant/ piedi collant] [http://ustall.org/web/chocolate-game.htm chocolate game] [http://davte.info/library/html/philips-.htm philips 17 monitor] [http://ustall.org/web/fax-colori.htm fax colori] [http://kinunia.cn/sony-.htm sony 32 100 hz] [http://chaba.info/sai-che/ sai che e un attimo] [http://ustall.org/web/le-inene.htm le inene] [http://kinunia.cn/voli-austria.htm voli austria] [http://helmed.info/topic/supreme/ supreme] [http://ustall.org/web/nokia-cellulari.htm nokia cellulari 6100] [http://chaba.info/www-tutto/ www tutto costantino com] [http://kinunia.cn/la-cumbia.htm la cumbia de los trapos] [http://nobinters.org/ninja-gaiden.htm ninja gaiden black] [http://ustall.org/web/tomtom-navigator.htm tomtom navigator 5 software e mappe] [http://chaba.info/dreamland/ dreamland] [http://bloprofeldi.info/lib/nomadi-il.htm nomadi il re nudo] [http://kinunia.cn/madonna-re.htm madonna re invention world tour 2004] [http://saibso.org/tv-hqpsr.htm tv hqp421sr] [http://psisemiya.com/images/small/key-for/ key for yankse] [http://kinunia.cn/coito.htm coito] [http://saibso.org/toshiba-satellite.htm toshiba satellite l10-125] [http://nobinters.org/feme-like.htm feme like u] [http://chaba.info/il-calendario/ il calendario di giorgia palmas 2005] [http://davte.info/library/html/estefan-hoy.htm estefan hoy] [http://amohseni.info/img/styles/www-lanebryantcatalog/ www lanebryantcatalog com] [http://amohseni.info/img/styles/niht/ niht] [http://nobinters.org/notebook.htm notebook 13] [http://helmed.info/topic/mission-mas/ mission m3as] [http://davte.info/library/html/i-simpson.htm i simpson - stagione 1] [http://saibso.org/scope-elettriche.htm scope elettriche lava] [http://chaba.info/wusbgp-wireless/ wusb54gp wireless] [http://kinunia.cn/stima.htm stima] [http://psisemiya.com/images/small/tim-un/ tim. un uomo da odiare] [http://kinunia.cn/jeanette-winterson.htm jeanette winterson] [http://amohseni.info/img/styles/abrasive-blasting/ abrasive blasting] [http://amohseni.info/img/styles/ex-z/ ex z55] [http://davte.info/library/html/lettore-divx.htm lettore divx lg dvx 9900] [http://ustall.org/web/motorizzazione-como.htm motorizzazione como] [http://chaba.info/franchising-immobiliare/ franchising immobiliare] [http://bloprofeldi.info/lib/freelander.htm freelander 2002] [http://psisemiya.com/images/small/immagini-disney/ immagini disney hentai] [http://chaba.info/pornodonne/ pornodonne] [http://saibso.org/lcd-x.htm lcd 1920x1080] [http://chaba.info/thermaltake-polo/ thermaltake polo 735] [http://ashythro.info/library/html/weekend-lunghi/ weekend lunghi al mare] [http://nobinters.org/madasun.htm madasun] [http://nobinters.org/annuncio-lavoro.htm annuncio lavoro molise] [http://nobinters.org/luomo-di.htm luomo di hong kong] [http://nobinters.org/netgear-adsl.htm netgear adsl modem router] [http://saibso.org/sigma-mm.htm sigma 55-200mm f4-5 6 dc] [http://nobinters.org/pencere.htm pencere] [http://ustall.org/web/casa-vacanza.htm casa vacanza emilia romagna] [http://ashythro.info/library/html/modello-side/ modello side] [http://davte.info/library/html/tata-young.htm tata young i believe in love] [http://chaba.info/vicopisano/ vicopisano] [http://helmed.info/topic/parrucchieri-on/ parrucchieri on line] [http://ustall.org/web/digicom-isdn.htm digicom isdn] [http://davte.info/library/html/www-scienze.htm www scienze uniba it] [http://saibso.org/pegperego-navetta.htm peg-perego navetta] [http://ustall.org/web/hard-disk.htm hard disk esterno 500] [http://chaba.info/casin/ casin] [http://nobinters.org/qtek-s.htm qtek s100 italia] [http://chaba.info/gigaset-segreteria/ gigaset segreteria] [http://ustall.org/web/ferro-termozeta.htm ferro termozeta] [http://davte.info/library/html/manuela-falorni.htm manuela falorni] [http://saibso.org/prezzi-alpitour.htm prezzi alpitour] [http://helmed.info/topic/cardiofrequenzimetri-polar/ cardiofrequenzimetri polar] [http://bloprofeldi.info/lib/dice-finley.htm dice finley quaye] [http://ustall.org/web/himno-nacional.htm himno nacional de brasil] [http://helmed.info/topic/universita-venezia/ universita venezia] [http://kinunia.cn/paul-theroux.htm paul theroux] [http://ustall.org/web/severina-vuckovic.htm severina vuckovic stolen home video] [http://ashythro.info/library/html/sesso-extreme/ sesso extreme] [http://bloprofeldi.info/lib/tyler-hero.htm tyler hero] [http://ashythro.info/library/html/ermengarda/ ermengarda] [http://ustall.org/web/panasonic-dsnap.htm panasonic d-snap sv-av50] [http://bloprofeldi.info/lib/embarazadas-con.htm embarazadas con perros] [http://psisemiya.com/images/small/segni-zodiacali/ segni zodiacali immagine disegni] [http://saibso.org/testi-canzoni.htm testi canzoni anni 70] [http://bloprofeldi.info/lib/foto-tiziano.htm foto tiziano ferro] {{task}}
{{Template:split-review}}
{{Template:split-review}}
In this task, the goal is to find the <em>character</em> length of a string. This means encodings like [[UTF-8]] need to be handled properly, as there is not necessarily a one-to-one relationship between bytes and characters.
In this task, the goal is to find the <em>character</em> length of a string. This means encodings like [[UTF-8]] need to be handled properly, as there is not necessarily a one-to-one relationship between bytes and characters.
Line 59: Line 59:
char *p = (char *) string;
char *p = (char *) string;
while (*p != '\0') length ;
while (*p++ != '\0') length++;
return 0;
return 0;
Line 93: Line 93:
}
}


==[[C plus plus|C ]]==
==[[C plus plus|C++]]==
[[Category:C plus plus|C ]]
[[Category:C plus plus|C++]]


'''Standard:''' [[ISO C plus plus|ISO C ]] (AKA [[C plus plus 98|C 98]]):
'''Standard:''' [[ISO C plus plus|ISO C++]] (AKA [[C plus plus 98|C++98]]):


'''Compiler:''' g 4.0.2
'''Compiler:''' g++ 4.0.2


#include <string> // note: '''not''' <string.h>
#include <string> // note: '''not''' <string.h>
Line 124: Line 124:


'''Platform:''' [[.NET]]
'''Platform:''' [[.NET]]
'''Language Version:''' 1.0
'''Language Version:''' 1.0+


string s = "Hello, world!";
string s = "Hello, world!";
Line 172: Line 172:


binary
binary
: utf8 ( str -- str )
: utf8+ ( str -- str )
begin
begin
char
char+
dup c@
dup c@
11000000 and
11000000 and
Line 186: Line 186:
swap dup c@
swap dup c@
while
while
utf8
utf8+
swap 1
swap 1+
repeat drop ;
repeat drop ;


Line 216: Line 216:


Since Java 1.5, the actual number of characters can be determined by calling the codePointCount method.
Since Java 1.5, the actual number of characters can be determined by calling the codePointCount method.
String str = "\uD834\uDD2A"; //U 1D12A
String str = "\uD834\uDD2A"; //U+1D12A
int length1 = str.length(); //2
int length1 = str.length(); //2
int length2 = str.codePointCount(0, str.length()); //1
int length2 = str.codePointCount(0, str.length()); //1
Line 229: Line 229:
var len1 = str1.length; //13
var len1 = str1.length; //13
var str2 = "\uD834\uDD2A"; //U 1D12A represented by a UTF-16 surrogate pair
var str2 = "\uD834\uDD2A"; //U+1D12A represented by a UTF-16 surrogate pair
var len2 = str2.length; //2
var len2 = str2.length; //2



Revision as of 06:33, 18 August 2007

Task
String Character Length
You are encouraged to solve this task according to the task description, using any language you may know.
This task has has been split off from another task. Its programming examples are in need of review to ensure that they fit the requirements of the new task.

In this task, the goal is to find the character length of a string. This means encodings like UTF-8 need to be handled properly, as there is not necessarily a one-to-one relationship between bytes and characters.

For byte length, see String Byte Length.

ActionScript

myStrVar.length()

Ada

Compiler: GCC 4.1.2

Str    : String := "Hello World";
Length : constant Natural := Str'Length;

AppleScript

count of "Hello World"

AWK

From within any code block:

w=length("Hello, world!")      # static string example
x=length("Hello," s " world!") # dynamic string example
y=length($1)                   # input field example
z=length(s)                    # variable name example

Ad hoc program from command line:

echo "Hello, world!" | awk '{print length($0)}'

From executable script: (prints for every line arriving on stdin)

#!/usr/bin/awk -f
{print"The length of this line is "length($0)}

C

Standard: ANSI C (AKA C89):

Compiler: GCC 3.3.3

 #include <string.h>

 int main(void) 
 {
   const char *string = "Hello, world!";
   size_t length = strlen(string);
          
   return 0;
 }

or by hand:

 int main(void) 
 {
   const char *string = "Hello, world!";
   size_t length = 0;
   
   char *p = (char *) string;
   while (*p++ != '\0') length++;                                         
   
   return 0;
 }

or (for arrays of char only)

 #include <stdlib.h>
 
 int main(void)
 {
   char const s[] = "Hello, world!";
   size_t length = sizeof s - 1;
   
   return 0;
 }

For wide character strings (usually Unicode):

 #include <stdio.h>
 #include <wchar.h>
 
 int main(void) 
 {
    wchar_t *s = L"\x304A\x306F\x3088\x3046"; /* Japanese hiragana ohayou */
    size_t length;
 
    length = wcslen(s);
    printf("Length in characters = %d\n", length);
    printf("Length in bytes      = %d\n", sizeof(s) * sizeof(wchar_t));
    
    return 0;
 }

C++

Standard: ISO C++ (AKA C++98):

Compiler: g++ 4.0.2

 #include <string> // note: not <string.h>
 
 int main()
 {
   std::string s = "Hello, world!";
   // Always in characters == bytes since sizeof(char) == 1
   std::string::size_type length = s.length(); // option 1: In Characters/Bytes
   std::string::size_type size = s.size();     // option 2: In Characters/Bytes
 }

For wide character strings:

 #include <string>
 
 int main()
 {
   std::wstring s = L"\u304A\u306F\u3088\u3046";
   std::wstring::size_type length = s.length();
}

C#

Platform: .NET Language Version: 1.0+

string s = "Hello, world!";
int clength = s.Length;  // In characters
int blength = System.Text.Encoding.GetBytes(s).length; // In Bytes.

Clean

Clean Strings are unboxed arrays of characters. Characters are always a single byte. The function size returns the number of elements in an array.

import StdEnv

strlen :: String -> Int
strlen string = size string 

Start = strlen "Hello, world!"

ColdFusion

  #len("Hello World")#

Common Lisp

  (length "Hello World")

Component Pascal

  LEN("Hello, World!")

E

"Hello World".size()

Forth

The 1994 ANS standard does not have any notion of a particular character encoding, although it distinguishes between character and machine-word addresses. (There is some ongoing work on standardizing an "XCHAR" wordset for dealing with strings in particular encodings such as UTF-8.)

Interpreter: ANS Forth

The following code will count the number of UTF-8 characters in a null-terminated string. It relies on the fact that all bytes of a UTF-8 character except the first have the the binary bit pattern "10xxxxxx".

binary
: utf8+ ( str -- str )
  begin
    char+
    dup c@
    11000000 and
    10000000 <>
  until ;
decimal

: count-utf8 ( zstr -- n )
  0
  begin
    swap dup c@
  while
    utf8+
    swap 1+
  repeat drop ;

Haskell

Interpreter: GHCi 6.6, Hugs

Compiler: GHC 6.6

strlen = length "Hello, world!"

IDL

Compiler: any IDL compiler should do

 length = strlen("Hello, world!")

Java

Java encodes strings in UTF-16, which represents each character with one or two 16-bit values. The most commonly used characters are represented by one 16-bit value, while rarer ones like some mathematical symbols are represented by two.

The length method of String objects gives the number of 16-bit values used to encode a string.

String s = "Hello, world!";
int length = s.length();

Since Java 1.5, the actual number of characters can be determined by calling the codePointCount method.

String str = "\uD834\uDD2A"; //U+1D12A
int length1 = str.length(); //2
int length2 = str.codePointCount(0, str.length()); //1

JavaScript

JavaScript encodes strings in UTF-16, which represents each character with one or two 16-bit values. The most commonly used characters are represented by one 16-bit value, while rarer ones like some mathematical symbols are represented by two.

JavaScript has no built-in way to determine how many characters are in a string. However, if the string only contains commonly used characters, the number of characters will be equal to the number of 16-bit values used to represent the characters.

var str1 = "Hello, world!";
var len1 = str1.length; //13

var str2 = "\uD834\uDD2A"; //U+1D12A represented by a UTF-16 surrogate pair
var len2 = str2.length; //2

JudoScript

 //Store length of hello world in length and print it
 . length = "Hello World".length();

Lua

Interpreter: Lua 5.0 or later.

 string="Hello world"
 length=#string

mIRC Scripting Language

Interpreter: mIRC

alias stringlength { echo -a Your Name is: $len($$?="Whats your name") letters long! }

OCaml

Interpreter/Compiler: Ocaml 3.09

String.length "Hello world";;


Perl

Interpreter: Perl any 5.X

 my $length = length "Hello, world!";

PHP

 $length = strlen('Hello, world!');

PL/SQL

DECLARE
  string VARCHAR2( 50 ) := 'Hello, world!';
  stringlength NUMBER;
BEGIN
  stringlength := length( string );
END;

Python

Interpreter: Python 2.4

length = len("The length of this string will be determined")

Ruby

Library: active_support

 require 'active_support'
 puts "Hello World".chars.length

Scheme

 (string-length "Hello world")

Seed7

 length("Hello, world!")

Smalltalk

 string := 'Hello, world!".
 string size.

Standard ML

Interpreter: SML/NJ 110.60, Moscow ML 2.01 (January 2004)

Compiler: MLton 20061107

val strlen = size "Hello, world!";

Tcl

Basic version:

 string length "Hello, world!"

or more elaborately, needs Interpreter any 8.X. Tested on 8.4.12.

 fconfigure stdout -encoding utf-8; #So that Unicode string will print correctly
 set s1 "hello, world"
 set s2 "\u304A\u306F\u3088\u3046"
 puts [format "length of \"%s\" in characters is %d"  $s1 [string length $s1]]
 puts [format "length of \"%s\" in characters is %d"  $s2 [string length $s2]]

UNIX Shell

With external utilities:

Interpreter: any bourne shell

 string='Hello, world!'
 length=`echo -n "$string" | wc -c | tr -dc '0-9'`
 echo $length # if you want it printed to the terminal

With SUSv3 parameter expansion modifier:

Interpreter: Almquist SHell (NetBSD 3.0), Bourne Again SHell 3.2, Korn SHell (5.2.14 99/07/13.2), Z SHell

 string='Hello, world!'
 length="${#string}"
 echo $length # if you want it printed to the terminal


VBScript

Len(string|varname) 

Returns the length of the string|varname Returns null if string|varname is null

xTalk

Interpreter: HyperCard

 put the length of "Hello World"

or

 put the number of characters in "Hello World"