Strings in Cocoa: Part I
Pages: 1, 2, 3
Comparing strings
I imagine it won't be long in your programming before you want to compare some strings for equality. In C you learned that you can do so by using the function strcmp(string1, string2):
Editor's Note -- The example below was updated and corrected on 7/05/01. Thanks to our readers for helping us present as accurate information as possible.char string1[] = "Yo";
char string2[] = "Yo";
if ( strcmp(string1, string2) == 0) {
// do the following code
}
And the conditional would evaluate to "true", executing the code within the braces of the if statement. In Cocoa, the situation is similar. Remember, whenever we declare a string
NSString *aString;
the variable aString does not actually contain the string object -- it is a pointer to some string object in memory. Another name we had for this type of variable was an object identifier, because it identifies an object in memory rather than hold the object itself. This technical detail has some very important implications, as a pointer is an address to a location in memory. Consider the following situation where we try to compare two string objects using ==:
NSString *string1 = @"A String";
NSString *string2 = @"A String";
Here we have statically created two NSString objects, and they are two separate objects, despite having been endowed with the same value. Now, if we used the C equality operator on them,
BOOL result = string1 == string2;
the equality statement would evaluate to "no" (Objective-C's "false"). That's right, "no". Yes, they look equal, but this line of code did not compare the strings -- it compared the values of their memory addresses. Since they are not the same object, they exist in unique memory locations, and consequently string1 is a different address than string2. This explains the falsity of the statement.
Now, if we had done the following,
NSString *string1 = @"A String.";
NSString *string2 = string1;
BOOL result = string1 == string2;
the equality operator would indeed return "yes", because both string1 and string2 point to the same object in memory -- they have the same address. The line NSString *string2 = string1; accomplished this by taking the address of the object that string1 points to and assigned it to string2 as well. So the addresses are now equal, as was revealed in the equality.
Now, let me make the point clear that if the data type of a variable is int, double, char, or float, the equality operator will work as expected, because these variable data types are not pointers, they actually contain the value of the data. Only object variables (and all pointer variables) fall prey to this phenomenon.
So, I hope you're convinced now that the ways of old just don't work with object-oriented programming. I want to now show you how we do comparisons and make equality judgments.
Whenever you want to check the equality of objects, you must invoke special comparison methods in the respective classes. In NSString, the most straightforward of these is the method - isEqualToString:, whose argument is an NSString object, and returns a boolean value indicating whether the receiver string is equivalent to the argument string. Now, we can truly test the equivalency of our strings:
NSString *string1 = @"A String.";
NSString *string2 = @"A String.";
BOOL result = [string1 isEqualToString:string2];
The statement really will evaluate to "yes", because the values of the objects that string1 and string2 point to are equal. A more general method - compare: allows you to determine if a string is equal to the receiver of this method, or whether the string would come before or after the receiver string (as in the ordering of a dictionary, lexical ordering). The return type of compare: is a custom Cocoa data type called NSComparisonResult, which has three possible values: NSOrderedAscending, NSOrderedSame, NSOrderedDescending (these are just constants defined in the Foundation Framework equal to the integers -1, 0, and 1 respectively. So, we could use compare in the following fashion:
NSString *string1 = @"aardvark";
NSString *string2 = @"tarsier";
BOOL result = [string1 compare:string2] == NSOrderedAscending;
Beacuase string2 comes after string1 in the alphabet, the message to string1 will return the value NSOrderedAscending, which we compare with NSOrderedAscending using the equality operator, and get "yes" as the value of result. This is equivalent to saying string2 is greater than string1.
By substituting NSOrderedSame or NSOrderedDescending in place of NSOrderedAscending, we can check to see whether the receiver (string1) is the same as the argument (string2), or whether string2 appears sooner in the alphabet than string1.
In this system, uppercase letters are "less" than lowercase letters. So the following would evaluate to "yes":
NSString *string1 = @"Aardvark";
NSString *string2 = @"aardvark";
BOOL result = [string1 compare:string2] == NSOrderedAscending;
This is because "Aardvark" occurs before "aardvark" lexically (think in terms of the order of words in a dictionary). If you want to compare strings without regard to case sensitivity, then the -caseInsensitiveCompare: is the method for you. This method in use would look like (using the same strings as the previous example):
BOOL result = [string1 caseInsensitiveCompare:string2] == NSOrderedSame;
And result would be given the value "yes" because the statement evaluates as true.
These are some of the basic methods available for string comparison for you to use. If your needs are more demanding than what is we covered here, take a closer look at the class documentation, which details many more string comparison methods that give you more flexibility and options.
Finding strings within strings
NSString provides some methods that allow us to search strings for substrings. All of the string search methods return a special data type defined in the Foundation Framework known as NSRange. NSRange is a just C struct with two components, a starting index, and a length.
The way ranges work is like this: If we had a string with 100 characters (elements), then the range {49, 50} specifies a substring whose first element is the 49th character of the parent string, and includes the following 50 elements -- that is, the last half of the parent string (remember, strings are arrays in their most fundamental form, and counting always starts from 0).
In the next few examples, we will be using the following string:
NSString *theString = @"Okay, enough about ranges.";
Suppose we want to find where in the parent string the substring "about" can be found. The method we invoke is -rangeOfString:, and here is a snippet of code we could use to illustrate the way it works:
NSString *theString = @"Okay, enough about ranges";
NSString *substring = @"about";
NSRange range = [theString rangeOfString:substring];
int location = range.location;
int length = range.length;
NSString *displayString = [[NSString alloc] initWithFormat:@"Location: %i, length: %i",
location, length];
[textField setStringValue:displayString];
Note, that NSRange is not a class, it is a C structure, so we do not type the NSRange variable range as we do classes using the pointer-star (*); it is simply NSRange. In the previous example, the range returned by the search method is where in the parent string, theString, we can find the substring; thus firstElement is 13, and length is just the length of the substring, 5. If the substring cannot be found in the parent string, then a range with length zero is returned, indicating failure.
Additionally, note how we access the elements of a C structure. NSRange is defined as the following structure:
typedef struct _NSRange {
unsigned int location;
unsigned int length;
} NSRange;
Recall from C that components of a struct variable are accessed using the variableName.component construct. So, in our example above, we access the location and length components of range in the same way: range.location, and range.length.
Extracting substrings from strings
Three methods that allow us to extract substrings from a parent string are:
-substringToIndex:-substringWithRange:-substringFromIndex:(which respectively take a substring from the beginning, middle, and end of a parent string.)
The first method, -substringToIndex:, returns a new string which is composed of the characters from the beginning of the receiver string up to, but not including, the character at the specified index. This might be used in the following way:
NSString *aString = @"Running out of ideas for strings.";
NSString *substring = [aString substringToIndex:7];
The result of this operation would be that substring now points to the string object @"Running". The method -substringFromIndex: works in the same way, except now the substring starts at the specified index of the receiver (including the character at the index), and includes all the characters to the end of the receiver. So if we wanted to get the substring "strings" out of aString, we would do the following:
NSString *substring = [aString substringFromIndex:25];
Finally, we have the method which lets us arbitrarily extract a substring from anywhere within the parent string-substringWithRange:. The argument to this method is -- as conveniently indicated by the method name (I love that about Objective-C) -- an NSRange. So, we could get the string "ideas" out of the parent string, aString this way:
NSString *substring = [aString substringWithRange:NSMakeRange(15, 5)];
Here the range starts with the 15th character, "i", and extends to include the next four characters, giving us a length of 5, "ideas".
Farewell
We've seen in this column just the fundamentals of working with string objects in Cocoa. Hopefully there is enough here to keep you busy, in addition to equipping you with the confidence to go and explore the more advanced methods of NSString. In the next column I will continue our discussion of strings by talking about how we work with paths, and I will also cover mutable strings and the NSMutableString class. Happy programming to you all! See you next time!
Michael Beam is a software engineer in the energy industry specializing in seismic application development on Linux with C++ and Qt. He lives in Houston, Texas with his wife and son.
Read more Programming With Cocoa columns.
Return to the Mac DevCenter.
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 26 of 26.
-
OK on comparision, now copying string wuold be the next step...
2003-09-19 10:47:09 anonymous2 [Reply | View]
very useful.
Can ve nice to see COPYING strings without memory leaks....
-
sample code?
2002-07-03 18:24:34 driftkop [Reply | View]
Hi,
The link to the sample code in this article is broken - where can I find it?
thanks,
- Koen.
-
Comparing @"..." type NSStrings
2001-07-06 15:30:29 halliday [Reply | View]
Incidentally, due to the uniqueness properties imposed by NSString upon NSString constants (@"..." type NSStrings), the example code:
NSString *string1 = @"A String";
NSString *string2 = @"A String";
BOOL result = string1 == string2;
evaluates to "yes" rather than "no". The compiler recognizes the second NSString to be the same as one it has already seen and simply reuses it. Since these NSStrings are immutable, there is no danger in doing so.
In fact, there are aspects of Cocoa that strongly rely upon this behavior. -
Comparing @"..." type NSStrings
2001-07-06 17:02:13 Michael Beam |
[Reply | View]
Is that right? Wow, i didn't realize that. Thanks for pointing that out. I guess it makes perfect sense. By the way, what are some of the features of Cocoa that rely on this fact as you mentioned?
Mike -
Comparing @"..." type NSStrings
2001-07-07 01:18:41 halliday [Reply | View]
Additionally, unless I'm misremembering, SELectors are implemented as unique @"..." type NSString constants as well. -
Comparing @"..." type NSStrings
2001-07-07 00:47:41 halliday [Reply | View]
As the NSString documentation indicates (in it's Objective-C version): "The compiler makes such object constants unique on a per-module basis".
Some of the features that rely upon this behavior are Pasteboard Types (they are really @"..." type NSStrings), various Attribute Keys, Exceptions, and Notifications (to name a few). Of course, this is an implementation detail, and is always subject to change.
-
Why not correct the wrong example
2001-07-05 00:42:35 darwinfo [Reply | View]
Hi out there,
as it has been discussed already, the string-compare-example is wrong.
Just for curiosity, why hasn't it been corrected in the meantime?
The correct solution is quite simple:
if (strcmp(string1, string2) == 0)
{
....
}
Greetings from Hof/Germany
Peter -
Why not correct the wrong example
2001-07-06 09:59:28 Derrick Story |
[Reply | View]
The example has been corrected with a dated Editor's Note above it so we all can keep our documentation striaght.
Thanks to all who helped!
Derrick -
Why not correct the wrong example
2001-07-06 17:06:32 briandds [Reply | View]
Um, it's still wrong. strcmp() returns zero if the strings are identical, so the code there should be:
if ( !strcmp(string1, string2) ) {
// do the following code
}
There wasn't much substance to this particular article anyway, so I don't know that it matters.
-
Pretty Good Basic Coverage, couple points
2001-07-04 18:29:51 bigboytoddy [Reply | View]
As a 10+ year vet of ObjC programming, 15+ of OO, I can say he did a nice job, a bit wordy but inviting to the newbie. Very nice to see. Mike does make a few errors, obviously accolades given to ObjC the language for method/selector names, which has nothing to do with 'Range' in a name, it has to do with authors/creators of the Frameworks/Cluster/Classes. Which brings up a side issue, and likely more important. ObjC is not pure OO, and it still shows. It never claims to be, just working hard to shed it's C ancestory. ST which ObjC is based upon, syntax mostly, and some garbage collection ideas, on the otherhand doesn't burdeon the user with the issues of types, macros to make things easier, and also having types defined in a method/selector name. Just a point, and Mike may want to reconsider why he really likes types, if he is an OO expert writing about OO in the first place.
Best wishes to all.
\t
-
Pretty Good Basic Coverage, couple points
2001-07-05 10:59:43 Michael Beam |
[Reply | View]
I'm a little unclear about what your trying to say in the last half of your comment. I think this is the context of your comment from the article:
"The argument to this method is -- as conveniently indicated by the method name (I love that about Objective-C) -- an NSRange"
And by saying "i love this about Objective-C" i wasn't trying to bring up any weighty issues of differences between OO lannguages. I was just stating that i like how readable and unambiguous (mostly, anyway) method names and code are. I realize this is a result of the guys at NeXT and Apple, and is feature of Cocoa, rather than ObjC; I guess i was too loose with my language.
By the way, i never claimed to be an expert at OOP. I said this in my first column, and my intent here has never been to project myself as such. I feel i have a good grasp on it, but i don't have the experience to debate the differences between SmallTalk and Objective-C. I'm just trying to relay to people starting out with Cocoa and Mac OS X development how to use Cocoa and how things fit together as i see it.
Its obvious that you have WAY more experience than me, so i'd love to keep the discussion going to learn more.
-
Thanks!
2001-07-04 03:41:24 kool [Reply | View]
That was usefull to me! Thanks! -
%s, %s chars in NSString
2003-04-30 12:16:05 anonymous2 [Reply | View]
Does anybody know how to deal with special characters in NSString?
For example I want to save in some .string file string template which I will use later for the formatting.
Something like
ID_FORMAT = "%sSomeString %d %d";
Right now when I read that string instead %s and %d I get something else.
In order to use it as format string I need to get back %s and all others formatting strings.
It's probably problem with string coding but which code to use for this?
Any idea?
Thanks
Kolle
LA, CA
-
operator overloading
2001-07-03 12:38:59 joshdavenport [Reply | View]
Operator overloading is polymorphic operator?
Is this accomplished (from the language users perspective) by allowing the operator, eg "=", as a method name?
Thanks to who knows.
Josh -
operator overloading
2001-07-03 15:30:42 puppybane [Reply | View]
It depends on the language. In c++, operator overloading is done by declaring a function classname::operator==(const classname * isItEqualToThis)
(I may be wrong about the exact syntax--it's been several years since I used c++)
Operator overloading is a neat feature to have, but it can cause problems. It can make the programmer's life easier (can use ==, etc instead of writing out a function call), but it also encourages bad style. For instance:
MyStringClass *string1 = "blah";
MyStringClass *string2;
string1 = string2;
This code sequence could do one of a number of things, depending on how the implementor wrote the class. It could put the string, "blah" into string2. Or, it could set string2 to point to string1! Or even worse, the implementor could have made "=" the operator for equality, and string1 = string2 could just be a comparison of the two strings!
While a useful ability, operator overloading isn't significantly easier than writing:
[myString isEqualTo:myOtherString]
or
[myString initWithString:myOtherString]
And this method makes the code easy to understand, and unambiguous. -
operator overloading
2001-07-03 17:05:01 canyonrat [Reply | View]
The assignment declaration is:
classname& operator= const classname&(rhs);
For an equality test its:
bool operator== const classname&(rhs);
This is why the the problem that you hypothesize never really happens. Even a language as weakly typed as ObjC is going to catch the difference between reference to object and bool.
The only language that I have ever used that didn't catch the difference between assignment and comparison was Java. Everything else at least issues a warning.
-
Comparing strings
2001-06-30 13:15:07 nriley [Reply | View]
char string1[] = "Yo";
char string2[] = "Yo";
if ( string1 == string2 ) {
// do the following code
}
And the conditional would evaluate to "true", executing the code within the braces of the if statement.
Er, yes, but only if the compiler decides to share the area of memory used by the string constants. The equivalent of compare: for C strings is strcmp.
I'm enjoying your article series - keep it up!
--Nicholas -
Comparing strings
2001-07-04 18:34:01 bigboytoddy [Reply | View]
OK, when would a compiler (GCC or others) not decide to share constant memory locations? Is there a directive to GCC to make every single constant a unique instance. I guess there would be, share with me where this would be useful, in DO or other networking usages? I'm very interested. Thanks.
\t
-
Comparing strings
2001-07-02 07:41:35 canyonrat [Reply | View]
Actually, comparing string1 and string2 with == didn't work when I compiled Mike's code fragment as a standard tool. This is surprising because both my memory and Help Viewer agree that gcc pools strings by default.
Wouldn't operator overloading be nice in ObjC? Comparing NSString with == should work. -
Comparing strings
2001-07-03 14:38:00 wcray [Reply | View]
> Actually, comparing string1 and string2
> with == didn't work
> when I compiled Mike's code fragment as
> a standard tool.
> This is surprising because...
Actually not unexpected. string1 and string2
were declared as unbounded char arrays, which
gcc assumes aren't constant strings (while
type char *s with initializers are assumed
to be constant).
I believe the logic is that
char foo[] = "yabba dabba";
declares foo to be a character array, and
then the initializer puts data into it, while
char *foo2 = "yabba dabba";
declares foo2 to be a pointer, and the
initializer points it to a string containing
"yabba dabba" that already exists in
the memory space.
-
Comparing strings
2001-07-03 16:44:43 canyonrat [Reply | View]
>Actually not unexpected. string1 and string2
>were declared as unbounded char arrays, which
>gcc assumes aren't constant strings (while
>type char *s with initializers are assumed
>to be constant).
That's really good to know. Thanks! -
Comparing strings
2001-07-04 18:36:41 bigboytoddy [Reply | View]
Okay, so when are each used, and why? I appreciate the ANSI C lessons here, as it seems
it is a ANSI issue, which brings me to ask why (again)? Where would this differentiation make a slack ass like me what to know the difference...?
Thanks
\t
-
Comparing strings
2001-07-04 23:33:56 canyonrat [Reply | View]
What this is telling me is that you almost always want char* s rather than char[] s. The first form gives gcc permission to be smart about checking whether you have already used the string and, if you have, just reusing it rather than saving a mew copy and bloating your code.
For example consider:
char* aString = "hello";
and later
char* bString = "hello"
the compiler has permission to notice that aString == bString and just reuse aString.
The second form says that you might want to change bString later and you don't want that to effect aString so they must be stored separately. But you won't need this second form in ObjC because you can use NSMutableString instead.
Of course the best rule for C style strings is don't use them at all. They are just too complicated and error prone. -
Comparing strings
2001-07-03 15:53:58 Michael Beam |
[Reply | View]
Thats pretty much what i came across when i was playing around with this the other day. I didn't understand the logic, but now i do. This is great stuff!--Learning these language level types of details that is. -
Comparing strings
2001-06-30 22:07:07 Michael Beam |
[Reply | View]
I see...thanks for the heads up, and i'm glad your enjoying the articles!






The way ranges work is like this: If we had a string with 100 characters (elements), then the range {49, 50} specifies a substring whose first element is the 49th character of the parent string, and includes the following 50 elements -- that is, the last half of the parent string (remember, strings are arrays in their most fundamental form, and counting always starts from 0).
Is it really so? We have 100 elements numbered from 0 to 99. Therefore the first half of the parent string must be from 0 to 49 or {0,50} and the last half - from 50 to 99 or {50,50} (but not {49,50}!!!).
Isn't it?