jump to navigation

NSData Naughtiness 13 July 2009

Posted by Oliver Mason in Apple, objective-c, programming.
trackback

Well, this is not exactly NSData’s fault, but I ran into a problem (for the second time; the first time I bypassed it with a short-cut) when reading text data from a file.

Occasionally there was random garbage at the end of a line, which I could not understand. Incidentally, I was reading a number of full files in one go each into an NSData instance, and converted that into an NSString with the correct encoding; this I would then tokenise and add to another file. So the garbage was actually at the end of each file. I then found that I can directly initialise an NSString with the contents of a file, and the problem disappeared.

Now I want to produce concordance lines, and I jump into the middle of the file to read a stretch. First I run into trouble with the encoding: as the data is UTF8-encoded, a random jump can end up in the middle of a multi-byte character. NSString does not like that… but here I can just test for that and skip the initial bytes. The same problem obviously also happens at the end, where the final multi-byte character could be incomplete. Again, truncation seems the easy way out.

But I also then had the issue with the occasional random garbage again! NSData seems to be at fault, and this time I can’t bypass it, as NSString can only read a full file. Quick websearch, and the solution crops up (in an aside) on stackoverflow.com: the data that NSData returns from the -bytes method is not zero-terminated, but NSString’s -stringWithUTF8String expects that, hence the random garbage of the unterminated data. In a way I’m surprised that it actually worked most of the time!

Advertisements

Comments»

No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: