start page | rating of books | rating of authors | reviews | copyrights

Book HomeJava and XSLTSearch this book

8.231. utf8

While Perl's implementation of Unicode support is incomplete, the use utf8 pragma tells the Perl parser to allow UTF-8 in the program text in its current lexical scope. no utf8 tells Perl to switch back to treating text as literal bytes in the current lexical scope. You'll probably use use utf8 only for compatibility, since future versions of Perl will standardize on the UTF-8 encoding for source text.

use utf8 has the following effects: bytes with their high-bit set (identifiers, string constants, constant regular expressions, package names) will be treated as literal UTF-8 characters and regular expressions within the scope of the utf8 pragma and will default to using character semantics instead of byte semantics. For example:

@bytes_or_chars = split //, $data;  # May split to bytes if data
                                    # $data isn't UTF-8
            {
                use utf8;                       # Forces char semantics
                @chars = split //, $data;       # Splits characters
            }


Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.