Manual Pages for UNIX Darwin command on man multibyte
MyWebUniversity

Manual Pages for UNIX Darwin command on man multibyte

MULTIBYTE(3) BSD Library Functions Manual MULTIBYTE(3)

NAME

mmuullttiibbyyttee - multibyte and wide character manipulation functions

LLIIBBRRAARRYY

Standard C Library (libc, -lc)

SYNOPSIS

##iinncclluuddee <>

##iinncclluuddee <>

##iinncclluuddee <>

DESCRIPTION

The basic elements of some written natural languages, such as Chinese,

cannot be represented uniquely with single C chars. The C standard sup-

ports two different ways of dealing with extended natural language encod-

ings: wide characters and multibyte characters. Wide characters are an

internal representation which allows each basic element to map to a sin-

gle object of type wchart. Multibyte characters are used for input and output and code each basic element as a sequence of C chars. Individual basic elements may map into one or more (up to MBLENMAX) bytes in a

multibyte character.

The current locale (setlocale(3)) governs the interpretation of wide and

multibyte characters. The locale category LCCTYPE specifically controls

this interpretation. The wchart type is wide enough to hold the largest value in the wide character representations for all locales. Multibyte strings may contain `shift' indicators to switch to and from particular modes within the given representation. If explicit bytes are used to signal shifting, these are not recognized as separate characters

but are lumped with a neighboring character. There is always a distin-

guished `initial' shift state. Some functions (e.g., mblen(3), mbtowc(3) and wctomb(3)) maintain static shift state internally, whereas others store it in an mbstatet object passed by the caller. Shift states are

undefined after a call to setlocale(3) with the LCCTYPE or LCALL cate-

gories. For convenience in processing, the wide character with value 0 (the null wide character) is recognized as the wide character string terminator, and the character with value 0 (the null byte) is recognized as the

multibyte character string terminator. Null bytes are not permitted

within multibyte characters.

The C library provides the following functions for dealing with multibyte

characters: FFuunnccttiioonn DDeessccrriippttiioonn mblen(3) get number of bytes in a character mbrlen(3) get number of bytes in a character (restartable)

mbrtowc(3) convert a character to a wide-character code (restartable)

mbsrtowcs(3) convert a character string to a wide-character string

(restartable)

mbstowcs(3) convert a character string to a wide-character string

mbtowc(3) convert a character to a wide-character code

wcrtomb(3) convert a wide-character code to a character (restartable)

wcstombs(3) convert a wide-character string to a character string

wcsrtombs(3) convert a wide-character string to a character string

(restartable)

wctomb(3) convert a wide-character code to a character

SEE ALSO

mklocale(1), setlocale(3), stdio(3), big5(5), euc(5), gb18030(5), gb2312(5), gbk(5), mskanji(5), utf8(5) STANDARDS These functions conform to ISO/IEC 9899:1999 (``ISO C99''). BSD April 8, 2004 BSD




Contact us      |      About us      |      Term of use      |       Copyright © 2000-2019 MyWebUniversity.com ™