wxRuby Documentation Home

Wx::EncodingConverter

This class is capable of converting strings between two
8-bit encodings/charsets. It can also convert from/to Unicode (but only
if you compiled Widgets with USE_WCHAR_T set to 1). Only a limited subset
of encodings is supported by EncodingConverter:
FONTENCODING_ISO8859_1..15, FONTENCODING_CP1250..1257 and
FONTENCODING_KOI8.

Note

Please use MBConv classes instead
if possible. CSConv has much better support for various
encodings than EncodingConverter. EncodingConverter is useful only
if you rely on CONVERT_SUBSTITUTE mode of operation (see
Init).

Derived from

Object

See also

FontMapper,
MBConv,
Writing non-English applications

Methods

EncodingConverter.new

EncodingConverter#init

Boolean init(%(arg-type)FontEncoding% input_enc, FontEncoding output_enc, Integer method = CONVERT_STRICT)

Initialize conversion. Both output or input encoding may
be FONTENCODING_UNICODE, but only if USE_ENCODING is set to 1.
All subsequent calls to Convert
will interpret its argument
as a string in input_enc encoding and will output string in
output_enc encoding.
You must call this method before calling Convert. You may call
it more than once in order to switch to another conversion.
Method affects behaviour of Convert() in case input character
cannot be converted because it does not exist in output encoding:

CONVERT_STRICT follow behaviour of GNU Recode -just copy unconvertible characters to output and don’t change them (its integer value will stay the same)
CONVERT_SUBSTITUTE try some (lossy) substitutions – e.g. replace unconvertible latin capitals with acute by ordinarycapitals, replace en-dash or em-dash by ‘-’ etc.

Both modes guarantee that output string will have same length
as input string.

Return value

false if given conversion is impossible, true otherwise
(conversion may be impossible either if you try to convert
to Unicode with non-Unicode build of Widgets or if input
or output encoding is not supported.)

EncodingConverter#can_convert

Boolean can_convert(%(arg-type)FontEncoding% encIn, FontEncoding encOut)

Return true if (any text in) multibyte encoding encIn can be converted to
another one (encOut) losslessly.

Do not call this method with FONTENCODING_UNICODE as either
parameter, it doesn’t make sense (always works in one sense and always depends
on the text to convert in the other).

EncodingConverter#convert

Boolean convert(%(arg-type)char% input, char output) Boolean convert(%(arg-type)wchar_t% input, wchar_t output) Boolean convert(%(arg-type)char% input, wchar_t output) Boolean convert(%(arg-type)wchar_t% input, char output)

Convert input string according to settings passed to
Init and writes the result to output.

Boolean convert(%(arg-type)char% str) Boolean convert(%(arg-type)wchar_t% str)

Convert input string according to settings passed to
Init in-place, i.e. write the result to the
same memory area.

All of the versions above return if the conversion was lossless and
if at least one of the characters couldn’t be converted and was replaced
with '?' in the output. Note that if CONVERT_SUBSTITUTE was
passed to Init, substitution is considered
lossless operation.

String convert(%(arg-type)String% input)

Convert String and return new String object.

Notes

You must call Init before using this method!

wchar_t versions of the method are not available if Widgets was compiled
with USE_WCHAR_T set to 0.

EncodingConverter#get_platform_equivalents

FontEncodingArray get_platform_equivalents(%(arg-type)FontEncoding% enc, Integer platform = PLATFORM_CURRENT)

Return equivalents for given font that are used
under given platform. Supported platforms:

PLATFORM_CURRENT means the platform this binary was compiled for.

Examples:

current platform enc returned value -—————————————————————- unix CP1250 {ISO8859_2} unix ISO8859_2 {ISO8859_2} windows ISO8859_2 {CP1250} unix CP1252 {ISO8859_1,ISO8859_15}

Equivalence is defined in terms of convertibility:
two encodings are equivalent if you can convert text between
then without losing information (it may – and will – happen
that you lose special chars like quotation marks or em-dashes
but you shouldn’t lose any diacritics and language-specific
characters when converting between equivalent encodings).

Remember that this function does NOT check for presence of
fonts in system. It only tells you what are most suitable
encodings. (It usually returns only one encoding.)

Notes

EncodingConverter#get_all_equivalents

FontEncodingArray get_all_equivalents(%(arg-type)FontEncoding% enc)

Similar to
get_platform_equivalents,
but this one will return ALL
equivalent encodings, regardless of the platform, and including itself.

This platform’s encodings are before others in the array. And again, if enc is in the array,
it is the very first item in it.

[This page automatically generated from the Textile source at 2023-06-09 00:45:29 +0000]