DB2 Database Encoding

This section provides important information about issues with DB2 and DB2 for z/OS database encoding, related sizing information, and action you need to consider taking.

What is the Issue?

When using a multi-byte character set (MBCS) and/or encoding DB2 processes columns with respect to their byte size, not their character length. This means that a CHAR, VARCHAR, or CLOB column, when using multi-byte characters, may store fewer characters, depending on the actual character length(s), than the column length specification indicates.

Consider the following illustration:

In the case of the single-byte data the string will fit and processing will be successful; but, in the case of the multi-byte data the string will not fit, resulting in overflow errors at run time. That is, normally an IBM Cúram Social Program Management web client will capture and report field size errors in a user-friendly manner. But, in a case as above, because it checks the number of characters and not the byte length, the client will not capture this size mismatch, causing the user to receive an "un-handled server exception" error, which is an underlying SQL Code -302 error.

How Cúram Addresses the Issue

Cúram provides modeling and build-time capabilities to resize its database columns to address the issue above. These capabilities are described further in the Cúram Modeling Reference Guide and Cúram Server Developer's Guide.

Because Cúram provides support for multiple languages out-of-the-box its support for MBCS data is enabled by default with the maximum expansion set. These expansion settings are appropriate to ensure that new users, testing environments, etc. do not encounter any errors due to their language, encoding, and database sizing. Also, users may find they require MBCS data when they import or copy/paste data from other applications into their Cúram system. However, these defaults may not be appropriate for all environments. The following section describes some considerations for changing these expansion settings.

What You Need to Consider

It is very important to carefully consider your data encoding requirements with respect to DB2 and Cúram in order to avoid unexpected behavior with how the database stores characters.

The preceding illustration represents a boundary case in that the data length matches the maximum column width. In many cases it's unlikely that even with MBCS characters that an overflow situation will occur since most data doesn't reach the maximum defined size; however, you do need to be prepared for the possibility of these error situations.

You should use the database character set encoding appropriate to your application and environment. If possible, you should consider using an SBCS and encoding that supports your requirements. For example, CP1252 supports most Western European characters. However, CP1252 (or other SBCS encodings) may not support characters coming from different or "broader" character sets/encodings (e.g. UTF-8) that users may be used to copying and pasting into their browser for Cúram.

At the point of installing your DB2 (or DB2 for z/OS) database you only need to identify your requirement for SBCS or MBCS data and be prepared to take appropriate action before building your Cúram database: