Unicode migration with an Oracle database

Out of Babylon

Standard Types Instead of National Characters

If the database only occasionally needs to store special characters, such as Arabic or Kanji, you can also use what are known as national language data types (NCHAR, NVARCHAR2, or NCLOB).

Converting the database to Unicode removes this need and the "N" data types can be converted back to "normal" data types. To do this, you simply modify the CREATE command for the corresponding table.

Corrupters

Corrupters can be a difficult point to master because even if the database did everything correctly, the data are still a mess. This can happen when two systems transfer data without converting. For example, say you create the database with the WE8MSWIN1252 character set, and a client sets the variable NLS_LANG=GERMAN_GERMANY.WE8MSWIN1252 (i.e., on the command line in DOS); the database then assumes that no transcoding is necessary because the two character sets are the same. The DOS client thinks that the characters in the query are okay, because no conversion occurs in the return path either. Now, when you use a Windows client for a query (e.g., Dell Toad for Oracle), you suddenly discover that you have just saved a bunch of hieroglyphics (Figure 1).

Figure 1: Faulty character representation after a data transfer without conversion.

This error can only be fixed by unloading the data via a corresponding interface and reloading with the "right" character set. This makes the migration considerably more difficult. Fortunately, these errors are rather rare today because they are quickly identifiable in graphical tools.

Migration

The actual migration takes place in the following four steps. Figure 2 represents the sequence schematically.

Figure 2: Schematic representation of a migration.

Create a new database. The new database is set up to reflect enterprise, and Oracle, specifications with a new character set (typically AL32UTF8). One pleasant side effect is that the database is completely reorganized, which mean that superfluous data (added to the database, for example, by installing sample schemas or older applications) can now be eliminated.

Export the schema definitions as an ASCII file. The next step is to export the schema definitions without the length semantics as an ASCII file (Figure 3). This gives you the option of cleaning up.

Figure 3: Exporting a database schema as an ASCII file.

Oracle offers a package named dbms_metadata for this task. I have had very good experiences with Toad for Oracle. The tool uses a graphical interface to facilitate the process of creating the complete definition, and multiple schemas, as an ASCII file and copying it to the editor. You can then use find and replace, if needed, to make changes (for example converting NVARCHAR2 to VARCHAR2).

Import the schema definitions. Next, run the script you just created with the schema definition against the new database. Before doing so, set the NLS_LENGTH_SEMANTICS parameter for the entire session:

SQL> ALTER SESSION SET NLS_LENGTH_SEMANTICS=CHAR;

This means that all character columns (i.e., CHAR, VARCHAR2, etc.) that do not use explicit length semantics use CHAR as their length semantics. And, non-standard characters will once again fit in the fields.

Import records. The last step is to import the records – that is, to transfer the table content conveniently via Export/Import or Data Pump. When you import the data, be sure to ignore the commands that would create the tables, because they already exist. Also, you must turn off foreign key constraints because it is impossible to determine the data insertion sequence. Otherwise, you might discover that detailed records (e.g., the address of a person) cannot be inserted because the master data for the person (i.e., first name, last name, etc.) is not yet available. The only thing you are interested in is that all data are inserted at the end of the import process; the address details should then be related to the person again.

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs



Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.

Learn More”>
	</a>

<hr>		    
			</div>
		    		</div>

		<div class=