iSeries Data Migration Toolkit Requirements Sun Java 2 V1.4 SDK or equivalent. Key Features
extends the usefulness of IBM iSeries and AS/400 data
completely converts iSeries or AS/400 physical file DDS and data to the Java platform
converts physical file DDS into XML
converts field reference file XML into DTD
produces SQL DDL schemas for creating relational tables
produces source code Java programs that convert iSeries binary data into loader format
uses national EBCDIC to Unicode conversion
generated source code for conversion programs is customizable
substitutes user-defined characters for illegal characters in names
substitutes long descriptive field names for abbreviated field names
Description
The utility of the data processed and stored on IBM iSeries and AS/400 systems may be extended by copying the data to other platforms, where additional processing capability is available. IBM systems employ proprietary encoding schemes to store data and reformatting this data for use on other platforms requires a description of its structure to guide conversion. The structure of data stored on IBM systems is described using a proprietary language that may be interpreted and translated to produce the equivalent data structure description in a common standard language employed on other platforms.
Users of the iSeries Data Migration Toolkit (DMK) enables users of these IBM systems to easily migrate their data files to any other platform, or into the Java environment of their iSeries systems. It is a collection of tools, executing on these other platforms, that translate physical file Data Description Specifications (DDS) into equivalent Extensible Markup Language (XML) descriptions, and create Java programs from these data structure descriptions for reformatting iSeries binary data for loading into files or databases, or for analysis. The XML descriptions enable these other platforms to have the equivalent record structure description capability that DDS provides for the iSeries systems. The toolkit uses and generates standard Java as distributed freely by Sun in their SDK. The DMK is platform-independent and no software other than a text editor and the SDK are required. A DMK for System/38 DDS and data may be special ordered.
Copying iSeries Files
DDS source files and binary data files are copied from the iSeries or AS/400 system over the user's local area network (LAN) into a connected system. If LAN connectivity to the AS/400 is not available, the source files and binary data are copied to removable media readable by the target system. DDS source file encoding translation from EBCDIC into ASCII is normally automatically provided by the means used for copying between systems. Custom translators for iSeries source files with encodings untranslatable into UTF-8 may be special ordered.
DDS to XML Translation
Physical file DS source files written in the proprietary IBM specification language are translated into XML, using tags derived from the DDS language. The entire DDS specification is translated. Specifications and overrides referencing previously translated field reference files are substituted when the resulting XML description is next used. An example Java program for reading and processing XML is included. The program uses the Simple API for XML (SAX) to read and parse the XML.
Field Reference Files
DDS field reference files, the data dictionary feature of DDS, are first converted to XML, then translated into Document Type Definition (DTD) files to provide substitution entities for XML file descriptions coded with substitutable references. Changes to the field reference XML are propagated to referencing descriptions when next used. This DMK feature parallels the use of field reference files on iSeries systems.
SQL DDL Schemas
The generated XML descriptions are used by the toolkit to produce ANSI SQL Data Definition Language (DDL) schemas. The schema generator utilizes a commonly-used subset of DDS positional and keyword parameters. Date and time fields are described as plain text fields, because correct date and time translation depends on the combination of source locale and the target database. Custom implementation of these and other keywords may be provided on special order.
Data Conversion
The DMK generates Java source code programs from XML file descriptions for conversion of iSeries binary data into Unicode comma-delimited text. The conversion programs translate binary data field-by-field and record-by-record. Fields are translated according to their data types and other specifications from the subset mentioned. Text field translation is dependent on the encoding configuration of the source iSeries system as specified by DDS CCSID keyword parameters or as specified by the user. EBCDIC single-byte character (SBCS) and double-byte (DBCS) to Unicode conversions are supported through Java classes from the Java SDK. Some 40 national EBCDIC encodings and their variants are directly supported. Custom encodings may be special ordered.
The Java conversion programs are easily adaptable by programmers to reformat data files into XML for export to other environments, or for direct insertion of records into flat files, or rows into relational tables, without intermediate loader files. An alternative use of the method allows XML-described fixed record length binary data files from any source to be translated, including non-IBM multi-byte encodings. Custom conversion program generators may be special ordered.
Field Name Translation
iSeries programming languages permit the use of special characters (asterisks, etc.) in file, record and field names. Such characters are usually illegal or have special meaning in other programming languages, including Java. The DDS translator provides default or user-defined substitution of such characters.
Field Name Substitution
Use of the DDS ALIAS keyword parameter allows the toolkit to optionally substitute long data field names in DDL schema for the abbreviations imposed by the fixed form DDS field name specification. Custom logic for file, record and field name substitution may also be developed on special order.
Supported Field Types
Field Type |
Data Conversion |
Text |
yes |
Zoned Decimal |
yes |
Binary |
yes |
Floating point |
yes |
Hexadecimal |
special order |
Date |
alphanumeric |
Time |
alphanumeric |
Timestamp |
alphanumeric |
Supported Keyword Parameters
Keyword Parameter |
Description |
Translated Into XML |
Translated Into DTD |
Used In Data Conversion |
ABSVAL |
Absolute Value |
yes |
|
|
ALIAS |
Alternative Name |
yes |
yes |
|
ALTSEQ |
Alternative Collating Sequence |
yes |
|
|
ALWNULL |
Allow Null Value |
yes |
|
|
CCSID |
Coded Character Set Identifier |
yes |
|
yes |
CHECK |
Check |
yes |
yes |
|
CHKMSGID |
Check Message Identifier |
yes |
yes |
|
CMP |
Comparison |
yes |
yes |
|
COLHDG |
Column Heading |
yes |
yes |
|
COMP |
Comparison |
yes |
yes |
|
DATFMT |
Date Format |
yes |
yes |
special |
DATSEP |
Date Separator |
yes |
yes |
special |
DESCEND |
Descend |
yes |
|
|
DFT |
Default |
yes |
|
|
DIGIT |
Digit |
yes |
|
|
EDTCDE |
Edit Code |
yes |
yes |
|
EDTWRD |
Edit Word |
yes |
yes |
|
FCFO |
First-Changed First-Out |
yes |
|
|
FIFO |
First-In First-Out |
yes |
|
|
FLTPCN |
Floating-Point Precision |
yes |
yes |
yes |
FORMAT |
Format |
yes |
|
yes |
LIFO |
Last-In First-Out |
yes |
|
|
NOALTSEQ |
No Alternative Collating Sequence |
yes |
|
|
RANGE |
Range |
yes |
|
|
REF |
Reference |
yes |
|
yes |
REFFLD |
Referenced Field |
yes |
|
yes |
REFSHIFT |
Reference Shift |
yes |
yes |
|
TEXT |
Text |
yes |
no |
|
TIMFMT |
Time Format |
yes |
yes |
special |
TIMSEP |
Time Separator |
yes |
yes |
special |
UNIQUE |
Unique |
yes |
|
yes |
UNSIGNED |
Unsigned |
yes |
|
|
VALUES |
Values |
yes |
|
|
VARLEN |
Variable-Length Field |
yes |
yes |
yes |
ZONE |
Zone |
yes |
|
|
Supported EBCDIC Encodings
CCSID |
SDK Support |
Description |
37 |
Cp037 |
US, Canada, Netherlands, Portugal, Brazil, New Zealand, Australia |
256 |
|
Netherlands |
273 |
Cp273 |
Austria, Germany |
277 |
Cp277 |
Denmark, Norway |
278 |
Cp278 |
Finland, Sweden |
280 |
Cp280 |
Italy |
284 |
Cp284 |
Catalan/Spain, Spanish Latin America |
285 |
Cp285 |
United Kingdom, Ireland |
290 |
|
Japan Katakana (extended) range |
297 |
Cp297 |
France |
300 |
|
Japan, English |
420 |
Cp420 |
Arabic |
423 |
|
Greece |
424 |
Cp424 |
Hebrew |
500 |
Cp500 |
Belgium, Canada, Switzerland, International Latin-1 |
833 |
|
Korea (extended range) |
834 |
|
Korea Host DBCS |
835 |
|
DBCS Traditional Chinese Host |
838 |
Cp838 |
Thailand extended SBCS |
870 |
Cp870 |
Latin 2 Multilingual |
871 |
Cp871 |
Iceland |
875 |
Cp875 |
Greek |
880 |
|
Cyrillic Multilingual |
892 |
|
EBCDIC, OCR A |
893 |
|
EBCDIC, OCR B |
905 |
|
Turkey Latin-3 |
918 |
Cp918 |
Pakistan (Urdu) |
924 |
|
Latin 9 |
930 |
Cp930 |
Japanese Katakana-Kanji mixed with 4370 UDC, superset of 5026 |
933 |
Cp933 |
Korean Mixed with 1880 UDC, superset of 5029 |
935 |
Cp935 |
Simplified Chinese Host mixed with 1880 UDC, superset of 5031 |
937 |
Cp937 |
Traditional Chinese Host mixed with 6204 UDC, superset of 5033 |
939 |
Cp939 |
Japanese Latin Kanji mixed with 4370 UDC, superset of 5035 |
1025 |
Cp1025 |
Multilingual Cyrillic: Bulgaria, Bosnia, Herzegovina, Macedonia (FYR) |
1026 |
Cp1026 |
Latin-5 Turkey |
1027 |
|
Japanese (Latin) Extended |
1069 |
|
Latin 4 |
1087 |
|
Symbol Set (Adobe) |
1097 |
Cp1097 |
Iran (Farsi)/Persian |
1110 |
|
Latin 2 Multilingual |
1112 |
Cp1112 |
Baltic Multilingual |
1113 |
|
Latin 6 |
1122 |
Cp1122 |
Estonia |
1123 |
Cp1123 |
Ukraine |
1130 |
|
Vietnamese |
1132 |
|
Lao |
1136 |
|
Hitachi Katakana |
1137 |
|
Devanagari |
1140 |
Cp1140 |
Variant of Cp037 with Euro character |
1141 |
Cp1141 |
Variant of Cp273 with Euro character |
1142 |
Cp1142 |
Variant of Cp277 with Euro character |
1143 |
Cp1143 |
Variant of Cp278 with Euro character |
1144 |
Cp1144 |
Variant of Cp280 with Euro character |
1145 |
Cp1145 |
Variant of Cp284 with Euro character |
1146 |
Cp1146 |
Variant of Cp285 with Euro character |
1147 |
Cp1147 |
Variant of Cp297 with Euro character |
1148 |
Cp1148 |
Variant of Cp500 with Euro character |
1149 |
Cp1149 |
Variant of Cp871 with Euro character |
1153 |
|
Latin 2 Multilingual with euro |
1154 |
|
Cyrillic Multilingual with euro |
1155 |
|
Turkey with euro |
1156 |
|
Baltic Multi with euro |
1157 |
|
Estonia with euro |
1158 |
|
Cyrillic, Ukraine with euro |
1164 |
|
Vietnamese with euro |
1165 |
|
Latin 2 EBCDIC/Open Systems |
1364 |
|
EBCDIC |
1388 |
|
EBCDIC |
4396 |
|
Japanese Host DB including 1880 |
5026 |
|
EBCDIC, Subset of 933 |
5035 |
|
EBCDIC |
5123 |
|
EBCDIC |
8482 |
|
Host SBCS Katakana |
9030 |
|
Thailand |
28709 |
|
SBCS Traditional Chinese Host (w/ euro update) |
The AS/400 source code examples referenced on this page are from "Application System/400 Application Development by Example", SC41-9852-00, International Business Machines Corporation, 1991.