DCOM Explained
by Rosemary Rock-Evans
Digital Press
ISBN: 1555582168   Pub Date: 09/01/98

Previous Table of Contents Next


How does translation actually work?

There are two parts to the process of translation of instructions to the DBMS, the translation of the DML and the translation of the names used for records, data items, and so on.

Name translation

The translation of names is an optional component of the database connectivity software, but it adds further transparency if it is available and can be used to simplify names which may be obscure or may have been restricted by bizarre DBMS rules (TOTAL, for example, only allowed data item names to be eight characters long).

The translation of names is usually handled via a Directory/Dictionary, which contains the names the developer can use to access the data and the names that are actually used in the DBMSs themselves. The creation of this Directory is a one-off job, but it must of course be kept up-to-date as the database designs change.

Translation of the DML

The translation between the DMLs of various DBMSs may seem a simple enough concept at first, but the complexities involved in true translation only start to emerge once one examines the differences between the DBMSs. In essence, a translation tool has to be able to smooth out and provide one interface which covers:

  The differences between the basic underlying models used by the DBMS vendors
  The differences in DML
  The differences in dialect between supposed standard DMLs

Translation of the underlying models-We have seen what the main types of DBMS are. Each of these types of DBMS uses different concepts, and the database connectivity software has to translate first and foremost between concepts before it can translate the DML itself, for example:

  Two-level networks-Two-level network DBMSs use Master Data records and Detail Data records with Elements, synonym chains, link paths, Manual Masters, and Automatic Masters (indexes).
  Hierarchical-Hierarchical DBMSs use segments, links, root segments, indexes, and data items or fields.
  Network (Codasyl)-Codasyl (network) databases are based on the concept of records (types), data items, sets, and areas. Sets can be ordered, indexes are allowed on records, and sets can be one or more record types
  Relational-Relational DBMSs are based on tables, columns, and foreign keys, and tables can be indexed. The link between tables is implemented implicitly using foreign keys, not explicitly using sets.

Although it is possible to provide some correlation between concepts such as the record (table, segment, Data Record, etc.), data item (column, element, field), and set (link, link path, embedded foreign key), not all concepts do translate. If the developer has used an ordered set, for example, in IDMS, no direct equivalent exists in the relational world. The closest concept might be an index on a specific key of the relational DBMS, which is itself ordered. One area of particular complexity is the use of the concepts themselves to record integrity rules.

Codasyl DBMSs can be designed to incorporate many of the rules of data integrity within their design (the use of mandatory and optional sets, for example); only recently have relational databases incorporated similar checks by using stored procedures, and some types of DBMS have no built-in integrity checking mechanism. There are thus basic differences in the way integrity is handled by each DBMS—automatically by using the design, automatically by using stored procedures, or not at all.

Equally important is the fact that even DBMSs within the same family may not support all the concepts. Stored procedures, for example, are a feature of relational DBMSs. They can be used to not only implement integrity rules and validation procedures (e.g., validate date) but to perform calculations (e.g., calculate age from date of birth). But not all relational DBMSs support stored procedures.

The different DMLs (Data Manipulation Languages)-Each of the different types of DBMS uses a different DML. The Relational language includes update commands such as INSERT, UPDATE, and DELETE, and similar commands exist in other DBMS types. The correspondence between update commands in the various DBMS types is thus reasonably clear cut, although DBMSs based on nonrelational DBMSs often contain more explicit references to the locking to be used and the creation of indexes.

The main differences are to be found in the commands used for “navigation” around the data in the database. Relational language queries are what is called nonprocedural in that they do not require the programmer to navigate his way around the database. Codasyl and two-level network databases, for example, do require navigation, and the programmer defines the query by traversing sets or links, by direct access, or by access using indexes. Whereas a programmer using a relational database may use command such as SELECT*FROM and SELECT statements that join tables—SELECT C1, C2, C3 FROM T1, T2 WHERE T1.C4 T2.C5—a programmer using a Codasyl database would use commands such as:

FIND CUSTOMER DB_KEY IS CUSTOMER_NO
FIND NEXT ORDER WITHIN CUST-HAS-ORDER
FIND OWNER WITHIN PROD-HAS-ORDLINE

If the Codasyl-based DML being input to the translation process requires a traversal of a set, the relational DML being output from the translation process has to provide the equivalent, for example, access to a table via an index or specific foreign key.

There are two essentially different approaches being used to get at data.


Previous Table of Contents Next