PRIDE ® -DBEM
Data Base Engineering Methodology
PHASE 4 - ENTERPRISE PHYSICAL DATA BASE DESIGN
ACTIVITY A - DEVELOP HARDWARE/SOFTWARE APPROACH

EXAMPLES   TOOLS & TECHNIQUES   FUNCTIONAL MATRIX   CHECKLIST   FORMS

TRANSLATE THIS PAGE TO... Chinese (simple)   Japanese       Dutch   French     German     Italian    
Free Translation courtesy of ALS      Chinese (traditional)   Korean       Portuguese       Russian       Spanish         

CONTENTS

This section contains the following:


 
    BUSINESS PURPOSE

    The purpose of this activity is to determine an enterprise-wide physical data base design strategy based on logical data base design specifications. Consideration is given to both hardware and software. The activity is performed by Data Base Administration and Data Communications Administration.  

    OVERVIEW

    Data Base Administration and Data Communications Administration review the preliminary system and data base designs to become familiar with data and processing requirements. From these specifications, DBA/DCA will devise a method of implementation that will be reviewed with management in the next activity. This process could potentially be extensive. As such, a Detail estimate/schedule is initially prepared only for Activity A. By the end of the activity, DBA/DCA will have greater awareness of design requirements and will be in a position to effectively prepare a complete Detail estimate/schedule for the remaining activities of Phase 4 (B - E).

    In contrast with the logical models, the EPDBM is quite heterogeneous, not only because it may have to represent the structures of various DBMS products, conventional files and even manual files, but also because of the need to cope with existing structures which may or may not be represented in the logical models.

    Data Base Administration is faced with the following hardware/software choices for each newly created record type within a new or existing ELDBM Object:

    • Manual File
    • Conventional Computer File
    • Hierarchical DBMS
    • Network DBMS
    • Relational DBMS

    The next section discusses the situations under which each selection is preferable.

    MANUAL FILES

    There are numerous situations where manual files are the best selection. Signed documents, contracts, and many other kinds of paper documents are still stored in most enterprises and will remain in use until we become a completely paperless society (if that ever happens). Although we should not document every piece of paper in a Data Base Model, some are of great importance for the conduct of business. If they were not included in these models, the business intelligence provided would be grossly incomplete.

    Documents can be defined as inputs and outputs depending on how they are used. If they are used to collect data (as is the case for most forms), they are regarded as inputs. If they are printed reports, they are regarded as outputs. There may be occasions where a form is used as a "turnaround document." In other words, it may be used to collect data and then be presented as a report. In this situation, the form/report is documented as both an input and an output.

    CONVENTIONAL COMPUTER FILES

    In the absence of a DBMS, this is obviously the only choice, but even if a DBMS is available, one may choose to use conventional files for a variety of reasons, depending on the characteristics of the applications involved. In general, they can be sequential, indexed or direct access, with various options that depend on the particular operating system being used.

    If a response time of hours or more is acceptable to all applications that use a record, and relationships with other records have little or no use, batch processing of sequential files may be the most economical solution. For example, many large commercial banks process hundreds of thousands of transactions a day that come in the mail or through Clearing Houses. There is no point in providing response times of seconds to these transactions (which would be near the limit of the fastest DBMS with currently available hardware) since they have a daily processing cycle.

    At the other extreme, some real time control applications requiring split-second response times must also use flat direct access files since available DBMS products are slower.

    These are just some of the situations where the use of conventional files is preferred to a DBMS, even considering that the conventional files do not have the same support for sharing, data independence, reliability and other benefits of DBMS technology.

    HIERARCHICAL DBMS

    Hierarchical DBMS packages have been very popular for many years, probably because much of human endeavor is hierarchical, especially if viewed within a limited environment. Most basic (operational level) information systems can use trees as the basic data structure. It is when higher level integration is required for control or policy level information systems that the restrictions of this structure appear. Since they are the oldest on the market, their manufacturers have had ample time for fine tuning. This puts them among the fastest of the DBMS types, and in some high transaction rate applications, this makes them the only choice, provided that the limitations of the hierarchical structure are acceptable.

    IBM's IMS is the foremost representative of the hierarchical DBMS packages and the one most widely used. Its terminology will therefore be used in our descriptions of the mapping from the EPDBM to a hierarchical DBMS.

    The main price to be paid in using such a DBMS is in the additional program complexity to handle multiple hierarchies as will always be the case in an integrated environment.

    NETWORK DBMS

    A network DBMS is nearly as fast as a hierarchical DBMS and has extra flexibility in providing relationships. A network is clearly more complex than a hierarchy, but simpler than a large number of interconnected hierarchies which would be required for an integrated Enterprise Data Base.

    The CODASYL DBTG, which is closely followed by a number of commercial DBMS products, enhances portability of applications among a variety of different machines which may be of paramount importance in some situations.

    Another major advantage of this approach is the ability to explicitly specify and enforce properties related to referential integrity (through set Membership clauses).

    Networks are appropriate for operational as well as control and policy level applications due to the ability to "navigate" through access paths and find related records. However, this places the burden of selecting access paths on the programmer, increasing complexity and reducing data independence.

    RELATIONAL DBMS

    The advantages of the relational approach have been well publicized and are very real. Among the most important are:

    • Enhanced Data Independence - the freedom from having to follow access paths in searching for data greatly reduces program complexity, increasing productivity. Even more important, it allows changes to these access paths to be transparent to the programs that use them.

    • Conceptual Soundness and Simplicity - the relational approach involves a simple data structure, easily understandable by any user and operators that behave in a well defined way. This simplifies communication with end-users, allowing them to directly access the data base through ad-hoc queries.

    The advantages are great in terms of usability and productivity by both end-users and DP professionals which is enhanced by the standardization of the SQL language. The price to be paid is the increased resource utilization, a small one in most practical cases. Also, in terms of performance, all but very high volume transaction oriented applications can usually be satisfied.

    HARDWARE SELECTION

    In addition to the software choices, final hardware decisions must be made at this point. A wide range of alternatives is available in general, but in practice one is usually confined to existing hardware except in very large and important projects. We will not discuss these alternatives in detail here, but just mention that dedicated database machines, networks of mini, micro and/or mainframes are among them. Data communications and distributed data base hardware and software may be required and must be fully cost justified at this point.

    Selection of the proper technology for a particular implementation is a technically complex task, particularly in an integrated environment, since one has to balance the potentially conflicting performance requirements of various applications. In many cases, Data Base Administration must compromise performance of some applications in order to optimize others. The political implications of this may be far reaching and are certainly harder to deal with than the technical issues.

    The decisions on what to optimize must be taken, if not with the approval of all parties involved, at least with their knowledge. Few things can be more disastrous than a sudden performance degradation on a user's system caused by the integration or optimization of some other system.

    Very often, the best technical solution for a new implementation conflicts with the existing DBMS. Although it is not unusual to have more than one DBMS in an installation, Data Base Administration should avoid excessive proliferation of different packages not only because extra effort is needed to coordinate them, but also because of the additional expertise that has to be developed and maintained to support the different DBMS packages.

    The logical structures in the ELDBM can be implemented through any DBMS. Therefore, when in place, it can serve as a guide to which systems are to be developed or redesigned. This constitutes a good opportunity to standardize and move towards the use of as few as possible different DBMS packages.

    WHO SHOULD PARTICIPATE?

    Data Base Administration and Data Communications Administration is primarily responsible for formulating the Enterprise Physical Data Base Design. The effort required to devise a method of physical implementation should not be underestimated. This may require considerable thought and research. Because of this, DBA/DCA may seek assistance from a variety of functions as part of this process. Those who may lend assistance include:

    • Data Engineering - who can explain objects, views, and relationships in the Enterprise Logical Data Base Model (ELDBM).

    • Systems Engineering - who can explain objects, views, and relationships in the Application Logical Data Base Model (ALDBM), along with processing requirements for an information system. For example, Systems Engineering should be able to anticipate transaction volume, processing speed (as derived from Frequency/Offset/Response Time), proposed processing method (e.g., interactive, batch, manual processing), file processing (Create/Update/Reference), file volatility and hit ratio, etc.

    • Software Engineering - who can propose file access methods, and anticipated program work file requirements.

    • DP Operations - who participates in an advisory capacity, particularly in hardware planning.
     

    STEPS IN EXECUTION

    1. Data Engineering reviews the specifications resulting from Phase 1 and 2. Based on the specifications, Data Engineering prepares a Detail estimate and schedule to complete the phase. This is based, in part, on the Order-of-Magnitude estimate/schedule resulting from Phase 1. The Detail estimate/schedule is reviewed with Project Management for approval.

    2. Data Engineering defines the objects, views, and data elements associated with the ELDBM using FD, RD, and DD resource definitions in the IRM. Systems Engineering and Enterprise Engineering is consulted during resource definition.

      NOTE: If Phase 2 was performed correctly, Data Engineering should not have to invent any new data elements.

    3. Data Engineering relates the logical files (objects) to the ELDBM (as represented by a single FD resource). The ELDBM should be linked to the host enterprise.

    4. Data Engineering prepares an "ELDBM Narrative" to describe the objects within the data base, along with a "Data Structure Display." These deliverables are retained for inclusion in the final Phase 3 design manual.

   


Copyright © 1971-2009 by M. Bryce & Associates
Palm Harbor, Florida, USA
All rights reserved.