PRIDE ® -DBEM
Data Base Engineering Methodology
PHASE 2 - APPLICATION LOGICAL DATA BASE DESIGN

ACTIVITY A   ACTIVITY B   ACTIVITY C   FUNCTIONAL MATRIX   CHECKLIST   FORMS

TRANSLATE THIS PAGE TO... Chinese (simple)   Japanese       Dutch   French     German     Italian    
Free Translation courtesy of ALS      Chinese (traditional)   Korean       Portuguese       Russian       Spanish         

CONTENTS

"A data element has only one logical definition, but may be represented physically in many different ways."
- Bryce's Law

This section contains the following:


 
    BUSINESS PURPOSE

    The purpose of this phase is to design or modify an Application Logical Data Base Model (ALDBM) for an information system. Several events occur during this phase:

    • Application Logical objects are defined, complete with views, and data elements.

    • Relationships between objects are established via the views.

    • A schematic of the Application Logical Data Base Model is prepared and reviewed for accuracy.

    • A technical review of the Application Logical Data Base Model is performed to validate its integrity.
     

    METHODOLOGY NAVIGATION

    A Phase 2 is initiated following a Phase 1. There is a one-to-one relationship between Phase 2 and an information system. In other words, a Phase 2 will be executed for each system identified in Phase 1 (normally there is just one). As such, the Project Manager may bind the system number to the project/phase key. For example:

    PD-00123 - Project TS - Trundle Sales System 2 - Phase 2 identifier

    Even if the scope of the project is to only modify a small portion of a system (such as a single sub-system, procedure, program or module), the whole system identifier should be used.

    The formal deliverable resulting from Phase 2 is an "ALDBM Design Manual" consisting of a description of the Application Logical Data Base Model. This is reviewed with Data Resource Management who must decide whether to:

    • Approve the project for continuation.

    • Request revisions to the phase findings.

    • Discontinue the project.

    Following Phase 2, the DBEM project will proceed to Phase 3, "Enterprise Logical Data Base Design," where the Application Logical Data Base Models will be merged into the Enterprise model.  

    GENERAL DISCUSSION

    The Application Logical Data Base Model (ALDBM) represents a logical view of the data resources required by a single system (application). It does not represent "global" data resources, but rather a "local" view. The "global" view is defined by the Enterprise Logical Data Base Model (ELDBM) as defined in Phase 3. The ALDBM is intended to represent the major facts and events pertaining to a specific application. There is one such model for each Information System.

    As opposed to the ELDBM, the ALDBM does not have stability as a major objective, but rather is designed to satisfy the needs of one system at one particular time. As object views change, so do the corresponding ALDB Models. Each ALDBM is a consistent subset of the ELDBM, with redundancy and overlaps being allowed among the various ALDB Models.

    Because the ALDBM is defined for a system, it is normally prepared by Systems Engineering, the architect of the system. By this time, the Systems Engineer should be the leading authority regarding the data requirements for the system and, thereby, is the in the best position to define the ALDBM. During this phase, Systems Engineering is supervised by Data Engineering who provides assistance in logical data base design related problems.  

    BASIC CONSTRUCTS

    The ALDBM is composed of a set of interrelated Objects as used in a specific application. To briefly recap the Object Concept (as described in the Methodology Overview), Objects represent business facts and events. Facts are somewhat tangible in nature; e.g., Products, Parts, Employees, Customers, Vendors, etc. Events represent some form of interaction between facts; e.g., Shipments, Orders, Billings, Payments, Purchases, etc. Each object is uniquely identified by a single data element referred to as the "primary basic grouping." AN OBJECT IS AN OBJECT ONLY WHEN A DATA ELEMENT HAS BEEN CREATED TO UNIQUELY IDENTIFY IT.

    Objects are decomposed into "views" to describe different aspects (perspectives) of the object. There are three types of views:

    1. Identification Views - to uniquely identify the object. All objects have an identification view.

    2. Characteristic Views - describe specific internal characteristics about an individual object, such as Types of Addresses, Customer Contacts, etc. Characteristic Views normally apply to factual objects and are used to uniquely identify a multiple-occurrence data element; for example, a customer may have many types of addresses (billing, shipping, mailing), and many contacts (people).

    3. Relationship Views - are used to establish a relationship between two or more objects. This is typically used by event related objects to bind two factual objects; for example:
    CUSTOMER ---------- ORDER ---------- PRODUCT (Fact) (Event) (Fact)  

    BASIC GROUPING

    Each view consists of the data elements that are used to identify, describe or quantify the object. A "basic grouping" (logical key) is established for each view. The basic grouping uniquely identifies the view and the data elements within it. The "basic grouping" is composed of one or more of the following:

    • Primary Basic Grouping - the data element used to uniquely identify the overall object. All "views" must include this data element in the basic grouping. It is the element that binds all common views together. An "Identification View" will only require the "Primary Basic Grouping" data element in its "basic grouping."

    • Secondary Keys - two types are available:

      - For "Characteristic Views" secondary keys are used to uniquely identify a single occurrence of data. For example, "Contact Number" is used to uniquely identify a single person. "Type Address" uniquely identifies a specific type of address. This approach is similar to a given "Line Number" to uniquely identify a single text line in a body of text.

      - For "Relationship Views" the secondary keys represent "foreign keys" to other objects. For example, to qualify "Quantity Ordered," a logical record is established with "Order Number" to represent the primary basic grouping (the Order Object); however, the "basic grouping" will also include "Customer Number" and "Product Number" to explicitly define the relationship and to uniquely identify "Quantity Ordered." In other words, "Customer Number" is the primary basic grouping for the Customer Object, and "Product Number" is the primary basic grouping for the Product Object.

     

    OBJECT RELATIONSHIPS

    Objects relate to one and other through "Relationship Views," not "Characteristic Views." Under this approach, a "Relationship View" is used to link other objects; for example:

    ---------------- ---------------- ----------------- | CUSTOMER | | ORDER | | PRODUCT | |IDENTIFICATION|-------| RELATIONSHIP |-------|IDENTIFICATION | | VIEW | | VIEW | | VIEW | ---------------- ---------------- -----------------

    These relationships are implicitly defined by the "basic grouping" of the "Relationship View." In this example, the "basic grouping" for the "Order Relationship View" would be: "Order Number," "Customer Number," and "Product Number." In other words, "Customer Number" and "Product Number" are foreign keys to other objects.

    To explicitly define the relationships between the objects pointers are defined from the "Relationship View" to the "Identification Views." Further, it is necessary to establish rules about the relationships. For example; Can a customer place many orders? Does an order relate to many products? This knowledge reflects important business rules and will be used as the integrity constraints when defining the physical data base. The typical types of relationship rules between objects include one-to-many (1:M), many-to-many (M:M), and many-to-one (M:1). For example:

    ---------------- ---------------- ----------------- | CUSTOMER | | ORDER | | PRODUCT | |IDENTIFICATION|<---->>| RELATIONSHIP |<<--->>|IDENTIFICATION | | VIEW | | VIEW | | VIEW | ---------------- ---------------- ----------------- 1:M M:M

    The first relationship between Customer/Order represents a one-to-many relationship; a Customer can place many Orders, but an Order pertains to a single Customer. The Order/Product Relationship expresses a many-to-many relationship; a single Order may be for many Products, and one Product may be used by many Orders.  

    APPLICATION LOGICAL DATA BASE MODEL (ALDBM) BASIC CONSTRUCTS

    --------------- | SYSTEM & | | SUB-SYSTEMS | --------------- | ----------------------------------- V V ------------- ------------- |FD-(OBJECT)| |FD-(OBJECT)| |-----------| |-----------| | CUSTOMER | | ORDER | | OBJECT | | OBJECT | ------------- ------------- | | ------------------ ------------------ V V V V ------------ ------------ ------------ ------------ | RD-(VIEW)| | RD-(VIEW)| | RD-(VIEW)| | RD-(VIEW)| |----------| |----------| |----------| |----------| | CUSTOMER | | CUSTOMER |<----| ORDER | | ORDER | | ADDRESS | | IDENT | | RELATION.| | IDENT | ------------ ------------ ------------ ------------ V V V V BG-CUSTOMER NO. BG-CUSTOMER NO. BG-ORDER NO. BG-ORDER NO. BG-TYPE ADDRESS DD-NAME BG-CUSTOMER NO. DD-DATE DD-ADDRESS BG-PRODUCT NO. DD-CITY DD-QUANTITY ORDERED DD-STATE/PROVINCE DD-ZIP/POSTAL CODE DD-COUNTRY

    NOTES: The pointers represent relationships between the various resources as physically recorded in the IRM. The relationship between the Order and the Customer would be used to indicate types of relationships between views (e.g., 1:M, M:M). Also, it implies a relationship to a Product Identification View.

    • FD represents File Description
    • RD represents Record Description
    • BG represents the "Basic Grouping" data elements.
    • DD represents Data Definitions
     

    ALDBM/ELDBM RELATIONSHIP

    Each ALDBM Object corresponds to an ELDBM Object. As new Objects and/or data elements are uncovered by Systems Engineering, they are created first in an ALDBM and later integrated into the ELDBM by Data Engineering during Phase 3. In this way, each ALDBM Object will eventually become a subset of the corresponding ELDBM Object. To illustrate:

    PAYROLL SYSTEM ELDBM BENEFITS SYSTEM ALDBM ALDBM ------------- ------------- ------------- |FD-(OBJECT)| |FD-(OBJECT)| |FD-(OBJECT)| |-----------| |-----------| |-----------| | EMPLOYEE | | EMPLOYEE | | EMPLOYEE | | OBJECT | | OBJECT | | OBJECT | ------------- ------------- ------------- | | | | | | V V V ------------- ------------- ------------- | RD-(VIEW) | | RD-(VIEW) | | RD-(VIEW) | |-----------| |-----------| |-----------| | EMPLOYEE |<-----------| EMPLOYEE |----------->| EMPLOYEE | | VIEW | | VIEW | | VIEW | ------------- ------------- -------------

    This means that there may be many different types of Employee Objects which contain subsets of data to describe the object for a particular system. However, there may be only one "global" Employee Object expressed by the ELDBM to accommodate all ALDB Models. This "application" versus "enterprise" view implies an evolutionary approach to data base design, which is natural.

    The integration of ALDBM Objects into the ELDBM is developed through an iterative process of reviews and modifications to either model by Data Engineering. Since Objects are identified through the same criteria for both logical models, they should be compatible. In fact, in a hypothetical situation of a completely new enterprise, the first Application Logical Data Base Model becomes the Enterprise Model.

    To satisfy the requirements of Data Base Engineering, the ALDBM must be:

    1. Comprehensive enough to satisfy all of the data elements needed to support the information requirements of the system being developed/modified.

    2. Easy to modify independently, that is: without causing changes to other Applications.

    3. Independent from the particular hardware or software used and even from the DBMS type (hierarchical, network, relational) used for implementation.

    4. Described in business terms, without excessive detail, to allow Users to clearly understand the meaning of data and also be able to check its accuracy.

    5. Compatible with the ELDBM.
     

    ALDBM TO APDBM RELATIONSHIPS

    During Phase 5, Data Base Administration will design and develop the Application Physical Data Base Model (APDBM). This will consist of physical files and records that must store the data needed to satisfy the ALDBM. There is not always a one-to- one relationship between logical and physical structures. One logical file may be implemented by multiple physical files. Conversely, one physical file may implement many logical files. As we will see in the physical design phases (4 and 5), physical implementation is based on what is cost-effective and may take many forms (e.g., indexed files, indexed sequential, flat files, hierarchical/network/relational DBMS, etc.). The point is, the relationships between logical and physical must be mapped in the IRM to demonstrate that the physical implementation satisfies the logical. Again, this mapping is performed in Phase 5.

    ------------- | RD-(VIEW) | APPLICATION LOGICAL |-----------| DATA BASE MODEL | EMPLOYEE | (ALDBM) | VIEW | ------------- | -------------------------- | | V V ------------- ------------- |RD-(RECORD)| |RD-(RECORD)| APPLICATION PHYSICAL |-----------| |-----------| DATA BASE MODEL | EMPLOYEE |<-----------| EMPLOYEE | (APDBM) | TABLE | | RECORD | ------------- -------------  

    RECORD DESCRIPTION (RD)

    As this discussion indicates, the Record Description (RD) resource in the IRM is used for a variety of purposes. In a logical context, the RD represents a "View"; in a physical context, the RD represents a "Record." The RD is also used to represent other physical structures, such as: screen panels, print maps, input transactions, output data, messages, tables, etc. All of these structures exhibit the same characteristics; all require some form of unique key, and all contain data elements.

    The File Description (FD) is used to represent logical files (Objects), physical files, and data bases. Files consist of records, thereby an FD/RD relationship is required. A data base represents a global view of data, therefore an FD/FD relationship is used to map all of the files within a data base.

    The RD is also the primary resource used to map the relationships between all four data base models. Although the File Description (FD) provides for FD-to-FD relationships, it is the RD that is used to connect the models.

    The following diagram illustrates all of the BASIC relationships needed to map the four data base models in the IRM. Additional pointers are available for different purposes, particularly for complicated physical file structures.  

    IRM RESOURCE RELATIONSHIPS - MAPPING THE FOUR DATA BASE MODELS

    NOTES:
    ALDBM - Application Logical Data Base ModelFD - File Description
    ELDBM - Enterprise Logical Data Base ModelRD - Record Description
    EPDBM - Enterprise Physical Data Base Model 
    APDBM - Application Physical Data Base Model 

    This diagram illustrates all of the BASIC relationships needed to map the four data base models. Additional pointers are available for different purposes, particularly for complicated physical file structures.  

    ALDBM REPRESENTATION

    There are three complementary ways to graphically express the ALDBM:

    • ALDBM Diagram - this is a simple graphic consisting of boxes to represent "objects" and arrows to represent relationships between objects. The types of arrows drawn on the graphic are used to represent relationship restraints. A single arrow represents "one" relationship, double-arrows represent "many" relationships; for example:
    ------------ ------------ ------------ | CUSTOMER |<--------->>| ORDER |<<-------->>| PRODUCT | ------------ ------------ ------------ 1:M M:M

    • The first relationship between Customer/Order represents a one-to-many relationship; a Customer can place many Orders, but an Order pertains to a single Customer.

      The second relationship expresses a many-to-many relationship; an Order may be for many different Products, and a Product can relate to many Orders.

    • ALDBM Matrices - the ALDBM Diagram may be fine for small applications but may be confusing for larger applications. To assist in this regard, matrices are used to express the same relationships; for example:

      CUSTOMER ORDER PRODUCT SHIPMENT BILLING
    CUSTOMER   1:M   1:M 1:M
    ORDER M:1   M:M    
    PRODUCT   M:M   M:M M:M
    SHIPMENT M:1   M:M    
    BILLING M:1   M:M    

      Two types of ALDBM Matrices are available. The ALDBM Views Matrix expresses the relationships between logical records. The ALDBM Objects Matrix expresses the relationships between logical files; this is deduced from subordinate record relationships.

    • Additional Matrices - During Phase 5, Data Base Administration will define the Application Physical Data Base Model (APDBM). In order to substantiate that the APDBM correctly implements the ALDBM, additional matrices will be prepared in Phase 5 to represent:

      • ALDBM/APDBM Files Matrix
      • ALDBM/APDBM Records Matrix

      These relations are recorded on the RD/RD relationship in the IRM. They are used for defining the integrity constraints of the data base. Ultimately, they express business rules. It should be noted that comparable versions of these relationship graphics are available for the ELDBM.

    • Text reports - a "Data Structure Display" is a structured list of FD/RD/DD relationships. It is used to express the basic structure of the ALDBM. The display is supported by an "Application Logical Files Discussion" which provides a textual explanation of each logical file (Object).
     

    HOW DATA ELEMENTS ARE USED

    The data element is the basic building block for all data resources. It is used to identify, describe and quantify the objects of the business. In this respect, it represents the facts about the business. The data element can be related to a variety of information resources, for different purposes. Data elements are documented in the IRM using the Data Description (DD) resource and can be related to:

    IR - INFORMATION REQUIREMENTS

    The IR/DD relationship expresses all of the data elements, both primary and generated, needed to support the information requirement.

    OD - OUTPUT DESCRIPTIONS

    The OD/DD relationship is used to represent all of the data elements, both primary and generated, that will be physically contained on the output. Data will also be used as the Sort/Access Key for the output.

    ID - INPUT DESCRIPTIONS

    Data will be related to inputs to express its Sort/Access Key.

    FD - FILE DESCRIPTIONS

    Data will be related to files to express keys:

    For logical files (Objects), a single indicative data element is selected as the "primary basic grouping."

    For physical files, one or more data elements can be appointed the Sort/Access Key to the file.

    RD - RECORD DESCRIPTIONS

    In records, data will be used both as keys for the record, and as part of the record's content set (the collection of data elements contained in the record).

    For logical records (Views), one or more indicative data elements are selected as the "basic grouping."

    For physical files, one or more data elements can be appointed the Sort/Access Key to the record.

    DD - DATA DESCRIPTIONS

    Data to data relationships are used to express three different types of data dependencies:

    1. The Source of a Generated Data Element - to express how a data element is calculated based on other data elements (primary as well as other generated data elements).

    2. To express a Group Data Element (a concatenation of primary data elements). For example "Telephone Number" may be described as a group data element consisting of "Area Code," "Exchange," and "Account Number."

    3. To express "Grouping" dependencies between data elements. This is used to express relationships between primary/indicative data elements and those that are primary/descriptive and quantitative. Such dependencies are used to express the meaning of the subordinate descriptive/quantitative items. For example, the descriptive data element "Name" can be related to "Customer Number" whereby it represents "Customer Name."

    Of all of these resources, DBEM is primarily concerned with FD, RD, and DD resources.  

    DATA DEFINITION (DD)

    Perhaps the most difficult task in all of DBEM is to correctly define a data element. This requires considerable skill. Historically, analysts and programmers have defined nothing more than a program label. Little insight was recorded in terms of the business facts and events being recorded. In order to properly define the ALDBM, it is necessary to supplant this physical perspective with a logical perspective.

    Defining logical files and records is relatively easy in comparison to defining data elements. This is not said to intimidate the reader, but rather to issue a strong warning as to the ramifications of superficial work.

    Systems Engineering will still be responsible for defining data requirements, but it is Data Engineering's responsibility to assure that data is defined correctly with a high degree of precision. If left unresolved in the ALDBM, problems will only cascade to the other data base models in subsequent phases.

    Although a complete description of the characteristics of a data element are provided in the Methodology Overview (see "Anatomy of a Data Element"), the following recaps the major points in defining data logically:

    • A data element has one logical definition but can be physically implemented many different ways.

    • The objective is to demonstrate that each data element is unique and not redundant with any other data element. Because of this, each data element must be classified according to its characteristics. A Data Taxonomy is used in this regard.

    • Each data element is identified by a unique control number. A standard business name and definition is provided to describe the data element in standard business terminology (such as a Webster/Oxford-style definition).

    • Data comes in two forms: primary and generated. Primary data refers to data in its virgin state; as introduced to a system from an external source (such as a person or department). Generated data refers to data that relies on other data elements in order to calculate a result (such as through some form of mathematics). A group data element is another form of generated data; it represents a concatenation of primary data elements. Generated data is perhaps the most important type of data to end-users. Primary data is important to the developers of the system.

      Logical data base models only make use of primary data elements. Physical data base models can make use of both (primary and generated).

    • There are three types of data elements: Indicative, Descriptive, and Quantitative.

      Indicative data is used to identify objects, either in part or in full. This means that a data element may either be strong enough to uniquely identify a whole object (e.g., "Customer Number," "Order Number," "Territory Code"), or a view within an object, such as "Type Address," "Contact Number," etc. View-identifiers also uniquely identify a single occurrence of data such as where a line number is used to uniquely identify a single line of text.

    • For primary data elements that are descriptive or quantitative, their dependency to indicative items are required to give them meaning (to put them in a specific context). For example, the data element "Name" should be linked to indicative data elements "Product Number," "Customer Number," etc.
     

    APPROACH TO DESIGN

    Ideally, File/Record/Data Definitions should have been created within ISEM Phase 2 (System Design) by Systems Engineering. If not, they are defined here in DBEM Phase 2. Subsequently, Systems Engineering is responsible for defining the relationships between objects using RD-to-RD Relationships.

    During System Design, should Systems Engineering encounter FD/RD/DD structures in another ALDBM that satisfy the data requirements in the new application, the Systems Engineer may elect to incorporate these data structures into the new ALDBM. As part of this decision to re-use existing resources, Systems Engineering must consider the time frame by which the existing file is created/updated/referenced in other sub-systems. If compatible with the new application's requirements, it is perfectly acceptable to include the data structures in the new ALDBM. The Systems Engineer should, of course, check with Systems Resource Management to assure that this is acceptable.

    During the preliminary design phases of system development/ modification projects, or as part of documentation projects, records and data elements are created and normalized and relationships are determined. Each such model is then integrated into the Enterprise Logical Data Base Model by Data Engineering during Phase 3.  

    DESCRIPTION OF PHASE ACTIVITIES

    Activity A - Define Application Objects

    Systems Engineering reviews the data requirements for all of the Information Requirements within an application. From them, an ALDBM is created consisting of File, Record and Data Descriptions. The files are attached accordingly to sub-systems.

    Data Engineering supervises resource definitions.

    Activity B - Design Application Logical DB Model

    Systems Engineering defines the relationships between Objects using RD-to-RD relationships. Data Engineering supervises these relationships and referees discrepancies in resource relationships.

    Systems Engineering then assembles the deliverables produced in the phase and packages them into a single phase review document which is evaluated by Quality Assurance prior to conducting the formal phase review with management.

    Activity C - Phase 2 Review

    Systems Engineering conducts a formal review of the Phase 2 deliverables with Data Resource Management, Systems Resource Management, Project Management, and Data Engineering. The review is used to evaluate the work performed thus far and revise as required. At this time, management will review the formal Phase 2 "ALDBM Design Manual" consisting of:

    • Phase Cover Page
    • Application Logical Files Discussion
    • ALDBM Objects Matrix
    • ALDBM Views Matrix
    • ALDBM Diagram
    • Data Structure Display
    • Project Plan
    • Project Estimate/Schedule Recap
    • Phase Review Checklist

    Based on this report and subsequent review meeting, management may elect to revise parts of the report, discontinue the project, or approve it for continuation.

   


Copyright © 1971-2009 by M. Bryce & Associates
Palm Harbor, Florida, USA
All rights reserved.