| PRIDE | ® | -DBEM |
| ||||||
"Whereas the logical Data Base will remain relatively static,
the physical data base changes dynamically."
- Bryce's Law
This section contains the following:
Copyright © 1971-2006 by
M. Bryce & Associates
Palm Harbor, Florida, USA
All rights reserved.
The purpose of this phase is to design or modify the Enterprise Physical Data Base Model (EPDBM). Several events occur during this phase:
METHODOLOGY NAVIGATION
A Phase 4 is initiated following completion of a normal
Phase 3 (ISEM related). There is a one-to-one relationship
between Phase 4 and the Enterprise Physical Data Base Model
as represented by an FD resource in the IRM. As such, the
Project Manager may bind the FD number to the project/phase key.
For example:
The formal deliverable resulting from Phase 4 is an "EPDBM
Design Manual" consisting of a description of the Enterprise
Physical Data Base Model. This is reviewed with Data Resource
Management who must decide whether to:
Following Phase 4, the DBEM project will proceed to Phase
5, "Application Physical Data Base Design," where physical file
structures will be created to implement the logical data base
design for a single application.
GENERAL DISCUSSION
Phase 4 represents a major transitional stage in the
methodology: going from logical data base design to physical.
Whereas Data Engineering had primary responsibility for
executing the early phases of DBEM, responsibility now shifts to
Data Base Administration and Data Communications Administration
who are responsible for the design and development of physical
hardware/software related to the collection, storage and
retrieval of data.
The Enterprise Physical Data Base Model (EPDBM) is a global
view of all of the files used to store data, this includes
manual files, computer related files, and any other storage
media. As such, there is only one EPDBM associated with a
single enterprise.
Whereas the ELDBM is a relatively static view of all
corporate data, the EPDBM will be more volatile. It reflects
the technology used for implementation at any given time. To
create the EPDBM, Objects of the ELDBM may be regrouped or split
in different ways, according to performance, security and
administrative considerations. In other words physical files
and records will not necessary be designed in the same manner
as their logical counterparts.
The model consists of multi-level aggregations of data
within one or more structures (Network, Hierarchical,
Relational, flat files), which will depend on the type of
technology and particular product(s) being used. This narrative
refers to these aggregates as data bases, files, records and
data elements, but relates them to the terminology used in a
particular product. Some products refer to this level of
modeling as the "Schema" or "Logical Data Bases" and reserves
the term Physical Model to descriptions at a lower level of
abstraction, sometimes involving even hardware descriptions.
DBEM considers this level as the Physical Model and reserves
the term Logical for the independent levels of abstraction.
The EPDBM provides data base designers with the freedom to
choose the best technology for each new application, either a
DBMS, disk, tape or even manual files, keeping it independent
from the logical models. Logical files and records in the ELDBM
are ultimately defined to reflect the rules of the business,
independent of physical considerations. A straightforward
implementation with the same logical and physical structures
could possibly produce poor performance or place unreasonable
physical constraints on the data base. Even more important,
the physical implementation may be changed in response to
performance tuning, new technology, or even varying usage loads
without impacting the logical models known to Users and Systems
Engineering. In particular, distributed data bases and data
communications procedures are implemented and documented at this
level only, and not in the logical models.
The Enterprise Physical Data Base Model must satisfy the
following requirements:
The number of aggregation levels, their names and contents
vary widely with each file management technique, which makes it
impossible to refer to all of them in this narrative. When
describing the procedures to convert the ELDBM to either flat
files or each of the major DBMS types (Hierarchical, Network and
Relational) standard terminology is used within each one. It is
quite possible to have several DBMS packages as well as any
other types of files within the EPDBM. Each product's
terminology and constructs is used in the parts of the EPDBM it
implements.
LOGICAL VERSUS PHYSICAL RESOURCE STRUCTURES
The differences between logical and physical data
structures can be considerable. Aside from using the same
type of IRM resources to model them in the IRM (FD, RD, DD
resources), there is little else that the two models share
in common.
The logical model is governed based on some very
specific rules. For example, "primary basic groupings,"
the three types of "views," fact/event relationships, use
of indicative data, etc. These rules were created to
develop simple data abstractions about the various objects
used in the operation of the business. Ultimately, they
promote data independence from the physical environment.
The physical model is totally different. It was created to
model the physical environment, and, as such, is dependent on
it. This is why it will evolve as technology evolves. Unlike
the logical model where the "basic grouping" requires indicative
data, the physical model can use whatever sort/access key is
convenient for the storage and processing of data. Further,
records can be merged and separated as required. This reflects
the fact that the physical model is dependent on the physical
file management tools and techniques in use.
FORMS OF PHYSICAL FILES
The media used to implement physical files can take many
forms: disk, tape, diskettes, cards, manual files, even memory.
Defining a file in memory is somewhat controversial, yet
important. If programs are going to write working files to
memory as part of their normal processing, understanding the
layout of the file is just as important as if it were written to
disk. Data Base Administration needs to know the capabilities
of physical files and processing requirements in order to
advise DP Operations for hardware capacity planning.
Computer files may use a variety of file management
techniques: Sequential, Index Sequential, Random, or DBMS
specific. The decision to use a particular file management
technique should be based on what is effective and economical,
not on what is technically fashionable. For example, not
everything requires a DBMS. There is little point to using a
DBMS if it is going to be used as nothing more than an elegant
access method.
When selecting a suitable file management technique,
Data Base Administration must consider the following:
This highlights the fact that Data Base Administration
requires information about the various applications in order to
arrive at a suitable physical file design. This includes data
and processing specifications as recorded in the IRM by Systems
Engineering. Data Communications will also use these
specifications to devise a data transmission/conversion
strategy.
SELECTING A DATA BASE MANAGEMENT SYSTEM (DBMS)
When should you consider a DBMS? Any time when data is
going to be shared between applications on the same computer.
As with any file management technique, selection should be
based on the criteria mentioned above. Because there are
significant differences between various file management
techniques, it is unlikely that a single DBMS will be able to
serve all of the needs of a company. In fact, DBMS packages
will proliferate in companies as data bases become more
distributed.
Currently, there are essentially three different types of
DBMS products commercially available:
Although the intent of the three types of DBMS' is to promote
the sharing of data and simplify data access, their
implementations are significantly different. To
illustrate:
DBMS PROS AND CONS
Obviously there are trade-offs between the various
approaches. One is not necessarily better than the other.
Bottom-line: Data Base Administration must resolve how well
the DBMS fulfills the data and processing requirements of the
various applications.
In order to represent the different structures found in the
major DBMS products within a standard framework, it is necessary
to draw a correspondence among the various levels of aggregation
they support. The table below shows such a correspondence and
the IRM resources used to represent them:
While the first two levels (field and record) are quite
similar for most DBMS packages, the clustering provided by an
IMS Physical Data Base, a CODASYL Area or a DB2 Tablespace are
quite different. However, these and higher level aggregations
are still represented as FD resources.
ENTERPRISE PHYSICAL DATA BASE MODEL (EPDBM) BASIC CONSTRUCTS
NOTES: The pointers represent relationships
between the various resources as physically recorded in the IRM.
RD-to-RD and FD-to-FD relationships depend on the selected file
management technique.
MANUAL FILES
Manual files, which are used to store documents, follow a
slightly different relationship:
ELDBM TO EPDBM RELATIONSHIPS
Data Base Administration has the responsibility of
mapping the relationships between the ELDBM and the EPDBM.
As mentioned, there is not always a one-to-one relationship
between logical and physical structures. One logical file
may be implemented by multiple physical files. The point is,
the relationships between logical and physical must be mapped
in the IRM to demonstrate that the physical implementation
satisfies the logical.
EPDBM TO APDBM
The EPDBM represents the global view of data storage
requirements. During Phase 5, Data Base Administration must
consider physical requirements for individual systems. This is
where the Application Physical Data Base Model (APDBM) is
defined. Ultimately, the APDBM represents subsets of the EPDBM.
Relationships between the EPDBM and the APDBM can be expressed
using RD-to-RD and FD-to-FD relationships.
RECORD DESCRIPTION (RD)
As this discussion indicates, the Record Description (RD)
resource in the IRM is used for a variety of purposes. In a
logical context, the RD represents a "View"; in a physical
context, the RD represents a "Record." The RD is also used to
represent other physical structures, such as: screen panels,
print maps, input transactions, output data, messages, tables,
etc. All of these structures exhibit the same characteristics;
all require some form of unique key, and all contain data
elements.
The File Description (FD) is used to represent logical
files (Objects), physical files, and data bases. Files consist
of records, thereby an FD/RD relationship is required. A data
base represents a global view of data, therefore an FD/FD
relationship is used to map all of the files within a data base.
The RD is also the primary resource used to map the
relationships between all four data base models. Although the
File Description (FD) provides for FD-to-FD relationships, it is
the RD that is used to connect the models.
The following diagram illustrates all of the BASIC
relationships needed to map the four data base models in the
IRM. Additional pointers are available for different purposes,
particularly for complicated physical file structures.
IRM RESOURCE RELATIONSHIPS - MAPPING THE FOUR DATA BASE MODELS
This diagram illustrates all of the BASIC relationships needed to map the four data base models. Additional pointers are available for different purposes, particularly for complicated physical file structures. EPDBM REPRESENTATION Graphical techniques to represent physical file management techniques may vary from vendor to vendor. Data Base Administration should use whatever technique is available. To represent FD/RD/DD structures in the IRM, a variety of matrices are available:
Aside from the matrices, the other major deliverable resulting from this phase are file layouts which detail the physical data structures.
DESCRIPTION OF PHASE ACTIVITIES
Activity A - Develop Hardware/Software Approach
Data Base Administration (DBA) and Data Communications
Administration (DCA) reviews the current ELDBM and ALDBM
designs as created in the previous phases. From this,
they determine the most suitable method of implementation
for physically collecting, storing, and retrieving data.
This includes both hardware and software.
DBA/DCA review the Data Base Approach with management to obtain
approval. Hardware/Software acquisitions are decided here.
Activity C - Define Enterprise Physical DB Model
Data Base Administration designs/modifies/documents the
Physical Data Base either according to the specifications
of the previous activity or existing operational
statements. Data Communications Administration designs/
modifies network software as required.
Activity D - Create D.D.L. Statements
Data Base Administration creates the necessary Data
Definition Statements (SCHEMA, DBD, CREATE, etc) to
implement the design of the previous activity. Data
Communications Administration creates the corresponding
network statements.
Also, relationships between ELDBM models and the EPDBM
model are established. This is based on RD-to-RD
relationships.
DBA/DCA then assembles the deliverables produced
in the phase and packages them into a single phase review
document which is evaluated by Quality Assurance prior to
conducting the formal phase review with management.
DBA/DCA conducts a formal review of the Phase 4
deliverables with Data Resource Management, Data
Engineering, and Project Management. The review is used
to evaluate the work performed thus far and revise as
required. At this time, management will review the formal
Phase 4 "EPDBM Design Manual" consisting of:
Based on this report and subsequent review meeting,
management may elect to revise parts of the report,
discontinue the project, or approve it for continuation.
| ||||