| |
“All
men are liable to error; and most men are, in many
points, by passion or interest, under temptation to
it.”
— John Locke
Essay Concerning Human Understanding, 1690

IN THIS CHAPTER,
YOU WILL LEARN:
-
How to balance a
DFD against the data dictionary;
-
How to balance
a DFD
against a process specification;
-
How to
balance a process specification against the data
dictionary;
-
How to balance an
ERD against the DFD and process specification;
-
How
to balance an ERD against the data
dictionary; and
-
How to balance a
DFD against the state-transition diagram.
In the
past five chapters, we have examined several
important modeling tools for structured
analysis:
Each of these tools,
as we have seen, focuses on one critical aspect
of the system being modeled. It is important
to keep this in mind, for it means that the person
reading the model is
also
focusing on one critical aspect, that is, the aspect to which his or
her attention is being drawn by the modeling
tool itself. Because the underlying
system has so many different dimensions of complexity, we want the
dataflow diagram to focus the reader’s attention
on the system functions without
letting data relationships distract his attention; and we want the
entity-relationship diagram to focus attention on the data relationships
without letting the functional characteristics distract his or her
attention; and we
want the state-transition diagram to focus attention on the timing
characteristics of the system without the distractions of functions
or data.
But there comes a time for pulling all
the modeling tools together, and that is what this chapter is all about.
The situation faced by the systems modeler is somewhat analogous to
the ancient
fable of the three blind wise men in India who stumbled up against
an elephant. As Figure 14.1 illustrates, they came to three different
opinions
about
the “reality” they
were dealing with after touching different parts of
the elephant:

Figure 14.1: Three
blind men touching an elephant
-
One
blind man touched the sharp end of one of the
elephant’s long tusks. “Aha,” he
said,
“what we have here is a bull. I can feel its horns.”
-
The
second blind man touched the bristly hide of
the elephant. “Without a doubt,” he
said,
“this is a ... what? A porcupine? Yes, indeed — a porcupine!”
-
The
third blind man felt one of the elephant’s thick
legs and said, “This
must be a tree that we’re dealing with.”
Similarly, when modeling three different
aspects of a system (functions, data, and timing), as well as modeling detailed
characteristics of the system in a data dictionary and set of process
specifications, it is easy to develop several different inconsistent
interpretations of that one reality. This is a particularly serious danger on
large projects, where various people and various special interest groups are
likely to be involved. It is also a danger whenever the project team (and/or
the user community) involves people with very different
backgrounds.
There is another reason for focusing on
model consistency: whatever errors exist will eventually be found, but they
become increasingly difficult and expensive later in the project. Indeed, any
errors that are introduced into the requirements model during the systems
analysis phase are likely to be propagated and magnified during the design and
implementation phases of the project. This is a particularly serious danger on
large projects where the systems analysis is often done by different people (or
even different companies!) than the design and implementation. Thus, Martin
points out that 50% of the errors that are detected in a system and 75% of the
cost of error removal are associated with errors in the systems analysis phase.
And studies in [Boehm, 1981] have shown that the cost of correcting an error
goes up exponentially in later stages of a project; it is ten times cheaper to
fix a systems analysis error during the systems analysis phase of the
project than it is to fix the same error during the design
phase.
Some of these errors are, of course,
simple errors in each individual model (e.g., a dataflow diagram with an
infinite sink). And some of the errors can be characterized as wrong
interpretations of what the user really wanted. But many of the more difficult
and insidious errors are intermodel errors, that is, inconsistencies between one
model and another. A structured specification in which all the modeling tools
have been cross-checked against each other for consistency is said to be
balanced.
The most common balancing error involves
a missing definition: something defined (or described) in one model is
not appropriately defined in another model. We will see several examples of this
in the following sections (e.g., a data store shown on the DFD but not defined
in the data dictionary, or an object in the ERD not shown as a corresponding
data store on the DFD). The second common type of error is one of
inconsistency: the same “reality” is described in
different, contradictory ways in two different models.
We will examine several major aspects of
balancing:
-
Balancing the dataflow
diagram against the data dictionary.
-
Balancing the dataflow
diagram against the process specifications.
-
Balancing the process specifications
against the data dictionary.
-
Balancing the ERD against
the DFD and process specifications.
-
Balancing the ERD against
the data dictionary.
-
Balancing the DFD against
the state-transition diagram.
As we will see, the balancing rules are
all very straightforward; they require very little intelligence or creativity to
carry out. But they must be carried out, and carried out
diligently.

14.1 BALANCING THE DFD AGAINST THE
DD
The rules for balancing the dataflow
diagram against the data dictionary are as follows:
-
Every dataflow (i.e., an
arrow on the DFD) and every data store must be defined in the data dictionary.
If it is missing in the data dictionary, the dataflow or data store is
considered
to be undefined.
-
Conversely,
every data element and every data store defined
in the data
dictionary must appear
someplace on a DFD. If it does not
appear, the offending data element or data store
is a “ghost” — something
defined but not
“used” in the system. This can happen if the data elements are
defined to correspond with an early
version of the DFD; the danger is that the DFD may be changed (i.e., a dataflow
or data
store may be deleted) without
a
corresponding change to the data
dictionary.
This means, of course,
that the systems analyst must painstakingly review
both the DFDs and the data dictionary to
ensure that they are balanced. It doesn’t matter which
model is examined first, though most analysts begin
with the DFD to ensure that all the elements are
defined in the data dictionary. Like all the other balancing
activities in this
chapter, it is a tedious chore and one that is well suited
to automated support.
14.2 BALANCING THE DFD AGAINST THE
PROCESS SPECIFICATION
Here are the rules for balancing the DFD
against the process specifications:
-
Every bubble in the DFD
must be associated with a lower-level DFD or a process specification,
but not both. Thus, if the DFD shows a bubble that is identified as 1.4.2,
then
there must either be a corresponding figure identified as Figure 1.4.2
whose bubbles are identified as 1.4.2.1, 1.4.2.2, and so on, or the structured
specification must contain a process specification for bubble 1.4.2. If both
exist, the model is unnecessarily (and dangerously) redundant.
-
Every
process specification must have an associated
bottom-level
bubble in the DFD. Since the process
specification does require a
lot of work, one would think it highly unlikely
that there would
be “tramp” process specifications
floating around a system. But it can happen:
the
process specification may have been written
for a preliminary version of
the DFD, after which a revision process might
eliminate some of the
DFD bubbles.
-
Inputs and outputs must
match. The DFD will show incoming and outgoing flows for each bubble, as
well as connections to stores. These should be evident in the process specification,
too: thus, we should expect to see a READ statement (or GET, or INPUT, or
ACCEPT, or some other similar verb) corresponding to each incoming dataflow
and
a WRITE (or PUT, or DISPLAY, etc.) for each outgoing
dataflow.
Note that these comments apply
specifically to processing bubbles. For the control bubbles in a DFD,
there are correspondences between the bubbles and associated state-transition
diagrams, as discussed in Section 14.6.
14.3 BALANCING THE PROCESS SPECS
AGAINST THE DFD AND DD
The rules for balancing the process specifications
against the dataflow diagram and data dictionary can
be described as follows; each data reference in the
process specification (typically a noun) must satisfy
one of
the following
rules:
-
It
matches the name of a dataflow or data store connected
to
the bubble described by the process specification, or
-
It
is a local term, explicitly defined in the
process specification, or
-
I t
appears as a
component in a data dictionary entry for
a dataflow or data store connected to the bubble.
Thus, the data elements X and Y appear in the process
specification shown in Figure 14.2, but do
not appear as a connected dataflow in
the DFD shown in Figure 14.3. However, the
data dictionary, a fragment of which is shown in
Figure 14.4, indicates that X and Y are components
of Z; and in
Figure 14.3 we see that Z is indeed a dataflow
connected to the bubble, so we conclude that the
model is
balanced [1].
PROCESS
SPECIFICATION 3.5: COMPUTE WIDGET FACTOR
*
P AND Q ARE LOCAL TERMS USED FOR INTERMEDIATE RESULTS
*
P = 3.14156 * X
Q = 2.78128 * Y - 13
WIDGET-FACTOR = P * Q +
2
Figure
14.2: A process specification component
of a system model

Figure 14.3: A
DFD component of a system model
X = * horizontal component of frammis
factor *
* units: centimeters; range: 0 - 100
*
Y = * vertical component of frammis
factor *
* units: centimeters; range: 0 - 10
*
Z = * frammis factor, as defined
by Dr. Frammis *
X + Y
Figure 14.4: A
data dictionary component of a system model
14.4 BALANCING THE DATA DICTIONARY
AGAINST THE DFD AND PROCESS
SPECIFICATIONS
From the discussion above, it can be seen
that the data dictionary is consistent with the rest of the model if it obeys
the following rule:
This
assumes, of course, that we are modeling the essential behavior
of a system. A complex, exhaustive data dictionary
of an existing implementation of a system
may contain some data elements that are no longer
used.
One could also argue that the data
dictionary might be planned in such a way that it permits future expansion; that
is, it contains elements that are not needed today, but might be useful in the
future. A good example of this is a data dictionary that contains elements that
may be useful for ad hoc inquiry. The project team, perhaps in concert with the
user, can determine whether this kind of unbalanced model is indeed an
appropriate thing to do. However, it is important to at least be aware of
the occurrence of such deliberate decisions.
14.5 BALANCING THE ERD AGAINST THE
DFD AND PROCESS SPECIFICATIONS
The entity-relationship diagram, as we saw in Chapter
12, presented a very different view of a system
than did the dataflow diagram. However, there are
some relationships that must hold in order for the
overall system model to be complete, correct, and
consistent:
-
Every store on the DFD
must correspond to an object type, or a relationship, or a combination
of an object type and relationship (i.e., an associative object type) on
the
ERD. If there is a store on the DFD that does not appear on the ERD,
something is wrong; and if there is an object or relationship on the ERD
that does not
appear on the DFD, something is wrong.
-
Object
names on the ERD and data store names on the DFD
must match. As we saw in Chapters
9 and 12,
the convention in this book is to use the plural
form (e.g., CUSTOMERS) on the DFD and the
singular form on the ERD.
-
The data dictionary entries
must apply to both the DFD model and the ERD model. Thus the data dictionary
entry for the above example should include definitions for both the object
on the ERD and the store on the ERD. This would imply a data dictionary
definition
such as the following:
CUSTOMERS =
{CUSTOMER}
CUSTOMER = name + address +
phone-number + ...
The data dictionary entries
for the singular form (e.g., CUSTOMER) must
provide the meaning and composition of a single instance
of the set of objects referred to (in the singular)
in the
ERD and (in the plural) in the data store of a DFD.
The data dictionary entries for the plural form (e.g., CUSTOMERS)
provide the meaning and the composition of the set
of instances.
Similarly, there are rules for ensuring
that the ERD is consistent with the process specification
portion of the function-oriented model (keep in
mind that the process specifications are the
detailed components of the model whose graphical “incarnation” is
the DFD). The rules are that the combined set
of all process specifications must, in their entirety:
-
Create and delete
instances of each object type and relationship and relationship shown in the
ERD. This can be understood by looking at the DFD shown in Figure 14.5: as we
know, the CUSTOMERS store corresponds to the CUSTOMER object.
Something must be capable of creating and deleting instances of a customer,
which means that some bubble within the DFD must have a dataflow connected to
the CUSTOMERS store. But the actual work of writing to the store (i.e.,
creating or deleting an instance of the related CUSTOMER object in the
ERD) must take place inside the bubble, which means that it must be documented
by the process specification associated with the
bubble.
-
Some
DFD bubble sets
values for each data element
attributed to each instance of each
object type,
and some DFD process uses (or reads)
values of each data
element [2].

Figure 14.5: Creating
and deleting ERD instances
14.6 BALANCING THE DFD AGAINST THE
STATE-TRANSITION DIAGRAM
The state transition diagram can be
considered balanced against the dataflow diagram
if the following rules are met:
-
Every control bubble in
the DFD has associated with it a state-transition
diagram as its process
specification. Similarly, every
state-transition diagram in the overall system model
must be associated with a control process (bubble)
in the
DFD.
-
Every condition in
the state-transition diagram must
correspond to an incoming control flow
into the control process associated
with the state-transition diagram. Similarly,
every incoming control flow on the control bubble
must
be associated
with an appropriate condition on
the corresponding state-transition diagram.
-
Every action in the
state-transition diagram must correspond to an outgoing control flow in
the control process associated with the state-transition diagram. Similarly,
every outgoing control flow on the control bubble must be associated with an
appropriate action on the corresponding state-transition
diagram.
These
correspondences are illustrated in Figure 14.6.

Figure 14.6: Correspondences
between the DFD and STD
14.7 SUMMARY
Note that all the balancing rules
presented in this chapter have been presented
as if you were going to personally examine
all the components of a system model to spot
potential errors and inconsistencies. This would
imply
that you should lay out, on the floor or on a
very large bulletin board, all the DFDs, process
specifications,
ERDs, STDs, and data dictionary, and then
walk from one to the other, carefully
checking that everything is in place.
As this edition of the book is being
prepared that is precisely what you would
have to do in most systems development
organizations around the world. The balancing
rules we
have presented in this
chapter can be automated, and there
are already a number of relatively inexpensive
PC-based workstation tools that will carry
out some or all of the
error-checking mechanically. Unfortunately,
they are not widely deployed and used in
systems development organizations
We have seen exactly the same phenomenon
in a number of other fields. One could
argue that the proliferation of cheap word-processing
systems has obviated the need for learning
script writing;
indeed, one might argue that the availability
of spelling checkers has even obviated
the
need for learning how to spell. And the
universal availability of
pocket calculators has obviated the need
to learn how to do long division. And the
ubiquitous presence of automatic-shift
cars has obviated the need to learn
how to drive stick-shift cars.
Indeed, I can’t think of any compelling
reason for teaching someone in North
America how to drive a stick-shift car as we enter
the 21st century. Nor can I think of
any
reason for emphasizing the art
of calligraphy and handwriting (except,
perhaps, as an art form) in an age
where word-processing
systems are about to be replaced by
voice-recognition systems.
But I can appreciate the need for learning
the basic principles of long division,
even if one is supremely confident
that one will
never be without a
pocket calculator; if nothing else,
as Joshua Schwartz of Harvard University points
out,
it helps us to know whether the answer
we have produced with our
calculator has the decimal point in
the right place.
Perhaps one could even argue the merits
of learning script handwriting when
home computers are still not as widespread as
televisions and telephones, and when
only a small percentage of U.S. schools
are prepared to teach the mechanical
skills of typing. Script handwriting is technologically
obsolete, and it is painful for computer-literate
parents (not
to mention computer-literate children!)
to be forced to learn this ancient,
primitive
communication skill; but it is probably
still
a necessary skill in
today’s society. After all, it was
only a few years ago that most parents stopped
teaching
their children how to replace the spark
plugs, change the oil,
and fix a flat tire on their automobile.
Similarly, I am convinced that a professional systems
analyst needs to understand the principles
of balancing presented in this chapter. As a systems
analyst, you may have no alternative but to carry
out these error-checking rules mechanically unless
proper software engineering tools are used within
your organization. The manual error-checking process
will normally be validated in a walkthrough environment;
walkthroughs are discussed in Appendix
D.

REFERENCES
-
James
Martin, An Information
Systems Manifesto. Englewood
Cliffs, N.J.: Prentice-Hall,
1984.
-
Barry
Boehm, Software
Engineering Economics.
Englewood Cliffs, N.J.: Prentice-Hall,
1981.

QUESTIONS
AND EXERCISES
-
Why
is it important to balance the models of a system
specification?
What are the dangers of an unbalanced
specification?
-
Why
is it important to find errors in a system model
as early
as possible?
-
What
percentage of the cost of error removal is associated
with
the systems
analysis phase of a
project?
-
What
are the two most common forms
of
balancing errors?
-
What
parts of the system model
must the
DFD be balanced
against?
-
What
parts of the system model
must the
ERD be balanced
against?
-
What
parts of the system
model must
the STD be balanced
against?
-
What
parts of
the system
model must
the data
dictionary be balanced
against?
-
What
parts of
the system
model must
the process
specification
be
balanced
against?
-
Are
there
are any other
components
of the
system
model
that
must be balanced?
-
What
are
the rules
for
balancing the DFD
against
the
data
dictionary?
-
Under
what
conditions
could
an
item
be
defined
in
the
data
dictionary
without
appearing
somewhere
on
a
DFD?
-
What
are the
rules for
balancing the
DFD against
the process
specifications?
-
What
would
happen if
a process
specification were
written for
a nonprimitive
(or nonatomic)
bubble in
the DFD?
-
Should
there
be a
process specification
for control
processes in
the DFD?
If so,
should it
take the
same form
as a
process specification
for a
normal process?
-
What
are
the rules
for balancing
the process
specification against
the DFD
and data
dictionary?
-
What
are “tramp
data?”
-
Under
what
conditions is
it acceptable
for a
term (or
data reference)
in the
process specification
to not
be defined
in the
data dictionary?
-
What
are
the rules
for balancing
the data
dictionary against
the DFD
and process
specification?
-
Under
what
conditions is
it possible
that the
project team
might deliberately
put
items into
the data
dictionary that
are not
in the
DFD?
-
What
are
the rules
for balancing
the ERD
against the
DFD?
-
What
is
the convention
for matching
names in
the ERD
with stores
in the
DFD?
-
What
are
the rules
for balancing
the ERD
against
the
process specification?
-
What
are
the rules
for balancing
the STD
against
the
DFD?
-
Under
what
conditions
is
it
valid
not
to
have
an
STD
in
a
system
model?
-
How
should
the
balancing
rules
presented
in
this
chapter
be
carried
out
in
a
typical
systems
development
project?
Who
should
be
responsible
for
seeing
that
it
gets
done?
-
If
you
have
an
automated
systems
analysis
workstation,
is
it
necessary
to
learn
the
balancing
rules
presented
in
this
chapter?
Why
or
why
not?
-
If
the
system
models
have
been
balanced,
can
we
be
confident
that
they
are
correct?
Why
or
why
not?
-
Point
out
three
balancing
errors
in
the
following
system
model.

-
Should
the
STD
be
balanced
against
the
ERD?
Why
or
why
not?

FOOTNOTES
-
[1] However,
it
may
be
worth
doing
some
further
checking
at
this
point:
if
X
is
the
only
component
of
Z
that
is
used
in
the
process
specification,
we
could
seriously
question
why
Z
was
shown
as
an
input
in
the
first
place.
That
is,
the
remaining
components
of
Z
may
be “tramp data” that
“float” through the bubble without being used. This often reflects
a model of an arbitrary implementation of
a
system,
rather
than
a
model
of
the
essential
behavior
of
the
system.
-
[2]
Note that the situation may be somewhat more complicated:
the bubble shown on the DFD may not be a bottom-level
bubble. Thus, it is possible that the bubble labeled
ENTER-NEW-CUSTOMER in Figure 14.5 may be
described by a lower-level dataflow diagram, not
by a process specification. If this is the case,
then one of the lower-level bubbles (perhaps not
one level, but several levels below) will be a primitive
and will access the store directly. Recall from
Chapter 9
that our convention on the DFD is that the store
is shown at the highest level where it is an interface
between two bubbles, and it is repeated in every
lower-level diagram.
|
|
|
|