CHAPTER 17: The Essential Model

 

“Look to the essence of a thing, whether it be a point of doctrine, of practice, or of interpretation.”

— Marcus Aurelius
Meditations VIII

IN THIS CHAPTER, YOU WILL LEARN:

  1. The four major system models in the life cycle;
  2. Why modeling the user’s current system is dangerous;
  3. The distinction between essential and implementation models; and
  4. How to “logicalize” an implementation model.

In the previous section (Chapters 9 through 16), we examined a number of modeling tools that every systems analyst should have at his or her disposal. However, given these tools, what kind of models should we build? Should we build a model of the user’s current implementation of a system? Should we build a model of the proposed new implementation? Or a model that is independent of the implementation technology? Or all three? These questions are addressed in the next several chapters.

We begin by examining the classical structured analysis approach to developing system models; as we will see, there are major problems with this approach. We will then discuss the essential model, which is the primary systems analysis model that we recommend building. Finally, we discuss some guidelines for constructing an essential model from an existing implementation model.

17.1 THE CLASSICAL MODELING APPROACH AND WHY IT DIDN'T WORK

17.1.1 The Four System Models

When structured analysis was first introduced, it was commonly argued that the systems analyst should develop four distinct models. These are shown in Figure 17.1.

Figure 17.1: The four system models

 

The current physical model is a model of the actual system that the user is presently using. It may be a manual system, an automated system, or a mixture of the two. Typically, the processes (bubbles) in the dataflow diagram for the current physical system are named for the names of people, or organizational units, or computer systems that do the work of transforming inputs into outputs. An example is shown in Figure 17.2. Note also that the dataflows typically show physical forms of data being transported from bubble to bubble; also, the data stores may be represented by file folders, disk files, or some other technology.

The new logical model is a model of the pure or essential requirements of the new system that the user wants. In the ideal case (from the systems analysts’ point of view), it is the same as the current logical model; that is, it contains the same functions and the same data. This situation could occur if the user was completely satisfied with the functionality of the current system, but was dissatisfied with its implementation [1]. In most cases, though, the user will ask for additional functions: “While you’re at it, could you add another transaction to take care of the following situation. ...” Or the user may ask that the system keep track of a new form of data. Thus, while 80% to 90% of the new logical model may be identical to the current logical model, there are likely to be at least a few changes and additions.

Figure 17.2: A current physical model

 

The current logical model is a model of the pure or essential requirements being carried out by the user’s current system. Thus, arbitrary implementation details are removed, and the resulting model shows what the system would do if perfect technology were available [2]. An example of a current logical model is shown in Figure 17.3.

Figure 17.3: The current logical model

 

The new physical model is a model showing the implementation constraints imposed by the user. One of the most important such constraints is the determination of the automation boundary (i.e., the determination of which functions in the new system will be automated and which will be performed manually). The new physical model corresponds to what we now call the user implementation model, which we discuss in more detail in Chapter 21.

17.1.2 Why The Classical Approach Didn't Work

The classical approach described above was based on three major assumptions:

  1. The systems analyst may not be very familiar with the application or business area: he may be an expert in computer technology, but only superficially knowledgeable about banking, insurance, inventory control, or whatever area the user is working in. Because of this, it is important for the systems analyst to begin with a current physical model as a way of educating himself. The model he draws will be relatively easy to verify, because it will contain a number of physical landmarks that can be observed in the user’s current physical environment. Once having gathered this background information, the systems analyst can continue by transforming the physical model into a logical model.
  2. The user may be unwilling or unable to work with a new logical model at the beginning of a project. The most common reason for this is suspicion of the systems analyst’s ability to develop a logical model of the new system. Even if the systems analyst thinks that he is an expert in the user’s business area, the user may not agree. “Why should I trust you to design a new system for me,” she will ask, “when you don’t even understand how my business works now?” Also, some users find it difficult to look at an abstract system model with no recognizable landmarks; they may need a model of the current physical system as a way of familiarizing themselves with the process of structured analysis and assuring themselves that the analyst hasn’t overlooked anything. (An alternative is the prototyping approach discussed in Chapter 5.)
  3. The transformation of a current logical model into a new logical model does not require much work and, in particular, does not require much wasted work. As indicated above, the user will typically add some new functions, or new data, to the system she already has, but most (if not all) of the existing logical (or essential) system remains intact.

These assumptions have indeed turned out to be correct in many projects. However, they ignore a much larger danger: the process of developing a model of the current system may require so much time and effort that the user will become frustrated and impatient, and ultimately cancel the project. To appreciate this, you must keep in mind that:

  • Some users (and some managers and some programmer-analysts) regard any form of systems analysis as a waste of time — as a way of “resting up” until the real work of the project (i.e., coding) begins. This has become increasingly true in recent years, as the pace of business quickens.
  • Many users are understandably dubious about the merits of carefully modeling a system that, by definition, will be superseded and replaced as a result of the development of the new system.

The problem occurs most often because the systems analyst gets carried away with the task of modeling the current system and begins to think of it as an end in itself. Thus, instead of drawing just the dataflow diagram(s) and documenting a few key process specifications, the systems analyst often draws every dataflow diagram, documenting every process specification, and developing a complete data dictionary.

Unfortunately, this approach almost always involves a great deal of wasted time. Indeed, you can normally expect that as much as 75% of the physical model will be thrown away in the transition from current physical to current logical; or to put it another way, the current physical model is typically three to four times as large as the current logical model. This is because of redundancy (the same function being carried out in several different parts of the current system, and several data elements being duplicated or triplicated), and because of verification, validation, and error-checking that are appropriate in the current physical system but not appropriate in the current logical system [3].

All this may seem rather obvious to the casual reader. However, in project after project, systems analysts have been observed getting so involved in the process of modeling that they have forgotten the user’s ultimate objective: to produce a working system. As Steve McMenamin (co-author of [McMenamin and Palmer, 1984]) points out, “Bubbles don’t compile.” [4]

Consequently, this book recommends that the systems analyst should avoid modeling the user’s current system if at all possible. The modeling tools discussed in Part II should be used to begin, as quickly as possible, to develop a model of the new system that the user wants. This new system, referred to in classical structured analysis textbooks as the new logical system, is referred to here as the essential model of the system.

There will occasionally be a situation where the systems analyst must build a model of the user’s current system; this is true, for example, if the systems analyst needs to model the current physical system in order to discover what the essential processes really are. This situation is discussed further in Section 17.3.

17.2 THE ESSENTIAL MODEL

17.2.1 What It Is

The essential system model is a model of what the system must do in order to satisfy the user’s requirements, with as little as possible (and ideally nothing) said about how the system will be implemented. As mentioned earlier, this means that our system model assumes that we have perfect technology available and that it can be readily obtained at zero cost.

Specifically, this means that when the systems analyst talks with the user about the requirements of the system, the analyst should avoid describing specific implementations of processes (bubbles in the dataflow diagram) in the system; that is, he or she should not show the system functions being carried out by humans or an existing computer system. As illustrated by Figure 17.4(a) and (b), these are arbitrary choices of how the system might be implemented; but this is a decision that should be delayed until the systems design activity has begun [5]. Figure 17.4(c) shows a more appropriate essential model of what the system function must carry out regardless of its eventual implementation.

Figure 17.4(a): A model of how a system function will perform its job

 

Figure 17.4(b): Another model of how the system function will be performed

 

Figure 17.4(c): A model of what the system function is

 

The same is true of dataflows and data stores: the essential model should describe the content of dataflows and data stores, without describing the medium (e.g., disk or tape) or physical organization of the data.

17.2.2 Difficulties in Building an Essential Model

While the guidelines above may seem simple and obvious, it often turns out to be very difficult to completely eliminate all implementation details from the essential model. The most common examples of implementation details are:

  • Arbitrary sequencing of activities in a dataflow model. The only sequencing on the dataflow diagram should be that required by data (e.g., bubble 2 may require a data element produced by bubble 1 and thus cannot begin its work until bubble 1 has finished) or by the sequencing of events external to the system.
  • Unnecessary files, data stores that would not be required given perfect technology. Temporary files (or intermediate files) are required in an implementation model because processes are scheduled to do their work at different times (e.g., an overnight batch program produces a file used by the daytime, on-line system); they are also introduced in implementation models for backup and recovery purposes, because the implementation technology is error prone, as are the people who operate the computers.
  • Unnecessary error-checking and validation of data and processes inside the system. Such validation activities are necessary in an implementation model, because one must work with error prone processes (e.g., some functions are carried out by humans, who are notoriously error prone) and noisy paths of data between processes.
  • Redundant or derived data. Redundant data elements are sometimes included in data stores for the sake of efficiency; while this is usually a reasonable thing to do, it should be done during the design phase of the project, not during the modeling of essential functions and data. Also, the systems analyst may inadvertently include data elements that can be derived, or computed, from values of other data elements.

17.2.3 Components of the Essential Model

The essential model consists of two major components:

  1. Environmental model
  2. Behavioral model

The environmental model defines the boundary between the system and the rest of the world (i.e., the environment in which the system exists). It is discussed in more detail in Chapter 19; as we will see, it consists of a context diagram, an event list, and a short description of the purpose of the system.

The behavioral model describes the required behavior of the insides of the system necessary to interact successfully with the environment. Chapters 20 and 21 describe a strategy for deriving the behavioral model; the model consists of the familiar dataflow diagrams, entity-relationship diagrams, state-transition diagrams, data dictionary entries, and process specifications that we have discussed earlier in the book.

17.3 WHAT TO DO IF YOU MUST BUILD AN IMPLEMENTATION MODEL

As mentioned earlier in this chapter, there are circumstances where you may find it necessary or desirable to build an implementation model before you build the essential model of the system. Typically, this will happen because the user is not convinced that you understand the business well enough to model a new system, or because you have decided on your own that you need to study the current environment before proposing a new system.

If you decide to proceed in this fashion, the primary thing you must remember is that your main objective is to get a general understanding and a general overview of the existing system. It is not your objective to document the current system in minute detail. Thus, it will probably be useful and appropriate to create one or more levels of dataflow diagrams for the current system; and it will probably be appropriate to generate an entity-relationship diagram. And it might be useful to write process specifications for a few of the more critical (or obscure) functions in the system; it might be useful to collect some of the physical documents that would represent a physical data dictionary. But you should not try to write process specifications for all the functions, nor should you try to develop a complete data dictionary for the existing system.

When you have finished developing the model of the current implementation, your next job is to logicalize it (i.e., to remove as many implementation-oriented details as possible). This will usually include the following steps:

  • Look for essential flows that have been arbitrarily packaged together in the same medium and separate them. For example, you may find that in the current system, several data elements are being transmitted together from one computer to another computer via a common telecommunications link; or you may find that several unrelated data elements are being copied onto a paper form to be transmitted to various functions.
  • Look for aggregate or packaged flows that are sent to bubbles (representing people, computers, etc.) that don’t need all the data in those flows. Thus, Figure 17.5(a) shows a process, COMPUTE FRAMMIS FACTOR, that requires only data element X; meanwhile, another process, COMPUTE WIDGET FACTOR, requires only data element Y. For convenience, the current implementation has packaged X and Y into an aggregate data element Z; logicalizing this model would result in the dataflow diagram shown in Figure 17.5(b).
  • Distinguish between the essential work done by a process and the identification of the processor shown in the implementation model. The processor might be a person or a computer or some other form of technology; and an individual processor might be carrying out fragments of one or more essential processes or, in their entirety, carrying out multiple essential processes. As we will see in Chapter 20, the essential processes should be grouped together if they are triggered by the same external event.
  • Eliminate processes whose only purpose is to transport data from one place to another within the system. Also, eliminate the bubbles responsible for physical input and output between the system and the external environment. A physical model of a system might show, for instance, a courier or messenger function; it should be eliminated in the essential model. And many physical DFDs have processes with names like “obtain input from user” or “print report”; these, too, should be eliminated.
  • Eliminate processes whose job is to verify data that are both produced inside the system and used inside the system. Since we are assuming perfect technology in the essential model, such internal verification and cross-checking is not necessary. It is appropriate, though, to provide error-checking for data brought into the system from the external environment. Thus, any processes whose names are “double check ... ” or “verify ... ” or “validate ... ” or “edit ... ” should be regarded with suspicion, unless they exist at the boundary of the system and are dealing with external inputs.
  • Look for situations where essential stores have been packaged together into the same implementation store (e.g., disk files, tape files, or paper files); this is very common in second-generation systems and in systems that have been optimized over a period of years to handle large volumes of data efficiently. Separate the content of the store from the medium of storage.
  • Remove any data elements from stores if they are not used by any process; also, remove data elements from stores if they can be computed, or derived, directly from other data elements. (Note that derived data elements and redundant copies of data elements may be reinserted later when the implementation model is developed during systems design.
  • Finally, remove any data stores that exist only as an implementation-dependent time delay between processes. These include intermediate files, report files, spooling files, and the like.

Figure 17.5(a): A physical model

 

Figure 17.5(b): The logicalized version

 

17.4 SUMMARY

The concept of an essential model seems quite natural, but it is not as easy to achieve on real-world projects as you might think. Most users are so involved in the implementation details of their current system that it is hard for them to focus on the “perfect technology” view of a system. And it is equally difficult for many veteran systems analysts, for they have spent so many years building systems that it is difficult for them to avoid making implementation assumptions as they describe a system.

Remember that it is critically important to develop the essential model of a system, for (as noted several times throughout this book) most large information systems have a lifetime of 10 to 20 years. During that period of time, we can expect computer hardware technology to improve by at least a factor of a thousand, and probably closer to a factor of a million or more. A computer that is a million times faster, smaller, and cheaper than today’s computer is indeed close to perfect technology; we must begin today modeling our systems as if we had that technology available to us.

REFERENCES

  1. Tom DeMarco, Structured Analysis and Systems Specification. New York: YOURDON Press, 1978.
  2. Chris Gane and Trish Sarson, Structured Systems Analysis: Tools and Techniques. Englewood Cliffs, N.J.: Prentice-Hall, 1978.
  3. Edward Yourdon, Managing the Systems Life Cycle. New York: YOURDON Press, 1982.
  4. Victor Weinberg, Structured Analysis. New York: YOURDON Press, 1978.
  5. Steve McMenamin and John Palmer, Essential Systems Analysis. New York: YOURDON Press, 1984.

QUESTIONS AND EXERCISES

  1. What are the four models recommended by classical systems analysis textbooks?
  2. What is a current physical model?
  3. Give three examples of physical processes (bubbles).
  4. Give three examples of physical stores.
  5. Give three examples of physical dataflows.
  6. What is a current logical model?
  7. What is the difference between a current physical model and a current logical model?
  8. What is perfect technology in the context of this chapter?
  9. What is a new logical model?
  10. What is the difference between a current logical model and a new logical model?
  11. Under what circumstances could the current logical model and the new logical model for a system be the same?
  12. What degree of overlap should the systems analyst expect to see between the current logical and new logical model of a system?
  13. What is a new physical model?
  14. What is another name for the new physical model?
  15. What is the major constraint that the new physical model describes?
  16. What are the three major assumptions that the classical approach to structured analysis is based on?
  17. Research Project: In your organization, what percentage of projects have systems analysts who are not intimately familiar with the user’s business area? Is this a reasonable percentage in your opinion? Is it changing?
  18. What are the two major reasons why a user might have trouble reading and understanding a logical model?
  19. What is the major problem with the classical approach to structured analysis?
  20. Why are some users dubious about the merits of modeling their current system?
  21. How much of the current physical model is likely to be thrown away in the transition to a current logical model?
  22. What are the reasons that the current physical model is so much larger than the current logical model of a system?
  23. What is a synonym for new logical model?
  24. What kind of error-checking is appropriate in a logical model? What kind is inappropriate? Why?
  25. Give a definition of the essential model of a system.
  26. What does constructive procrastination mean in the context of this chapter?
  27. When, in a systems development project, should the decision be made about implementing a function (i.e., a process in the DFD) with a person versus a computer?
  28. What are the four common errors or mistakes typically made by systems analysts when trying to create an essential model?
  29. Why should temporary files not be shown in an essential model?
  30. When should temporary files be shown in a system model? Why?
  31. When should redundant data be shown in a system model?
  32. When should derived data be shown in a system model?
  33. What are the two components of the essential model of a system?
  34. What is the purpose of the environmental model of a system?
  35. What is the purpose of the behavioral model of a system?
  36. If you have to document the current implementation of a system, what should you be careful to avoid?
  37. Is it a good idea to document all the dataflows in the current implementation of a system? Why or why not?
  38. Is it a good idea to document all the process specifications in the current implementation of a system? Why or why not?
  39. Is it a good idea to document all the elements of the data dictionary in the current implementation of a system? Why or why not?
  40. When logicalizing a current physical model, what should you do with essential flows that have been packaged in the same medium?
  41. When logicalizing a current physical model, what should you do with packaged flows sent to processes that don't need all the data?
  42. When logicalizing a current physical model, what should you do with processes whose only purpose is to transport data from one place to another?
  43. When logicalizing a current physical model, what should you do with bubbles whose only purpose is to verify data that are created within the system?
  44. When logicalizing a current physical model, what should you do with essential stores that have been packaged in the same medium?
  45. When logicalizing a current physical model, what should you do with data elements that exist in stores but are not used anywhere in the system?
  46. When logicalizing a current physical model, what should you do with temporary files that are found in the current physical system?

FOOTENOTES

  1. [1] There are many possible reasons for this. The system may be implemented on computer hardware that is now obsolete or on hardware whose manufacturer has gone out of business. Or the system’s performance or response time may be inadequate. Or the user may ask that some manually maintained data (e.g., paper files) be computerized. Or, as is increasingly common these days, the software may be so poorly documented that it can no longer be maintained or modified.
  2. [2] Perfect technology can be interpreted as computer hardware that costs no money, takes up no space, consumes no power and generates no heat, runs at infinite speeds (i.e., carries out any computation in zero time), stores an infinite amount of data, any or all of which can be retrieved in zero time, and the computer never, ever breaks down and never makes mistakes.
  3. [3] Regardless of whether we are building a logical (essential) or physical (implementation) model, it is usually appropriate to perform some error-checking of data that come into the system from the external world. However, as data are transmitted from place to place within the system, the logical (essential) model does no error-checking, because it assumes that the system will be implemented with perfect technology. In the physical (implementation) model, especially a model of the current physical system, the error-checking is vital because (1) some of the processing is error prone, especially if carried out by humans, (2) the transportation of data from one process to another may be error prone, depending on the communications medium used, and (3) the storage and retrieval of data from physical data stores may be an error prone activity.
  4. [4] Eventually, bubbles will compile. That is, the combination of dataflow diagrams, data dictionary, and rigorous process specifications can become input to a code generator that will produce executable programs. However, even in this case, the effort to produce a complete, detailed physical model is a waste of time. Nobody wants a computerized replica of the current system.
  5. [5] A popular term for this is “constructive procrastination.” My colleague, Steve Weiss, prefers “safe deferral,” which is less pejorative, and is indeed the principle upon which the top-down approach is based.