Chapter 22

From Structured Analysis Wiki

Jump to: navigation, search

Contents

Moving Into Design

For close designs and crooked counsels fit,
Sagacious, bold, and turbulent of wit,
Restless, unfix’d in principles and place,
In power unpleas’d, impatient of disgrace;
A fiery soul, which working out its way,
Fretted the pygmy-body to decay ...

-- John Dryden
Absalom and Achitophel, 1680




IN THIS CHAPTER, YOU WILL LEARN:


  1. The three levels of systems design;
  2. The three major criteria for evaluating a design;
  3. How to draw a structure chart; and
  4. How to use coupling and cohesion to evaluate a design.


When the user implementation model has been completed, the job of systems analysis is officially over. Everything past that point becomes a matter of implementation. The visible part of this work is programming and testing, which we will discuss in Chapter 23. However, programming should be preceded by a higher-level activity: design.

As a systems analyst, you may not feel that you are interested in the details of systems design or program design; however, as we saw in the previous chapter, the work of the systems analyst and the work of the designer cannot always be separated. Especially in the area of the user implementation model, the analyst must make sure that she or he understands the user’s requirements, while the designer must ensure that those requirements can be realistically implemented with current computer technology. Thus, it is important for you to have some understanding of the process that the designer goes through when your job is finished.

There is another reason for being interested in systems design: you may find yourself doing the job! Especially on small- and medium-sized systems, the same individual is often expected to document the user requirements and also develop the design. Thus, you may be expected to decide the best way of mapping the model of user requirements onto a configuration of different CPUs; you may have to decide how the logical data model (which was documented with ERDs) can best be implemented by a database management system; and you may have to decide how the system functions should be allocated to different tasks within each processor.

It is not the purpose of this book to discuss the activities of systems design in great detail; this is better accomplished in books devoted to the subject, such as [Page-Jones, 1988], [Yourdon and Constantine, 1989], [Ward and Mellor, 1985], [Jackson, 1975], [Orr, 1977], and others. However, we will briefly examine the major stages of design and some of the more important objectives that a system designer should try to achieve. Since systems design and program design are indeed subjects unto themselves, you should definitely examine the references at the end of this chapter if you need additional information.




THE STAGES OF DESIGN

The activity of design involves developing a series of models, in much the same way that the systems analyst develops models during the systems analysis phase of a project. The specific design models and their relationship to the systems analysis models discussed in this book are illustrated in Figure 22.1.

The most important models for the designer are the systems implementation model and the program implementation model. The systems implementation model is further divided into a processor model and a task model.


Figure 22.1: Analysis models and design models; source: Image:Figure221.graffle
Figure 22.1: Analysis models and design models; source: Image:Figure221.graffle


The Processor Model

The first job that the systems designer faces is deciding how the essential model (or, to be more accurate, the automated portion of the user implementation model) will be allocated to major pieces of hardware and system software technology. At the level of the processor model, the systems designer is primarily trying to decide how the essential model should be allocated to different processors (CPUs) and how those processors should communicate with one another. There are typically a variety of choices:


  • The entire essential model can be allocated to a single processor. This is often referred to as the mainframe solution.
  • Each bubble in the essential model’s Figure 0 DFD can be allocated to a different CPU (typically a mini or micro computer).[1] This is often referred to as the distributed solution.
  • A combination of mainframes, minis, and micros can be chosen to minimize costs, maximize reliability, or achieve some other objective.

Just as processes must be assigned to appropriate hardware components, data stores must be similarly allocated. Thus, the designer must decide whether a store will be implemented as a database on processor 1 or processor 2. Since most stores are shared by many processes, the designer may also have to decide whether duplicate copies of the store need to be assigned to different processors. The activity of allocating processes and stores to processors is illustrated in Figure 22.2.


Figure 22.2: Allocating processes and stores to processors; source: Image:Figure222.graffle
Figure 22.2: Allocating processes and stores to processors; source: Image:Figure222.graffle


Notice that anything other than a single-processor implementation will involve some mechanism for communicating between processors; what we have traditionally shown as dataflows must now be specified in physical terms. Some of the choices available to the systems designer for processor-to-processor communication are:


  • A direct connection between the processors. This could be implemented by connecting the processors with a cable, or a channel, or a local area network. This kind of communication will generally permit data to be transmitted from one processor to another at speeds ranging from 50,000 bits per second (often abbreviated as 50KB) to several million bits (megabits) per second.
  • A telecommunications link between the processors, typically via the Internet. This is especially common if the processors are physically separated by more than a few hundred feet. Depending on the nature of the telecommunications link, data will typically be transmitted between the processors at speeds ranging from 28.8 KB per second to as high as several megabits per second.[2]
  • An indirect link between the processors. Data may be written onto magnetic tape, floppy disk, punched cards, or some other storage medium on one processor and then physically carried to another processor to be used as input.


The last case (sometimes known as the “sneaker interface”) is somewhat extreme, but it illustrates an important point: processor-to-processor communication is generally much, much slower than communication between processes (bubbles) within the same processor. Thus, the systems designer will generally try to group processes and stores that have a high volume of communication within the same processor.

A variety of factors must be taken into account by the systems designer as he or she makes these allocations. Typically, the major issues are these:


  • Cost. Depending on the nature of the system, a single-processor implementation may be the cheapest, or it may not be. For some applications, a group of low-cost microcomputers might be the most economical solution; for others, implementation on the organization’s existing mainframe computer might be the most practical and economical.[3]
  • Efficiency. The systems designer is generally concerned with response time for on-line systems and turn-around time for batch computer systems. Thus, the designer must choose processors and data storage devices that are fast enough and powerful enough to meet the performance requirements specified in the user implementation model. In some cases, the designer may choose a multiple-processor implementation so that different parts of the system can be carried out in parallel, thus speeding up overall response time. At the same time, the designer must be concerned about the inefficiency of processor-to-processor communication, as discussed earlier.


For example, suppose that the designer sees that the system contains an edit function and a process function, as shown in Figure 22.3. By putting each function in a separate processor, the designer knows that the system will be able to edit one transaction while simultaneously carrying out the processing for another transaction, thus presumably improving the efficiency of the overall system. On the other hand, the edited transactions will have to be sent from one CPU to another; this may be very efficient if it can be done through a direct hardware connection, or it may be very inefficient if the communication takes place via slow telecommunication lines.


Figure 22.3: Processor-to-processor communication; source: Image:Figure223.graffle
Figure 22.3: Processor-to-processor communication; source: Image:Figure223.graffle


  • Security. The end user may have security requirements that dictate the placement of some (or all) processors and/or sensitive data in protected locations. Security requirements may also dictate the nature of (or the absence of) processor-to-processor communication; for example, the designer may be precluded from transmitting data from one processor to another over ordinary telephone lines if the information is confidential.
  • Reliability. The end user will typically specify reliability requirements for a new system; these requirements may be expressed in terms of mean time between failure (MTBF) or mean time to repair (MTTR), or system availability.[4] In any case, this can have a dramatic influence on the kind of processor configuration chosen by the designer: he or she may decide to separate the system processes into several different processors so that some portion of the system will be available even if other parts are rendered inoperable because of a hardware failure. Alternatively, the designer may decide to implement redundant copies of processes and/or data on multiple processors, perhaps even with spare processors that can take over in the event of a failure. This is shown in Figure 22.4; even if the processing CPU should fail (which is perhaps more likely, because it is a large, complex mainframe computer), the individual edit CPUs can continue operating -- collecting transactions, editing them, and storing them for later processing. Similarly, if one of the edit CPUs breaks down, the others can presumably continue operating.
  • Political and operational constraints. The hardware configuration may also be influenced by political constraints imposed directly by the end user, by other levels of management in the organization, or by the operations department in charge of maintaining and operating all computer systems. This may lead to a specific choice of hardware configuration, or it may preclude the choice of certain vendors. Similarly, environmental constraints (e.g., temperature, humidity, radiation exposure, dust/dirt, vibration) may be imposed upon the designer, and this can have an enormous influence on the processor configuration that he or she chooses.


Figure 22.4: Multiple processors for reliability; source: Image:Figure224.graffle
Figure 22.4: Multiple processors for reliability; source: Image:Figure224.graffle

The Task Model

Once the processes and stores have been allocated to processors, the designer must, on a processor by processor basis, assign processes and data stores to individual tasks within each processor. The notion of a task is common to virtually every brand of computer hardware, though the terminology may differ from vendor to vendor: some vendors will use the term partition, while others use the terms job step, overlay, or control point. Regardless of the term, Figure 22.5 shows how a typical processor divides its available storage into separate areas, each managed by a central operating system. The systems designer generally has to accept the vendor’s operating system as a given (though she or he may be able to choose between several different operating systems for a given computer), but the designer does have the freedom to decide which portions of the essential model assigned to that processor should be allocated to individual tasks within the processor.


Figure 22.5: Organization of tasks within a processor; source: Image:Figure225.graffle
Figure 22.5: Organization of tasks within a processor; source: Image:Figure225.graffle


Note that processes within the same processor may need to communicate through some form of intertask communication protocol. The mechanism for doing this varies from one vendor to another, but it is almost universally true that the communication takes place through the vendor’s operating system, as illustrated by Figure 22.6. Just as transmission of data from one processor to another processor is relatively slow and inefficient, the communication of data (or control signals) from one task to another task within the same processor is also inefficient. Communication between processes in the same task is usually much more efficient. Thus, the systems designer will generally try to keep within the same task those processes that have a high volume of communication.


Figure 22.6: Intertask communication within a processor; source: Image:Figure226.graffle
Figure 22.6: Intertask communication within a processor; source: Image:Figure226.graffle


Within an individual processor, it is not always clear whether activities are occurring synchronously or asynchronously; that is, it isn’t always clear whether only one thing can be happening at a time or multiple things at a time. Typically, each individual processor only has a single CPU, which can only be executing instructions for one process at a time; however, if one process is waiting for some input or output from a storage device (e.g., disk, tape, CRT terminal, etc.), the processor’s operating system can switch control to another task. Thus, the systems designer can often pretend that each task is an independent, asynchronous activity.

The Program Implementation Model

Finally, we reach the level of an individual task; at this point, the systems designer has already accomplished two levels of process and data storage allocation. Within an individual task, the computer typically operates in asynchronous fashion: only one activity at a time can take place. A common model for organizing the activity within a single, synchronous unit is the structure chart, which shows the hierarchical organization of modules within one task. The major components of a structure chart are shown in Figure 22.7.


Figure 22.7: Components of a structure chart; source: Image:Figure227.graffle
Figure 22.7: Components of a structure chart; source: Image:Figure227.graffle


You would read this small structure chart in the following way:


  • Module A is the top-level executive module of the system consisting of modules A and B. The reason that A is identified as the top-level (superordinate) module is not because it is topologically above module B, but rather because no other module calls it. Module B, on the other hand, is said to be subordinate to module A. (Module A is presumed to be called or invoked by the operating system of the computer.)


Module A contains one or more executable instructions, including a call to module B. This call may be implemented as a CALL statement in languages like FORTRAN; or a PERFORM statement or CALL USING statement in COBOL; or simply by invoking the name of B in other languages. The structure chart deliberately avoids describing how many times module A actually calls module B; that depends on the internal program logic within module A. Thus, there may be a statement of the following kind in module A:


IF nuclear-war-begins
CALL Module-B
ELSE
...


in which case module B may never be called. But there might also be a program statement in module A of the following kind:


DO WHILE there are more orders in ORDERS file
CALL Module-B
ENDDO


in which case module B may be called several thousand times.


  • When module B is called, module A’s execution is suspended. Module B begins executing at its first executable statement; when it finishes, it exits or returns to module A. Module A then resumes its execution at the point where it left off.
  • Module A may or may not pass input parameters to module B as part of its call, and module B may or may not return output parameters when it returns to module A. In the example shown in Figure 22.7, module A passes parameters X and Y to module B, and module B returns parameters P and Q. Detailed definitions of X, Y, P, and Q would normally be found in a data dictionary. The actual mechanics of transmitting the parameters will vary from one programming language to another.


An example of a complete structure chart is shown in Figure 22.8. Note that it contains four levels of modules; this would normally represent a program of about 500 to 1000 program statements, assuming that each module represents about 50 to 100 program statements.[5]


Figure 22.8: An example of a structure chart; source: Image:Figure228.graffle
Figure 22.8: An example of a structure chart; source: Image:Figure228.graffle


There is an obvious question at this point: How does the systems designer transform a network model of processes in a dataflow diagram into the synchronous model represented by a structure chart? Several books on systems design, including [Page-Jones, 1988] and [Yourdon and Constantine, 1989], discuss this question in great detail. As Figure 22.9 illustrates, there is a cookbook strategy for transforming the network dataflow model into a synchronous structure chart model; indeed, the strategy is generally referred to as transform-centered design. Transform-centered design is only one of several strategies for converting a network dataflow model into a synchronous, hierarchical model; [Page-Jones, 1988], [Yourdon and Constantine, 1989], and [Ward and Mellor, 1985] discuss a variety of such strategies. Note that each process bubble in the dataflow diagram shown in Figure 22.9 becomes a module in the derived structure chart; this is a realistic situation if the processes are relatively small and simple (e.g., if the process specification is less than a page of structured English). In addition to the module that implements the dataflow processes, it is evident that the structure chart also contains modules to coordinate and manage the overall activity, as well as modules concerned with bringing input into the system and getting output out of the system.


Figure 22.9: Transform-centered design strategy; source: Image:Figure229.graffle
Figure 22.9: Transform-centered design strategy; source: Image:Figure229.graffle


Other design strategies use the entity-relationship diagram or other forms of data-structure diagrams as a starting point in deriving the appropriate structure chart; see [Jackson, 1975] and [Orr, 1977] for more information about such design strategies. Another alternative is to use an object-oriented design/programming strategy when one has reached the level of inter-task communication within a processor.[6]

DESIGN GOALS AND OBJECTIVES

In addition to achieving the design objectives specified in the user implementation model, the designer is also concerned with overall quality of the design. The ultimate ability of the programmers to implement a high-quality, error-free system depends very much on the nature of the design created by the designer; similarly, the ability of the maintenance programmers to make changes to the system after it has been put into operation depends on the quality of the design.

The field of structured design contains a number of detailed guidelines that help the designer determine which modules, and which interconnections between the modules will best implement the requirements specified by the systems analyst; all the books listed at the end of this chapter elaborate on those guidelines. The two most important guidelines are coupling and cohesion; these as well as some other common guidelines are discussed next.

  • Cohesion. The degree to which the components of a module (typically the individual computer instructions that make up the module) are necessary and sufficient to carry out one, single, well-defined function. In practice, this means that the systems designer must ensure that she or he does not split essential processes into fragmented modules; and the designer must ensure that she or he does not gather together unrelated processes (represented as bubbles on the DFD) into meaningless modules. The best modules are those that are functionally cohesive (i.e., modules in which every program statement is necessary in order to carry out a single, well-defined task). The worst modules are those that are coincidentally cohesive (i.e., modules whose program statements have no meaningful relationship to one another at all).[7]
  • Coupling. The degree to which modules are interconnected with or related to one another. The stronger the coupling between modules in a system, the more difficult it is to implement and maintain the system, because a modification to one module will then necessitate careful study, as well as possible changes and modifications, to one or more other modules. In practice, this means that each module should have simple, clean interfaces with other modules, and that the minimum number of data elements should be shared between modules. And it means that one module should not modify the internal logic or data of another module; this is known as a pathological connection. (The dreaded ALTER statement in COBOL is a preeminent example.)
  • Module size. If possible, each module should be small enough that its program listing will fit on one page (or, alternatively, so that it can be displayed on one screen of a CRT). Of course, sometimes it is not possible to determine how large a module will be until the actual program statements have been written; but initial design activities will often give the designer a strong clue that the module is going to be large and complex. If this is the case, the large complex module should be broken into one or more levels of submodules. (On rare occasions, designers create modules that are overly trivial, for example, modules consisting of two to three lines of code. In this case, several such modules can be aggregated together into a larger supermodule.)
  • Span of control. The number of immediate subordinates that can be called by a manager module is known as the span of control. A module should not call more than approximately half a dozen lower-level modules. The reason for this is to avoid complexity: if a module has, say, 25 lower-level modules, then it will probably contain so much complex program logic (in the form of nested IF statements, nested DO-WHILE iterations, etc.) that nobody will be able to understand it. The solution to such a situation is to introduce an intermediate level of manager modules, just as a manager in a human organization would do if he found that he was trying to directly supervise 25 immediate subordinates.[8]
  • Scope of effect/scope of control. This guideline suggests that any module affected by the outcome of a decision should be subordinate (though not necessarily immediately subordinate) to the module that makes the decision. It is somewhat analogous to a management guideline that says that any employee affected by the outcome of a manager’s decision (i.e., within the scope of effect of the decision) should be within the manager’s scope of control (i.e., working somewhere in the hierarchy of people that reports to the manager). Violating this guideline in a structured design environment usually leads to unnecessary passing of flags and switches (which increases the coupling between modules), or redundant decision making, or (worst of all) pathological connections between modules.

SUMMARY

There is much more to learn about design, but with this introduction you should understand the process that the system designer goes through. As we have seen, the first step is to map the essential model of user requirements onto a configuration of processors. Then, within each processor, the designer must decide how to allocate processes and data to different tasks. Finally, we must organize the processes within each task into a hierarchy of modules, using the structure chart as our modeling tool.

Note also that additional processes and data repositories will probably have to be added to the implementation model to accommodate the specific features of the implementation technology. For example, additional processes may be needed for error-checking, editing, and validation activities that were not shown in the essential model; other processes may be necessary for transporting dataflows between CPUs. Once this is accomplished, programming can begin. The subjects of programming and testing are discussed in Chapter 23.

REFERENCES

  1. Meilir Page-Jones, The Practical Guide to Structured Systems Design, 2nd ed. Englewood Cliffs, N.J.: Prentice-Hall, 1988.
  2. Edward Yourdon and Larry L. Constantine, Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design. Englewood Cliffs, N.J.: Prentice-Hall, 1989.
  3. Paul Ward and Steve Mellor, Structured Development for Real-Time Systems, Volume 3. New York: YOURDON Press, 1986.
  4. Michael Jackson, Principles of Program Design. New York: Academic Press, 1975.
  5. Ken Orr, Structured Systems Development. New York: YOURDON Press, 1977.
  6. Grady Booch, James Rumbaugh, and Ivar Jacobson. The Unified Modeling Language User Guide. Addison Wesley, 1999.
  7. Peter Coad and Edward Yourdon. Object-Oriented Design. Prentice Hall/Yourdon Press, 1991.
  8. Sally Shlaer and Stephen J. Mellor. Object-Oriented Systems Analysis: Modeling the World in Data. Prentice Hall/Yourdon Press, 1988.

QUESTIONS AND EXERCISES

  1. What activity follows the development of the user implementation model in a typical systems development project?
  2. What are the three major stages of design in a typical systems development project? What models are developed during these three stages?
  3. Why are models important during the design phase of a project?
  4. What is the major purpose of the processor model during the design activity?
  5. Give three examples of how the processes in an essential model could be mapped onto CPUs in an implementation model.
  6. What decisions must be made during the processor modeling activity about data stores that were identified in the essential model?
  7. List three common methods for interprocessor communication.
  8. What factors should the designer take into account when choosing one of these three methods? Which of these factors do you think is most important?
  9. If you are working on a systems development project where reliability is a high priority, how would this affect your decision about allocating essential processes and essential stores to different processors?
  10. Give an example of how political constraints could influence the allocation of essential tasks and essential stores to different processors.
  11. What is a task model in the context of this chapter? What are its components?
  12. Give three common synonyms for task.
  13. Under what circumstances could different tasks be operating at the same time?
  14. Research Project: Pick a common computer and operating system. Describe how different tasks operating under the control of the operating system can communicate with each other. What is the typical overhead (in terms of CPU time, memory utilization, and any other significant hardware resources) for such intertask communication?
  15. Give a definition of the program implementation model. What are its components?
  16. How should the designer transform an asynchronous, network-oriented DFD essential model into a synchronous, hierarchical model?
  17. Under what conditions does each bubble in the essential model become a module in the program implementation model?
  18. List two common design strategies. Give a brief description of each.
  19. What is the primary objective that the designer is trying to achieve when he or she translates the essential model into an implementation model?
  20. What other objectives does the designer usually try to achieve when he or she creates an implementation model?

ENDNOTES

  1. Before the mid-1990s, this was not realistic for anything other than a trivial system. If a system had 500 bottom-level bubbles in its essential model DFD, would it be realistic to consider implementing the system with 500 separate CPUs? With today’s technology, the answer often turns out to be “yes,” particularly if the separate CPUs actually belong to the end-users (who are presumed to have acquired them, and paid for them, out of their own budget).
  2. In many cases, the speed of the telecommunications link depends on whether the end-user is accessing the system through a dial-up connection and his/her own modem, or whether the connection is accomplished via hardware, modems, and telecommunication lines acquired by the development team. In the former case, the development team typically has little or no control over the nature of the equipment or the speed of the connection. Indeed, that becomes a design constraint in many cases: the system needs to be designed so that it will still behave in an acceptable fashion even if the user dials in with a reasonably slow modem.
  3. Keep in mind that there is a budget for the entire project, which should have been determined as part of the analysis process (see Chapter 5). Thus, the designer must choose the most efficient system that fits within the budget. However, keep in mind also the fact that budgets can change: the budgets developed during the analysis phase of the project were only estimates and may be subject to revision if the designer can show that more money needs to be spent for an acceptable implementation.
  4. System availability is usually defined as the percentage of time that the system is available for use. It can be calculated based on MTBF and MTTR as follows: availability = MTBF/(MTBF+MTTR)
  5. Of course, a module called EXTRACT CHARACTER does not sound as if it would require 50 to 100 statements; it might only require two to three statements in a typical high-level programming language. In a lower-level machine-oriented language, though, many more statements would typically be required.
  6. Indeed, one could interpret the essential model portrayed in Figures 22.1 and 22.2 as an object model (though it obviously does not show such OO concepts as class hierarchies, and it is not diagrammed with popular notations such as UML (See [Booch, 1999] for details of the UML notation). But if one interprets each “bubble” as an object/class, and each “dataflow” between the bubbles as a message between objects, then there is not a fundamental difference between the “structured” view of the system, and an “object-oriented” view; see [Shlaer and Mellor, 1988] for more discussion of this concept.
  7. Examples of functionally cohesive modules are CALCULATE-SQUARE-ROOT, COMPUTE-NET-SALARY, and VALIDATE-CUSTOMER-ADDRESS. An example of a coincidentally cohesive module is MISCELLANEOUS-FUNCTIONS
  8. There is an exception to this, known as a transaction center. If the manager module makes one simple decision in order to invoke only one of the immediate subordinates, then the program logic within that manager module will probably be fairly simple. In this case, we do not have to worry about the manager’s span of control.
Personal tools