CS 371: Software Engineering Lecture Notes

I have HTML-ified most of my CS 371 lecture notes, which you can find here. These lecture notes are perpetually incomplete and are no substitute for your own attendance and understanding of what is discussed in class.

lecture #1 began here

Read the Syllabus. Get a CS account. Get the Textbooks. Skim Lethbridge chapter 1. Read in detail those Chapter 1 sections whose topics are mentioned in class.

Lecture 1 covered the syllabus and included a big discussion of possible projects.

lecture #2 began here

What is Software Engineering?

Software engineering can be described in terms of analysis (front end software engineering; breaking apart a problem into pieces) and synthesis (back end software engineering; constructing a solution from available or new components). By analogy to other engineering disciplines, we can infer that software engineering attempts to establish the kinds of methods and tools that enable software projects to be built predictably within proscribed schedules and budgets, meeting the customer's requirements of functionality and reliability. Lethbridge's definition is:

software engineering is the process of solving customers' problems by the systematic development and evolution of large-high quality software systems within cost, time, and other constraints.

But software engineering is radically different than other engineering disciplines, in that:

the end product is abstract, not a concrete object like a bridge
costs are almost all human; materials are an ever shrinking fraction
easy to fix bugs, but hard to test and validate
software never wears out...but the hardware/os platforms it runs on do
variations in application domain are open-ended, may require extensive new non-software, non-engineering knowledge for each project

Phases of Software Development

Whether you follow the "Waterfall Model" or the "Spiral Model", software development includes the following aspects. The Waterfall Model says to do these things in order. The Spiral Model says to repeat these tasks in a cycle, adding detail and making corrections, until you know what you are doing. Note that many or most of these tasks produce documents, not programs, as their end result.

Requirements Analysis
Software Design
Coding
Testing
Maintenance

In this class we will approximate the function of a software engineering group on a project, emphasizing the requirements analysis through testing tasks and possibly neglecting the most important phase: maintenance!

Things Learned in Past Semesters

Design is more difficult than coding

Communicating and committing are more difficult than technical issues

Integration is more difficult than designing and coding your own stuff

Java has many ugly warts:

static vs. non-static
constructors in subclasses do not inherit and must call superclass

The Development Team

Large projects are more of a challenge of organization and communication than they are of "programming". The development team may be organized into roles corresponding to the development phases (analyst, designer, programmer, tester, etc.), but perhaps a more crucial distinction is between the customer who is a domain expert and the developer who knows mainly about software. Generalizing, we have content specialists (customer, user, artist, librarian, manager) and functionality specialists (programmer, tester, manager) and one of the main goals of software engineering must be to bridge the Great Divide between these two spheres.

The Methods and Tools

You will be learning in this course many current state of the art methods and tools used in software engineering. "Front end" tools are documentation tools designed to help content specialists and programmers talk to (and understand) each other. We will use a subset of UML, a popular diagramming notation for requirements analysis and software design. A drawing tool called Dia available on our Linux systems is good at writing UML diagrams. Our Dia diagrams will be incorporated into HTML analysis and design documents.

"Back end" tools emphasize synthesis of solutions based on designs. We will use a graphical user interface builder to generate user interfaces, the "make" program to compile many files containing many persons' work together, and the "cvs" source code revision control system.

Some quotes on tools, relevant to software engineering

Software development is like war:: "We shall not fail or falter; we shall not weaken or tire. Neither the sudden shock of battle nor the longdrawn trials of vigilance and exertion will wear us down. Give us the tools and we will finish the job." -- Sir Winston Churchhill; Do you only need one or two tools?; "An apprentice carpenter may want only a hammer and saw, but a master craftsman employs many precision tools. Computer programming likewise requires sophisticated tools to cope with the complexity of real applications, and only practice with these tools will build skill in their use." -- Robert Kruse

Why is Software Engineering Crucial?

Because the larger a program gets, and the more features you add, the more bugs you get. Why? Because things get too complex for us to handle. Until we can solve this unsolvable puzzle, Moore's Law is limited or revoked by our inability to utilize hardware, just as we are unable to utilize our own brain (wetware).

Belady-Lehman observed:

[D. Berry, The Inevitable Pain of Software Development, Monterey Workshop 2002]

So, Software Engineering is All About Pain

Software Engineering, it turns out, is mainly about pain. Dan Berry, one of software engineering's luminary founding fathers, had this to say last October about software engineering methods:

Each method, if followed religiously, works. Each method provides the programmer a way to manage complexity and change so as to delay and moderate the B-L upswing. However, each method has a catch, a fatal flaw, at least one step that is a real pain to do, that people put off. People put off this painful step in their haste to get the software done and shipped out or to do more interesting things, like write more new code. Consequently, the software tends to decay no matter what. The B-L upswing is inevitable.

Dr. Berry goes on to give the following examples:

Software Method	Pain
Build-and-fix	doesn't scale up
Waterfall Model	it is impossible to fully understand and document complex software up front
Structured Programming	Change is a nightmare: patch or redesign from scratch
Requirements Engineering	Haggling over requirements is a royal pain.
Extreme Programming	Writing adequate test cases is a pain
Rapid Prototyping	We can't bear to throw away the prototype!
Formal Methods	Writing formal specification, and verifying it, may be a pain. Changing requirements is definitely a pain.
Code inspections	Documentation prep for inspection is a pain; nobody wants to be inspected.
"Daily Builds"	Testing, regression testing, and possibly reworking your latest change to not break someone else's latest change is a pain.

My goal for this course is to maximize your learning while minimizing your pain.

Object Modeling

Before we do object oriented programming, we do object oriented modeling. A model is a representation of a real-world process or concept in graphic or narrative form. Your goal in modeling is to create a model space that closely corresponds to reality. Your goal in modeling is also to discern what parts of reality are relevant to the application at hand. What the application needs to do, determines which parts of information to represent, and how to store it. Light has been modeled variously as particles or waves; people might be modeled as (SSN, income) number pairs, or as pliable 3D solid bodies, etc.

Reality Model space

Person
Name: Clint Jeffery
Email: jeffery@cs.nmsu.edu
Phone: 646-3480
Quest: grail
Favorite Color: blue

An object is an application-specific group of related information, used to represent some part of a model. "Numbers" or "arrays" are not application-specific, but "rate schedules" or "teams" might very well be objects in applications that work with those ideas.

Fields and Methods

The information in an object is called a field (a.k.a. member variable). Fields are tightly controlled and modified according to the rules of the application domain from which that object originates (laws of physics, corporate policies, etc.). This control is achieved by means of encapsulation: you can't get at the information directly, you must go through a public interface consisting of a set of methods (a.k.a. member functions).

Reality Model space

Person
Name: Clint Jeffery
Email: jeffery@cs.nmsu.edu
Phone: 646-3480
Quest: grail
Favorite Color: blue
+email(msg)
+phone(msg)
+turnin(assignment)
-read_email()
-send_email()

Objects are categorized and grouped with other similar objects into classes. A class is a set of objects which represent similar things in the application domain. We say the objects that are members of a particular class are instances of that class. What classes belong in an application depend on the application domain. They might be real world entities (cars, people, machines in factories, products in warehouses, asteroids, ...) or concepts (mathematical points in 3D space).

There are lots of different kinds of cars that could be represented by the same class, and share the same code. Classes define attributes and methods which form some kind of approximation or projection of the real world thing or idea. What attributes you choose depend on what the application will need to do with the objects. A set of methods forms a public interface to the objects of a given class; attributes are not accessed directly.

A method is a function that:

operates on an instance
can read or write information from the instance's fields
can call other methods to do part of its work
can ask another object to do part of its work (via its public interface)

lecture #3 began here

Book Learning!

Read: Lethbridge Chapter 2!

It is titled "review of object orientation", but it is OK if it is not review for some of you.

Book Learning!

Read: Lethbridge Chapter 7! This is all about user-centered design and use cases, which I believe should come first as a way of trying to identify the classes we will need for OO class design later.

Homework #1

Let's get started!

Boehm's Top 10

Barry Boehm is one of the titans of the software engineering field, a leading proponent of such concepts as the "Spiral Model".

Finding and fixing a software problem after delivery of the product is 100 times more expensive than defect removal during requirements and early design phases.
Nominal software development schedules can be compressed up to 25% (by adding people, money, etc.) but no more.
Maintenance costs twice what development costs.
Development and maintenance costs are primarily a function of size, e.g. the number of source lines of code, in the product.
Variations in humans account for the greatest variations in productivity.
The ratio of software to hardware costs went from 15:85 in 1955 to 85:15 in 1985 and continues to grow in favor of software as the dominant cost.
Only about 15% of the development effort is in coding.
Application products cost three times as much per instruction as individual programs; systems software products costs nine times as much.
Walk-throughs catch 60% of the errors
Many software processes obey a Pareto distribution:
- 20% of the modules contribute 80% of the cost
- 20% of the modules contain 80% of the errors!
- 20% of the errors consume 80% of the repair budget
- 20% of the modules take 80% of the execution time
- 20% of the tools are used 80% of the time

Source: Barry Boehm, Industrial Metrics Top 10 List, IEEE Software 4:5, Sept. 1987.

Use Cases and Class Extraction

You can identify classes from a software specification document by looking for "interesting" nouns, where interesting implies there are some pieces of information to represent in your application, and operations to perform on them. You can also identify classes by developing use cases from the specification document. Use cases are formatted descriptions of "discrete" tasks. By "discrete", we mean an individual standalone thing a user does while using the system.

If you look through the tasks mentioned in a specification document, you can identify a set of candidates. Example candidate tasks for a "wargame":

Combat
Roll dice
Move pieces
Perform the Missions Phase

Example candidate tasks for Parker Brothers' Monopoly game:

Buy property
Roll dice
Move piece
Count money

Entire books have been written about use cases. Use cases are also described in Chapter 11 of the Unicon book; some of today's examples may be found there.

Terminology

actor: role that an external entity plays in a system
use case (or just "case"): depiction of some aspect of system functionality that is visible to one or more actors.
extension: a use case that illustrates a different or deeper perspective on another use case
use: a use cases that re-uses another use case.

We have previously introduced use cases and use case diagrams. Lethbridge defines a use case as:

A use case is a typical sequence of actions that an actor performs in order to complete a given task.

Now we will expand on the discussion of use cases, use case diagrams, and look at examples.

Use Case Descriptions

Drawing an oval and putting the name of a task in it is not very helpful by itself, for each use case you need to add a detailed use case description. Your first homework assignment is to "go and do this" for your semester project.

Section 7.3 of the text explains the format of use case descriptions. Each use case has many or all of the following pieces of information. The items in bold would be found in any reasonable use case description.

Name: The name of the use case.
Actors: What participants are involved in this task.
Goals: What those people are trying to accomplish.
Preconditions: The initial state or event that triggers this task.
Summary: Short paragraph stating what this task is all about.
Related use cases: What use cases does this use case use or extend? What uses/extends this use case?
Steps: The most common sequence of actions that are performed for this task. Lethbridge divides actions into two columns: user input is given in the left column, while system response is given in the right column. The two column format is optional, but saves on paper and may improve clarity. The steps are numbered, so there is no ambiguity in using both columns on each line.
Alternatives: Some use cases may vary the normal sequence of steps.
Postconditions: what does this task produce?

A simple generic use case for a "file open" operation might look like:

Open File
Summary: A user performs this task in order to view a document. The user specifies a filename and the document is opened in a new window.
Steps:

Choose "Open" from the menu bar.
System displays a File Open dialog.
User selects a filename and clicks "OK".
System closes the dialog and opens the file in a new window.

Alternative: If the user clicks Cancel in step 3, no file is opened.

Lethbridge-style two column format is nicely motivated in Example 7.2 from the text, which has enough steps to where two columns saves enough space to matter. When you start having trouble fitting the whole use case description on a page, there are substantial benefits to a compact format.

Exit parking lot, paying cash
Actor: car driver
Goal: to leave the parking lot
Precondition: driver previously entered the parking lot, picked up a ticket, and has stayed in the lot long enough that they must pay to leave.
Summary: driver brings their vehicle to an exit lane, inserts their ticket into a machine, and pays the amount shown on the machine.
Related use case: exit parking lot, paying via credit card.
Steps:

1. Drive to exit lane, triggering a sensor. 2. System prompts driver to insert their ticket.
3. Insert ticket. 4. System displays amount due.
5. Insert money into slot until cash in exceeds amount due. 6. System returns change (if any) and raises exit barrier
7. Drive through exit, triggering a sensor. 8. Lower exit barrier

Alternative: User crashes through exit barrier with rambars on front of truck in step 1. (just kidding)

Lethbridge Example 7.3 gives you one more look at use case descriptions. This one is for a library management application.

Check out item for a borrower
Actor: Checkout clerk (regularly), chief librarian (occasionally)
Goal: Help the borrower borrow the item, and record the loan
Precondition: The borrower wants to borrow a book, and must have a library card and not owe any fines. The item must be allowed for checkout (not on reserve, not from reference section, not a new periodical, etc.)
Steps:

1. Scan item's bar code and borrower's library card. 2. Display confirmation that the loan is allowed, give due date.
3. Stamp item with the due date.
4. Click "OK" to check out item to borrower. 5. Record the loan and display confirmation that record has been made.

Alternative: the loan may be denied for any number of interesting reasons in step 2 (see preconditions).

Now, ask yourself: why is Dr. J convinced that use case descriptions are a vital software engineering job in the initial stages of requirements analysis? The first person to tell me can stop by for a delicious San Saba chocolate covered almond in my office.

Use Case Diagrams and Examples

One reason to do a use case diagram is simply to summarize or catalog what tasks are part of the system; a sort of table of contents. But the main reason use case diagrams exist is in order to show who does what, when different users (actors) participate in different (overlapping) tasks. If you only have one actor, or there are no tasks in which multiple actors interact, there may be no reason that you have to do a use case dialog.

Consider the textbook example, Figure 7.1.

There are three actors (Registrar, Student, Professor), and there are five use cases. The "Find information about course" use case is vague and probably the three actor types can find out different information from each other. They are not typically involved in the same instance of finding out information about a class, so the example could be better.

Figure 7.2 illustrates a bunch of more exotic use case diagram items, namely actors and use cases that use or extend other actors and use cases.

Unlike last semester's textbook, Lethbridge covers the main thing about use cases fairly well, which are the use case descriptions. For use case diagrams, we need to look for or construct some more examples. Lethbridge he omits one interesting category of actor in use case diagrams, namely: external system actors. A computer program may interact with external entities that are not humans; they may be remote database servers, for example.

Figures 11-1 and 11-2 of the Unicon book give some more examples of use cases.

lecture #4 began here

Class Diagrams

Class diagrams are the "meat and potatoes" of object-oriented analysis and design. We will begin our discussion of them today, and continue next week. Class diagrams describe more detailed, more implementation-oriented things than use case diagrams.

Class diagrams can present varying levels of detail about the classes in them. Some class diagrams may have nothing more than the class name for each class; others may hold the full list of fields and methods. When more space is taken by class details, there is room for fewer classes per diagram, so you often have "overview diagrams" that show many classes and their connections, supplemented by "detail diagrams" that show more information about closely related classes.

Associations

Perhaps the main purpose for class diagrams is to identify and depict relationships between objects that will be needed in the running system. An association is the word we use for such a relationship. We draw a line between the rectangles for classes to depict an assocation. There are three major types of associations:

inheritance
aggregation
user defined

Inheritance: the Un-Association

We have discussed how inheritance is not really an association, it is a relationship between kinds of things, in the design and maybe in the programming language type system, whereas associations are relationships between instances (objects) at run-time. Inheritance is so vital that many class diagrams focus specifically on a large inheritance class hierarchy, similar to a biological taxonomy of species. Inheritance is usually a static feature of a design, although there exist languages in which instances can change who they inherit from at runtime.

Here is an example class hierarchy from the Lethbridge book (chapter 2):

Aggregation: the Simplest Association

Aggregation, the parts-whole relationship, is perhaps the most useful association of all of them. Many many complex things are made up of an assembly of simpler items. There are at least two flavors of aggregation, static and dynamic. Static aggregation is lifelong aggregation; the parts cannot exist apart from the whole, or enter or leave the whole. Dynamic aggregation is more like a team whose members can come and go. Here is an example of a massive chain of aggregations with a familiar theme:

Association Details

There are many details added to associations to show more information about the relationship. Some of these details are discussed in Chapter 5 in your text.

link: just as classes have instances at runtime called objects, associations have instances at runtime called links. Links occasionally are so important and complicated that they need their own attributes. The main information about them is usually their lifetime, and what instances they are connecting.
multiplicity: a.k.a. cardinality, it is the number of object instances per link instance in a given relationship
qualifier: some many-to-one relationships have a unique key used to traverse the association.
roles: the different ends of an association may have differing roles associated with them. Especially useful if both ends of an association connect the same class.
composition: there is a special kind of aggregation called composition, which denotes aggregations in which the component parts have no existence apart from the whole thing. The relationship is hardwired, static, or constant. Composition is marked using a filled diamond; hollow diamond means a regular (transitory, or dynamic) aggregation.

Here are some more class diagram figures from some old software engineering text (Pfleeger). One point here is that it is logical to start with a simple sketch of classes and basic relationships, and add many details later on.

As a larger example of class diagrams and associations, consider a previous semester's project. They produced two, overlapping class diagrams, one focusing mainly on cards and card decks and one focusing on characters, units, and the map. We can look at these two diagrams and consider what they did right, what needs to be changed for the campaign game. We can also work, as an example, some of the classes and relationships for the Monopoly game.

lecture #5 began here

User-defined Association Examples

Here is an association you might see in a human resources application:

Person

employee employer
Works-for

Company

What are some example instances of this association?

Here is a more detailed version of that association:

Person
name
SSN
address
salary
job title
employee employer
* Works-for
Company
name
address

There is a multiplicity, since many people may work for the same company. But what if a given person works for more than one company?

Here is an association you might need for a geography application:

Country
name
Has-capital
City
name

Now, what are some examples of this association? Give me some instances -- and their "links". To include more information in this association, we need to know:

How many capitals can a country have?
How many countries can a city be capital of?
Does every country have a capital? Vice-versa?

lecture #6 began here

Reading

For coverage of this week's lectures, read Lethbridge Chapter 8, ESPECIALLY Section 8.2.

Statecharts

A statechart, or state diagram, depicts dynamic properties of a system. See p. 276 of the text. A statechart consists of

a set of states: drawn as circles, ovals, or rectangles, with a label or number inside.
a set of transitions: drawn as arrows from one state to another.
a start state, and a set of final states

Some of you may be getting an eery sense of deja vu at this point. Statecharts are a non-trivial extension of finite automata, because they have:

instead of "input symbols", events associated with transitions: these may be complex, synchronous or asynchronous
events may have conditions, drawn inside square brackets
activities

Statechart Diagram Examples

lecture #7 began here

HW1

More Statechart Examples

Next week we will work on statechart examples relevant to your semester project.

Collaboration Diagrams (Chapter 9)

Collaboration diagrams are the next form of UML diagram we will describe in this course. Collaboration diagrams describe how a set of objects interacts by calling from method to method.

Collaboration Diagram Examples

A collaboration where an actor pushes a button to get an elevator to his floor. The control object checks how long the job queues of all the elevators are and chooses the shortest. It then creates a job order object and invokes it by putting it in a queue. The elevator object runs concurrently and picks up jobs from the queues. The elevator is an active object, meaning that it executes concurrently with its own thread of control.

The object MainWindow receives the message NewCustomer, and creates a Customer object. A CustomerWindow is created, and the customer object is then passed to the CustomerWindow which allows for update of the customer data.

A collaboration diagram that summarizes sales results.

A Collaboration Diagram for a Printer Server

Collaboration Diagram with Simple Numbering

Collaboration Diagram with Decimal Numbering

lecture #8 began here

Object Oriented Design: Adding Detail

This week's lectures will include material from Lethbridge Chapter 5. You can view object oriented design as a process of adding detail to class diagrams. We will look at as many examples of this process as we can.

Where Do We Go Next?

For detailed design, we need to reorganize/regroup and assign teams to go into the details of various aspects of gameplay. A Big Question we need to decide (today) is: one team, or two?

Homework #3: Detailed Design

You are dividing labor right now for work on phase 2, adding details and corrections to your project analysis, and merging ideas you find in other teams' homework #2. Make sure your names are on any submitted course documents. In fact, I need to know who did which parts of which documents, so I don't lump everyone together under the same grade. If you are asked to prepare part of a document you don't understand, better ask your classmates and/or instructor what is needed for that part.

lecture #9 began here

User Interface Design

By the next round of turnin, we will need to establish a fairly complete user interface design for things like the main screen. User Interface Design is the subject of an entire course (CS 485) and for our purposes we will have to settle for a rudimentary and primitive introduction.

User interface design starts from what tasks/activities the application is to support. You probably will discover a few tasks in this phase that requires a dialog for a task we haven't identified previously. But mainly we need to design dialogs and sequences of actions to perform specific tasks in use cases.

Aspects of User Interfaces

look: this is the most obvious part of user interface design, but not the most important part
feel: this is like: what clicks perform what operations. how many clicks does it take. does it feel like you are directly manipulating the objects on the screen, or does it feel like you are following a long sequence of orders you receive from the program.
metaphors: users can quickly learn an unfamiliar task, or quickly interpret an unfamiliar graphic, if a familiar metaphor is used. Examples: "desktop metaphor"
mental model: a user interface provides the user with a particular mental model of how they view the system. designing that model will determine many aspects of the user interface (what info to show, what tasks to support)
navigation rules: navigation through large structures which don't all fit on the screen is a central issue for many (most) applications.

A few Obvious User Interface Tips

Minimize # of clicks for common tasks
Provide all the information that's needed on a single screen
Strive for "direct manipulation"
Modeless is usually better than modal
Be familiar and consistent with other applications

Interpersonal Communications: Some Rules of Engagement

1. Respect your classmates, even when you disagree or they are wrong.: "Treat others the way you would like to be treated" - Jesus. I am not impressed, and will not tolerate for long, group "leaders" who disrespect their teammates publically. If you have a problem with one of your team member's contributions, discuss it with them privately. If you cannot resolve it through polite discussion with the individual, discuss it RESPECTFULLY within your group, and if there is a problem that can't be resolved internally, see me. Part of your grade will be based on whether I determine that you respected your classmates or not.
2. Accept group decisions even when you disagree.: "The Needs of the Many Outweigh the Needs of the Few...or the One" - Spock. There has to be some mechanism for making decisions, whether it is democracy, dictatorship, or whatever. Those decisions should be made based on what's best for the group, not what makes an individual look good.
3. You must include all group members in decisions.: I want to hear no more team members who are surprised about something that affects them.
4. You should do your best to contribute to your team.: "From each according to his abilities" - Marx. The easiest way to fail this course is to not contribute to your team. If you do your best, make your contribution, and the team discards it, that is not your problem or fault. If you don't do your best to help your team succeed, don't be surprised at the grade you get.
5. E-mail is not a good medium for resolving problems.: I have found through many long years that e-mail does not work well at conveying emotions. Using e-mail to try to resolve problems can easily make them worse. Of course, sometimes you have no choice, but basically e-mail is easily misinterpreted. Human faces and intonation are lost, and people do not type as well as they talk. When there is a problem, your best bet is to e-mail to setup a meeting to discuss it. Your next best bet is to think, and rethink, what you are planning to send by e-mail. Ask: how will this person react to this e-mail? Have I respected them? Will they understand my situation? Will they feel I am attacking them, or trying to help?

Example of how not to use e-mail for interpersonal communications:

From: ralph
To: cjeffery
Date: Wed, Apr 22
Subject: Carping
I'm more than a bit tired of beating you about the ears in hopes that you'll rearrange your priorities, work habits, or whatever it takes to get your research on track.
I'll assess the situation in a couple of weeks. If I'm still not satisfied with your progress, I'll put it in writing.

This e-mail may have accomplished a certain motivational goal, but it did not improve the working relationship between sender and recipient.

Why are collaboration diagrams useful?

Collaboration diagrams help connect the use cases and classes. Each scenario (== sequence of steps for a use case description) can lead to a collaboration diagram.

Design Buzzwords and Vague Concepts

Guess what, the book talks about design. Assignment: read Chapter 5 in full at this point if you haven't already. It gives step-by-step procedures for performing the tasks of detail design you need for your project homeworks 3-4.

Here are some buzzwords and ideas that relate to design:

Design methods

1. modular decomposition: top-down breaking up function into parts
2. data-oriented decomposition: top-down breaking up information into parts
3. event-oriented decomposition: identifying what changes are to be made, and when they occur
4. outside-in design: blackbox I/O orientation
5. object-oriented design: relationships between data

Things that get designed

1. Architecture: interaction between programs and their environment, including other programs
2. Code: algorithms and data structures, starting with equations, pseudocode, etc.
3. Executable/package: how is this system going to be installed and run on user machines?

Common architectures Pipes, layers, client-server, peer-to-peer, ring ...

Some common breakdowns

poor prioritization
failure to consider constraints on the solution (missing requirements)
failing to perform mental simulations of complex multi-step activities
failing to track and return to subproblems which aren't solved yet
failing to expand/merge/integrate subsolutions into a complete whole

"Good" Design

Book material on design resides in chapters 7-9; skim them and look for good parts.

Low coupling: Coupling refers to the interdependences between components. Components need to be as independent as possible. The book defines many kinds of coupling, including content coupling, control coupling, stamp coupling, and data coupling.
High cohesion: Cohesion refers to the degree to which a component is focused and connected internally (it is almost "internal coupling"). Bad cohesion has a single component doing unrelated tasks. Bad cohesion may coinside with lots of duplicate code (same thing repeated with slight changes for different tasks). The book defines levels of cohesion: coincidental, logical, temporal, procedural, communicational, sequential, functional.
Minimal complexity: There are several types of complexity, but in general, complexity is bad, and the goal is to minimize it while meeting requirements. Most of the complexity measures that are out there measure the complexity of code, but we are talking about design right now. Designs that are complex, or designs that poorly address the application domain and requirements, lead to complex code. Bad programmers can of course create complex code from even good designs.

lecture #10 began here

Good/Bad Design Examples

Vince Kellen's Coupling and Cohesion Examples give examples of bad cohesion, indicated by excessive module length/size, along with other goodies.
Our textbook has an on-line "knowedge base" including a set of design principles and a software quality section
Wikipedia entries on coupling and cohesion give some concrete examples

Comments on Homework

Produce a readable document: I have to be able to read your hard copies, I really mean it! If you can't format your diagrams to fit the page, or print a lot of text so small I can't read it with my glasses on, or don't have enough toner in your printer: fix it.
Avoid "buzzword infection": Students learning software engineering or UML are exposed to many new terms. When writing software engineering documentation, keep the technical buzzwords out of your application domain descriptions unless they really belong there. Example: in use cases you learn about actors, so in your descriptions of the application domain, the word "actors" starts being used to refer to many non-actor things.

CVS and Version Control

CVS stands for "Concurrent Versions System". It is a software version control system, whose primary function is to aid in the coordination of programmers on large projects. Compared with earlier tools SCCS (source code control system) and RCS (revision control system), CVS offers some significant advantages:

It allows all programmers to edit any file at any time. Earlier tools used a "locking" system to allow only one programmer to edit a file at a time.
It semi-automatically merges changes by multiple programmers; if the edits do not conflict it is fully automatic, and if the edits are to the same place in the program, it notes the conflict, shows both versions, and requires the programmer(s) to resolve the conflicts manually.
CVS works on multiple platforms (e.g. UNIX and Windows) and since it is open source, everyone can use it. Previous systems were not very portable (RCS) or proprietary and commercial (SCCS, PVCS, etc).
CVS works over the internet, making it awesome for coordinating the development of public open source projects with personnel scattered around the world.

For these good technical and political reasons, CVS will be our source code control system of choice. There is a newer system called subversion which is said to compete favorably with CVS; feel free to check it out, it seems to be on our system, and to use mostly the same commands as CVS.

Major CVS Commands

CVS works using a "repository" which is a database for source files. You can specify which repository to use via a command line option, but it is usually done by setting a CVSROOT environment variable. This can be set to either a local directory or an internet machine and directory name. When you checkout a project from a repository, your project directory remembers which repository it belongs to, so there is no problem using CVS with many different projects in different repositories; you don't have to keep resetting CVSROOT all the time.

Unless you are creating your own repository, the first command you need is

cvs checkout projectname

which grabs a copy of a named project from the repository. The various cvs commands that manipulate the repository have the syntax

   cvs command [filenames...]

The other commands you need immediately for CVS include:

cvs diff [filenames...]: Show any differences between your file and the version in the repository
cvs update [filenames...]: Merge in any changes others' have committed to the repository. If you have changed lines that others have changed, the conflict is reported and both copies of the changed lines are left in for you to merge by hand.
cvs commit [filenames...]: Merge your changes into the repository.
cvs log [filenames...]: Show history of changes that were made to a file or files.

There are a few other CVS commands, and command-line options, that you may find useful; read the manuals! One option of special interest is -r tag which let's you ask for older versions from the repository instead of the current version. This may help if the current repository gets broken. :-) Use it with care, however; when you go back to an earlier version, the repository doesn't think any changes you make apply to the current version. You may need to use Dr. J's "Texas Three-Step"

Demonstration of CVS on the current class project

HW #4

lecture #11 began here

Comments on Homework

States in statecharts must correspond to (sets of) values of attributes in some class: make explicit
Arrows/transitions in statecharts must be identified/labeled with the events that cause them.
Methods in many use case descriptions need to migrate to class diagrams
Some teams will need to split class diagram into multiple pieces
Do not just copy text from rules (duh) -- analyze and adapt
Many HW3 use case diagrams still have things that aren't use cases in them
User tasks != character "tasks" in the simulation
Tie your credits a bit closer to your materials

Implementing UML Constructs

At the beginning of the semester, we said... languages vary in their ability to implement OO designs.

We had a lab where there was some UML-to-C++ going on, but we need a lot more details in our designs, and we need a lot more practice with going from UML to code. Now let's return to that topic and consider the implementation of associations in your semester projects.

Filenames

Go with InitialCaps on classnames and filenames. Unless you use 8.3 filenames, you don't have infinite portability, but that is OK.

Documenting Code

I have alluded to JavaDoc, is there a similar tool for C++?

JavaDoc comments look like this example. They must be placed immediately BEFORE the thing they describe. The first sentence should always be a summary of the described entity.

/**
 * This is a beautiful JavaDoc comment.
 */

Our UML documentation can be referenced (once we rename doc/ to doc-files/ in our project) with a comment like:

/**
 * The UML class diagram for this code is
 * <A href="doc-files/myclass.html"> here </A>
 */

JavaDoc comments can include a large number of interesting "tags", signalled by the at sign (@). Some are standalone (@see, @deprecated, etc) and some are "in-line", meaning they are enclosed by curly braces in order to clear mark the end of the tag, as in {@link #getDiplomacy()}

For our class the best tags are: @author, {@link...}, @param, @return, @see, and @version. There are other tags; read the man page.

lecture #12 began here

Midterm Exam: let's go ahead and see what you know on Friday March 10. Review text chapters 2,4,5,7,8,9, focusing on sections that related to material discussed in class or activities done in homeworks or labs.

CVS Repository

To create it I had to say

mkdir ~/371
mkdir ~/371/CVSROOT
cd ~/html/courses/371/fps # where my sources were
cvs import FPS mumbledy fumbledy
chgrp -R cs371 ~/371

To get your copy you should say

setenv CVSROOT /home/uni1/jeffery/371
newgrp cs371
cvs checkout FPS

To modify files in the repository

newgrp cs371
cvs update, commit, etc.

A Few Nuggets from Lethbridge

Avoid unnecessary generalizations. At what point do you stop making subclasses, and just handle differences in instances, if any, via a field value?
Have a generalization key.
Constraints; OCL exists
Tasks in developing a UML model are more than just drawing classes and relationships. They include identifying responsibilities for each class, which help lead to the methods the class needs.
Class criteria include: don't define two classes with different names, if one of them is really the same thing as the other. Don't define classes for things that should be instances of other classes. Don't define classes for things that have no meaningful data. Don't define classes for things that have no meaningful behavior (i.e. methods). Don't define classes for things that are too vague. Don't define classes for things irrelevant to this application domain or this particular application within that domain.

lecture #13 began here

CVS check: 6 students have done enough lab work to stick their names into credits.cpp:

void credits()
{
   std::cout << "FPS - a CS371 project" << std::endl;
   std::cout << "Clint Jeffery" << std::endl;
   std::cout << "Michael Simmons" << std::endl;
   std::cout << "Scott Blauert" << std::endl;
   std::cout << "Gustavo Jimenez"<< std::endl;
   std::cout << "Jeston Uhl"<< std::endl;
   std::cout << "Curtis Wyatt" << std::endl;
   std::cout << "Philip Killough" << std::endl;
}

Design Patterns

Some of this material can be found in Lethbridge Chapter 6.

Architect Christopher Alexander wrote extensively about the concept of patterns in building, and software design patterns were inspired by his work. Design Patterns create a common vocabulary for "known good" solutions that seem to recur over and over again. If we do not recognize and exploit these recurrences, we reinvent the wheel over and over again.

Design Patterns in software got popularized by a "Gang of Four" authors (Gamma, Helm, Johnson and Vlissides) who wrote a book in 1995. Like most New Things in software engineering, it spawned a cult following and a legion of wannabes inventing "design patterns" for things like wallpaper. But the original idea is valid and many of the design patterns in the Design Patterns book are legitemate.

The original design patterns book described in detail 23 reoccurring patterns in software, divided into three main categories: creational, structural, and behavioral. Each pattern is described exceedingly well in prose and outlined in a UML class diagram, after which example implementations are sketched in C++ or Smalltalk. Within all three categories a great deal of similarity can be observed, such as the heavy use of abstract classes; enough to suggest the existence of meta-patterns.

The design patterns fad has died down, but the concept of design patterns has been thoroughly institutionalized by the software engineering community.

What is a Design Pattern?

Minimalist Definition

A quad-tuple consisting of:

a pattern name

a description of what problem it solves

a description of the solution

an assessment of consequences and implications of the pattern

Expanded Definition

Name and Classification

Intent

Also Known As

Motivation

Applicability

Structure (e.g. UML)

Participants

Collaborations

Consequences

Implementation

Sample Code

Known Uses

Related Patterns

How Design Patterns Solve Design Problems

finding objects - if the pattern says you need one, you need one

determining granularity - several patterns address granularity explicitly

specifying interfaces - patterns describe part or all of the public interfaces of the classes in them

specifying implementations - patterns may include known-efficient code samples

code reuse - "design reuse facilitates code reuse"

How to Select a Design Pattern

GoF suggest several ways, such as

look for which design problem above affects you, then look for design patterns that pertain to it

scan all the patterns' Intent sections

study how patterns interrelate

study patterns of similar purpose - to tell when to use which

I would just add that, first you familiarize yourself with a bunch of design patterns, and then when doing design you recognize which pattern to use via deja vu.

How to Use Design Patterns

Buy the GoF book, read the pattern in detail

Look at the sample code to get a concrete feel for it

Apply (translate) the pattern Structure section to your application classes

Adapt the sample code when it is appropriate to do so; otherwise write your own

Design Patterns versus Pattern Languages

Students of Christopher Alexander were quick to note that the design patterns book is mostly just a catalog of buzzwords and misses the point. Design patterns should be more about the rules for connecting patterns than about the individual patterns themselves, which we mostly already know or could invent when needed. I started getting into this with a mathematician colleague in San Antonio, but then moved to Las Vegas and then Las Cruces, and have been too busy to pursue this to its appropriate conclusion.

Some Cynical Observations About Design Patterns

Of course these were not new inventions, they were a catalog of tried and true methods. That is OK.

Design Patterns proponents are trying to create a common vocabulary of buzzwords, to reduce the cost of communication and increase the level of understanding when software engineers are talking with one another.

Name That Design Pattern

If we do well enough on this quiz, we have a common vocabulary, and have either already learned design patterns, or don't actually need them.

Provide an interface for creating objects without specifying their concrete classes.

Separate the construction of a complex object from its representation.

Let subclasses decide which class to instantiate for a specified object-creation interface.

Specify objects to create using a prototypical instance; create objects by copying.

Ensure a class has only one instance, and provide a global point of access.

Convert the interface of a class into another interface, expected by clients.

Decouple abstraction from implementation, so the two can vary independently.

Compose objects into tree structures to represent part-whole hierarchies. Treat individuals and composites uniformly.

Attach additional responsibilities to an object dynamically.

Provide a single unified interface to a set of subsystem interfaces.

Use sharing to support large numbers of fine-grained objects efficiently.

Provide a surrogate or placeholder for another object.

Give more than one object a chance to handle an incoming message. Pass the request along the chain until an object handles it.

Encapsulate a request as an object.

Interpret sentences in a language by defining and operating on an internal data structure representation of that language.

Provide a way to access the elements of an aggregate object sequentially.

Define an object that encapsulates how a set of objects interact.

Capture and externalize an object's internal state so that the object can be restored to this state later.

Create a mechanism such that when one object changes its state, all its dependent observers are notified and updated.

Allow an object to alter its behavior when its internal state changes, appearing to have changed its class.

Define a set of encapsulated, interchangeable, algorithms; allow algorithms to vary independently of their clients.

Define the skeleton of an algorithm, deferring some steps to subclasses.

Represent an operation on elements of an object structure; enable new operations without changing the element classes.

lecture #14 began here

The Patterns

Abstract Factory

Provide an interface for creating objects without specifying their concrete classes.

Builder

Separate the construction of a complex object from its representation.

Factory Method

Let subclasses decide which class to instantiate for a specified object-creation interface.

Prototype

Specify objects to create using a prototypical instance; create objects by copying.

Singleton

Ensure a class has only one instance, and provide a global point of access.

Adapter

Convert the interface of a class into another interface, expected by clients.

Bridge

Decouple abstraction from implementation, so the two can vary independently.

Composite

Compose objects into tree structures to represent part-whole hierarchies. Treat individuals and composites uniformly.

Decorator

Attach additional responsibilities to an object dynamically.

Facade

Provide a single unified interface to a set of subsystem interfaces.

Flyweight

Use sharing to support large numbers of fine-grained objects efficiently.

Proxy

Provide a surrogate or placeholder for another object.

Chain of Responsibility

Give more than one object a chance to handle an incoming message. Pass the request along the chain until an object handles it.

Command

Encapsulate a request as an object.

lecture #15 began here

Interpreter

Interpret sentences in a language by defining and operating on an internal data structure representation of that language.

Iterator

Provide a way to access the elements of an aggregate object sequentially.

Mediator

Define an object that encapsulates how a set of objects interact.

Memento

Capture and externalize an object's internal state so that the object can be restored to this state later.

Observer

Create a mechanism such that when one object changes its state, all its dependent observers are notified and updated.

State

Allow an object to alter its behavior when its internal state changes, appearing to have changed its class.

Strategy

Define a set of encapsulated, interchangeable, algorithms; allow algorithms to vary independently of their clients.

Template Method

Define the skeleton of an algorithm, deferring some steps to subclasses.

Visitor

Represent an operation on elements of an object structure; enable new operations without changing the element classes.

Antipatterns

lecture #16 began here

Current CVS:

I put my checkout copy at: http://www.cs.nmsu.edu/~jeffery/courses/371/FPS/
The goal of sticking it all up on the web: easier discussion in class and meetings.: Needs to have some HTML underpinnings to be usable though, index.html files in each directory for example.
Client: 2200 lines: 9 .h files, 1300 lines; 9 .cpp files, 950 lines
Server: 300 lines, 700 more in send_structs/: 2 .h files, 70 lines; 3 .cpp files, 240 lines

lecture #17 began here

Coding

"You must know your organization's standards and procedures before you begin to write code" - Pfleeger.

Documenting code

The first main standard most programmers will run into is documentation standards for code. Make it clear and easy to follow. This helps you return to your own work after a break, and it certainly helps others use it, during for example the testing or maintenance phase.

You should write comments in your code -- not on every line, as you should for assembly language, but on every "paragraph", meaning every method, and every control structure or algorithm within the method that is interesting or complex enough that you estimate your classmates might not understand.

Indentation Conventions

We will adopt an indentation standard for shared source code on this project, of 3 spaces per level. Example:

class A {
   public int a;
   public static void main(String argv[])
   {
      int x;
      for (x = 0; x < 10; x++) {
         if (x % 2 == 0)
            System.out.println("even");
         else
           System.out.println("odd");
         }
   }
}

Header comments

There is a standard header for each source file module, and a standard header for each method or function.

	source file header	method header
Java	header.java	meth.java

Standards - part 2

"The most critical standard is the need for a direct correspondence between the program design components and the program code components" - Pfleeger

Programming Guidelines

In addition to documentation, there are guidelines you can adopt for how you implement your control structures, algorithms, and data structures.

Control Structures

There are various maxims to follow here, such as "don't use goto's unless you have to" or "keep your loops as simple and shallow as possible". People didn't just make these up for fun - there is strong statistical evidence that many software projects fail when these are not followed.

Algorithms

There are classic easy slow algorithms, and complex fast algorithms. Do not use complex fast algorithms unless you actually need them. They have hidden costs:

the extra programming time cost - especially in a prototype situation like this class
the extra testing time cost - complexity requires more test cases
extra understanding time - for maintainers and people who have to call your code

Data Structures

Complex data structures are often adopted in order to allow faster algorithms. Complex data representations are also sometimes adopted to save space.

General Guidelines

localize input and output: actually, localize all platform interactions and system-dependent code
include your pseudocode: don't just throw it away, include it as header comments and such
revise and rewrite, don't just patch: if the code is getting complex, maybe its time to revisit the design and/or rewrite a module; don't just stick on a bandaid and pretend its OK. (OOP = ostrich-oriented programming)

Typical Glitches

filenames, e.g. player.java renamed to Player.java

missing fields

wrong field names

array trouble

using strings or integers when you need objects

public variables vs. "good" practice

Fun with Vectors and Hashtables

Java features Vector and Hashtable classes for accessing collections of ordered objects (Vector) and objects with associated string names (Hashtable). Container classes like these are fundamental building blocks.

Vectors are used as follows:

feature Java
finding it import java.util.*
declaring it Vector v;
constructing it v = new Vector();
sizing it v.size();
subscripting it v.elementAt(i);
using the elements ((Property)v.elementAt(i)).m()
adding elements v.addElement(x);

feature	Java
finding it	import java.util.*
declaring it	Vector v;
constructing it	v = new Vector();
sizing it	v.size();
subscripting it	v.elementAt(i);
using the elements	((Property)v.elementAt(i)).m()
adding elements	v.addElement(x);

Now, what about the important HashTable type?.

feature Java
finding it import java.util.*
declaring it Hashtable d;
constructing it T = new Hashtable();
sizing it T.size();
inserting into it T.put("foo", new Integer(5));
looking up things T.get("foo");
using the elements ((Property)T.get("foo")).m()
This syntax is long-winded and cumbersome compared to more powerful OO languages, but it gets the job done, and at least the syntax and public interface is pretty consistent. When you learn how to use one collection, you've learned most of what you need to use any of the other Java collection classes.

feature	Java
finding it	import java.util.*
declaring it	Hashtable d;
constructing it	T = new Hashtable();
sizing it	T.size();
inserting into it	T.put("foo", new Integer(5));
looking up things	T.get("foo");
using the elements	((Property)T.get("foo")).m()

Comments from the Code

test your compile on OUR system before you cvs commit it. don't - repeat - don't check in something that doesn't compile.

Respect class formatting conventions: use < 80 column lines.

I want to be able to print hard copies without special tools
some data transmission methods, such as some e-mailers, auto break long lines, breaking the code in the process

lecture #18 began here

Software Measurement

Software metrics is the art of measuring aspects of software. To measure is to assign numbers or symbols to attributes of entites in order to describe them. Interpreting these measures requires some model of what they were measuring.

People measure many different kinds of things. Measures can be categorized as reporting information about software process, products, or resources. Besides these broad categories, there are direct vs. indirect measures, and internal vs. external measures.

Direct vs. Indirect

There are direct measures (lines of code, size in bytes, running time in seconds) and indirect measures (quality, functionality, complexity). Some measures report static properties of program source code or data, others measure dynamic properties of program execution.

While direct measures are "easy" to observe they are not necessarily very meaningful/useful. Indirect measures allow more useful interpretation -- if they are true about what they are measuring. There is a tendency to use a couple of direct measures (say, #bugs, and #KLOC) to come up with a new measure via a simple equation (say, #bugs/#KLOC) and claim this is a new, indirect measure. Does quality = #bugs/#KLOC?

Internal vs. External

Engineers may want measures that estimate how long something will take, or how much progress has been made (project monitoring), measures of the effectiveness of tools or processes (process improvement).

Managers on the other hand may want measures that validate what they are doing. Most software engineering texts I have read totally deny that managers might use metrics in evaluating their engineers. For example, in a test by Pressman it was noted that normalized metrics data are used to evaluate the software process, but but never the individual people. I suspect this is inevitably false, and that assessing developers is another common use of metrics.

Classifying Software Measures

Size-oriented, direct measures
- Lines of code
- Execution speed
- Memory size
- Bug reports per customer-week
Function-oriented, indirect measures
- # of user-input activities
- # of user-output views
- # of user "inquiries" (for databases)
- # of files
- # of external interfaces (network, etc)
- # of non-trivial algorithms used
You can weight the indirect measures as (simple, average, or complex). The sum of the weighted scores = "function points" of the program. Unforunately, the resulting measure is pretty subjective.

Are Lines of Code a Good Measure of Program Size?

For example, can we use them to measure how productive some has been? Pay people per line of code? Assign course grades based on it?

#lines of code per programmer per day seems invariant across languages. so maybe LOC is a metric of effort put in.
languages vary in terms of #lines of code to implement a feature set. so maybe LOC does not measure relative size across languages.
humans have static limits on how many lines we can navigate well. varies widely; for most programmers it is from 5K to 50K lines.
unmonitored programmers can easy defeat a metric if its used for pay or merit.

OK, but how else are you supposed to measure program "size"? Some software engineers use "function points", counting each major unit of functionality (algorithm, input handler, output formatter, etc) weighted by how difficult it was. This is pretty inexact, but still gives one a way to attempt to report how much the software can do, not just how much the programmers have written.

How do you Measure Progress

A student (Michael Simmons) once claimed his teams' semester project was 75% complete the last week of the semester. What metrics would allow a manager determine whether this was true or not?

Compare delivered code against deadlines/milestones that were projected. You might come up with a graph of how many "projected months" you are completing per real month.
The manager could measure team progress/effort in terms of thousands of lines of code written (does lots of code imply lots of accomplishment?)
Function points delivered, compared with total function points needed?
FURPS - functionality, usability, reliability, performance, supportability
Found and fixed defects -- if the bug count is a lot higher than it was a month ago, that might or might not be an indication of health.
Defect frequency (defects per 1000 test hours) -- given enough testing, when testing is producing fewer and fewer bug reports, you have an indicator of progress, and when the frequency drops below a certain threshhold the software might be considered stable.

Measuring Quality

"Quality" means different things to different people. Many people would say it is some combination of the following

correctness: degree to which software performs its required function. Unit of measure: defects per 1000 lines of code
maintainability: ease with which a program can be corrected if an error is encountered. Unit: mean-time-to-change.
integrity: ability to withstand attacks on security
usability: skill required to learn the system, time it takes to become proficient with it, net increase in productivity over the old system, user attitudes towards the system

Tools That Measure Execution Time

Why measure time? Ancient reason: CPU cycles cost $. Real time cost $. Modern reasons: many programs have real-time requirements (say, a server that must process a transaction in 100 milliseconds or less in order to keep up with load, or an embedded software system that must talk to hardware). Another modern reason: GUI's must keep up with human users who as a race mostly have ADD. Psychology says it is better to be predictable than to be fast on average but way slow part of the time (like the web is). One tool that measures time: UNIX time(1) command.

Profiling and performance tuning. For C and UNIX, the standard tool is gprof. Compile and link with -pg, run program, run gprof on .output file. What about Java? JVMPI is standard, but is there a standard profiler? Does Netbeans have one?

lecture #19 began here

Complexity

Complexity is perhaps the most fun software metric out there. How complex a code is affects its understandability, maintainability, testability, and modifiability. CS algorithms tend to talk about theoretical complexity in terms of big-Oh, notation, as in O(n). Software engineers are interested in this, but usually use the term complexity in a broader sense. Two algorithms might have the same big-Oh value, but one may be vastly preferable if it is less subject to bugs and easier to read and maintain.

Some ways to measure relative complexity of two competing solutions to a problem:

1. nominal.: Giving names to different solutions, without comparison power. "Solution A uses dynamic programming; solution B uses decision tables."
2. ordinal.: Ordering solutions, we can say something is more complex without saying how much ("A is more complex than B").
3. interval: "A is 18 units of difficulty more complex than B" - if we can define units, we can calculate interval distances, but unless we know where the "origin" is, we still have less comparison power than...
4. ratio: Comparing in which the ratio between two values is meaningful, thanks to knowing where the 0 is ("A is 3 times as difficult as B").

Design Complexity versus Implementation Complexity

It is possible to evaluate designs, and assess one design as being "more complex" than another design. What measures of design complexity can you think of? What kind of measurement (nominal, ordinal, interval, ratio) can we achieve for designs?

Halstead's "Software Science"

n1 = unique operators		# n1 and n2 form the
n2 = unique operands		# "vocabulary" n

N1 = total # operators		# N1 and N2 form the
N2 = total # operands		# program "length" N

(2/n1 * n2/N2) * (N * log₂(n)) = "program intelligence content"

Published claims that this correlates well with total programming and debugging time

McCabe's Cyclomatic Complexity

For a program flow graph G, V(G) = E-N+2. Wasn't that easy? Actually constructing the flow graph can be a non-trivial exercise, but often flow graph may be already constructed as part of a compiler implementation, so if that compiler source code is available it may be easy to get the McCabe's number.

lecture #20 began here

lecture #21 began here

lecture #22 began here

lecture #23 began here

Software Testing

Testing is the process of looking for errors (not the process of demonstrating the program is correct). Passing a set of tests does not guarantee the program is correct, but failing a set of tests does guarantee the program has bugs.

Testing is best done by someone other than the person who wrote the code. This is because the person who wrote the code would write tests that reflect the assumptions and perspectives they have already made, and cannot be objective.

Kinds of errors include:

Syntax & semantics errors: typos, language misuse
Logic errors
I/O errors: formatting & parsing failures, network handshake trouble, ...
Design errors: misinheritance
Documentation errors: program does one (reasonable) thing, document says another

Kinds of testing include:

black box: tests written from specifications, cast in terms of inputs and their expected outputs
white box: tests written with the program source code in hand, for example, to guarantee that every line of code has been executed in one or more tests.
unit testing
integration testing
system testing
regression testing

Coverage Testing

Coverage means: writing tests that execute all the code. Since a significant portion of errors are due to simple typos and logic mistakes, if we execute every line of code we are likely to catch all such "easy" errors.

There are at least two useful kinds of coverage: statement coverage (executing every statement), and path coverage (executing every path through the code). Statement coverage is not sufficient to catch all bugs, but path coverage tends to suffer from a combinatorial explosion of possibilities. Exhaustive path coverage may not be an option, but some weaker forms of path coverage are useful.

lecture #24 began here

A note on Java "extends" (non-Java programmers can skip this section)

Last night around 11pm when I tried to make, I came across a baffling error that looked like:

StarSystemGame.java:4: cannot resolve symbol
symbol  : constructor Game ()
location: class Game
  {
  ^
1 error

It seemed to be complaining that Game() must provide a default constructor in order for any subclass to be allowed. But Game() should not have a default constructor, so I tried hard to understand why Java complains. In looking for an answer, I first looked at other code in our project that was doing to extends to see if it would show how to "do things right" so it would compile. I game across the following bogosity:

   // Constructor methods
   /**
    * This empty constructor method was added by the Galactic and Province 
    * Team to allow Sovereign to extend Charactr.
    */
   public Charactr()
   {

   }

This pretty much proves that the error is common, and that one "solution" is to add a default constructor that makes no sense. But, wanting a better fix and some understanding, I went to google, and after some searching, came across some notes from Stanford which pretty much explain the situation.

In brief: the compiler error message seems to blame the parent for the child's mistakes (a common pattern in the real world). The correct solution is

ALWAYS DEFINE A CONSTRUCTOR FOR THE SUBCLASS.

Without it, Java supplies a default constructor which calls the superclass default constructor (which doesn't exist and generates an error message). This issue is compounded by the fact that

JAVA DOES NOT INHERIT CONSTRUCTORS.

The reason the child got a default constructor added was because it did not inherit the parent's constructor.

This analysis got StarSystemGame compiling OK by adding the constructor:

  public StarSystemGame(String sourceFile) {
     super(sourceFile);
     }

I thought I was done, and feeling pretty clever, but there was one problem: subclass Sovereign DOES define its constructor. And taking out the dummy constructor from Charactr.java still causes Sovereign.java to fail. Darn!

I thought: maybe since Java doesn't inherit constructors, the subtype must define all the same constructors its supertype does, but that didn't work. Then I thought: maybe the subclass constructor always calls a superclass constructor, and if you don't do it yourself, it gets done unto you. This seems to be the case. So for class Sovereign to compile OK without the bogus superclass Charactr default constructor, subclass Sovereign's constructor has to call a superclass constructor. The following works, although I was just guessing at the parameter names due to TERRIBLE PARAMETER NAMES IN THE Charactr CONSTRUCTOR!

   super('?', name, combatRating, 0, (Environ)null, "", "", "", "", "",
         enduranceRating, 0, leadershipRating, 0, 0, 0, false);

Questions: what is the "scom" parameter? Characters do not have a space combat value (they do have a space leadership value). The constructor is confused about this because class Unit requires all units to have a space combat value. The spacecombat attribute should be moved down into class MilitaryUnit where it belongs. This got me wondering whether that first Charactr constructor was being used at all. Apparently not. Sovereign constructors were being passed a space leadership rating, but the superclass didn't seem to have a field to store it in. I added spaceLeadership to class Charactr. Code for parsing characters from .dat files probably needs to know about space leadership and do the right thing with it.

lecture #25 began here

Homework #6

Unit Testing

One method of unit testing that is statistically shown to be cost-effective is to read the source code! This may be done with the aid of a partner or team, performing either a walkthrough (where the code author sets the agenda and others review) or an inspection (where the leadership sets the agenda).

Whole books have been written about methods of writing good tests, much of which boils down to: write tests to challenge the boundary conditions and assumptions that programmers typically make when writing code.

Junit HowTo / CppUnit Docs

JUnit is for Java, CppUnit is a port of it for C++, installed locally at /home/uni1/jeffery/371/cppunit-1.10.2/ but not yet tested.

Example of (white box) testing: Testing Loops

If your job is to write tests for some procedure that has a loop in it, you can write tests that:

skip the loop entirely
execute only one iteration of the loop
execute two iterations
execute M iterations, for some random 2 < M < N-1
execute (N-1), N, and (N+1) iterations

where N is the maximum number of allowable passes.

Public versus Accessors

In reading your HW4, I noticed one or more teams declaring "everything is public". While public methods are normal, public fields are only normal in a rapid prototyping context. For this reason, I am not surprised to see the actual Java code using lots of privates where the design claimed fields were public. One point I would like to reiterate is: if a field is public there is no reason to write accessor get/set methods. In fact, a big advantage of declaring a field public for rapid prototyping purposes is that you get to defer writing these methods.

lecture #26 began here

L&L on Testing

Software Testing is described in Chapter 10 of the Lethbridge text. Are you fully buzzword cognizant?

failure: an unacceptable behavior exhibited by a system
defect: a flaw in any aspect of the system including the requirements, the design and the code, that contributes or may potentially contribute to the occurrence of failures.
error: a slip-up or inappropriate decision by a software developer that leads to the introduction of a defect into the system

Coverage testing clarification: "all possible paths" is impractical due to combinatorial explosion. "all nodes" is inadequate because it misses too much. The right compromise is "cover all edges".

lecture #27 began here

More Lethbridge on Testing

Testing is like detective work?

Lethbridge makes an unfortunate analogy between programmers and criminals; they have a modus operandi, and once you find what type of bugs a programmer is writing in one place, the programmer may well repeat similar bugs elsewhere in the code.

In selecting test cases, look for equivalence classes

You usually cannot test all the possible inputs to a program or parameters to a procedure that you wish to test. If you can identify what ranges of values ought to evoke different kinds of responses, it will help you minimize test cases to: one representative from each class of expected answer, plus extra tests at the boundaries of the equivalence classes to make sure the ranges are nonoverlapping.

Some (Lethbridge) Bug Categories

Incorrect logic (< instead of >; missing an = somewhere, etc.)

Performing a calculation in the wrong place

Not terminating a loop or a recursion

Not establishing preconditions; not checking your inputs

Not handling null conditions

Not handling singleton/nonsingleton conditions

Off by one errors

Precedence errors

Use of bad algorithms

Not using enough bits/not enough precision

Accumulating a large numerical error

Testing for floating point equality

Deadlock/livelock

Insufficient response time (on minimal configurations)

Incompatibilities with specific hardware/software configurations

Resource leaks

Failure to recover from crashes

lecture #28 began here

Inspections

Lethbridge p. 383

Idea: examine source code looking for defects.

Roles:

author
moderator: runs the meeting. establishes and enforces the "rules"
secretary: recording defects when they are found
paraphrasers: step through the document, explaining it in their own words

Writing Formal Test Cases (Lethbridge 10.8)

A test case is a set of instructions for detecting a particular class of defects in a software system, by causing them to occur. A test case often involves running many tests aimed at that particular defect to be detected.
A test plan is a set of test cases.

A test plan should be written before testing starts. Lethbridge points out that it can be developed right after requirements are identified. In recent years the "extreme programming" community has argued in favor of writing the test cases first, before coding. If you can't come up with satisfactory test cases, you certainly don't know the problem yet well enough to code a solution or know whether your program is in fact a solution.

A test case includes: a test case #/id, a descriptive name, a set of instructions, a description of the expected (correct) results, and (for some tests) instructions on how to restore the system back to normal after the test.

Test cases are generally assigned a priority; Lethbridge suggests they be designated priority I (critical), II (general), and III (cosmetic).

In-class exercise: if we wanted test cases for our semester project, what should be in them?

Testing Strategies for Large Systems

Integration Testing

The "big bang": what happens when you wire it all together? How do you avoid the big bang?

Top Down vs. Bottom Up Testing

Top down = work from the user interface gradually deeper into the system
Bottom up = test individual units first
Sandwich testing = test UI and lowest levels first, because they are the easiest to test

lecture #29 began here

Project Status

src/ has 428 lines in 5 files.
What is this WidgetMasterOmega/ thing? Was it part of the design? Are there supposed to be widgetmaster stuff up in src ?
Client has 3443 lines in 23 files.
Server has 229 lines in 4 files
Server/include has 200 lines in 7 files
Server/send_structs has 691 lines in 7 files
Server/source has 825 lines in 5 files

The Rest of Software Engineering

So, what parts of software engineering didn't we cover this semester that we would like to have covered?

``Graduate Level'' Topics

Over time, topics that used to be graduate level tend to become undergraduate level. Which of these topics ought to be covered in CS 371, and what parts of 371 should we remove to make room?

Software Architecture: Architecture involves "bigger picture" aspects of designing a software system that involves multiple programs/threads/processes. Besides "client server", what architectures are out there?
Component-Based Software Engineering: Components are independently-compiled, installable, reusable software elements that are usable by multiple applications. What kinds of components are widely used?
Software Process: Its one thing to define what notations are used for various phases of software development. Its another to define what tasks, in what order, a programming team should perform in order to go about the whole development process in an efficient manner.
Program Verification: Loosely, this is the study of how to prove programs correct. What does "correct" mean?
Dynamic Analysis: This is the study of program execution behavior, which may aid in detecting bugs, understanding what is going on, or confirming correctness.

lecture #30 began here

Final Exam Review

The final exam will cover the whole course, with an emphasis on software engineering "back-end" issues.

Review your UML, especially parts you missed on the midterm -- what is the difference between a state and an event? When I ask you to draw a statechart, knowing what a state is will be important. What is the difference between a class and an association?

There were in general more misunderstandings about statecharts than about class diagrams, and more misunderstandings about collaboration diagrams than about statecharts. What is the purpose of a collaboration diagram?

What tasks are involved in the so-called ``back end'' of software development?

When you take a UML class diagram and write the code that corresponds to it, how to you implement each of the things you see in the diagram? The closer the language corresponds back to the diagramming notation, the easier it will be to implement from a design and keep them in sync.

What is software testing? What is its purpose? What kinds of software testing are there?

What is software metrics? What categories of measurement are there? What specific measures?

What are the most frequent kinds of bugs you have encountered in this semester? What has been the hardest part about doing the project? What has been the hardest part about doing the coding? What methods worked, and what methods did not work very well?

Is it harder to write new code, or to modify someone else's existing code? Why?

What is CVS? How does it work? What are the major CVS commands? Suppose you get a "conflicts during merge" message from CVS; what causes it and what do you do about it?

What other software tools did we use this semester? What do they do, how do you use them?

Were there any C++ language features you had to learn this semester in order to implement your project? If so, what were they? Were there any problems related to using the language or class libraries?