Computer Programming Principles/Printable version


Computer Programming Principles

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Computer_Programming_Principles

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.

The Problem

The Problem

edit

Most software programs are made because there is a problem that needs to be addressed or solved. Before a problem can be addressed, you need to clearly identify the problem or problems that need to be addressed. In this section you will learn basic skills to identifying problems.


Problem Statement

edit

A problem statement is a description of the issues that need to be addressed. A good problem statement answers:

  1. What is the problem? This should help explain why a program is needed.
  2. Who has the problem? This should help identify who is likely to use the program.

If the answers are too general, thinking and creativity may be required.


The Problem/Trial And Error

Suppose you have a function foo that takes the arguments a and b. However, there is an API bar available but the documentation is very vague. What needs to be passed to bar? Should b or a come first? Trying repeatedly, you find out that bar is almost completely undocumented. Worse, the code is closed. What are you going to do?


Methodologies/Reverse Programming

Reverse Programming is a process of coding an initial application or functional portion without an upfront design through an inventive top down process.

The method can be demonstrated using a simple math tree. First you start with your answer, say 9. Then you decide how you would arrive at 9, of the many possibilities you choose 4+5. Now you have 9=4+5. You can proceed to replace the 4 and 5 with a further math function. Suppose 4=2*2 and 5 = 3+2, so now the final formula would be 9=(2*2)+(3+2). The process can continue indefinitely. The point being is that you did not know up front how you would arrive at 9 but were able to start immediately without planning.

In the case of software development it is possible to start with the end result, at each stage of working backwards you can use constants and then replace those constants with functions or objects, etc.

The nice thing about this process is that it helps not to "over design" as the choice of each step backwards is usually the simplest.

Reverse programming was developed by William Egge by video recording his code edits and then reviewing them. He was inspired by the difficulty and being overwhelmed by complex problems and wanted an easier method that "flowed" without having to design up front.


Maintaining/Testing

Introduction

edit

Software testing is used to gain valuable information about a program's quality and can be used to uncover problems. Software testing is only as good at the conditions tested for. Untested conditions can still cause a program to function in unexpected ways. Knowing what requirements and priorities the program has is a major step in preparation for software testing. Software testing can help you find answers to:

  • Are the requirements for the program satisfied?
  • Are there any gaps in the program's requirements?
  • Are there defects or failures in the program's designed?
  • Are there any unexpected errors?
  • Does the program work as expected?

Build scripts

edit

Try to set up "one-button testing". That makes it much more convenient to type a little, then hit that button which

  • saves the file you just edited,
  • compiles the application with all the appropriate options (such as "-fno-emit-frame", if necessary), and
  • runs a few quick tests, to make sure your "improvements" don't accidentally break something else.

Spending an hour to code up a few tests and set up one-button testing may *seem* like it is more hassle than it is worth. Manually compiling the file, and manually run through the various parts of the application to make sure they work, may take far less than an hour. But trust me, most programs you are going to edit-compile-test many, many times. And a year from now, when you make just one tiny little change, wouldn't you much rather push-the-button and be done with it, rather than

  • manually compile the file
  • manually run through the application and see that it suddenly it doesn't work any more
  • pull out your hair until
  • hours later, you remember you needed to include "-fno-emit-frame"
  • manually re-compile the file, this time with "-fno-emit-frame"
  • start testing all over from the beginning.

There are lots of ways to set up an automated build system.

One-button testing is just one part of the continuous integration recommended by some programmers.

Even lawyers can see the advantages of automated build scripts. [1]

Further reading

edit


Maintaining/Debugging

Introduction

edit

Debugging is the art of diagnosing errors in programs and determining how to correct them. "Bugs" come in a variety of forms, including: coding errors, design errors, complex interactions, poor user interface designs, and system failures. Learning how to debug a program effectively, then, requires that you learn how to identify which sort of problem you're looking at, and apply the appropriate techniques to eliminate the problem.

Bugs are found throughout the software life cycle. The programmer may find an issue, a software tester might identify a problem, or an end user might report an unexpected result. Part of debugging effectively involves using the appropriate techniques to get necessary information from the different sources of problem reports.

The most common types of mistakes when programming are:

  • Programming without thinking
  • Writing code in an unstructured manner

Bugs in Detail

edit

What are these different kinds of bugs, then?

With coding errors, the source of the problem lies with programmer induced erroneous or improper code. Some examples of coding errors include:

  • Disregarding adopted conventions.
  • Calling the wrong function ("moveUp", instead of "moveDown")
  • Using the wrong variable names in the wrong places ( "moveTo(y, x)" instead of "moveTo(x, y)")
  • Failing to initialize a variable ( "y = x + 1", where x has not been set) when absolutely required.
  • Skipping a check for an error return.

Software users readily see some design errors, while in other cases design flaws make a program more difficult to improve or fix, and those flaws are not obvious to a user. Obvious design flaws are often demonstrated by programs that run up against the limits of a computer, such as available memory, available disk space, available processor speed, and overwhelming input/output devices. More difficult design errors fall into several categories:

  • Failure to hide complexity
  • Incomplete or ambiguous "contracts"
  • Undocumented side effects

Complex interactivity bugs arise in scenarios where multiple parts of a single program, multiple programs, or multiple computers interact.

Poor user interface designs often lead users to use the program in ways that accomplish something other than what they intend. For example, a "search" page for a web site might have an option for "case-insensitive" searching. When the option is hard for the user to find or see, that user might report a bug that some of their data is "lost", simply because it is not found by the case sensitive search.

Sometimes, computer hardware simply fails, and it usually does so in wildly unexpected ways. Determining that the problem lies not with the software itself, but with the computer(s) on which it is usually complicated by the fact that the person debugging the software may not have access to the hardware that shows the problem.

Preventing Bugs

edit

No discussion of debugging software would be complete without a discussion of how to prevent bugs in the first place. No matter how well you write code, if you write the wrong code, it won't help anyone. If you create the right code, but users cannot work the user interface, you might as well have not written the code. In short, a good debugger should keep an open mind about where the problem might lie.

Although it is outside the scope of this discussion to describe the myriad techniques for avoiding bugs, many of the techniques here are equally useful after the fact, when you have a bug and need to uncover it and fix it. Thus, a brief discussion follows.

Understand the Problem

edit

In order to write effective software, the developer must solve the problem the user needs solved. Users, naturally enough, do not think in strict algorithms, windowing systems, web pages, or command line interfaces. Rather, users may not think of problems in the same way that the developer thinks of problems.

To address this difference, sit down with the intended user, and ask them what they want from the software. Users frequently want more than software can actually deliver, or have contradictory aims, such as software that does more, but doesn't require that they learn anything new. In short, ask the users what their goals are. Absent those goals, users will keep reporting bugs that do not add up to a coherent whole.

Effective Processes

edit

Development Tools

edit

Unit Testing

edit

Unit testing means checking what happens in all possible states that the current module can enter. Therefore you should prepare a "test list" where you define all the possible inputs for current module.
For example: We have program that gets positive numbers from the user and process them. First we need to check if the input is a number (it can be a char), then we will check if it's positive. By checking I mean enter input and see what happens.
Hint: When you start to write this test list you'll notice that it's quite hard to predict all the possibilities; if you have the option to ask someone else (that didn't help writing the module) to help it could be fruitful.

Documenting Code

edit

Basic debugging steps

edit

Although each debugging experience is unique, certain general principles can be applied in debugging. This section particularly addresses debugging software, although many of these principles can also be applied to debugging hardware.

The basic steps in debugging are:

  • Recognize that a bug exists
  • Isolate the source of the bug
  • Identify the cause of the bug
  • Determine a fix for the bug
  • Apply the fix and test it

Recognize a bug exists

edit

Detection of bugs can be done proactively or passively.

An experienced programmer often knows where errors are more likely to occur, based on the complexity of sections of the program as well as possible data corruption. For example, any data obtained from a user should be treated suspiciously. Great care should be taken to verify that the format and content of the data are correct. Data obtained from transmissions should be checked to make sure the entire message (data) was received. Complex data that must be parsed and/or processed may contain unexpected combinations of values that were not anticipated, and not handled correctly. By inserting checks for likely error symptoms, the program can detect when data has been corrupted or not handled correctly.

If an error is severe enough to cause the program to terminate abnormally, the existence of a bug becomes obvious. If the program detects a less serious problem, the bug can be recognized, provided error and/or log messages are monitored. However, if the error is minor and only causes the wrong results, it becomes much more difficult to detect that a bug exists; this is especially true if it is difficult or impossible to verify the results of the program.

The goal of this step is to identify the symptoms of the bug. Observing the symptoms of the problem, under what conditions the problem is detected, and what work-arounds, if any, have been found, will greatly help the remaining steps to debugging the problem.

Isolate source of bug

edit

This step is often the most difficult (and therefore rewarding) step in debugging. The idea is to identify what portion of the system is causing the error. Unfortunately, the source of the problem isn't always the same as the source of the symptoms. For example, if an input record is corrupted, an error may not occur until the program is processing a different record, or performing some action based on the erroneous information, which could happen long after the record was read.

This step often involves iterative testing. The programmer might first verify that the input is correct, next if it was read correctly, processed correctly, etc. For modular systems, this step can be a little easier by checking the validity of data passed across interfaces between different modules. If the input was correct, but the output was not, then the source of the error is within the module. By iteratively testing inputs and outputs, the debugger can identify within a few lines of code where the error is occurring.

Skilled debuggers are often able to hypothesize where the problem might be (based on analogies to previous similar situations), and test the inputs and outputs of the suspected areas of the program. This form of debugging is an instance of the scientific method. Less skilled debuggers often step sequentially through the program, looking for a place where the behavior of the program is different from that expected. Note that this is still a form of scientific method as the programmer must decide what variables to examine when looking for unusual behavior. Another approach is to use a "binary search" type of isolation process. By testing sections near the middle of the data / processing flow, the programmer can determine if the error happens during earlier or later sections of the program. If no data problems are detected, then the error is probably later in the process.

Identify cause of bug

edit

Having found the location of the bug, the next step is to determine the actual cause of the bug, which might involve other sections of the program. For example, if it has been determined that the program faults because a field is wrong, the next step is to identify why the field is wrong. This is the actual source of the bug, although some would argue that the inability of a program to handle bad data can be considered a bug as well.

A good understanding of the system is vital to successfully identifying the source of the bug. A trained debugger can isolate where a problem originates, but only someone familiar with the system can accurately identify the actual cause behind the error. In some cases it might be external to the system: the input data was incorrect. In other cases it might be due to a logic error, where correct data was handled incorrectly. Other possibilities include unexpected values, where the initial assumptions were that a given field can have only "n" values, when in fact, it can have more, as well as unexpected combinations of values in different fields (field x was only supposed to have that value when field y was something different). Another possibility is incorrect reference data, such as a lookup table containing incorrect values relative to the record that was corrupted.

Having determined the cause of the bug, it is a good idea to examine similar sections of the code to see if the same mistake is repeated elsewhere. If the error was clearly a typo, this is less likely, but if the original programmer misunderstood the initial design and/or requirements, the same or similar mistakes could have been made elsewhere.

Determine fix for bug

edit

Having identified the source of the problem, the next task is to determine how the problem can be fixed. An intimate knowledge of the existing system is essential for all but the simplest of problems. This is because the fix will modify the existing behavior of the system, which may produce unexpected results. Furthermore, fixing an existing bug can often either create additional bugs, or expose other bugs that were already present in the program, but never exposed because of the original bug. These problems are often caused by the program executing a previously untested branch of code, or under previously untested conditions.

In some cases, a fix is simple and obvious. This is especially true for logic errors where the original design was implemented incorrectly. On the other hand, if the problem uncovers a major design flaw that permeates a large portion of the system, then the fix might range from difficult to impossible, requiring a total rewrite of the application.

In some cases, it might be desirable to implement a "quick fix", followed by a more permanent fix. This decision is often made by considering the severity, visibility, frequency, and side effects of the problem, as well as the nature of the fix, and product schedules (e.g., are there more pressing problems?).

Fix and test

edit

After the fix has been applied, it is important to test the system and determine that the fix handles the former problem correctly. Testing should be done for two purposes: (1) does the fix now handle the original problem correctly, and (2) make sure the fix hasn't created any undesirable side effects.

For large systems, it is a good idea to have regression tests, a series of test runs that exercise the system. After significant changes and/or bug fixes, these tests can be repeated at any time to verify that the system still executes as expected. As new features are added, additional tests can be included in the test suite.

Steps to reduce debugging

edit

There are concrete steps that can be taken to reduce the amount of time spent debugging software. These are listed in the sections below.

The correct mindset

edit

Probably the most important thing you can do when you are starting to debug a program is to realize that you don't understand what is going on. Programmers who are convinced that their program should work fine are less likely to find errors simply because they are refusing to admit their confusion. If the program behaved the way you think it does, you wouldn't be debugging; the program would be working fine. Even when the program appears to work, if you examine it with the thought that there is at least one bug remaining and you are going to find it, then you are more likely to find something wrong with the program.

Start at the source

edit

The time when you are most aware of where problems are more likely to arise is usually when first designing and writing the code. By inserting integrity checks at various places within the program, problems can be detected and reported by the program itself. In addition to detecting problems, considerations should be given as to how best to handle each error. Options include:

  • Report error, set invalid fields to a default value, and continue
  • Report error, discard the record associated with the invalid value, and continue
  • Report error, transfer invalid record into separate file/table so the user can examine and possibly correct the problem
  • Report error and terminate the program

Treat user input with suspicion

edit

Any data that originated from users (including external systems) should be treated with suspicion. Carefully validate all such input data, performing syntactical and semantical integrity checks. Such invalid data are a common source of programming errors. Think not just of data entered in error, but malicious data as well, as in buffer overflow exploits.

If data are entered interactively by users, you can provide appropriate error messages and allow the user to correct the invalid field(s). If data are not from an interactive source, then the erroneous records should be handled as described above.

Use of log files

edit

Programs that write information to log files can provide significant information that can be used to analyze what was going on before, during, and after problems are encountered. The number of entries to be searched can be reduced by creating various log files, such as a separate log for each major component of the system, plus one log file strictly for errors. Each entry should be date/time stamped so that entries from different logs can be correlated.

Test suites

edit

A standard set of tests that can be run to perform tests can assist in finding errors before they make it into production. These test cases should be automated as much as possible to reduce the amount of effort required to perform these tests. As new features are added to the system, additional tests should be created to exercise those features.

Change one thing at a time

edit

When making a lot of changes, apply them incrementally. Add one change, and then test that change thoroughly before starting on the next change. This will reduce the number of possible sources of new bugs. If several different changes are applied at the same time, then it is much more difficult to identify the source of the problem. Furthermore, minor errors in different areas can interact to produce errors that never would have happened if those changes had been applied one at a time.

Back out changes that have no effect

edit

If you make a change to fix a problem, but the program still behaves the same, back out those changes before proceeding. The fact that your changes didn't do anything indicates one of several things:

  • The problem is not where you think it is
  • The area you modified either isn't being called, or isn't being called the way you think it is
  • Assuming the section you changed wasn't executed, you might have introduced new bugs that won't appear until you fix the current bug

Try another port of your Application

edit

Programs, that are available under different Architectures (e.g. Operation Systems like MS Windows, MacOSX, Linux or Processors like Intel Pentium, PowerPC or DEC Alpha) sometimes react differently on other Systems (especially for subsequent errors). Sometimes it is far more easy to find the error on a different architecture.

Think of similar situations

edit

When a bug has been found, think of other places where the same mistake might have been made. Check those places and see if the same problem exists there as well.

Finding User Interface Bugs

edit

Finding Design Bugs

edit

Finding Coding Errors

edit

Not every kind of program is debugged in the same way and not all techniques can be used on all types of programs.

The main character in debugging is the debugger. This is software which runs simultaneous with the newly written program and allows you to pause the program and read memory addresses, stack and various other normally invisible parts of your program.

Another method of debugging is the log-file. Outputting the contents of certain variables can provide valuable information on how your program performs. Outputting a string containing the name of the function when the function is called can be useful in locating when an error is introduced. For finding where a program crashes it's more practical to use the debugger.

Large programs are hard to debug, small programs are (relatively) easy to debug. So the key is to turn a large program into a lot of small programs for debugging. This is called "Unit testing" and involves compiling a part of your program (a routine, a collection of related routines, a module or even a complete subsystem) with extra code to allow it to run without the rest of the code in place.

Full screen application (especially games) can be hard to debug as you won't be able to see the debuggers output. A solution lays in using a null-modem cable, a second computer and a terminal program (e.g. Hyper-terminal). Pipe the output of the debugger through the null modem cable to the second computer.

e.g. In Dos with gdb using a serial null modem cable:
Configure the port with mode: mode COM2: 9600,n,8,1,none
Pipe the output to COM2 by adding >COM2 when you invoke the debugger.

Further reading

edit

Most languages support their own special techniques for debugging:

Some platforms have special debugging techniques:


Appendixes/Copying