Object-Oriented design tips

An introduction to Object-Oriented design

Introduction

This node will outline some basic guidelines and rules on designing class hierarchies and OOP programs. It will occasionly delve into the microscopic issues of class design.

If you don't know what OOP is, check here for an explanation. If you're not too interested in program design, go read this.

This introduction is split into a number of tips, each of which should give you additional insight into the OOP method.

Tip 1: Classes should represent a concept or thing

Classes can be said to "exist" in two places - solution space and implementation space.

Classes in solution space represent something that is part of the real world problem we're trying to solve. Examples are words, fonts and letters in a word processor. Classes in implementation space represent concepts or things that only exist as part of our program, but are integral to the implementation (arrays, linked lists, smart pointers).

Only define a class if you can clearly link it to some concept either in solution space or implementation space. Think about your problem, or listen to your clients talking about it. The nouns they use are likely to be the classes you'll want to define in solution space, and the verbs are the operations you'll want to define on those classes.

Tip 2: Ask not what your program does, but what it does it to

When you're designing your software, think first about the objects and concepts the solution needs to represent, and later about what exactly these objects and concepts will do. If you start to plan towards particular functionality early in the development phase, your choice of classes may be biased towards some specific solution. This can be a problem.

As software evolves, it tends to generalise. Clients will call on new functionality to be implemented - if your design isn't flexible, then you're in trouble. If your design was flexible from the start (ie. it concentrated on the data rather than the functionality), then you won't have as much trouble adding new functionality later on.

This is one of the most fundamental maxims of OO design.

Tip 3: Allow as little communication as possible between classes

This includes all types of relationships, and the reasoning for it is rooted primarily in stability. Every time a class talks to another class, it increases the possibility for errors to occur in three places (discounting the actual code doing the asking):

The code doing the receiving.
The type of the return value.
The type of the data sent.

Why? Because these three things are subject to change, and the change may well be outside your control if you're working on a commercial project, or you might just plain forget about possible errors when you change one of these things. Classes which are forever talking to each other allow errors to propogate through them like wildfire.

By minisming dependencies, you minimise the amount of things that might break when you change something.

This is formalised in the Interface Segregation Principle, which also explains some other reasons why minimising communication is a good idea.

Further reading:

Tip 4: Know what "isa" relationships are

When designing class hierarchies, you're going to be implementing quite a lot of "isa" relationships through inheritance. An "isa" relationship is something resembling the following (language-neutral syntax):

function Breathe( Human h );
function Study( Student s );

class Human {
  ...
};

class Student inherits class Human {
  ...
};

We've defined two functions, Breathe() which takes an argument of type Human, and Study which takes an argument of type Student. The class Student "isa" Human, because it inherits the class Human. Human is not a Student. Proof of this is that while Breathe(Student) is a legal call, Study(Human) is not. Both Humans and Students breathe, but only Students study.

"Isa" relationships are the most common use of inheritance, and are implemented through public inheritance. A good general rule, at least at first (when you're ready to break it, you'll know), is to only use public inheritance when you want an "isa" relationship, and not any other time.

Further reading:

Tip 5: Know what "is-implemented-in-terms-of" relationships are

...And the difference between them and "isa" relationships.

It's common to get the two mixed up, and use public inheritance for something other than "isa" relationships.

An "is-implemented-in-terms-of" relationship occurs when you have a class that relies on the features of another class to do its job, but isn't a derivative of the other class (in the way a Student is a derivative of Human). Put another way, "is-implemented-in-terms-of" means what it says: one class is implemented by using another.

Imagine a class CPU, that relies on classes ALU (Arithmetic Logic Unit) and RAM to do its job. A CPU is clearly not an ALU or a RAM, but it needs them to do its job. Here's how you would represent the following using a technique called layering:

class CPU {
  ALU myALU;
  RAM myRAM;
  ...
};

ALU and RAM are both classes, and myALU and myRAM are said to be layered objects. They are data members of the class CPU, and it can now access their features to do its job. The alternative approach, using public inheritance, would be:

class CPU inherits class ALU and class RAM {
  ...
};

This would give the class CPU access to the members of the other classes, but is impractical for two reasons:

It breaks the rules of encapsulation. Clients would know what other classes were used in the implementation of CPU, and be able to access the features of them directly. This is information they don't need, and in some cases can be downright harmful.
If we had a function defined to take an ALU as a parameter, it could also take a CPU. This is crazy.

Further reading:

Tip 6: Don't put data members in the public interface of a class

(Unless you can think of a really good reason to do so).

Data members are variables that are members of a class. The public interface of a class is the part of the class that other classes can manipulate.

Allowing other classes to mess around with your class' internal data is just a bad idea. They could mutate your class in ways it was never meant to be mutated or put invalid data in the fields, for instance. It's a lot better to only let other classes access your data members through functions of the form get_datamember() and set_datamember() in a way you control.

This also helps satisfy the Uniform Access Principle - if everything in the public interface of your class is a member function, then your clients don't need to think about whether they should put () after the name of the member.

Further reading:

Uniform Access Principle

circle / ellipse problem	Learning to program	OOP	Object-Oriented
Uniform Access Principle	Interface Segregation Principle	private inheritance	Why the willow weeps
Open Closed Principle	Object Oriented Software Construction	public inheritance	Layering
Inheritance	member function	Arithmetic Logic Unit	multiple inheritance
The Sopranos	C++