A philosophy of programming which emphasizes algorithms as the central concepts of software, as opposed to objects. Abstraction is viewed as a tradeoff rather than a virtue.

The primary tools in generic programming are:
Generic Programming's intellectual founder and main proponent (in 2000) is Alexander Stepanov. His enormously successful Standard Template Library, which he wrote in collaboration with Ming Lee when they both worked for Hewlett-Packard Corporation, was an expression of these ideas.

Generic Programming is, in part, a reaction to OOP, many of whose techniques have been overhyped, and whose misuse has damaged countless schedules and hurt program performance. Of couse there are other negative reactions to OOP.

Despite this, Stepanov has chosen C++ as the only language his ideas can be adequately expressed in, at least for the time being.

Many people who have been exposed to Generic Programming seem to think that Stepanov's objections to OO are limited to the notion of polymorphism, the central overhyped feature in the silly propaganda about the wonderfulness of OOP. These people imply that if polymorphism were de-emphasized, he wouldn't object so much. Maybe so, but my own reading between the lines seems to indicate that he doesn't think encapsulation's all its cracked up to be.

Overview

The Java 1.5 incarnation of generic programming is the new language feature called Generics. Generics were originally based on Generic Java, but they have grown noticeably since then, to be more flexible. Generics have a marked similarity to C++'s template classes, but of course there are some differences (it would just be no fun if Java and C++ could agree on anything):

  • There are some syntactical differences.
  • Implementation differences. Duh.
  • Generics are more verbose (more on this later).
  • Generics result in less code, because they don't create multiple copies of each class.
  • Generics are slower, because they don't create multiple copies of each class.

How to use a generic (parameterized) class

The best example of well-thought-out generic classes is the Collections library. Before Java 1.5 (codename "Tiger"), you would create and use a linked list like this:

LinkedList list = new LinkedList();
list.add(new Integer(5));
list.add(new Integer(10));
int sum = 0;
Iterator iter = list.iterator();
while (iter.hasNext())
  sum += ((Integer)(iter.next())).intValue();

The part most relevant to Generics is the cast to Integer. Because a LinkedList holds a list of Object references, the compiler doesn't know what type to expect, so we have to explicitly tell it, if we want to use the object for much.

But, with the introduction of Generics, we can be more concise:1

LinkedList<Integer> list = new LinkedList<Integer>();
list.add(new Integer(5));
list.add(new Integer(10));
int sum = 0;
Iterator<Integer> iter = list.iterator();
while (iter.hasNext())
  sum += iter.next().intValue();

Note the two changes: the use of <Integer>, and the removal of the cast. We tell the compiler that our list will only hold Integers, and so it does the cast for us. It also knows that the iterator() method of a LinkedList<Integer> returns an Iterator<Integer>, so the extra type information is not lost.

Why Generics?

As we just saw, Generics save you some typing. Not so many casts or parentheses, which is nice. But they do much more.

  • They make code more readable: anyone reading "LinkedList<Integer>" will know that the list can only hold Integers.
  • They promote code reuse: nobody will ever have to write an IntList or a StringList.
  • They allow us to guarantee run-time type safety at compile time.2

That last point is a Big Deal. Consider the pre-Generics version. What if someone had said list.add("Hello, World!")? The compiler would allow that, because the add() method takes Objects, and a String is certainly an Object. But at run time, the summation loop would fail, because String can't be cast to Integer. It totally sucks to discover bugs when they show up on the client's computer, printing their filthy stack trace all over the terminal. But in the Generics version, that line will fail to compile. Depending on your compiler, you will get a different message. javac gives the slightly-helpful "cannot find symbol add(java.lang.String) in class java.util.LinkedList<java.lang.Integer>", while Eclipse offers the more friendly "The method add(Integer) in the type LinkedList<Integer> is not applicable for the arguments (String)." This prevents a lot of problems, and in fact one of the stated goals of Generics is to ensure that all type errors are found at compile time.

How to write a generic class

So how do you write new generic classes? It's pretty simple:

public class Pointer<T> {
  private T ref;
  public Pointer(T init) {ref = init;}
  public T get() {return ref;}
  public void set(T newRef) {ref = newRef;}
}

As you can see, the <T> creates a new psuedo-class, technically called a type parameter, which can be used just like a regular class for the rest of the definition of Pointer. If you need multiple type parameters, just separate them by commas, as in HashMap<K, V>.

Sometimes it can be useful to give type parameters to a single method, and not to a whole class. That's easy too:

public static <T> T getFirst(List<T> list) {return list.get(0);}

Advanced Generics

Wildcards

The above information is enough to make effective use of Generics in many cases. But it doesn't cover everything. Consider, for example, a method to save all the objects in a list to backing store. We have an interface for doing that, because we are good little OO programmers:

public interface Savable {
  public void save(OutputStream out);
}
public class Image implements Savable { /* stuff */ }
public class Text implements Savable { /* stuff */ }

Now we want to write a saveAll method, that saves a whole list of Savables to one OutputStream. Our first attempt would look something like:3

public static void saveAll(List<Savable> list, OutputStream out) {
  for (Savable s : list)
    s.save(out);
}

And indeed, this first attempt will work, sorta. If someone gives us a List<Savable>, we will save them all to disk, as advertised. But what if they have a List<Image>? They should be able to use our method for that, too, since all Images are Savable. But they can't. It is illegal to assign a List<Image> to a List<Savable>. To see why, consider:

List<Image> images = new LinkedList<Image>();
List<Savable> saveables = images; // create an alias
saveables.add(new Text("Some text here")); // Adding a Text to List<Image>...
Image i = images.get(0); // DANGER!

The above code does not compile. If it did, it would fail at runtime, and the whole point is never to do that. So, because it's illegal to add a Savable to a List of Image, it's illegal to assign a List of Image to a List of Savable. But there is a way around this, using wildcards:

public static void saveAll(List<? extends Savable> list, OutputStream out) {
  for (Savable s : list)
    s.save(out);
}

The wildcard syntax here means, "A list passed to this method will be a holder for some subclass of Savable." This tells the compiler that it is legal to get a Savable object from the list, but that it is not legal to put one into the list. This allows flexible, polymorphic methods, and preserves type safety.

Likewise, we could use List<? super Savable> to say the opposite: putting Savables in is legal, but getting them out is not. If you don't care at all what the type is, and just want to get Objects out and not put anything in, you can say List<?>.

Reusing wildcards

Sometimes you will have two Generic objects available, and you want to assert some kind of relationship between them. For example, consider an addAll method, which adds all elements from one list to another. Obviously they must have compatible types. I'll first show you what it looks like, then describe what it means:

public static <T> void addAll(List<T> dest, List<? extends T> src) {
  // Implementation
}

Here we're saying that dest is a list of Ts, and src is a list that can hold Ts, or maybe some subclass of T. Therefore, it must be legal to take things you get from src and put them into dest. When you call this method, the compiler infers the type of T by looking at the type of dest.

Here's one last hairy example from the Collections framework. It's the most complicated you'll ever run across in practice, though of course you can generate artificial ones that are worse.

static<T extends Comparable<? super T>> void sort(List<T> list) {
  // Implementation
}

So, it's saying that the list must contain objects which are willing to compare themselves to some superset of themselves. This is necessary to make it possible to sort a List<Integer>, where Integer implements Comparable<Number> (which it doesn't really; this is just an example).

Implementation

I'll just give a brief overview here, mainly to describe how it's different from C++'s templates. You can look it up on Sun's website if you want to know the gory details. In C++, whenever you need an instance of a template class, the compiler generates a specialized version of the class for you. This means it is optimized for the parameters you use, but it means you have an extra copy of the template class for every different parameter you want to use.

Java does not do this. Instead, all instances of (for example) LinkedList, no matter what their type parameters, use the same class definition. Also, they internally store Objects, just like before Generics came around. But, they also hold a Class object defining what their type parameter is. Then, any time a LinkedList object uses its type parameter, the compiler casts the underlying Object to the right type. So, this incurs a bit more runtime overhead, because the number of casts is non-negligible, but it doesn't require as much code as the C++ version.

History

It should be noted that C# introduced generics before Java did. They were intentionally left out of Java in earlier versions, possibly because they seemed too C++-ish, until C# released its own Generics feature. It's possible that Java introduced them for fear that otherwise everyone might flock to C#. The C# version of templates is similar, but (of course) not identical to Java's. I have no experience with C#'s generics, so maybe someone else should node that here.

Conclusions

Generics are a good example of the generic programming paradigm, and they significantly increase programmer productivity for anyone who uses them effectively. Java programmers should learn this stuff; it will save you a lot of grief. You should also learn everything in Java 1.5, because its primary purpose is to increase productivity, and it definitely does.


1 In fact we can write much more concise code than this, if we use other features from Java 1.5. But in this example, I only wanted to illustrate the changes made by Generics.
2 Sorta. What it really guarantees is no type errors "out of the blue". Any potentially unsafe code will either trigger a warning, or refuse to compile without a cast. So, in some sense, any type errors you get are ones you explicitly opened yourself up to.
3 See Java 1.5 for an explanation of the for loop syntax used here. It's similar to the foreach in many languages, and it's just a shorthand for the Iterator stuff we did earlier.

Introduction

Generic programming is a philosophy of programming that emphasizes the algorithm, invented/discovered by Alexander Stepanov. Its most famous incarnation is in the STL of C++; however it is quite distinct. Best to say that C++, as a multi-paradigmatic language, also supports generic programming.

At work we have a fairly complex Message class. There are at least two different possible views of the contents of Message, neither of which exists in contiguous memory. Just last week I got asked...

"Do we have a substring find algorithm on Messages? And what about regular expressions? Do we have algorithms for those? I looked and couldn't find any relevant method in class Message."

Algorithms

Algorithms are a sequence of steps. For example, an algorithm for finding a minimal element is:

  1. Initialise the current minimal element to the first element
  2. For every succeeding element, if it is smaller than the current minimal element, set the current minimal element to that element
  3. At the end, the current minimal element is a minimal element.

This algorithm is horribly unspecified! Typically, you'll find it completely specified: the elements are assumed to be stored in an array, they are assumed to be of type (say) float, the size of the array is assumed to be known, etc. Given these properties, it is possible to prove that the algorithm works. But all of these details are completely irrelevant to the algorithm.

C++ comes with a standard library. For historical reasons, the std::string class has a "find substring" algorithm; for equally historical reasons, it doesn't have a "find regexp" algorithm. std::string is not a good example of generic programming.

The header file <algorithm> contains the find_end algorithm; if your read it carefully you'll see it's just a "find substring" algorithm, and that it seems to work on std::strings.

Generic programming suggests we look at the algorithm as the important part, then see what is required for it to work.

Concepts

What does an algorithm need in order to work? Types for the arguments (and return value) that satisfy certain requirements.

The requirements a type needs to fulfill in order for an algorithm to work on it are part of the description of the algorithm. In the above "minimal element" example, we need these requirements:

  • The elements must be ordered
  • A non-empty Range of elements must be given as input
  • Given an element, it must be possible to access the next element
  • It must be possible to determine when we have reached the end of the range
Note that the algorithm works even if the elements are only partially ordered! Of course, in that case we shall only find a (rather than the) minimal element -- but the same algorithm delivers.

It turns out that some collections of properties get used by many algorithms. These are called concepts. The "minimal element" algorithm uses a (nonempty) Range specified by Forward Iterators that can be dereferenced to a Comparable value type. In English:

  • A pair of iterators b,e, guaranteeing that if b is advanced enough times it will equal e;
  • b ≠ e;
  • The type of the iterator b supports advancing to the next element;
  • The type of the iterator b can be saved and used later;
  • The value type *b can be compared.

That's it! In one stroke, our simple algorithm will work for... anything for which it can work!

"Both views of Message are defined in terms of methods returning start and end iterators. These are bidirectional iterators. The value types are some unsigned type, so:

  1. For finding substrings, the std::find_end algorithm (which we didn't write) works unchanged;
  2. The boost::regex::match algorithm (which we didn't write, either) works if you add a small definition to adapt it to our value type.
And almost every other algorithm that doesn't require Random Access Iterators also works for Message."

Satisfying a concept sounds like inheritance, but isn't (necessarily) that. It is more like satisfying an interface -- but not in the object oriented sense. Algorithms are instantiated for particular types. Indeed, something like "forward iterator" is not an OO interface -- that would require specification of the value type!

C++ concepts specifically allow for non-objects -- the typical random access iterator is a naked pointer.

Libraries

The upshot of generic programming is that you get supremely capable libraries. Libraries are composed of algorithms -- bits of code -- and objects -- things you can give the algorithms to use. Algorithms are specified in terms of the concepts they require in order to work correctly. Objects are specified in terms of the concepts they implement.

The libraries are extensible in two ways:

  • New algorithms can be constructed, their required concepts elucidated, and take as inputs existing objects;
  • New objects can be created, the concepts they satisfy specified, and passed as inputs to existing algorithms.
Typically, of course, libraries will also contain various adaptors, to create new objects that bend existing objects to fit new concepts.

Implementations

Stepanov originally hoped to create a generic library for Ada. Compilers were unready, and template programming not understood well enough, to allow this to happen. Eventually he created the Standard Template Library for C++, which got standardized.

The STL supplies mostly algorithms, but it is more famous for its containers: a set of models of many of the concepts needed for the algorithms. Ironically, the containers are so well-known because of their genericity: they fit so many different situations that it often makes little sense to create a new container.

STL algorithms are pretty basic (and the containers, while good, are rather boring). The brilliance is, again, in their genericity. Even better, it turns out that STL algorithms are every bit as efficient as hand-written code!

std::find_end works with both views of our Messages, despite the fact the Stepanov didn't know of our Message types.

For further concepts, algorithms, and objects, the place to turn is BOOST -- the staging ground for new C++ libraries.

Java 5 (aka Java 1.5, don't ask me what happened to "1.") salvages the mess that are Java (object-oriented, not generic) Collections by adding some measure of genericity. It also tries to formalize concepts in the language. It does not try to match C++ STL efficiency, that being left to the Java compiler and JITs.

The upcoming version of C++, ISO C++ 0x, may support concept checking. Thanks to the magic/horror of C++ template metaprogramming, this requires rather fewer changes to the language than might be expected. Run-time efficiency will not be compromised, of course -- this is C++.

Log in or register to write something here or to contact authors.