Questionable instance of C code, in which a negative number is used to index an array. For example:

char st[5];
st[-1] = 0;

This code looks at the memory address that st is located at, backs up one byte (assuming you're on a system with one-byte characters), and writes the value 0 to it. Since array indexing is another syntax for dereferencing pointer arithmetic, the code is functionally equivalent to:

char st[5];
*(st-1) = 0;

In the above example, there is no way to determine what is stored in memory directly before st; the code might run flawlessly, it may overwrite some arbitrary variable, or it may crash with a Segmentation Fault.

There are occasional instances where negative indexing could come in handy. If a pointer p points to the nth element of an array, then p[-m] (where m is a positive integer) refers to the element m places before the one the pointer refers to. Extreme care must be taken, however, to ensure that m < n at all times.


Why did I node this? The Frankensteinian project I'm debugging for work right now has fifty-six instances of negative pointer indexing.

I feel I should draw attention to Cabaal's tiny but helpful writeup for brackets, which recommends the use of &#91; and &#93; to put [ or ] into a writeup without linking. That might be intuitively obvious to some, but not to everybody...

C

In C, negative array indexing is useful when used in conjunction with some pointer into the array.

Unlike Pascal, C arrays always start at index 0. This is a Good Thing. Except that sometimes it isn't.

Say we want to translate the 8 directions


1 2 3
8 0 4
7 6 5

from their offsets in X and Y to the direction number. That is, we want to translate (-1,-1) to 7, (+1,-1) to 5, (0,+1) to 2 and so on. "if"s are slow; we want to use data directed programming to do the translation by lookup.

We'll use a nice array (note that the Y coordinate increases from bottom up in mathematics, but from top down in programming -- so we've reversed the order of the rows):

int to_dir[3][3] = {
  { 7, 6, 5 },
  { 8, 0, 4 },
  { 1, 2, 3 }
};

To determine the direction number of (dx,dy), we need to lookup to_dir[dy+1][dx+1]. This is ugly and error-prone: it is quite hard to forget we need to add all those 1's.

Here's a better way to do it. Add the definition

int (*dir)[3] = (int(*)[3])&to_dir[1][1];
to the code. Yes, it's ugly. But the typecast is needed for good cause: we're trying to reassign a pointer to arrays of 3 ints into the middle of an array of arrays of 3 ints, rather than to the beginning of a row!

Now use dir[-1][-1], and its friends dir[i][j] with -1≤i,j≤+1 to access the directions. No possibility of forgetting to add 1's -- it's already there inside dir. For added safety, you'd probably prevent access to to_dir by giving it minimal scope, and exporting only dir.

Perl and others

In Perl and others (like Python)too, negative array indexing is not wasted. Perl doesn't have pointer arithmetic (mostly because it doesn't have pointers). Instead, negative indices are counted from the array's end: $ARGV[-1] is the last command line argument.

This almost (but not quite) parallels the behaviour of many UN*X utilities, of course. For instance, vi's "G" command goes to the line given as argument. So "123G" goes to line 123. Line 0 is the last (not first!) line, and is the default; just "G" goes to the last line. And negative line numbers go to lines counted down from the last line.

ISO C++ with generic programming and Gorgonzola cheese

Below, gorgonzola demonstrates how beautifully C++ lets you build array-like classes that allow various forms of negative array indexing. As with everything else in the language, you can do it. The code is incredibly long-winded, but C++ programmers are used to that.

Superficially, prime fodder for exhibition in the C++ Programming Language Freakshow (tm), ariels Esq. (prop.). But in actual fact, it's too simple. After all, the code is only for 1-dimensional arrays. We're interested in doing the same for multi-dimensional matrices. This, I'm sure, will require just a little more code.

If this were Algol or Pascal, we might simply issue the following declarations to implement ariels' example matrix:

type compass = array [-1..1, -1..1] of integer;
const conv = compass [7, 6, 5, 8, 0, 4, 1, 2, 3];  { HIGHLY compiler-dependent }
{ const conv: compass = ((7, 6, 5), (8, 0, 4), (1, 2, 3));  Delphi }
and simply use the conv structured constant. But Pascal is not the favorite programming language of many people (more's the pity) and we must look to other languages to solve our problem.


ariels' method for his C example essentially works like this: Convert the array into a pointer, fiddle with the pointer, convert it back into an array. Now, I personally can't see how this would work any way other than the way he describes, but unfortunately, a C++ compiler isn't supposed to compile it.

You can convert an array to a pointer1 then add to it. But you can't convert a pointer back to an array, period. Converting a pointer into an array via static_cast is explicitly forbidden2 by the C++ Standard, and the section on reinterpret_cast3 does not mention it as one of the allowed conversions. Even if it were allowed, it would result in unspecified behavior.

If we only wanted a one-dimensional array, we wouldn't have to convert it back. We could do something like

const int *zero_based = { 7, 6, 5, 8, 0, 4, 1, 2, 3 };
const int *translated = zero_based+4;
and access translated[-4] thru translated[4] with no problems4. Unfortunately, to get a multidimensional array, you need to cast back to a multidimensional array type, which you can't do.


So, we're left with the problem of how to do negative array indexing in C++. The answer is "design a class to do it", as you might expect with anything complicated in C++. Examining the Pascal example above, we find there are four things to implement:

  1. Allocating and maintaining access to an array.
  2. Providing for negative array indexing.
  3. Restricting the allowed index values.
  4. Extending the array to multiple dimensions.

The most flexible type of array would also allow it to be rotated, such that sub-arrays could be accessed from different dimensions, but that's extraneous to our purpose. But examining and discarding this task helps us in our design: Examining the task further, we see that issues 3 and 4 can also be put aside for now. Issue 3 can be solved using the Constrained pattern to design a restricted index type for the array. Issue 4 can be solved with recursive templates, or by designing proxy classes that allow different slices into the array in much the same way as C++'s valarray template.

We're left with allocating and maintaining acces to the array, as well as providing for negative array indexing. Believe it or not, C++ already contains a class that'll work for us! We can simply use a std::map<long, value-type> for some value-type we want to collect. This technique is the best one if we have a fairly sparse array, or if we're going to access the array exclusively via the map's subscripting operator, operator[]() Unfortunately, map iterators don't work like array iterators. A secondary consideration is the greater amount of overhead for a map versus an array or a std::vector So if we want to use sandard algorithms on the elements of the array class, we'll have to roll our own.

But thinking about standard container classes introduces the notion that we could at least design an array class analogous to a standard one. We have the choice of emulating two Standard C++ Container classes: std::vector and std::valarray5. We can't use either directly, as both require an unsigned integral type as an index. We could simply wrap one of these classes and provide an offset:

#include <vector>

template<class T, class allocator=std::Allocator<T> >
class nvector
{
 long offset;
 std::vector<T, allocator> wrapped;
 public:
 typedef long index_type;
 T &operator[] (long i) { return wrapped [i+offset]; }
 T const &operator[] (long i) const { return wrapped [i+offset]; }
 //
 // Now, imagine all the other vector boiler plate, forwarded to the 
 // wrapped vector
 //
};

However, since we've dropped the implied requirement for all indexes to start at zero, we've opened up the possibility of modifying the array at both ends (similar to a std::list). We will probably be better off making our own implemntation that takes care of both ends, using vector as our inspiration:

template<class T, class allocator=std::Allocator<T> >
class devector
{
 private:
 T *blkbeg;
 T *arrbeg;
 T *arrend;
 T *blkend;
 long begindex;
 void ensure_index (long i); // implmentation left to your imagination
 public:
 T *begin() { return arrbeg; }
 T const *begin() const { return arrbeg; }
 T *end() { return arrbeg; }
 T const *end() const { return arrend; }
 size_t size() { return arrend-arrbeg; }
 size_t capacity() { return blkend-blkbeg; }
 typedef long index_type;
 T &operator[] (long i)
 {
  ensure_index (i);
  return *(arrbeg+begindex); 
 }
 T const &operator[] (long i) const
 {
  if (i < begindex) return T();
  if (begindex-i >= size() return T();
  return *(arrbeg+begindex);
 } 
 //
 // Now, imagine all the other nice sequencey stuff, properly implemented
 //
};

As suggested above, devector < devector <some-type> > would be a (somewhat inflexible) two-dimensional array that allowed negative indexing.


References to International Standard ISO-IEC 14882, Programming Languages -- C++, published by the American National Standards Institute, © 1998 Information Technology Industry Council:
1Section 4.2, Array-to-pointer conversion
2Section 5.2.9.6, static_cast
3Section 5.2.10, reinterpret_cast
4Section 5.2.1, Subscripting

5I hear you laughing, ariels. valarray is so complex as to be almost useless, but it is our best starting point for a rotatable, sliceable array that also allows negative indexing.

Log in or register to write something here or to contact authors.