**Introduction**

In programming languages and mathematical notation, there are a few
different ways to do array indexing - the first choice is to start at 0
or at 1. If you want to refer to an *interval* (or slice) [ a :
b ] of an array, there is a second choice - should the interval
include or exclude point b. And thirdly, there is the option of using
negative array indexing, and deciding what that should mean.

The node "Array indexes as measure of a language's analness" mentions that there are different schools of thought on this subject. Personally I believe there is One True Way of indexing, and all the rest are just bad design. In this node I will explain what the right way to do indexing and slicing is, and why. Not incidentally, Python designer Guido van Rossum had just the same thoughts, and I will give Python examples.

**One True Way of Indexing**

Given an array/list (from now on list, since that's the Python
term) L, the One True Way of indexing is:

- L[0] is the first item of the list.
- L[a:b] is the slice of all the items L[x],
for a <= x < b. So this excludes the endpoint, it is the
*half-open interval*that is often written [ a, b ) in math. - Negative indices count from the end of the list. That is, L[-x] = L[length(L)-x] for x > 0.

What makes one way of indexing better than the other? It's better if it can lead to simpler expressions in common situations. Off-by-one errors are an extremely common source of bugs. The fewer "-1" or "+1" terms clutter your expressions the better. The One True Way of indexing leads to very elegant expressions, especially when intervals and negative indexing are involved. Without the intervals, I don't think there is much difference between starting with 0 or 1. With them...

Python writes a slice L[0:a] as L[:a], and L[a:len(L)] as L[a:] (and L[:] for the identity slice, or copy of the original list).

**Examples**

In the following list, the equations hold for the One True Way;
in *italics* are the equivalent expressions assuming that we
start counting at 1, an interval includes the end point, and there is
no negative indexing:

- The length of a slice L[a:b] is b-a (if b >= a, 0 otherwise) (
*b-a+1*) - The first n characters of L: L[:n] (
*also L[:n]*) - The last n characters of L: L[-n:] (
*L[len(L)-n+1:]*) - The identity slice. L == L[0:len(L)] == L[:] (
*L[1:len(L)]*) - The empty slice: L[a:a] is empty for any a. (
*perhaps L[a:a-1]?*) - A slice of length n, from point a, is L[a:a+n] (
*L[a:a+n-1]*) - For any index a, L == L[0,a]+L[a,len(L)] == L[:a]+L[a:] (
*L[1:a]+L[len(L)-a+1]*) - Given a slice L[a:b] and an index c between them, L[a:b] == L[a:c]+L[c:b] (works for negative indices as well) (
*L[a:c-1]+L[c:b]*)

The negative indexing makes it easy to do array operations 'backwards', and most of the above equations still hold for negative a and b. Not a +1 or -1 in sight for the True Way! On the other hand, I'm even in doubt whether I wrote all the expressions correctly for the other option...

**Visualization help**

One thing people often complain about is that this style of
indexing is counter-intuitive. People start counting items at 1, and
when they say "items 1 to 3" they mean items 1, 2 *and* 3.

If you have trouble with this, you should visualize lists a little
differently; the indices refer to points *between* the items in the
list. This way, there is no ambiguity possible (no different interpretations), and the math works out exactly the same. Consider the array with the letters of the alphabet:

+---+---+---+---+---+---+---+---+---+------- | a | b | c | d | e | f | g | h | i | .... +---+---+---+---+---+---+---+---+---+------ 0 1 2 3 4 5 6 7 8 9

Here, it is clear that [2:5] refers to c, d, and e, and nothing else. It's also natural that you would slice *between* numbers and not right through them.

The same ideas work well in Computer Science, correctness proofs and the like; this is not just for programming languages. Use half-open intervals.