From where should the index start?

Lost User

BobJanova wrote:

outside of certain computing circles?

Actually you need to grow the group significantly. It is math circles which is the basis for all sciences. We use non-zero base index when it comes to physical stuff that exists in the world. Because we imply the 0th element (nothing) by talking about the object. In math and sciences you can not imply this, and must therefore define it. The 0th base index dates back to before computers. It just so happens that computers made it more common knowledge.

Computers have been intelligent for a long time now. It just so happens that the program writers are about as effective as a room full of monkeys trying to crank out a copy of Hamlet.

Lost User

You have a very good point. I don't use for loops at all, unless the algorithm requires it. If I am iterating through a collection there is rarely a need to know if I am at 0 or 1. I just care about the order, and that is why we have sorting algorithms out of the box. It is all hidden and makes it very easy. When there is a need your loop takes that into account.

Computers have been intelligent for a long time now. It just so happens that the program writers are about as effective as a room full of monkeys trying to crank out a copy of Hamlet.

Lost User

if you try to page within an array, list or any other indexed structure, everything that is not zero based is a pain.

Dr Walt Fair PE

So why do you think a compiler couldn't figure that out for you? If we're going to write a compiler and define how we set up indices, I don't see any problem. If your algorithm depends on some constants, that could be changed outside the interface, it doesn't sound like a good algorithm design to me. You're right, I don't work on large teams. But I was always told that encapsulation is a good thing in a team environment. In that case, the algorithms are basically a black box to the rest of the team and it shouldn't matter how they are written internally. For example, I have some simultaneous PDE's to solve with sufficient boundary conditions that my natural (physically meaningful) indices go from 2 to N-2. But when we solve the resulting difference equations, the algorithms always start with index 1 in FORTRAN or index 0 in C. It really doesn't matter much, since those details are hidden. The mathematicians use index = 1 as the base, so sticking with that is much more readable from a numerical methods point of view. In fact when translating the algorithms, I usually get it working with a base 1 array, realizing I'm wasting the memory for the 0 entries. Then I shift the loop indices and increment the values and test everything again, then finally replace the indices and reduce the array size by 1. Doing it that way allows me to implement complicated array manipulations without introducing errors that can't be detected. I see no reason whatsoever that the compiler couldn't do those rote manipulations.

CQ de W5ALT

Walt Fair, Jr., P. E. Comport Computing Specializing in Technical Engineering Software

Lost User

I am starting to wonder if you are actually reading what I am posting. I never said the compiler can't figure it out. I am saying it shouldn't have to. If you change the base index variably to what one specific algorithm needs integration is a nightmare. Yes Black box is great. BUT you need a consistent API. Imagine if everyone defined their own integers. i.e. One system said its integers can go out to 32 bit and another said 33 and still another said 34. The complexity in integrating the systems would get ridiculous real fast. The same thing goes for your starting index. If I pass you a collection and also have to pass you the starting 'label (as some are refering to index) I have added a complexity to the API. If that alone is not enough, when I string 3 or 3 dozen algorithms together and have to do the integration testing I can't find my data. Encapsulation is important but you still have to test integration of the components. Now if everything miraculously works from the get go and you need not ever touch again (Wake up Alice your dreaming!!!), you are fine. But more likely something will not work during integration testing or when an upgrade occurs. To narrow down which component is failing you will have to follow the data. If you are using different based indecies you will spend an enormous amount of time calculating back to a single base (beit 1 or 0)

Computers have been intelligent for a long time now. It just so happens that the program writers are about as effective as a room full of monkeys trying to crank out a copy of Hamlet.

Dr Walt Fair PE

Collin Jasnoch wrote:

I am starting to wonder if you are actually reading what I am posting.

Same here.

Collin Jasnoch wrote:

I never said the compiler can't figure it out. I am saying it shouldn't have to.

And that's where we disagree. Peace. That's why there are a multiplicity of languages to choose from and why we don't all work on the same things.

CQ de W5ALT

Walt Fair, Jr., P. E. Comport Computing Specializing in Technical Engineering Software

Dan Neely

...and by starting at one instead of zero, each iteration of the loop will be doing *ListBox1.Items[p+(n-1)*s] (give or take the amount of C++ I've forgotten over the years) each time you wrote ListBox1.Items[i] instead of *ListBox1.Items[p+(n)*s]. If you really have a problem with the concept of zero based indexing invent a language that doesn't have pure arrays and only allows indexing collections via a foreach style construct. Don't forget to define behavior in which the body of the loop adds or removes an item from the collection. eg this isn't legal C#:

foreach (var item in myCollection)
{
if (item.readyToDelete)
myCollection.Remove(item);
}

Instead you have to write:

for (int i = myCollection.Count -1; i <= 0; i--)
{
if (myCollection[i].readyToDelete)
myCollection.Remove(myCollection[i]);
}

Don't forget that while the example I gave would be fairly trivial you also would need to implement code to handle DeleteSevenRandomItemsInCollectionAndAddThreeNewItemsAtRandomLocations(), DeleteItemIfIndexIsEvenAndAddANewItemBeforeIfItIsOdd() and all the other horrible cases that urgent programmers would write which result in psuedorandom results if the iteration order varies but which give no logical pattern in which the collection can be iterated. MS didn't allow adding/removing from the collection in a foreach loop for a reason.

Did you ever see history portrayed as an old man with a wise brow and pulseless heart, waging all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt

Rob Grainger

I'm not so sure. The point of a language is to abstract away such technical details (which are getting close to C/assembly style here). There's no reason a language can't implicitly do this conversion at compile-time, making it a moot point. Designing language features so that it makes for more efficient compuation is a fallacy - languages should be designed to make it easier for programmers, then implemented to achieve the desired performance. That said, I still prefer 0-based indexing, this just seems the wrong reason to me, heavily steeped in the C tradition of having pointers and arrays be interchangeable.

Rob Grainger

I fundamentally disagree. We have always begun counting at 1, we do not begin at zero but omit it. That's an incredibly wierd outlook. Get out more - you're spending too much time with programmers. Consider bananas - if I have three bananas I count them 1, 2, 3. Yet, if I have no bananas, I'm unlikely to begin counting them at all. I don't think we're omitting the zero - it was never there to begin with. Zero was introduced much later as a concept then counting, and is one of the most significant developments in mathematics, but its not involved in counting.

Stuart Rubin

In general, I try to make my variables "domain specific" and single purpose. So, if my function returns "milliamps" but can also return an "error code", I will usually have the function return an enumerated "error code" like

typedef enum {ERR_NONE, ERR_BAD_READING, ERR_BUSY} AdcErrCode;

and pass the returned "milliamps" value by reference. When I'm feeling really disciplined, I may even define the data type like "typedef uint16_t Milliamp". For your example, I would certainly return a separate error code like

typedef enum {STR_SRCH_FOUND, STR_SRCH_NOT_FOUND, STR_SRCH_BAD_STR} StrSearchErr;

or something and return the error code by reference. Typedefing your domain specific types add ZERO additional run time or memory overhead. Thoughtful, explicit typedefs also make your static code analyzers work a lot better. Note that these are examples from a C point-of-view. So, the point is not whether to use 0 or 1 as the first index, or what your error code should be, but to be extremely explicit in your data types (even in loosely typed languages) and variable names. An "index" in your search is not necessarily the same as a letter "count".

Mark AJA

Bit off subject but sometimes I want the HTML command <LI> to start at 0 and not 1, or allow numbers under 1. but <LI VALUE="0"> will not set it to 0 I ended up using a table and not <UL> <LI>'s.

Lost User

Rob Grainger wrote:

We have always begun counting at 1, we do not begin at zero but omit it. That's an incredibly wierd outlook.

Actually it is a scientific way of looking at it.

Rob Grainger wrote:

I don't think we're omitting the zero - it was never there to begin with. Zero was introduced much later as a concept then counting, and is one of the most significant developments in mathematics, but its not involved in counting.

No it was not introduced at the same time as counting. "Hey Ugh. Get me 3 sticks for fire".... "1.. 2... 3" Counting has been around for a while, but math proofs and formulas which require enumeration did require it and still do (ohhh wait... Is that one computers are doing??? Math formulas? Hmmm I thought it was just for perrty pictures of Salma Hayek) If you are familiar with math proofs you should know the most common value that you are converging from or to is 0. If you are using a 1 base index (or counting and ommitting 0) you will make the algorithms more complicated. Here is another way to show we are indeed omitting in human language. If you write down "10", we know you mean "10". A computer however will use the data before it. If you did nothing to the data it will not be "10" but could be 26310 or 28510 and so on and so on. Now I know what you are thinking. Compiler handles that why can't it do this for us. Because wether you wrote "10" or "0010" has absolutly no effect on the algorithm. You starting your array from 1 does. If you do not grasp that you may need to study mathematics a little more.

Computers have been intelligent for a long time now. It just so happens that the program writers are about as effective as a room full of monkeys trying to crank out a copy of Hamlet.

C Grant Anderson

Except on Tuesdays when it should start at a random number and use base 6 or 18 depending on if it is raining or not.

jocstar

In my everyday life (at least for the first 365 days) I was in year[0] :)

Alan Balkany

A rule I follow for indexes/ordinals is: "If it's visible through the GUI, start at 1. If it's internal to the program, start at 0." The zero-based index is elegant because it's naturally the offset from a base. The one-based index is intuitive because in the real world, counting starts with one (i.e. the first item).

SeattleC

Array indices start at zero because inside the hardware what you need is the offset from the beginning of the array to index the element. It sounds silly, but the cost of subtracting one from every index expression adds up to a noticable cost in a decent sized program. Optimization in this years' compilers is probably good enough to overcome this cost in statically typed languages but these languages are all old enough to vote now. In a byte-code language or interpreter, you pay this price forever. Getting confused between 0 (false) and 0 (first character) results from a lack of imagination in the design of string handling routines. A find-substring function could return the substring or the empty string, instead of the index. It could return a slice of the string, an iterator, etc. It seems like every third-year CS undergrad wants to design their own programming language after taking the compilers class. It's like a rite of passage. It's probably best to wait until they've written significant programs in several languages before they actually do it.

Antonino Porcino

I've designed a script programming language which was intended to be used by entry-level programmers. With that in mind, I've adopted 1-based indexing (both for strings and arrays) in the hope of making it simpler to understand. My reasoning was that it is easier to remember that 1st element is [1] and not [0] for unskilled coders. Now that a lot of code has been written in that language, I really repent of my choice. On average I see that code is slightly longer than it would be if it was 0-based, and often wrong access to [0] element causes problems. At a certain point I decided to revert to 0-based indexing but it was too late because there was too much code to fix. So my advice is, stick to 0 indexing, especially if your audience has background in c/c++/java/c#.

Nikunj_Bhatt

Seeing all the replies, I think you have provided a very strong PRACTICAL answer. All other are just saying on their own thoughts and not going deep to the actual problem. However, I still would like to use 1 based index. Presently I could be wrong. I would & should first check some of my programs using 1 based index instead of 0 based index. I don't think using 1 based index would create any problem. You may be facing the problem because the entry-level programmers will move on to other higher generation languages in future and thus, from starting, they would think of using the same methods as provided in the higher generation languages. And I don't think using a 1 based index would create any noticeable time-delay because 1 based index would be only for programmer but in the machine language, it can use 0 based index and this conversion can be applied at compile time and the execution speed will not be affected. Programmers may not need to became comfortable to computer language/computer, but the languages can be designed to become comfortable to programmers. In common sense, if I say someone (as I am a teacher, I often say this to my students) to display value of the 1st element of an array, what he/they would understand? the first element of array which is array[0] OR array[1]? I then have to say display value of the 0th element! Isn't this strange (actually, weird)?

wbaxter37

I Was going to say I really don't care, but thwn remembered my stint in Rocky Mountain BASIC (in HP technical computers for engineering and test) where one could use any set of indices you wanted. I ended up settling on 0 for all arrays because it made bound checking a lot easier. Remember the 3-week rule: Agter 3 weeks you'll look at you're code and mutter "Why the hell did I do that and what does it do?". It's a lot easier if there is consistency in your code. After years of C/C++/C# and recent Ruby I have no problem with 0 based indices.

Alan Burkhart

I would favor a 1-based index instead of zero based. "First" is "1st" not "0st". Humans automatically think of first as being #1. No blue ribbon ever had a "0st" on it. Back in the day when BASIC allowed us to choose how to index, I always chose one-based indexing. And, I've always thought it a bit clumsy that a For-Next loop had to stop at Count-1 instead of Count.

XAlan Burkhart