From where should the index start?

Stefan_Lang

Personally I consider any functions that return -1 (or 0) as an indication of an error but an actual value otherwise as bad design: Indication of an error or a meaningful value shouldn't be represented by one and the same number! Instead, the function result , and the indication of an error should be separated, either by passing an output variable by reference, and using the return value as error indicator only, or by using exceptions. This problem is not at all related to indexing however, just to function design. Independently of that, if I were to design a new language, I would consider who would or should use it. If it is meant mostly as an alternative to existing languages then I might be inclined to stick with the usual semantics of basing everything at 0. OTOH, I find a 1-based index more natural, and therefore it would be more suitable for a language that someone would use to learn programming from scratch.

Stefan_Lang

I agree. The compiler should - indeed usually must - use a 0-based index. But that shouldn't be a restriction for a high level programming language. For all I care an index range could be any sequence of numbers from a to b, no matter what a and b are - depending on the problem at hand it might even make sense to use negative values! For instance consider an array that stores voltages for a Balance dial on a radio which can only be set to distinctive values: the center value would be 0 (naturally!), values to the left might be -1, -2, -3, etc.., whereas values to the right would be +1, +2, +3,... Now, why would I use indexes from 0 to N (or 1 to N) for the purpose of storing the appropriate voltage levels for each settings? It would be much more natural and easier to grasp if I could just use an Index range of -N to N instead! Consider index values as integral key values that are used to quickly retrieve associated values in a simple map - and what do you get? right, freely definable index ranges. (oh, and please don't nail me if my example of 'voltage levels' being set by a balance dial isn't accurate, I'm a software guy, not an electrician ;) )

Kabwla Phone

Let me reverse your question... Where do you want your index to end? Suppose the character/element you are looking for is the last in the string/list. Should retrieving the value be implemented as: zero based: List[List.Count -1]; One based : List[List.Count];

Antonino Porcino

another 1-based argument is when you tell to display the last element in the array, it's hard to explain to novices it's array[count-1] instead of array[count]. Also consider that sometimes it's the language syntax itself that forces the choice of index numbering. For example in BASIC the FOR...TO...NEXT statement does a "<=" as end condition check, so it's more natural to have 1-based indexing, that is FOR i=1 TO Count instead of FOR i=0 to count-1.

Stefan_Lang

I've thought a bit more about your ideas and suggestions in you original post as well as responses (including my own). I've realized two things: First, as mentioned before, the problem you describe about strings and checking their successful execution is not related to index values at all, or at least shouldn't, and if you consider that a problem and are about to design a new language, it should be easy for you to avoid that trap. For instance you could attach an error flag to each object type, including built-in types. Or you could define that all functions return on object of type 'ErrorResult' which has both the actual result value(s) and the error state as components. I'm sure there are many other possible ways. Second, 0 or 1 are not the only possible base values for an index. It might be the exception rather than the rule, but in a certain context, just about any range might make sense. You could even allow enum types to serve as an index, in which case the term 'base' actually loses its meaning, at least from the PoV of the developer. Let's say you have an array with one element for each weekday. Now, what would you consider the 'base index' of such an array, when your code looks like this:

enum weekdays { Monday, Tuesday /* ... */ };
menu[Sunday].lunch = "chicken curry";

You could also allow strings or other types as an index, obliterating the boundaries between maps and arrays, but then it would be hard to determine the actual array size, not to mention implement its layout.

Eric A Carter

If I am reading this correctly, you are stating that VB.NET has to recalculate the value of (ListBox1.Items.Count-1) through EVERY ITERATION of the loop; does anyone else see the flaw in that logic? The trouble with your example is this: you are making the assumption that the compiler will not optimise the above code at compile time. If you can show me that VB.NET (or any modern high-level language, for that matter) won't ensure that in almost all cases (ListBox1.Items.Count-1) is precalculated and placed in a register when compiled into machine code, I'll show you a programming language that needs a serious rewrite. Another thing that compilers like to optimise for is the ability to test against 0, which often uses less code and is several cycles faster in many cases than tests against non-0 values. While most compilers will quite happily refactor a (0 to (count-1)) loop to take advantage of the faster test, a (1 to count) loop may force the compiler to use more/slower code in order to preserve the indexing, thereby wasting cycles. While this may not be true all of the time (and probably not so much now as it was when I first started coding), the point is that 0-based indexing can often provide a hint to the compiler to use a faster increment/decrement strategy. I think it's obvious now which camp I stand in. I think of it like this: if everything started at 1, then I would be a year older than I really am, and I'm old enough as it is. Bsides, when I was born I would have been 1 year old by default, which is clearly wrong no matter how bad you are at mathematics. In life as well as code, one must always start from 0 and work up from there. :-D

"All programming is an exercise in caching." - Michael Abrash, GPBB

Nikunj_Bhatt

I don't know weather compiler would optimize the code of ListBox1.Items.Count-1 in For loop or not. I think compiler will not optimize the code because the Items.Count may change when executing the statements inside the loop and therefore it must be calculated whenever this statement is encountered. And about the example of life. If you compare life with programming, then you would be 1 year+9 months old at birth-time. Your age could be 1.5 year at any point of time but there is no array element with the index 1.5 in any programming language. However I am not good at maths, I know that Zero means no-thing and 1 means, there is at-least One thing. You can say that there is Zero apple on the desk when there is no apple on the desk, but doesn't it looks wrong when you say that there is Zero element in an array when the array has One element? This is like a paradox.

Nikunj_Bhatt

You are right that

Nagy Vilmos wrote:

Where we can, we should not be using indexes today.

But

Nagy Vilmos wrote:

When we do, need to respect how things are stored.

I think, you are forgetting the question. Sentences in my first post in this thread is :

Now suppose, presently you are going to design a new language and forget that any programming language already exist. Then from where would start the index number? Zero or One or something else?

In the question, I am clearly asking "if you are going to design a new language" & "forget that any programming language already exist". So, I was expecting answer while keeping these words in mind. If you can design a new language, you can also define how the things will be stored too. Otherwise nothing innovative exist in your mind; you may be following a crowd of mad people.

Eric A Carter

1. I stand corrected - I just ran a few simple tests which prove your case. I still hold that in certain situations, 0-based can produce faster machine-level code than non-0; clearly this is not one of them, due to runtime recalculation concerns. 2. Technically you are correct, but unless there are some really pedantic people out there, I know of noone who would say they just turned 'X years and X weeks' old on thier birthday (depending on how many weeks they gestated for - not everyone spends 9 months in the womb). In some ways, the difference between 0 and 1 in this discussion is moot, because eventually everything comes down to 0-based ADDRESSING in machine code anyway. As far as 'indexing' (an entirely different thing to 'addressing', mind you) in a high-level language is concerned, there are 3 basic options which are entirely preference-dependant: a. 0-based indexing, mimicing addressing to nullify/trivialise index-to-address calculations; b. 1-based indexing, mimicing the human perception of quantity; c. N-based indexing, allowing the user to specify the base index to complement a particular indexing system used 'natively' by a code block. Personally, I prefer the 0-based approach, because the machine works that way for addressing anyway, and any other indexing system can be emulated (for almost no cost these days) by simply adding an offset value: 1-based = index + 1, 12-based = index + 12, -345-based = index + -345; not hard at all. If I was to write a high-level language of my own however, I admit I'd probably code in the option of preset array offsets, with a default offset of 0 if one wasn't specified at array allocation. All of these options have been used to great effect in many languages over the years, and all of them have pros and cons. If you're writing a high-level/scripting language, your best bet is to go with the indexing system that works best for you first, and if others like it, that's a bonus.

Nikunj_Bhatt

You are right that:

Eric A. Carter wrote:

the difference between 0 and 1 in this discussion is moot, because eventually everything comes down to 0-based ADDRESSING in machine code anyway

From this:

Eric A. Carter wrote:

Personally, I prefer the 0-based approach, because the machine works that way for addressing anyway

:doh: :( :(( I have always found in this thread that most of all are thinking about computer/compiler (machine) and not about programmer (human). Are we (programmer, humans) making code for sake of machine? Are the variety of languages made to ease compilers or to ease programming for programmers (humans)? And I don't think everyone is writing their thoughts after thinking if they were to build a new language while keeping in mind that there is no programming language exist.