Nested loops
-
Making an arbitrary rule to have no more than one nested loop (two loops) *is* indeed a silly rule. See here[^] for an example where your arbitrary rule makes no sense. Your data structures, or database, or whatever may require more looping. Unrolling such loops usually results in harder-to-understand and often less efficient code. It may not be possible to re-engineer or re-factor the code without redoing the entire system. Often an unachievable goal for legacy and cost reasons. If this is a new system, then yeah, by all means go for it, if possible and truly better. I have found that often, it's not really better. See
If your actions inspire others to dream more, learn more, do more and become more, you are a leader." - John Quincy Adams
You must accept one of two basic premises: Either we are alone in the universe, or we are not alone in the universe. And either way, the implications are staggering” - Wernher von BraunEvery situation is different, but too many levels of loops can overwhelm the maintenance programmer with complexity. Functions/methods exist to chop the logic into manageable chunks. After the first couple of loops, the rest can be abstracted away into a subroutine. This makes the function understandable, and if more detail is needed the subroutine can then optionally be examined.
"Microsoft -- Adding unnecessary complexity to your work since 1987!"
-
That's just silly. Use as many nested loops as required for the algorithm and no more. It may be possible to re-factor and re-engineer the algorithm to make fewer loops, but that may make the code more complex, and harder to understand. Judicious use of The KISS principle is, I think, best.
If your actions inspire others to dream more, learn more, do more and become more, you are a leader." - John Quincy Adams
You must accept one of two basic premises: Either we are alone in the universe, or we are not alone in the universe. And either way, the implications are staggering” - Wernher von Braunahmed zahmed wrote:
Use as many nested loops as required for the algorithm and no more.
I totally agree. But... One loop makes a problem linear on the number of items in the loop. Two loops generally makes it O(n squared). Three loops makes it O(n cubed). In most practical cases, once you get to three nested loops, you need to ask yourself, "Am I using an efficient algorithm to compute this result?" While there exist O(n cubed) algorithms, in the workaday world, you usually don't see them. Another thing, multiply nested loops so often happen because somebody didn't really understand lwhat they were iterating over. Many a gnarly function of hundreds of lines and loops nested three or four deep have I analyzed and converted to a far simpler form. The deep nesting is often a signal that you're "doing it wrong".
-
Follow up on the Variable names thread below, I would like to ask what is the level of nested loops that is acceptable in general? Most people would agree that three levels is acceptable, but I would say stop at two. The third loops gets a bit messy.
for(int i=0; i<100; i++) { //Okay
for(int j=0; j<100; j++) { //Acceptable
for(int k=0; k<100; k++) { //Messy
}
}
}Shameel wrote:
Most people would agree that three levels is acceptable, but I would say stop at two.
Not sure I have ever seen three. Certainly not recently. If I did it was only because it was spanning an array with 3 dimensions. I very seldom encounter two. I suspect that if this is something that you see more often than say once a year then someone is designing something wrong or your problem domain space itself leads to solutions that requires that. If the first then it is a design problem not a code problem. If the second then I would suspect that those working in that space should be comfortable with that idiom (more so than working in other domains) so it isn't a problem.
-
That's just silly. Use as many nested loops as required for the algorithm and no more. It may be possible to re-factor and re-engineer the algorithm to make fewer loops, but that may make the code more complex, and harder to understand. Judicious use of The KISS principle is, I think, best.
If your actions inspire others to dream more, learn more, do more and become more, you are a leader." - John Quincy Adams
You must accept one of two basic premises: Either we are alone in the universe, or we are not alone in the universe. And either way, the implications are staggering” - Wernher von BraunI like short functions - anything longer than 10 lines is suspicious, IMO, and I never encountered a case where I absolutely needed more than 200 lines in one function. If you have three levels of nesting, there's place for extracting a function, giving it a decent, speaking and reasonable name (occasionally a long one, but that's IMO still better than writing comments - compilers don't check comments). Thus, I'd say nesting beyond three is rarely justified. And extracting smaller functions just makes the code better readable, not worse.
-
mark merrens wrote:
c) Two: anything else is lunacy and must be stamped out: 2 dimensions is more than enough for anybody!
That statement pretty much reminds me of
someone once supposedly said:
No one will ever need more than 640KB of memory
In geometry, I can easily think of problems that require 6 or more nested loops. In tensor analysis, double that. I could imagine that top notch physicists and mathematicians may need even more, occasionally. How do you think they model their ideas about 10-, 20- or higher-dimensional space-time? You can not refactor away the need for a nested loop. End of story.
Suppose you have some operation operating on multidimensional arrays. You don't write distinct versions for each number of dimensions, you use recursion. Recursion for such cases doesn't require more than one nesting level.
-
for (int index = 0; index < 1000000; index++)
{
int k = index % 100;
int j = (index / 100) % 100;
int i = index / 10000;
}Don't think so. That might be the worst possible way to do it.
for (int i = 0; i < 100; i++)
{
innerLoop(i);
}Great. So you thought you found the actual logic? Nope, Chuck Testa a useless outer loop that tells you nothing about what's going on. This might work if you have code bubbles[^], but otherwise it's like you have to keep turning stones one by one until you finally find the interesting part. It's probably the "accepted way" anyway, but it sucks. Any possible way to do this sucks. Fortunately it's not very common.
You assume innerLoop() is what the inner loop should be. Suppose you need to multiply an n-dimensional array by a constant. How about multiplySliceBy(index, mutiplier)? As long as you have mulitplySliceBy() defined for with dimension count from 1 up to an arbitrary upper limit, I think it's speaking code.
-
If that's in your satnav you must have to drive really slowly for all those nested function calls to complete before you arrive at wherever you wanted it to direct you to.
That was the case with ancient compilers/interpreters. Nowadays function calls are cheap in most cases (unless you put huge frames on the stack - i.e. your functions have very long parameter lists (I mean, several dozen parameters). If you're smart and put all parameters except the loop counters/indexes in a struct, you only need to pass a pointer to the struct - it's something pointer-like even when you pass the object in C# or Java, so it can be made very cheap. In C++, you can declare the functions inline, which will tell the compiler to produce code as if there weren't any functions. Thus, you get the best of for everybody: optimized code for the computer and readable code for human readers.
-
You assume innerLoop() is what the inner loop should be. Suppose you need to multiply an n-dimensional array by a constant. How about multiplySliceBy(index, mutiplier)? As long as you have mulitplySliceBy() defined for with dimension count from 1 up to an arbitrary upper limit, I think it's speaking code.
-
I like short functions - anything longer than 10 lines is suspicious, IMO, and I never encountered a case where I absolutely needed more than 200 lines in one function. If you have three levels of nesting, there's place for extracting a function, giving it a decent, speaking and reasonable name (occasionally a long one, but that's IMO still better than writing comments - compilers don't check comments). Thus, I'd say nesting beyond three is rarely justified. And extracting smaller functions just makes the code better readable, not worse.
moving code of inner loops into functions doesn't reduce the nestedness, often complicates the understanding and seldom makes things simpler.
If your actions inspire others to dream more, learn more, do more and become more, you are a leader." - John Quincy Adams
You must accept one of two basic premises: Either we are alone in the universe, or we are not alone in the universe. And either way, the implications are staggering” - Wernher von Braun -
for me personally if you are at a point of needing to deeper than two levels then you need to rethink the approach
Lobster Thermidor aux crevettes with a Mornay sauce, served in a Provençale manner with shallots and aubergines, garnished with truffle pate, brandy and a fried egg on top and Spam - Monty Python Spam Sketch
'tis said by Linus Torvalds that "If you need more than 3 levels of indentation, you're screwed anyway, and should fix your program."[^] While this is a quote from 1995, it is certainly a guideline worth thinking about. When you have many levels of nesting, what are the options? - Collapse your 2D array into a flat array and reference using a multiplicative index (much like the underlying work done with a 2D array anyway) - Call a function when at a given depth, splitting the complexity into separate pieces and cutting the indentation. - Rework the program logic to require less indentation in the first place, such as using external variables or handling exceptions separately. These all have their good and bad points, and as always it depends on circumstances. As people here have pointed out, situations such as iterating across 3D constructs does often require additional levels of indentation to use. It's down to the situation at hand; if it makes things easier to read/maintain, then simplify things where you can. If "simpler" is more twisty and fragmented, then just tolerate the extra indentation and get back to work.
Sometimes a fist in the face says more than a thousand honeyed words.
-
Follow up on the Variable names thread below, I would like to ask what is the level of nested loops that is acceptable in general? Most people would agree that three levels is acceptable, but I would say stop at two. The third loops gets a bit messy.
for(int i=0; i<100; i++) { //Okay
for(int j=0; j<100; j++) { //Acceptable
for(int k=0; k<100; k++) { //Messy
}
}
}I always use one more than necessary:
for($ocd=0; $ocd
But other than that one special idiosyncrasy, as many loops that are necessary. It's helpful to use Iterators whenever possible, for languages that support them. Three loops are what I typically see.
-
I like short functions - anything longer than 10 lines is suspicious, IMO, and I never encountered a case where I absolutely needed more than 200 lines in one function. If you have three levels of nesting, there's place for extracting a function, giving it a decent, speaking and reasonable name (occasionally a long one, but that's IMO still better than writing comments - compilers don't check comments). Thus, I'd say nesting beyond three is rarely justified. And extracting smaller functions just makes the code better readable, not worse.
This will blow a circuit in you. I work at a place where we have some really old legacy code (from the 1980s). Functions are 500 to 2000 lines long and have very large switch statements (many cases). [The code was originally done in C, ported to C++, and some of it ported into C#.] Yes, it sucks badly. But, how would you refactor the code to make the function smaller. What design pattern(s) would you use to make it meaningful (show off your software design skills)? Thanks.
-
Follow up on the Variable names thread below, I would like to ask what is the level of nested loops that is acceptable in general? Most people would agree that three levels is acceptable, but I would say stop at two. The third loops gets a bit messy.
for(int i=0; i<100; i++) { //Okay
for(int j=0; j<100; j++) { //Acceptable
for(int k=0; k<100; k++) { //Messy
}
}
}So, how would you loop through a 3D array? :laugh: Your Messy loop is only going through a million itterations, and since it isn't doing anything, that's about a millisecond or 2 of cost. If it did something in the middle that cost a second to run, that IS messy. (Over 11 and a half days.) It's also extremely short, therefore easy to read My point of view is: Do what needs to be done, but with the least amount of work needed. Do it in the least confusing way you can think of. (Sometimes they clash, other times they clash because I'm not thinking clearly.) Years ago, a challenge was issued where I was taking classes. Not part of the class, but interesting, and made me learn more about C# than the classes did. A fellow student calculated that with our current computers, it would take 6 months to figure out the puzzle. I said to myself, "No way! However, I could see how the current way I was trying it would take too long and I have to think about the right solution." I didn't get a complete solution, but I found some right (and wrong) answers in 5 minutes. Long after class, I went back to the puzzle and found solutions in 40 minutes. I knew StringBuilder was more efficient, but I wanted to find out how efficient so I then replaced my string math with builder math and cut it to 20 minutes. Did some more optimizations and got it down to 4 minutes. The class machines would have done it in 8 minutes. I can do it now in 2 minutes. If you use indeterminate logic (Determine if it makes sense to go on before doing so. Makes the point where you decide to not go on, indeterminate.) I'm at a loss at how to calculate how long it will take before I've got code that takes the time it needs. So, take the worst solution I can think of. (40 nested loops, each looping 7 times.) Estimate the cost of validating the numbers in the puzzle 0, so the estimate is just based on the looping. How long would it take? 400,000,000,000,000,000 years (approx. using the class computers) You speed up the computer to 50GH, you can divide that estimate by 100 which is still a pretty long wait. :-D
-
Suppose you have some operation operating on multidimensional arrays. You don't write distinct versions for each number of dimensions, you use recursion. Recursion for such cases doesn't require more than one nesting level.
Recursion? :confused: Sorry, but that makes even less sense to me than just extracting an independent function: you have to add additional parameters to intialize and control recursion level, and that usually makes it harder to maintain the code compared to just nesting loops. IMHO That completely defeats the OPs purpose. Also I cannot think of a good example where you actually have to fully traverse a N-dimensional array with N>3 (TBH I cannot even think of a good example with N==3) In the examples I have been thinking of this is simply not the case: There are two dimensional arrays of spline patches containing two-dimensional grids of control points that define segments for each spline patch, and points with each patch are defined by two parameters. A function that determines the intersection with another surface will have to iterate over the 2D set of patches, and within a patch over the 2D set of grid points, just to find out whether there could be an intersection within that segment. Then iterate again to find the exact parameter pairs of the points of the intersection curve. Recursion is simply not an option here. (Yes you could extract a function in this example, e. g. for determining the intersection with just one spline patch, but then you'll sacrifice some potential for optimization, and you'll add the need for merging the resulting partial intersection curves. Again, it would complicate the code, not simplify it.)
-
Follow up on the Variable names thread below, I would like to ask what is the level of nested loops that is acceptable in general? Most people would agree that three levels is acceptable, but I would say stop at two. The third loops gets a bit messy.
for(int i=0; i<100; i++) { //Okay
for(int j=0; j<100; j++) { //Acceptable
for(int k=0; k<100; k++) { //Messy
}
}
}I wrote a Sudoku program and it used 9 nested loops, as you can imagine. It may by easy to write code with very few nested loops for a simple function. But if when you have to write many nested loops which are more readable, just do it.
-
That was the case with ancient compilers/interpreters. Nowadays function calls are cheap in most cases (unless you put huge frames on the stack - i.e. your functions have very long parameter lists (I mean, several dozen parameters). If you're smart and put all parameters except the loop counters/indexes in a struct, you only need to pass a pointer to the struct - it's something pointer-like even when you pass the object in C# or Java, so it can be made very cheap. In C++, you can declare the functions inline, which will tell the compiler to produce code as if there weren't any functions. Thus, you get the best of for everybody: optimized code for the computer and readable code for human readers.
Florin Jurcovici wrote:
That was the case with ancient compilers/interpreters.
If anything, modern CPU architectures make these things more complex than they have ever been before: pipelining, prefetching, synchronising processes - you name it. Passing function arguments is just one of many things to take care of, and many of these things didn't even exist 20 years ago. Acually, because modern compilers are so much smarter and modern hardware so much more capable, that in fact increases the cost of function calls: within a single function, prefetching and pipelining allows the CPU to process a whole series of consecutive commands much faster. Interrupting the control flow by calling another function however prevents all this optimization! Suddenly the whole processing chain comes to a full stop! The context is saved, the new context created, and a new pipeline slowly set into motion. The context change isn't the big thing. You're right about that. But preventing your highly optimized parallel pipelining multicore CPU from optimizing the tasks with it's inherent parallelism is going to adversely affect program performance. This, by the way, is also one of the bigger challenges of parallel programming: spreading a simple loop (nested or not) on multiple processors is only going to help up to a certain point: the cost of context change and not going to be able to take full advantage of CPU-internal parallelism is much higher than many people think. You may not be aware of it, but you are using parallel programming as well: the compilers will take advantage of your CPUs or Multi-CPU System's inherent parallelism as much as it can. But it cannot parallize code across function call boundaries! Even if your code doesn't lend itself well to parallelization - at the very least the CPU can load chunks of the big array you're iterating through into the L2 cache, so reading and writing will be much faster. If you extract your inner loops into another function however, the higher level function has no way of knowing which data that function accesses, and therefore won't preload anything. Instead, each function call will load the data range needed there. That's fine if the first call accesses only the first part of the big array, and the next calls consecutive parts. But there's no guarantee for that: maybe the first call reads the 1st, 11th, 21st, 31st, etc. element, the second call reads the 2nd, 12th, 22nd, etc.. In that case,
-
This will blow a circuit in you. I work at a place where we have some really old legacy code (from the 1980s). Functions are 500 to 2000 lines long and have very large switch statements (many cases). [The code was originally done in C, ported to C++, and some of it ported into C#.] Yes, it sucks badly. But, how would you refactor the code to make the function smaller. What design pattern(s) would you use to make it meaningful (show off your software design skills)? Thanks.
I don't know the code, so I can't say anything for sure. But if I correctly recognize the pattern, the code screams for being refactored into something properly object-oriented - I suppose many switch statements dispatch based on the same variable/set of values. Use extract method to split the large methods, then extract class to extract the smaller, specialized classes, then extract a common interface for all small classes, then change the switch statements to just call a method on an interface. It probably won't be that simple, and also a lot of but not just manual work, but you should eventually get somewhere, IMO.