Fighting a monster
-
I started to dive in a new (for me) very large code base. One of the files is 9500 lines. Got me wondering: what's the largest single source file you ever met? I'm not talking about automatically generated source files, but those written by humans. On the same note, when do you think it's time to break a file in smaller pieces? For me, it is somewhere around 1000 lines.
Mircea
-
I started to dive in a new (for me) very large code base. One of the files is 9500 lines. Got me wondering: what's the largest single source file you ever met? I'm not talking about automatically generated source files, but those written by humans. On the same note, when do you think it's time to break a file in smaller pieces? For me, it is somewhere around 1000 lines.
Mircea
Line count is not a default metric for me. If all the code is concise and organized well and has purpose, who cares about line count...I don't.
-
I started to dive in a new (for me) very large code base. One of the files is 9500 lines. Got me wondering: what's the largest single source file you ever met? I'm not talking about automatically generated source files, but those written by humans. On the same note, when do you think it's time to break a file in smaller pieces? For me, it is somewhere around 1000 lines.
Mircea
Well, one of the worst I ever saw had a single method that had over 25K lines. It was so convoluted, it blew up the Cyclomatic Complexity calculations.
-
I started to dive in a new (for me) very large code base. One of the files is 9500 lines. Got me wondering: what's the largest single source file you ever met? I'm not talking about automatically generated source files, but those written by humans. On the same note, when do you think it's time to break a file in smaller pieces? For me, it is somewhere around 1000 lines.
Mircea
Oil and gas FORTRAN programs would typically run that long. No comments. 8 letter names. Engineers. My source gets split when it's description gets too cumbersome (indicating code smell): e.g. "the charge, pursuit, retreat adapter".
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
-
I started to dive in a new (for me) very large code base. One of the files is 9500 lines. Got me wondering: what's the largest single source file you ever met? I'm not talking about automatically generated source files, but those written by humans. On the same note, when do you think it's time to break a file in smaller pieces? For me, it is somewhere around 1000 lines.
Mircea
What language? C# should never have been released to the wild without partial classes. :mad: For me, it's not directly about the size. I wouldn't break up a file simply because it's larger than X units. The point is that being "too big" can make things hard to find, and "too hard to find" is what matters more than simply "size". Similarly, having two pieces of code in separate files makes it easier to have them open in two windows beside each other for whatever reason you may need to do that. Splitting the code for an application into several files makes version control easier and reduces change conflicts when multiple developers are working on the same code base. And code sharing between unrelated applications is easier when the applications share a minimum of code and don't share code they don't actually rely on. The lower and more common the code is, the more granular it should be. My library code is in single-method (possibly a family of overloaded methods) files, and each application can include only the parts it requires. Higher level -- application-specific -- code should probably be separated more by functional area; frontend, backend, configuration, administration, etc. Multi-file projects just make everything better for a team of developers. Size itself doesn't matter.
-
What language? C# should never have been released to the wild without partial classes. :mad: For me, it's not directly about the size. I wouldn't break up a file simply because it's larger than X units. The point is that being "too big" can make things hard to find, and "too hard to find" is what matters more than simply "size". Similarly, having two pieces of code in separate files makes it easier to have them open in two windows beside each other for whatever reason you may need to do that. Splitting the code for an application into several files makes version control easier and reduces change conflicts when multiple developers are working on the same code base. And code sharing between unrelated applications is easier when the applications share a minimum of code and don't share code they don't actually rely on. The lower and more common the code is, the more granular it should be. My library code is in single-method (possibly a family of overloaded methods) files, and each application can include only the parts it requires. Higher level -- application-specific -- code should probably be separated more by functional area; frontend, backend, configuration, administration, etc. Multi-file projects just make everything better for a team of developers. Size itself doesn't matter.
I *wish* C++14 and above had partial classes
To err is human. Fortune favors the monsters.
-
I started to dive in a new (for me) very large code base. One of the files is 9500 lines. Got me wondering: what's the largest single source file you ever met? I'm not talking about automatically generated source files, but those written by humans. On the same note, when do you think it's time to break a file in smaller pieces? For me, it is somewhere around 1000 lines.
Mircea
No idea of the lines count, but almost certainly assembler code: maybe 1/2MB or thereabouts? The assembler we were using only supported single files: no includes, no relocatable blocks, no linker. For a 32KB ROM, that's only 16 chars per line so it's probably about right - maybe a little conservative. Since the ROM was full, and most common instructions one byte long, maybe 28K lines or so?
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt AntiTwitter: @DalekDave is now a follower!
-
I *wish* C++14 and above had partial classes
To err is human. Fortune favors the monsters.
You can't split a class into multiple files in C++ to be compiled together into one executable? I'm pretty sure you can, but I only ever dabbled in C++ back in the day. Or are you saying that you want to have different parts of a class compiled into separate executables (DLLs)?
-
You can't split a class into multiple files in C++ to be compiled together into one executable? I'm pretty sure you can, but I only ever dabbled in C++ back in the day. Or are you saying that you want to have different parts of a class compiled into separate executables (DLLs)?
No, it would be nice to be able to declare a class twice in two different files and have it compiled into the same binary. C++ does not let you do that, at least not to my knowledge, unless they added it after C++17. For example, I have a draw class that handles all the drawing operations in my graphics library. It would be nice to segregate the different drawing primitives into different files, but the only way to do that is to delegate and forward or to use multiple C++ implementation files for a single class, but you still wind up with all the method definitions for all the drawing primitives in the same header. You can hack around it using the preprocessor, by
#include
ing class fragments, but that's techy. Edit: You can use inheritance to approximate it, but that still runs you into visibility issues.To err is human. Fortune favors the monsters.
-
I started to dive in a new (for me) very large code base. One of the files is 9500 lines. Got me wondering: what's the largest single source file you ever met? I'm not talking about automatically generated source files, but those written by humans. On the same note, when do you think it's time to break a file in smaller pieces? For me, it is somewhere around 1000 lines.
Mircea
Well, it was a long time ago but back in pre-history when I was writing COBOL (no longer on cards thank God) the main program in an overnight suite I supported was (I think) 21,000 lines of code (well, code + blank space). We did everything we could to avoid printing the thing (360 pages), so the listing was frequently annotated by hand. Of course that didn't really help when the actual code had typos that still allowed it to compile. As overnight on-call support, I'd frequently get bleeped (by pager) and have to hook up the 600baud teletype to my landline, and get the relevant bits of memory dump. (Most issues were S0C7 faults). Yes, for some reason data was never properly validated; actual code bugs were rare, but get a non-numeric character in a character column in data and the whole shooting match failed.
Telegraph marker posts ... nothing to do with IT Phasmid email discussion group ... also nothing to do with IT Beekeeping and honey site ... still nothing to do with IT
-
I started to dive in a new (for me) very large code base. One of the files is 9500 lines. Got me wondering: what's the largest single source file you ever met? I'm not talking about automatically generated source files, but those written by humans. On the same note, when do you think it's time to break a file in smaller pieces? For me, it is somewhere around 1000 lines.
Mircea
I have to deal with a program that has several files, one of which is 88K lines and about 2.5MB. To add to the misery, there are thousands of global variables. We have rewritten most applications based on this but not all of them and it's an on-going thing.
"They have a consciousness, they have a life, they have a soul! Damn you! Let the rabbits wear glasses! Save our brothers! Can I get an amen?"
-
You can't split a class into multiple files in C++ to be compiled together into one executable? I'm pretty sure you can, but I only ever dabbled in C++ back in the day. Or are you saying that you want to have different parts of a class compiled into separate executables (DLLs)?
Yes, you can. Class definition has to stay in one file but implementation can go in any number of files.
Mircea
-
I started to dive in a new (for me) very large code base. One of the files is 9500 lines. Got me wondering: what's the largest single source file you ever met? I'm not talking about automatically generated source files, but those written by humans. On the same note, when do you think it's time to break a file in smaller pieces? For me, it is somewhere around 1000 lines.
Mircea
Anything past one pizza.
PartsBin an Electronics Part Organizer - An updated version available! JaxCoder.com Latest Article: ARM Tutorial Part 1 Clocks
-
No, it would be nice to be able to declare a class twice in two different files and have it compiled into the same binary. C++ does not let you do that, at least not to my knowledge, unless they added it after C++17. For example, I have a draw class that handles all the drawing operations in my graphics library. It would be nice to segregate the different drawing primitives into different files, but the only way to do that is to delegate and forward or to use multiple C++ implementation files for a single class, but you still wind up with all the method definitions for all the drawing primitives in the same header. You can hack around it using the preprocessor, by
#include
ing class fragments, but that's techy. Edit: You can use inheritance to approximate it, but that still runs you into visibility issues.To err is human. Fortune favors the monsters.
Hmm, odd, I was sure I had done that back in the 80s or 90s. I must be mistaken. PIEBALD goes spelunking and finds some old C++ code... Ah, you are correct (of course), I was mistaken for the most part. I see by my code that the class has to be defined in one place, but that the implementations of the members can be separated out into other files -- and I'm using #include to combine the code together. When C# 1 was first released, a class had to be fully defined and implemented in one file -- which was horrible -- but C# 2 added partial classes (and interfaces), with which a class definition and implementation can be spread across multiple files. I do see that something similar could probably be accomplished with C++ (and maybe C# 1) by using the C-preprocessor, but that wouldn't be as clean.
-
Yes, you can. Class definition has to stay in one file but implementation can go in any number of files.
Mircea
But with C# (2 and newer), the definition can also be spread across files; not just the implementation. C# combines the definition and implementation together (other than abstract members).
-
But with C# (2 and newer), the definition can also be spread across files; not just the implementation. C# combines the definition and implementation together (other than abstract members).
I know, but my monster is C/C++ :-D
Mircea
-
I know, but my monster is C/C++ :-D
Mircea
If it ain't broke, don't break it.
-
I started to dive in a new (for me) very large code base. One of the files is 9500 lines. Got me wondering: what's the largest single source file you ever met? I'm not talking about automatically generated source files, but those written by humans. On the same note, when do you think it's time to break a file in smaller pieces? For me, it is somewhere around 1000 lines.
Mircea
I was told two extreme cases from the software for the ITT System 12 phone switch: The largest 'struct' definition (the language used was CHILL, not C, so terms are different) ran to 8300 lines. Printed 72 lines per page, this single type definition would fill a 115 page book. The linker for this system maintained a symbol export table for each module. Early linker versions used a signed 16-bit integer to index this table, so it was limited to 32768 exported symbols. This limit was exceeded - System 12 defined modules exporting more than 32 Ki symbols. I can understand importing that many symbols, but exporting from a single module!?! The maintainer of this linker was a university classmate of mine (he was the one telling about that struct, too) told that they made a quick fix, changing the index type to unsigned integer, to allow for 65536 exported symbols. But if anyone can break a 32 Ki limit, they can break a 64 Ki limit, too. So in the next major revision, the index type was changed to 32 bits. Hopefully, noone will export more than 4 billion symbols from a single module :-)
-
Well, it was a long time ago but back in pre-history when I was writing COBOL (no longer on cards thank God) the main program in an overnight suite I supported was (I think) 21,000 lines of code (well, code + blank space). We did everything we could to avoid printing the thing (360 pages), so the listing was frequently annotated by hand. Of course that didn't really help when the actual code had typos that still allowed it to compile. As overnight on-call support, I'd frequently get bleeped (by pager) and have to hook up the 600baud teletype to my landline, and get the relevant bits of memory dump. (Most issues were S0C7 faults). Yes, for some reason data was never properly validated; actual code bugs were rare, but get a non-numeric character in a character column in data and the whole shooting match failed.
Telegraph marker posts ... nothing to do with IT Phasmid email discussion group ... also nothing to do with IT Beekeeping and honey site ... still nothing to do with IT
DerekT-P wrote:
the listing was frequently annotated by hand
Wasn't that common practice in the 1970s-80s? In my student days, I was an intern in a company making 16bit minis and 32bit superminis, running their own OS (in those days, 'Unix' was hardly known at all outside universities). I managed to isolate a bug in the OS, and went to the responsible guy. For quite a while, he flipped back and forth in his huge OS source printout, before exclaiming a "There!", dug out his ballpoint pen and wrote the code fix into the listing. What makes me remember it better than I would otherwise: This OS was written in a language about midway between assembler and K&R C. He didn't write his fix in that language. He didn't write the assembler instructions. He wrote down the numeric instruction codes, in octal format. I guess that this qualifies for being an 'oldtimer' :-)
-
I started to dive in a new (for me) very large code base. One of the files is 9500 lines. Got me wondering: what's the largest single source file you ever met? I'm not talking about automatically generated source files, but those written by humans. On the same note, when do you think it's time to break a file in smaller pieces? For me, it is somewhere around 1000 lines.
Mircea
Original Pascal had no module concept. All code had to go in a single file. Open source is not as new as Linux people will make us believe! The Pascal P4 compiler was always freely available. I picked it up as a university freshman and studied on my own alongside working on the '101 Introduction to Programming' hand-in exercises. I'll say that it gave me a head start in programming ... The compiler source was between 30,000 and 35,000 lines. For quite a few years following, I was really bothered by the upcoming C source file common practice of creating a separate file for every single function: How can you find anything at all in the source when you have to open hundreds of files for searching? Tools for searching across an entire directory tree wasn't very developed then - not until the C practice had become more widespread. Today we have the tools. We also have FOSS. Even today I am appalled when I have to handle a zillion files, each containing 70-100 lines of open source license/copyleft blurb, followed by a five line function. In every single one of a zillion files!