Learning a big new codebase
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
TheOnlyRealTodd wrote:
when I open the project file for these, it can be tough trying to figure out where to even start.
Getting them to compile is the hardest step! :-D
TheOnlyRealTodd wrote:
and also looking at the unit tests to get an idea of whats happening
That's a good approach, especially single stepping through the tests. And write down everything you learn, and your questions for what you don't understand yet. Personally, I think most of these open source sites could benefit greatly from writing some "programmer documentation", as you're not the only one with "how the elephant does this stuff work?" question. And that might be the most useful contribution at first! Marc
Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny
-
S Houghtelin wrote:
read the project documents
...while keeping in mind that the actual product probably deviates substantially from the original documentation.
dandy72 wrote:
..while keeping in mind that the actual product probably deviates substantially from the original documentation.
Hence the "If you are lucky to work at a company that has decent documentation practices."
It was broke, so I fixed it.
-
dandy72 wrote:
..while keeping in mind that the actual product probably deviates substantially from the original documentation.
Hence the "If you are lucky to work at a company that has decent documentation practices."
It was broke, so I fixed it.
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
Play with it without looking at the code, first -- get the grand view, and get to know the whys and wherefores.
I wanna be a eunuchs developer! Pass me a bread knife!
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
To analyse a large code base you could try running a dependency tool such as DeepEnds
-
If you are lucky to work at a company that has decent documentation practices, read the project documents. To get an overall idea of what a project is about read the specification document. Then read the code description, if there is one. Also, try to follow the flow charts. These are standard documents in medical device design and manufacturing. If you are into database or web design, good luck! :laugh:
It was broke, so I fixed it.
-
To analyse a large code base you could try running a dependency tool such as DeepEnds
Not to be confused with these Depends: Free Incontinence Samples For Men & Women in 2016 | Depend®[^] :)
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
The essence of your learning question is advice as to how you can most efficiently organize and sequence your code-reading-and-understanding learning tasks so as to attain a reasonable level of proficiency in the time you have allotted and at the same time reduce the time you will spend on the study of code without merit, which frankly is not possible until you have already learned it. My suggestion is to, at the outset, refuse to learn and study code written in throw-away computer languages.
-
dandy72 wrote:
..while keeping in mind that the actual product probably deviates substantially from the original documentation.
Hence the "If you are lucky to work at a company that has decent documentation practices."
It was broke, so I fixed it.
I've been at this for 40 years and have yet to find a company that had more than completely minimal documentation at a level that could help a developer. It has always been a learn-as-you-go process. Most developers do NOT document their work.
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
As always, working backwards is a good approach. You can find previously FIXED Items, and review the posted code changes that fixed the item. I would recommend being able to build/test the previous version, and verify the bug. Apply the fix. Verify the bug is gone. If you get decent at that. Then get realistic. It takes approximately 5,000hrs to master a new skill. Assuming you have mastered programming in general, lets assume a large code base will take you about 1,000hrs for a solid basic understanding. (Half a work year). Yeah, it is easy to jump in and hack away. But actually mastering a code base. This gets to the REASON others suggest you support a code base that you already use, like, and would like to extend. BTW, as you setup your environment to test/validate prior updates. Considering reviewing and enhancing the documentation that helps others get to where you got to.
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
-
I've been at this for 40 years and have yet to find a company that had more than completely minimal documentation at a level that could help a developer. It has always been a learn-as-you-go process. Most developers do NOT document their work.
In the medical device industry if we do not have documentation, you will not be able to sell your device. It is a requirement and for good reason. Would you want to be on the operating table being monitored by devices with software of unknown provenance? "Most developers do NOT document their work." and we wonder why the quality of the software out there sucks. That's called winging it and in my opinion it is unprofessional and if a developer is unable or unwilling to maintain at least some level of documentation I would not be inclined to hire them or to keep them in my employ.
It was broke, so I fixed it.
-
S Houghtelin wrote:
read the project documents
...while keeping in mind that the actual product probably deviates substantially from the original documentation.
The sad part in all the comments on this topic is that not one suggested writing some documentation for the project. Documentation is always someone else's responsibility. Two years ago I was handed 100KLOC of undocumented but production critical cowboy code. Programmer who wrote it was adamant that "the code is self documenting". It wasn't. It took 18 months to document it to the point where it could be maintained...barely. If you REALLY want to contribute to a project, write something other than code. "Everyone complains about the weather, but no one does anything about it."
-
If you can find an issue that does not have a UnitTest, contribute a UnitTest that reproduces the issue.
I found the doxygen tool to be very helpful. I can show some call traces and is very convenient for navigating objects.
-
It's admirable wanting to get involved and commit to an open source project, but my suggestion would be only get involved in a project if it's something you use/reference as part of some other development project you are working on, and there are improvements or fixes that would benefit your own project that you believe would also be of benefit to others.
Wastedtalent wrote:
only get involved in a project if ...
I disagree. There is great value in studying large, commercial-grade software for its own sake. Even if the OP never commits but just spends hours or days spelunking one of the codebases he mentions, he is bound to be enriched by the process. Many of the techniques of enterprise-scale coding can't be taught in books.
-
In the medical device industry if we do not have documentation, you will not be able to sell your device. It is a requirement and for good reason. Would you want to be on the operating table being monitored by devices with software of unknown provenance? "Most developers do NOT document their work." and we wonder why the quality of the software out there sucks. That's called winging it and in my opinion it is unprofessional and if a developer is unable or unwilling to maintain at least some level of documentation I would not be inclined to hire them or to keep them in my employ.
It was broke, so I fixed it.
That's certainly good to know. I wasn't talking about end-user documentation, though, I was talking about the documentation that would help a developer. I wonder if the code behind those medical devices is documented any better than what I've seen in a dozen or so other industries?
-
Wastedtalent wrote:
only get involved in a project if ...
I disagree. There is great value in studying large, commercial-grade software for its own sake. Even if the OP never commits but just spends hours or days spelunking one of the codebases he mentions, he is bound to be enriched by the process. Many of the techniques of enterprise-scale coding can't be taught in books.
I agree, I also think studying and getting involved in are two very different things, and the OP was talking about making commits.
-
That's certainly good to know. I wasn't talking about end-user documentation, though, I was talking about the documentation that would help a developer. I wonder if the code behind those medical devices is documented any better than what I've seen in a dozen or so other industries?
We have to comply with GMP, UL, ISO, FDA, CE and EU standards among others. We are required to have our documentation internally and externally reviewed and accepted by the regulatory bodies. Every aspect of the product needs to go through risk and hazard analysis and QA tested using the very documents the software developer wrote. If the software and document do not match, it needs to be corrected and retested. This doesn't mean bugs can't get through, but certainly the obvious glaring stuff rarely does. This is why it takes forever and a massive amount of $₤€ to get a new product out. End user documentation is also very regulated, but thankfully I don't have to deal with that aspect.
It was broke, so I fixed it.
-
We have to comply with GMP, UL, ISO, FDA, CE and EU standards among others. We are required to have our documentation internally and externally reviewed and accepted by the regulatory bodies. Every aspect of the product needs to go through risk and hazard analysis and QA tested using the very documents the software developer wrote. If the software and document do not match, it needs to be corrected and retested. This doesn't mean bugs can't get through, but certainly the obvious glaring stuff rarely does. This is why it takes forever and a massive amount of $₤€ to get a new product out. End user documentation is also very regulated, but thankfully I don't have to deal with that aspect.
It was broke, so I fixed it.
Good to hear, and also makes a lot of sense. Most of the organizations I worked with were not producing software that could wind up being "life critical" like that. Glad to hear that someone does it.
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
In GREP we trust. Use GREP, find in files, or find usages to see all the references to a particular class and it's public methods. Start the new dev with a small task - a bug fix or minor enhancement. Then have the new dev document every class and method that contributes in some way to the scenario. Have the new dev document every other class and method that depends on the code that is changed, all the way up to the UI or interface. Also, I absolutely concur with the idea of debugging and examining the stack. If unit tests and integration tests are available then run these in the debugger. If a developer went to the trouble to write unit tests, then it must be important.