Learning a big new codebase
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
In GREP we trust. Use GREP, find in files, or find usages to see all the references to a particular class and it's public methods. Start the new dev with a small task - a bug fix or minor enhancement. Then have the new dev document every class and method that contributes in some way to the scenario. Have the new dev document every other class and method that depends on the code that is changed, all the way up to the UI or interface. Also, I absolutely concur with the idea of debugging and examining the stack. If unit tests and integration tests are available then run these in the debugger. If a developer went to the trouble to write unit tests, then it must be important.
-
I agree, I also think studying and getting involved in are two very different things, and the OP was talking about making commits.
I also suggest that getting involved in that depth with a project that you use regularly will increase your interest and commitment to the process. Doing work on a project that you have little interest in, just for the sake of it, will soon feel like a thankless task. I've only ever contributed to projects that I have a direct interest in, because the first step in contributing to any product is using it.
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
Could I also echo the responses from some of the other repliers that that is a chronic shortage of documenters for most open source products? They are like rocking horse poop. If you want to hone your coding skills and can find a project that deeply interests you, then hack away. If you are just looking to contribute to a project that matters to you, then documenters are always welcomed with open arms.
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
I think your first impulse (find and solve an issue) was the right one. At least you have a "goal" in mind; the rest ("reading code") gets old pretty fast. Ultimately, you will find out your value is in seeing the big picture quickly, and prioritizing what needs to be done. A lot of code never gets executed or deals with fringe cases; better to focus on the stuff that actually gets run; i.e. the "buggy" parts.
-
The sad part in all the comments on this topic is that not one suggested writing some documentation for the project. Documentation is always someone else's responsibility. Two years ago I was handed 100KLOC of undocumented but production critical cowboy code. Programmer who wrote it was adamant that "the code is self documenting". It wasn't. It took 18 months to document it to the point where it could be maintained...barely. If you REALLY want to contribute to a project, write something other than code. "Everyone complains about the weather, but no one does anything about it."
Documentation can be self documenting as well. The amount of times I despair when I see a summary of a method which basically repeats the method name. Code should be simple and self explanatory as to the implementation. If it isn't then it probably needs to be refactored. A method can explain its function in its name, no need to repeat it (as an example I saw the documentation to an attribute "rtpHeaderExpected" as "expects an rtp header"). Documentation is useful when it explains the why of code, not the what (which is what the code should explain). So yes to documentation, but only when its useful !
-
In the medical device industry if we do not have documentation, you will not be able to sell your device. It is a requirement and for good reason. Would you want to be on the operating table being monitored by devices with software of unknown provenance? "Most developers do NOT document their work." and we wonder why the quality of the software out there sucks. That's called winging it and in my opinion it is unprofessional and if a developer is unable or unwilling to maintain at least some level of documentation I would not be inclined to hire them or to keep them in my employ.
It was broke, so I fixed it.
S Houghtelin wrote:
In the medical device industry if we do not have documentation, you will not be able to sell your device
S Houghtelin wrote:
"Most developers do NOT document their work." and we wonder why the quality of the software out there sucks
The only way comparing the software industry with the medical device industry could be fair is if software was priced to match said medical devices. Don't blame developers for not documenting their code. That decision is not made by them.
-
S Houghtelin wrote:
In the medical device industry if we do not have documentation, you will not be able to sell your device
S Houghtelin wrote:
"Most developers do NOT document their work." and we wonder why the quality of the software out there sucks
The only way comparing the software industry with the medical device industry could be fair is if software was priced to match said medical devices. Don't blame developers for not documenting their code. That decision is not made by them.
Sadly this is very true. When there is a clock ticking down the profit margins, documentation is usually the first casualty.
We're philosophical about power outages here. A.C. come, A.C. go.
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
Since you're a novice first focus on what you are comfy with. Pick that layer. Pick up a tool like Ndepend or Nitriq and see how the layers interact. Then and only then start playing on the keyboard.
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
Here are a few tips. If nothing else fix layout issues - indentation, spacing etc - and add (sensible) comments where it makes sense. The act of tidying up layout and having to think about what a small section of code is doing, in its own right, will help build your understanding of the bigger picture. From there is probably won't take long for you to start spotting refactoring opportunities. If you do decide to make changes, start with the small trivial things since these will often be overlooked or tolerated for the sake of the big things. Build up a testing regime for your changes BEFORE you make the changes. Don't focus on code structure or control too much. That will become obvious. The key to any code base is how it organizes its data and moves it around. This you can analyse and diagram. Remember, fundamentally all any software is really about is moving data from A to B. With this in mind, pay special attention to the interfaces between modules, components and systems. This is likely where the most problems are. Especially when either side of the interface has been independently developed. Also look for places where data is transformed from one form to another e.g. conversions, lookups. Another source for faults. If you are able to run the software, another way of gaining understanding is to include detailed logging/tracing of the software's operation as it is running. In this context, look to trace the initial state of variables and when variables change, the function calls (including explicit variable values) and function returns and error events in the code. Those three categories of logging should be enough for you to hone in on most problems with the code when it is running. This is ofc verbose and has performance implications so make sure it can be turned off or removed from the release product entirely. Finally, study design patterns and identify where they have been used in the code. Either intentionally or unwittingly. You may be lucky and the patterns may be explicitly named e.g. WidgetFactory or WangleAdaptor. Design patterns are not the be all and end all, but they are a useful shorthand for common development problems.
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
Look Big, but start small. First get an overview of the what it's fundamentally trying to achieve. But then,look at a function/class at the bottom that needs a little maintenance and update it. Maybe only a little refactoring to make it clearer or updating the inline documentation. Then follow up into the calling classes and see what's happening there. Like pulling a piece of knitted wool, each tug takes you further into unraveling it until you've touched every part of the main codebase. And then review/throw away most of your changes. You had no clue what you were doing at the start.
-
Do you have any recommended strategies for a junior developer when attempting to learn a large new codebase? One of my goals is to make some commits on something like ASP.NET MVC (.NET Core now), Entity Framework, Node.js, or some other major project on GitHub. Not surprisingly however, when I open the project file for these, it can be tough trying to figure out where to even start. Of course I can view the issues and try my hand at solving one, but I found that even that often requires a general idea of the project's moving parts. Do you have any suggestions or resources on breaking down a big project like this to bite-sized chunks that can be learned over time in hopes of a serious contribution? One strategy I've tried is looking at the classes that I am familiar with from using the software and also looking at the unit tests to get an idea of whats happening. Thanks.
The two main tips I have are: 1) Use doxygen to generate a hyperlinked, annotated version of the source code you can navigate in your browser. 2) Do code reviews: this gives you an opportunity to ask questions about code that developers who are currently hot in. Only works if you have a functioning code review system in place, of course.
-
Go small at first. Follow the code paths from UI to the finish line for that particular path. I cannot stress this technique enough. Pick a path that is relevant to your current task or project. Don't try to learn the entire system at one time.
-
Go small at first. Follow the code paths from UI to the finish line for that particular path. I cannot stress this technique enough. Pick a path that is relevant to your current task or project. Don't try to learn the entire system at one time.
Yes, I agree with you. This is one of the paths. Aquecedor solar