Unpopular opinions: LINQ
-
You may find this other unpopular opinion article by Joe Duffy - The 'premature optimization is evil' myth[^] interesting as it does a nice analysis of how LINQ "
Quote:
makes writing inefficient code very easy
I started reading it, and it seems we're on the same page as far as general approach toward optimization. Optimization starts in the design phase during requirements gathering. Performance is either an explicit or unwritten requirement of any application. No application can take forever to perform. How long is acceptable is a question of design. Optimization continues through project planning - choosing the platforms and tools, and even right data structures and patterns to accomplish your tasks. You don't garbage collect driver code. You don't use a Dictionary where a LinkedList would be more appropriate. These are design decisions, the first one high level, the second one more specific, but still, design decisions. Only after that does the phrase "optimization is evil" come into play. Because at this point, if you're optimizing, you're optimizing something you should have optimized during design, or you're trying to bit twiddle to work around something that again, should have been optimized during design. It's way more efficient to optimize up front during the design and planning phase, rather than after the fact when you are locked in and your options for improving performance are limited to bit twiddling. Unless you're doing embedded though, counting cycles isn't important. Optimizations should be done on the algorithmic level - look for a Big O figure, not how to shave a cycle here or there.
Real programmers use butterflies
-
I'm not a fan of LINQ. I love functional programming but .NET's enumerator paradigm is not up to the task. It creates too many objects too quickly to be a real grown up functional language, whose iteration is highly optimized because it's a first class operation. I've benched LINQ against hand written pseudo-functional operations that do the same thing. It was not encouraging. For things that make heavy use of functional computation like parser generators, where your LINQ query might be half a page, it's a Bad Idea(TM) Worse, I think its use has been over encouraged by Microsoft. It makes green developers write even worse code, and makes it harder for a seasoned developer to understand the performance implications of the code they are writing (and I'm not talking about bit twiddling here, I'm talking about figuring out your Big O expression) I tend to avoid its use, preferring - at least in C# - to make my iteration operations explicit and long hand. If .NET had truly optimized iteration paradigm - one that didn't create new objects for every single iteration operation** - i might consider using it. ** yes i understand that LINQ combines multiple operations into a single iteration *sometimes* - in practice it's not often enough to make up for the overhead of enumerators. Now, there's a case where all of the above doesn't matter, and that's PLINQ. Theoretically, for a large enough operation, that can be highly parallelized, the overhead of enumerators suddenly isn't the biggest part of the performance equation. What I mean is it essentially pays for itself. Also, given the issues with synchronization and other cross task communication (is your operation clustered over a network?) enumerators are actually not a bad idea since you can lock behind them or RPC behind them. Contrast that with C++ iterators that are usually lightly wrapped pointer ops and you realize their limitations fast: In order to enable all of the stuff you need to make iteration operations work with each other in parallel you have to wrap every iterator operator anyway, making it as "heavy" as an enumerator in .NET, not counting the general overhead of running managed code. So basically, PLINQ is where LINQ finally covers its costs - where it reaches the point where its advantages outweigh its disadvantages. All of this of course, is one developer's opinion. And some of this doesn't necessarily apply to business software, where performance almost doesn't matter for most scenarios.
Real programmers
"I tend to... make my iteration operations explicit and long hand." Thanks! I thought I was the only one. So much more readable when I come back later, IMHO.
-
honey the codewitch wrote:
Until developers abuse it with cross code interdependencies, making selective linking useless.
As someone said, you can write poorly-crafted code in any language. :sigh:
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.
This reminds me of Flon's Axiom from 45 years ago: "There does not now, nor will there ever, exist a programming language in which it is the least bit hard to write bad programs."
-
I'm not a fan of LINQ. I love functional programming but .NET's enumerator paradigm is not up to the task. It creates too many objects too quickly to be a real grown up functional language, whose iteration is highly optimized because it's a first class operation. I've benched LINQ against hand written pseudo-functional operations that do the same thing. It was not encouraging. For things that make heavy use of functional computation like parser generators, where your LINQ query might be half a page, it's a Bad Idea(TM) Worse, I think its use has been over encouraged by Microsoft. It makes green developers write even worse code, and makes it harder for a seasoned developer to understand the performance implications of the code they are writing (and I'm not talking about bit twiddling here, I'm talking about figuring out your Big O expression) I tend to avoid its use, preferring - at least in C# - to make my iteration operations explicit and long hand. If .NET had truly optimized iteration paradigm - one that didn't create new objects for every single iteration operation** - i might consider using it. ** yes i understand that LINQ combines multiple operations into a single iteration *sometimes* - in practice it's not often enough to make up for the overhead of enumerators. Now, there's a case where all of the above doesn't matter, and that's PLINQ. Theoretically, for a large enough operation, that can be highly parallelized, the overhead of enumerators suddenly isn't the biggest part of the performance equation. What I mean is it essentially pays for itself. Also, given the issues with synchronization and other cross task communication (is your operation clustered over a network?) enumerators are actually not a bad idea since you can lock behind them or RPC behind them. Contrast that with C++ iterators that are usually lightly wrapped pointer ops and you realize their limitations fast: In order to enable all of the stuff you need to make iteration operations work with each other in parallel you have to wrap every iterator operator anyway, making it as "heavy" as an enumerator in .NET, not counting the general overhead of running managed code. So basically, PLINQ is where LINQ finally covers its costs - where it reaches the point where its advantages outweigh its disadvantages. All of this of course, is one developer's opinion. And some of this doesn't necessarily apply to business software, where performance almost doesn't matter for most scenarios.
Real programmers
I think the premise of Linq was to separate database operations (T-SQL) from C# developers. Personally I think they have replaced T-SQL for something that is not easy to learn and understand. Exactly the opposite of what T-SQL is. If one does not do Linq each and every day, but once in a while then Linq is difficult to quickly comprehend what is going on for someone asked to maintain it. Second, in Microsoft's universe there are ample resources(developers, fast computers, fast internet connections, and Azure is free. Most of their big clients are in a similar situation. And Microsoft can afford to hire those that are the best of the best. That is not necessarily true for their big clients, and probably not true for the bottom of their customer pyramid. Very few within Microsoft have to maintain complex codebases for long periods of time. And we can see what happens when they have to. How many repetitive breaking bugs have you seen in VS? New stuff is the focus, not fixing what is broken. Consider any product demo or new feature demo. It is thrown together by the back office developers to demonstrate the concepts of the new capabilities, carefully avoiding any and all complexities that would pop up in a production environment. The life time is a few weeks, because by then Microsoft devs have moved on to next thing, a new iteration, and terminology has changed. If performance is not an use, and one would prefer to keep database operations separate from the dedicated C# team then Linq is the way to go. You just need to ask yourself is this Microsoft flavor of the day the best solution for me? And to the person who thought the orderby and groupby operations were easy to understand, you should make a YT video and disperse your knowledge. Once can look at 5 Linq videos on this subject and be none the wiser.
-
I'm not a fan of LINQ. I love functional programming but .NET's enumerator paradigm is not up to the task. It creates too many objects too quickly to be a real grown up functional language, whose iteration is highly optimized because it's a first class operation. I've benched LINQ against hand written pseudo-functional operations that do the same thing. It was not encouraging. For things that make heavy use of functional computation like parser generators, where your LINQ query might be half a page, it's a Bad Idea(TM) Worse, I think its use has been over encouraged by Microsoft. It makes green developers write even worse code, and makes it harder for a seasoned developer to understand the performance implications of the code they are writing (and I'm not talking about bit twiddling here, I'm talking about figuring out your Big O expression) I tend to avoid its use, preferring - at least in C# - to make my iteration operations explicit and long hand. If .NET had truly optimized iteration paradigm - one that didn't create new objects for every single iteration operation** - i might consider using it. ** yes i understand that LINQ combines multiple operations into a single iteration *sometimes* - in practice it's not often enough to make up for the overhead of enumerators. Now, there's a case where all of the above doesn't matter, and that's PLINQ. Theoretically, for a large enough operation, that can be highly parallelized, the overhead of enumerators suddenly isn't the biggest part of the performance equation. What I mean is it essentially pays for itself. Also, given the issues with synchronization and other cross task communication (is your operation clustered over a network?) enumerators are actually not a bad idea since you can lock behind them or RPC behind them. Contrast that with C++ iterators that are usually lightly wrapped pointer ops and you realize their limitations fast: In order to enable all of the stuff you need to make iteration operations work with each other in parallel you have to wrap every iterator operator anyway, making it as "heavy" as an enumerator in .NET, not counting the general overhead of running managed code. So basically, PLINQ is where LINQ finally covers its costs - where it reaches the point where its advantages outweigh its disadvantages. All of this of course, is one developer's opinion. And some of this doesn't necessarily apply to business software, where performance almost doesn't matter for most scenarios.
Real programmers
honey the codewitch wrote:
Now, there's a case where all of the above doesn't matter, and that's PLINQ.
..that's where my argument ends. PLINQ on an in-memory database (SQLite). Everything else is storage. And that's either blobs or relational.
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
I'm a jerk i guess. I don't haggle. I charge based on how much I like the project, and I don't take jobs I don't like. I am fair about my invoices, and I itemize my time, but again, I don't haggle. Pay me or find someone else. I cost what I cost, and I'm always told I'm worth it when I'm told anything. One of my current clients actually told me I "walk on water" so I had to reduce his expectations for fear of drowning. :laugh:
Real programmers use butterflies
I wish I had that luxury, but to me (work > no work) and many of my potential customers rather work with Excel, an old system or even paper than pay "too much" for their custom software :sigh: My hourly rate is fine though, so there's some room for haggling, but only for specific projects. I don't doubt that if I'd cut my rate in half they'd still haggle though :laugh: And sometimes bringing a customer in is more important than money if I know they'll bring me more work in the future. At the end of the day I've got bills to pay and the only way how is with paying customers.
Best, Sander Azure DevOps Succinctly (free eBook) Azure Serverless Succinctly (free eBook) Migrating Apps to the Cloud with Azure arrgh.js - Bringing LINQ to JavaScript
-
I'm not a fan of LINQ. I love functional programming but .NET's enumerator paradigm is not up to the task. It creates too many objects too quickly to be a real grown up functional language, whose iteration is highly optimized because it's a first class operation. I've benched LINQ against hand written pseudo-functional operations that do the same thing. It was not encouraging. For things that make heavy use of functional computation like parser generators, where your LINQ query might be half a page, it's a Bad Idea(TM) Worse, I think its use has been over encouraged by Microsoft. It makes green developers write even worse code, and makes it harder for a seasoned developer to understand the performance implications of the code they are writing (and I'm not talking about bit twiddling here, I'm talking about figuring out your Big O expression) I tend to avoid its use, preferring - at least in C# - to make my iteration operations explicit and long hand. If .NET had truly optimized iteration paradigm - one that didn't create new objects for every single iteration operation** - i might consider using it. ** yes i understand that LINQ combines multiple operations into a single iteration *sometimes* - in practice it's not often enough to make up for the overhead of enumerators. Now, there's a case where all of the above doesn't matter, and that's PLINQ. Theoretically, for a large enough operation, that can be highly parallelized, the overhead of enumerators suddenly isn't the biggest part of the performance equation. What I mean is it essentially pays for itself. Also, given the issues with synchronization and other cross task communication (is your operation clustered over a network?) enumerators are actually not a bad idea since you can lock behind them or RPC behind them. Contrast that with C++ iterators that are usually lightly wrapped pointer ops and you realize their limitations fast: In order to enable all of the stuff you need to make iteration operations work with each other in parallel you have to wrap every iterator operator anyway, making it as "heavy" as an enumerator in .NET, not counting the general overhead of running managed code. So basically, PLINQ is where LINQ finally covers its costs - where it reaches the point where its advantages outweigh its disadvantages. All of this of course, is one developer's opinion. And some of this doesn't necessarily apply to business software, where performance almost doesn't matter for most scenarios.
Real programmers
I have always hated LINQ, because it involves learning new syntax and introducing limitations that make my coding life more difficult. It's for people, as in new Microsoft employees, educated in OOP but apparently not able to handle SQL. What's bizarre, is that now, with .NET Core 3.1, 5.0, etc, I can't use LINQ to update the database, because I use database views and stored procs to manage the data, and EF Core has no capability to handle that in LINQ, so I have to use their Exec methods, which, besides having to list all the parms and values, have limitations regarding return values, which have to be predefined as classes in the database context and models, even for a simple integer return.
-
I have always hated LINQ, because it involves learning new syntax and introducing limitations that make my coding life more difficult. It's for people, as in new Microsoft employees, educated in OOP but apparently not able to handle SQL. What's bizarre, is that now, with .NET Core 3.1, 5.0, etc, I can't use LINQ to update the database, because I use database views and stored procs to manage the data, and EF Core has no capability to handle that in LINQ, so I have to use their Exec methods, which, besides having to list all the parms and values, have limitations regarding return values, which have to be predefined as classes in the database context and models, even for a simple integer return.
Yeah, that's a whole different can of worms I didn't (and couldn't due to lack of experience) address in my OP. Microsoft's DB access has always left a lot to be desired. Every time they introduce some new RDBMS api layer it just builds on what's there without seeming to add much (linq over the DB and I'm basing this on your assessment above) - or when it does, it's brittle as hell, like the entity framework. I would appreciate the syntax, personally, for *functional programming* but I don't consider the .NET enumerator paradigm up to the task of first class functional operations like guided iteration. It's too inefficient, for starters. So LINQ builds on something that's not up to the job. My secondary criticism has to do with syntax and learning curve which i briefly addressed in my op, but on reflection i suppose my main criticism is introducing LINQ into an imperative language (C#) rather than say, making F# less arcane for people that are used to languages like Haskell.
Real programmers use butterflies
-
The other issue is that for those of us who write in VB.Net, the LINQ syntax is convoluted and ugly. At least in C# the LINQ syntax is concise. As for clarity, simple LINQ statements can be clear, but a loop can always be clearer. The other issue with LINQ is trying to figure out what the actual in the IEnumerator is. This is why C# added the var statement, because LINQ doesn't lend itself to clean typing.
Actually, I would discourage the use of var and encourage creating a class up front. This forces you to design what you want up front, which not only makes designing easier, but more importantly, you can serialize the class (with its groups, lists, etc) to an xml file and review it to make sure you are getting what you want. Var, IMHO, is a terrible way to go, especially for long term maintenance should something unexpected pop up. Serializing a class is extremely useful, especially for groupings, joins, and select many, etc. The tree view like structure in a serialized xml file is incredible to look at for complicated solutions. Guess another thing I am saying is that using var promotes backwards design. Create the Linq and hope for the best vs know what you want and design Linq to get you there, using serialization to test. Further, having the class greatly improves readability, a concern some have expressed. It’s a lot easier to know what Linq is doing if you can see what it produces. Regarding speed: I use PLinq for Next Generation Sequencing of the human genome. Six billion data points. It is plenty fast. Same goes for my CODIS search engine. CODIS is what you see on crime lab TV shows: here is the DNA profile from a crime scene. Are there any matches? PLinq was a godsend for that complicated process and is literally a million fold faster than Sql. Example: a 47 minute Sql search was 18 seconds for PLinq.
MeziLu
-
Usually it's not terrible. Not orders of magnitude slower for what most people seem to use it for - queries in business software. However, don't use it for what I'd call "real" functional programming. If you're going to write a parser generator or scanner generator for example, you don't want to compute your tables using linq. In that case it *will* be orders of magnitude slower than most anything you could write by hand. And I guess now you can tell what kind of software I write. :laugh:
Real programmers use butterflies
Ah, I understand. Thank you for the explanation! Full disclosure: I don't write generators. I prefer shouldering the burden to drive something data-driven from the get-go then to go through the intermediate step of writing a generator (that generates something that gets the actual job done). I find the one-step-approach way easier to debug, adopt to future changing requirements (which will of course change because that's what requirements simply do) and to teach to someone freshly joining the team. Should I ever be explicitly required to write a generator (instead of getting things done one way or another), I'll heed your words.
-
Ah, I understand. Thank you for the explanation! Full disclosure: I don't write generators. I prefer shouldering the burden to drive something data-driven from the get-go then to go through the intermediate step of writing a generator (that generates something that gets the actual job done). I find the one-step-approach way easier to debug, adopt to future changing requirements (which will of course change because that's what requirements simply do) and to teach to someone freshly joining the team. Should I ever be explicitly required to write a generator (instead of getting things done one way or another), I'll heed your words.
It's not so much about the code generation per se, but the type and amount of iterations you'll be doing. Consider the following source file that generates an LR table. The code is ugly because the algo is ugly. There's not much way around it. See the accompanying article for an explanation of the algo if you want: Downloads: GLR Parsing in C#: How to Use The Most Powerful Parsing Algorithm Known[^] The point in showing you this is the iteration code to generate things like the LRFA state graph. When I say generate above, I'm not talking about code generation, but simply computation of tables. Trying to do those things with LINQ - massive recursive iteration is just a mug's game.
Real programmers use butterflies
-
It's not so much about the code generation per se, but the type and amount of iterations you'll be doing. Consider the following source file that generates an LR table. The code is ugly because the algo is ugly. There's not much way around it. See the accompanying article for an explanation of the algo if you want: Downloads: GLR Parsing in C#: How to Use The Most Powerful Parsing Algorithm Known[^] The point in showing you this is the iteration code to generate things like the LRFA state graph. When I say generate above, I'm not talking about code generation, but simply computation of tables. Trying to do those things with LINQ - massive recursive iteration is just a mug's game.
Real programmers use butterflies