Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Linq TO SQL DistinctBy Question

Linq TO SQL DistinctBy Question

Scheduled Pinned Locked Moved C#
databasequestioncsharplinqhelp
4 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • K Offline
    K Offline
    Kevin Marois
    wrote on last edited by
    #1

    I hope this is clear. If not, I can explain further. I have a table that [looks like this](https://1drv.ms/u/s!AlkRTpT49yCMmgO6gF-1gD-8UnlR) I will be running this Linq to SQL query repeatedly

    var maxRev = 2;
    var jobSequenceSheets = (from jss in dc.JobSequenceSheets
    where jss.JobId == jobId &&
    jss.RevisionNumber == maxRev
    select jss).OrderBy(x => x.Plan)
    .ThenBy(x => x.Elevation)
    .DistinctBy(c => new { c.Plan, c.Elevation})
    .ToList();

    So, of you look at the image, given RevisionNumber 2, I should get back only two records with Plan & Elevations of 1A and 2A. There are two rows for 2A because of the Lot. I don't care about the lot. All I care about for a new feature I'm working on is that I get back a distinct set of Plan/Elevations with the SAME ID'S EACH TIME IT'S RUN. Since there are two rows for 2A, given this query, I should get back only 2 rows (not 3). Can I expect to get back the same ID's each time? Does DistinctBy use the FIRST matching row it finds? I would expect this query to give me back rows 22607 and 22608 each run. Thanks

    If it's not broken, fix it until it is. Everything makes sense in someone's mind. Ya can't fix stupid.

    OriginalGriffO Richard DeemingR 2 Replies Last reply
    0
    • K Kevin Marois

      I hope this is clear. If not, I can explain further. I have a table that [looks like this](https://1drv.ms/u/s!AlkRTpT49yCMmgO6gF-1gD-8UnlR) I will be running this Linq to SQL query repeatedly

      var maxRev = 2;
      var jobSequenceSheets = (from jss in dc.JobSequenceSheets
      where jss.JobId == jobId &&
      jss.RevisionNumber == maxRev
      select jss).OrderBy(x => x.Plan)
      .ThenBy(x => x.Elevation)
      .DistinctBy(c => new { c.Plan, c.Elevation})
      .ToList();

      So, of you look at the image, given RevisionNumber 2, I should get back only two records with Plan & Elevations of 1A and 2A. There are two rows for 2A because of the Lot. I don't care about the lot. All I care about for a new feature I'm working on is that I get back a distinct set of Plan/Elevations with the SAME ID'S EACH TIME IT'S RUN. Since there are two rows for 2A, given this query, I should get back only 2 rows (not 3). Can I expect to get back the same ID's each time? Does DistinctBy use the FIRST matching row it finds? I would expect this query to give me back rows 22607 and 22608 each run. Thanks

      If it's not broken, fix it until it is. Everything makes sense in someone's mind. Ya can't fix stupid.

      OriginalGriffO Offline
      OriginalGriffO Offline
      OriginalGriff
      wrote on last edited by
      #2

      Since you are creating a new instance for each item you check in the comparer, it will return every instance as a new, distinct element, since it will compare instance references and by definition new returns different references each time it is called. Have a look here: C# – DistinctBy extension[^] - it explains the extension method quite well.

      "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt AntiTwitter: @DalekDave is now a follower!

      "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
      "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt

      Richard DeemingR 1 Reply Last reply
      0
      • OriginalGriffO OriginalGriff

        Since you are creating a new instance for each item you check in the comparer, it will return every instance as a new, distinct element, since it will compare instance references and by definition new returns different references each time it is called. Have a look here: C# – DistinctBy extension[^] - it explains the extension method quite well.

        "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt AntiTwitter: @DalekDave is now a follower!

        Richard DeemingR Offline
        Richard DeemingR Offline
        Richard Deeming
        wrote on last edited by
        #3

        Not quite. An anonymous type uses value equality, not reference equality:

        Anonymous Types | Microsoft Docs[^]:

        Because the Equals and GetHashCode methods on anonymous types are defined in terms of the Equals and GetHashCode methods of the properties, two instances of the same anonymous type are equal only if all their properties are equal.

        Also, the DistinctBy operator which was added in .NET 6 uses a different approach from the blog you linked to: runtime/Distinct.cs at ebba1d4acb7abea5ba15e1f7f69d1d1311465d16 · dotnet/runtime · GitHub[^] And the answer will also depend on whether the DistinctBy method gets translated to SQL, or evaluated on the client.


        "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

        "These people looked deep within my soul and assigned me a number based on the order in which I joined" - Homer

        1 Reply Last reply
        0
        • K Kevin Marois

          I hope this is clear. If not, I can explain further. I have a table that [looks like this](https://1drv.ms/u/s!AlkRTpT49yCMmgO6gF-1gD-8UnlR) I will be running this Linq to SQL query repeatedly

          var maxRev = 2;
          var jobSequenceSheets = (from jss in dc.JobSequenceSheets
          where jss.JobId == jobId &&
          jss.RevisionNumber == maxRev
          select jss).OrderBy(x => x.Plan)
          .ThenBy(x => x.Elevation)
          .DistinctBy(c => new { c.Plan, c.Elevation})
          .ToList();

          So, of you look at the image, given RevisionNumber 2, I should get back only two records with Plan & Elevations of 1A and 2A. There are two rows for 2A because of the Lot. I don't care about the lot. All I care about for a new feature I'm working on is that I get back a distinct set of Plan/Elevations with the SAME ID'S EACH TIME IT'S RUN. Since there are two rows for 2A, given this query, I should get back only 2 rows (not 3). Can I expect to get back the same ID's each time? Does DistinctBy use the FIRST matching row it finds? I would expect this query to give me back rows 22607 and 22608 each run. Thanks

          If it's not broken, fix it until it is. Everything makes sense in someone's mind. Ya can't fix stupid.

          Richard DeemingR Offline
          Richard DeemingR Offline
          Richard Deeming
          wrote on last edited by
          #4

          It depends. Are you using the DistinctBy method added in .NET 6, or a different implementation? And does the ORM you're using translated DistinctBy to SQL, or does it evaluate it on the client? If it evaluates it on the client, and you're using the .NET 6 method or something equivalent, then technically it will return the first item it encounters within each group. However, this is an implementation detail, and you cannot rely on it. And since you don't order by the ID, you can't guarantee that the database will return the records in any ID-related order. If you always want the lowest ID, then you need to be explicit:

          var jobSequenceSheets = dc.JobSequenceSheets
          .Where(jss => jss.JobId == jobId)
          .Where(jss => jss.RevisionNumber == maxRev)
          .GroupBy(jss => new { jss.Plan, jss.Elevation }, (_, items) => items.OrderBy(jss => jss.ID).First())
          .OrderBy(jss => jss.Plan).ThenBy(jss => jss.Elevation)
          .ToList();

          NB: Depending on your ORM, you might need to stick an .AsEnumerable() between the second .Where(...) and the .GroupBy(...) to force client evaluation of the grouping.


          "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

          "These people looked deep within my soul and assigned me a number based on the order in which I joined" - Homer

          1 Reply Last reply
          0
          Reply
          • Reply as topic
          Log in to reply
          • Oldest to Newest
          • Newest to Oldest
          • Most Votes


          • Login

          • Don't have an account? Register

          • Login or register to search.
          • First post
            Last post
          0
          • Categories
          • Recent
          • Tags
          • Popular
          • World
          • Users
          • Groups