Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Database & SysAdmin
  3. Database
  4. SQLite: Correlated "EXISTS" vs uncorrelated "IN"

SQLite: Correlated "EXISTS" vs uncorrelated "IN"

Scheduled Pinned Locked Moved Database
discussiondatabasesqlitevisual-studioregex
6 Posts 5 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Offline
    M Offline
    Midi_Mick
    wrote on last edited by
    #1

    Just after opinions here: which of these would be the best practice for an SQLite database:

    DELETE FROM Messages WHERE Account = @acct AND NOT EXISTS (SELECT * FROM temp.IdList WHERE MailId=Id)
    or
    DELETE FROM Messages WHERE Account = @acct AND Id NOT IN (SELECT MailId FROM temp.IdList)

    Indexes exist for both Messages(Account, Id) and temp.IdList(MailId) Each table may hold in excess of 20000 rows From what I understand, the correlated exists will need to be evaluated for each row of the Messages table, but will stop evaluating as soon as it finds a match (and being indexed, should be pretty quick). However, with the IN expression, the full result set is evaluated only once, but the comparison will be against that full set. Cheers, Mick

    Richard Andrew x64R G C J M 5 Replies Last reply
    0
    • M Midi_Mick

      Just after opinions here: which of these would be the best practice for an SQLite database:

      DELETE FROM Messages WHERE Account = @acct AND NOT EXISTS (SELECT * FROM temp.IdList WHERE MailId=Id)
      or
      DELETE FROM Messages WHERE Account = @acct AND Id NOT IN (SELECT MailId FROM temp.IdList)

      Indexes exist for both Messages(Account, Id) and temp.IdList(MailId) Each table may hold in excess of 20000 rows From what I understand, the correlated exists will need to be evaluated for each row of the Messages table, but will stop evaluating as soon as it finds a match (and being indexed, should be pretty quick). However, with the IN expression, the full result set is evaluated only once, but the comparison will be against that full set. Cheers, Mick

      Richard Andrew x64R Offline
      Richard Andrew x64R Offline
      Richard Andrew x64
      wrote on last edited by
      #2

      Just going by your analysis, I would say the NOT EXISTS would be better. As you say if the table is indexed it should be pretty quick.

      The difficult we do right away... ...the impossible takes slightly longer.

      1 Reply Last reply
      0
      • M Midi_Mick

        Just after opinions here: which of these would be the best practice for an SQLite database:

        DELETE FROM Messages WHERE Account = @acct AND NOT EXISTS (SELECT * FROM temp.IdList WHERE MailId=Id)
        or
        DELETE FROM Messages WHERE Account = @acct AND Id NOT IN (SELECT MailId FROM temp.IdList)

        Indexes exist for both Messages(Account, Id) and temp.IdList(MailId) Each table may hold in excess of 20000 rows From what I understand, the correlated exists will need to be evaluated for each row of the Messages table, but will stop evaluating as soon as it finds a match (and being indexed, should be pretty quick). However, with the IN expression, the full result set is evaluated only once, but the comparison will be against that full set. Cheers, Mick

        G Offline
        G Offline
        GuyThiebaut
        wrote on last edited by
        #3

        I cannot speak for SQLite however I can for SQL Server(which perhaps does not help very much here). It is best to never use a NOT IN statement, where the results returned could contain many rows, as it will need to evaluate for every single row in the inner select. Whereas with a NOT EXISTS or a LEFT JOIN with a IS NOT NULL condition, which is another option, it will evaluate as false as soon as it hits the first row where there is a match. So basically I agree with your evaluation - for what it's worth I always use NOT EXISTS nowadays.

        “That which can be asserted without evidence, can be dismissed without evidence.”

        ― Christopher Hitchens

        1 Reply Last reply
        0
        • M Midi_Mick

          Just after opinions here: which of these would be the best practice for an SQLite database:

          DELETE FROM Messages WHERE Account = @acct AND NOT EXISTS (SELECT * FROM temp.IdList WHERE MailId=Id)
          or
          DELETE FROM Messages WHERE Account = @acct AND Id NOT IN (SELECT MailId FROM temp.IdList)

          Indexes exist for both Messages(Account, Id) and temp.IdList(MailId) Each table may hold in excess of 20000 rows From what I understand, the correlated exists will need to be evaluated for each row of the Messages table, but will stop evaluating as soon as it finds a match (and being indexed, should be pretty quick). However, with the IN expression, the full result set is evaluated only once, but the comparison will be against that full set. Cheers, Mick

          C Offline
          C Offline
          CHill60
          wrote on last edited by
          #4

          I agree with the two previous comments - NOT EXISTS will give better performance. I found this article some time ago - again it's for SQL rather than SQLite but the principles still apply ... NOT IN vs. NOT EXISTS vs. OUTER APPLY vs. OUTER JOIN[^]

          1 Reply Last reply
          0
          • M Midi_Mick

            Just after opinions here: which of these would be the best practice for an SQLite database:

            DELETE FROM Messages WHERE Account = @acct AND NOT EXISTS (SELECT * FROM temp.IdList WHERE MailId=Id)
            or
            DELETE FROM Messages WHERE Account = @acct AND Id NOT IN (SELECT MailId FROM temp.IdList)

            Indexes exist for both Messages(Account, Id) and temp.IdList(MailId) Each table may hold in excess of 20000 rows From what I understand, the correlated exists will need to be evaluated for each row of the Messages table, but will stop evaluating as soon as it finds a match (and being indexed, should be pretty quick). However, with the IN expression, the full result set is evaluated only once, but the comparison will be against that full set. Cheers, Mick

            J Offline
            J Offline
            Jorgen Andersson
            wrote on last edited by
            #5

            NOT IN and NOT EXISTS are not directly comparable since they treat nulls differently. EXISTS returns TRUE or FALSE only, IN is trivalent and returns TRUE, FALSE or NULL. Use the one that gives you the result you expect. (Most probably NOT EXISTS)

            Wrong is evil and must be defeated. - Jeff Ello

            1 Reply Last reply
            0
            • M Midi_Mick

              Just after opinions here: which of these would be the best practice for an SQLite database:

              DELETE FROM Messages WHERE Account = @acct AND NOT EXISTS (SELECT * FROM temp.IdList WHERE MailId=Id)
              or
              DELETE FROM Messages WHERE Account = @acct AND Id NOT IN (SELECT MailId FROM temp.IdList)

              Indexes exist for both Messages(Account, Id) and temp.IdList(MailId) Each table may hold in excess of 20000 rows From what I understand, the correlated exists will need to be evaluated for each row of the Messages table, but will stop evaluating as soon as it finds a match (and being indexed, should be pretty quick). However, with the IN expression, the full result set is evaluated only once, but the comparison will be against that full set. Cheers, Mick

              M Offline
              M Offline
              Midi_Mick
              wrote on last edited by
              #6

              Ok guys - I guess the verdict's in. It's what I thought initially thought would be the case too. Its just that the doco on Correlated vs Uncorrelated sub-queries threw me a little, where correlated sub-queries are re-evaluated for each row of the main query. I do agree, however, that re-evaluating the sub-query will be significantly better than searching the full result set. Cheers people.

              1 Reply Last reply
              0
              Reply
              • Reply as topic
              Log in to reply
              • Oldest to Newest
              • Newest to Oldest
              • Most Votes


              • Login

              • Don't have an account? Register

              • Login or register to search.
              • First post
                Last post
              0
              • Categories
              • Recent
              • Tags
              • Popular
              • World
              • Users
              • Groups