Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Database & SysAdmin
  3. Database
  4. Dealing with Duplicate Records

Dealing with Duplicate Records

Scheduled Pinned Locked Moved Database
databasealgorithmsquestion
7 Posts 6 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • X Offline
    X Offline
    xdavidx
    wrote on last edited by
    #1

    I am writing an IRC bot that keeps track of peoples usernames and ip address's whenever they join my channel. Its a pretty active channel so I am adding records quite often. I was just curious how I should deal with duplicate records. The way I see it, I have 3 options: 1. Before adding a record to my database, check to see if it already exists. 2. Add record to db, then once a week or something run a procedure to purge all duplicate records 3. Dont worry about having dupes in the DB. Option 3 is obviously the easiest :-D But least efficient (i think?:confused: ) I was planning on doing searching and reports against the data, so with that in mind, which option is my best bet?

    C L M T 4 Replies Last reply
    0
    • X xdavidx

      I am writing an IRC bot that keeps track of peoples usernames and ip address's whenever they join my channel. Its a pretty active channel so I am adding records quite often. I was just curious how I should deal with duplicate records. The way I see it, I have 3 options: 1. Before adding a record to my database, check to see if it already exists. 2. Add record to db, then once a week or something run a procedure to purge all duplicate records 3. Dont worry about having dupes in the DB. Option 3 is obviously the easiest :-D But least efficient (i think?:confused: ) I was planning on doing searching and reports against the data, so with that in mind, which option is my best bet?

      C Offline
      C Offline
      Christian Graus
      wrote on last edited by
      #2

      Why not add a constraint to your database and make their username a primary key, then check if the record exists before adding, and offer the user the chance to update their record ( assuming they have passed some check to ensure they are the original owner of the username ) Christian No offense, but I don't really want to encourage the creation of another VB developer. - Larry Antram 22 Oct 2002
      C# will attract all comers, where VB is for IT Journalists and managers - Michael P Butler 05-12-2002
      Again, you can screw up a C/C++ program just as easily as a VB program. OK, maybe not as easily, but it's certainly doable. - Jamie Nordmeyer - 15-Nov-2002

      1 Reply Last reply
      0
      • X xdavidx

        I am writing an IRC bot that keeps track of peoples usernames and ip address's whenever they join my channel. Its a pretty active channel so I am adding records quite often. I was just curious how I should deal with duplicate records. The way I see it, I have 3 options: 1. Before adding a record to my database, check to see if it already exists. 2. Add record to db, then once a week or something run a procedure to purge all duplicate records 3. Dont worry about having dupes in the DB. Option 3 is obviously the easiest :-D But least efficient (i think?:confused: ) I was planning on doing searching and reports against the data, so with that in mind, which option is my best bet?

        L Offline
        L Offline
        leppie
        wrote on last edited by
        #3

        Adding a UNIQUE constraint to that column, will prevent you from adding "another" item that is the same. :) WebBoxes - Yet another collapsable control, but it relies on a "graphics server" for dynamic pretty rounded corners, cool arrows and unlimited font support.

        1 Reply Last reply
        0
        • X xdavidx

          I am writing an IRC bot that keeps track of peoples usernames and ip address's whenever they join my channel. Its a pretty active channel so I am adding records quite often. I was just curious how I should deal with duplicate records. The way I see it, I have 3 options: 1. Before adding a record to my database, check to see if it already exists. 2. Add record to db, then once a week or something run a procedure to purge all duplicate records 3. Dont worry about having dupes in the DB. Option 3 is obviously the easiest :-D But least efficient (i think?:confused: ) I was planning on doing searching and reports against the data, so with that in mind, which option is my best bet?

          M Offline
          M Offline
          mwilliamson
          wrote on last edited by
          #4

          The simple way to solve your problem is to run DELETE * FROM Table WHERE sUserName = 'whatever' INSERT INTO Table (...) VALUES (...) every time. This way you will have no dupes. You can also make a sorted procedure like IF EXISTS( SELECT * FROM Table WHERE sUserName = 'whatever' ) THEN UPDATE Table SET .... ELSE INSERT INTO Table (...) VALUES (...) END

          1 Reply Last reply
          0
          • X xdavidx

            I am writing an IRC bot that keeps track of peoples usernames and ip address's whenever they join my channel. Its a pretty active channel so I am adding records quite often. I was just curious how I should deal with duplicate records. The way I see it, I have 3 options: 1. Before adding a record to my database, check to see if it already exists. 2. Add record to db, then once a week or something run a procedure to purge all duplicate records 3. Dont worry about having dupes in the DB. Option 3 is obviously the easiest :-D But least efficient (i think?:confused: ) I was planning on doing searching and reports against the data, so with that in mind, which option is my best bet?

            T Offline
            T Offline
            Tatham
            wrote on last edited by
            #5

            I see the fastest and easiest option as being number 1. However, what is your DB running on - Access, SQL? If you're running running on either of those (at least with SQL) it wouldbe extremely easy to implement option 1. Also, if your storing effectively a hashtable of usernames and IP's, there is rarley going to be dupes. I might sign in 5 times over 5 days and be listed as 5 unique entries for all of my new IP's. Very few IRC go'ers willhave a static IP. Tatham Oddie (VB.NET/C#/ASP.NET/VB6/ASP/JavaScript) tatham@e-oddie.com +61 414 275 989

            R 1 Reply Last reply
            0
            • T Tatham

              I see the fastest and easiest option as being number 1. However, what is your DB running on - Access, SQL? If you're running running on either of those (at least with SQL) it wouldbe extremely easy to implement option 1. Also, if your storing effectively a hashtable of usernames and IP's, there is rarley going to be dupes. I might sign in 5 times over 5 days and be listed as 5 unique entries for all of my new IP's. Very few IRC go'ers willhave a static IP. Tatham Oddie (VB.NET/C#/ASP.NET/VB6/ASP/JavaScript) tatham@e-oddie.com +61 414 275 989

              R Offline
              R Offline
              Rein Hillmann
              wrote on last edited by
              #6

              Tatham wrote: Very few IRC go'ers willhave a static IP. You mean except for those people who IRC from behind corporate gateways and those on Cable and DSL modems (with static IPs). :rolleyes:

              T 1 Reply Last reply
              0
              • R Rein Hillmann

                Tatham wrote: Very few IRC go'ers willhave a static IP. You mean except for those people who IRC from behind corporate gateways and those on Cable and DSL modems (with static IPs). :rolleyes:

                T Offline
                T Offline
                Tatham
                wrote on last edited by
                #7

                Reinout Hillmann wrote: who IRC from behind corporate gateways IRC isn't exactly the most popular thing behind a company firewall and use of it behind a COMPANY firewall means your not doing your job. Reinout Hillmann wrote: and those on Cable and DSL modems Not everybody has cable/DSL. Reinout Hillmann wrote: (with static IPs) Even with cable/DSL, if you turn the momem off then back on your IP can still change. FEW will have a static IP. Not NONE, FEW. Tatham Oddie (VB.NET/C#/ASP.NET/VB6/ASP/JavaScript) tatham@e-oddie.com +61 414 275 989

                1 Reply Last reply
                0
                Reply
                • Reply as topic
                Log in to reply
                • Oldest to Newest
                • Newest to Oldest
                • Most Votes


                • Login

                • Don't have an account? Register

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • World
                • Users
                • Groups