Converting to a Relational Database [modified]
-
In my company we are using a SQL Server 2005 Database to manage all our client and job records. However this database was originally designed by someone in the company who didn't know much about the database. We are reaching a point in our growth were this database needs to be converted to a relational database. So I copied the database locally to work with it. I used Access (2010) to analyze the database and set up the tables. When I go through the wizard I get to the last step where it shows all the data that seems to be similar so I can select what each record should be (correcting typographical errors step). This will eliminate duplicate client names, site names, etc. However there are over 10,000 records in the database, and to go through this manually would take a long time, as it seems there is no way to start it and finish it at a later time. Does anyone have an recommendation on how I can go about making this flat database into a relational database relatively easily? There is approximately between 16,000 and 17,000 records. Thanks in advanced for any help or suggestions.
modified on Friday, March 11, 2011 11:48 PM
-
In my company we are using a SQL Server 2005 Database to manage all our client and job records. However this database was originally designed by someone in the company who didn't know much about the database. We are reaching a point in our growth were this database needs to be converted to a relational database. So I copied the database locally to work with it. I used Access (2010) to analyze the database and set up the tables. When I go through the wizard I get to the last step where it shows all the data that seems to be similar so I can select what each record should be (correcting typographical errors step). This will eliminate duplicate client names, site names, etc. However there are over 10,000 records in the database, and to go through this manually would take a long time, as it seems there is no way to start it and finish it at a later time. Does anyone have an recommendation on how I can go about making this flat database into a relational database relatively easily? There is approximately between 16,000 and 17,000 records. Thanks in advanced for any help or suggestions.
modified on Friday, March 11, 2011 11:48 PM
Hi, One possibility is to use 'temporary' tables and
DISTINCT
queries. For example if you have a table with column SiteName and you have duplicate site names, you could do something like:SELECT DISTINCT SiteName INTO NewSites FROM Sites;
TRNCATE TABLE Sites;
INSERT INTO Sites SELECT * FROM NewSites;
DROP TABLE NewSites;In the real situation you most likely have more coĺumns and possibly other conditions. This is why I chose this 'temporary' table approach since you can easily check if the result is fine before you delete the original data and possibly do the duplicate elimination in smaller parts if that's more suitable. This can also be done in-place if you have some kind of mechanism to identify the row. If a table has an Id that's preferred and if it doesn't you could perhaps modify this http://www.codeproject.com/Tips/159984/How-to-remove-duplicate-rows-in-SQL-Server-2005-wh.aspx[^] to your needs.
The need to optimize rises from a bad design.My articles[^]
-
In my company we are using a SQL Server 2005 Database to manage all our client and job records. However this database was originally designed by someone in the company who didn't know much about the database. We are reaching a point in our growth were this database needs to be converted to a relational database. So I copied the database locally to work with it. I used Access (2010) to analyze the database and set up the tables. When I go through the wizard I get to the last step where it shows all the data that seems to be similar so I can select what each record should be (correcting typographical errors step). This will eliminate duplicate client names, site names, etc. However there are over 10,000 records in the database, and to go through this manually would take a long time, as it seems there is no way to start it and finish it at a later time. Does anyone have an recommendation on how I can go about making this flat database into a relational database relatively easily? There is approximately between 16,000 and 17,000 records. Thanks in advanced for any help or suggestions.
modified on Friday, March 11, 2011 11:48 PM
DisIsHoody wrote:
This will eliminate duplicate client names, site names, etc. However there are over 10,000 records in the database, and to go through this manually would take a long time, as it seems there is no way to start it and finish it at a later time. Does anyone have an recommendation on how I can go about making this flat database into a relational database relatively easily?
Second did not cause the first problem. That same problem can occur in a relational database. As per the other posting you can use distinct to eliminate duplicates but only on exact matches. Consider the following entries. Mikes Auto 431 W. Main Street Mike Auto 431 W. Main Street Mikes Auto 431 West Main St. Which of them are duplicates? Can one create code that eliminate duplicates from a list like this? Yes. Is it worth it for 10,000 records? Depends.
-
In my company we are using a SQL Server 2005 Database to manage all our client and job records. However this database was originally designed by someone in the company who didn't know much about the database. We are reaching a point in our growth were this database needs to be converted to a relational database. So I copied the database locally to work with it. I used Access (2010) to analyze the database and set up the tables. When I go through the wizard I get to the last step where it shows all the data that seems to be similar so I can select what each record should be (correcting typographical errors step). This will eliminate duplicate client names, site names, etc. However there are over 10,000 records in the database, and to go through this manually would take a long time, as it seems there is no way to start it and finish it at a later time. Does anyone have an recommendation on how I can go about making this flat database into a relational database relatively easily? There is approximately between 16,000 and 17,000 records. Thanks in advanced for any help or suggestions.
modified on Friday, March 11, 2011 11:48 PM
Watch using Access to SQL as Microsoft helps you by creating columns that are NOT what is wanted. I have to live with postal code being a float because the person moving data from Access to SQL did not know what they were doing. As long as we only do work in the US it is not a major problem but comparison can be tricky.