Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Database & SysAdmin
  3. Database
  4. Stored Procedure to Handle CSV

Stored Procedure to Handle CSV

Scheduled Pinned Locked Moved Database
databasealgorithmshelptutorialannouncement
7 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • F Offline
    F Offline
    Franklin Smith
    wrote on last edited by
    #1

    I have done some searching, and I do understand how to do some of what I want to do. But I can't just do a bulk insert into the database I'm already using. Sample Data

    Real Estate Number,Owner,Street Number,Street,Street Type,Street Direction,Unit,City,Zip Code
    004306-5000,MANWARREN JAMES P,10933,BRIDGES,RD,,,Jacksonville,32225-
    013092-0164,WILSON DANTE J,8438,MC GIRTS VILLAGE,LN,,,Jacksonville,32225
    013100-1035,LEWIS GARNET A,4435,JACKSON HOLE,CT,,,Jacksonville,32225
    048496-0040,ALLEN JON R,2830,WICKWIRE,ST,,2,Jacksonville,32225-
    112862-1000,CITY OF JACKSONVILLE,0,FT CAROLINE,RD,,,Jacksonville,32225
    112863-0000,ONAS CORPORATION,0,FT CAROLINE,RD,,,Jacksonville,32225

    Database structure

    SELECT [PropertyID]
    ,[RealEstateNumber]
    ,[Name]
    ,[StreetNumber]
    ,[StreetName]
    ,[Type]
    ,[Direction]
    ,[Unit]
    ,[City]
    ,[ZipCode]
    ,[LastReviewed]
    FROM [DuvalFL].[dbo].[PropertyLookupInformation]

    PropertyID is a uniqueidentifier. It defaults to newid() LastReviewed is a datetime. I need to get the CSV file(File name changes each time) into a temp table. I don't have a problem with an actual table being made to do a temporary insert and then dropping it at the end of the procedure. Truncate the Zip Code field to 5 characters.(some of them have a "-" after the first 5 digits) If the Real Estate Number exists in the database then update the record. Otherwise Insert a new record. I could do all this from my program, but the DB is faster than using for or foreach statements to process each record.

    M P 2 Replies Last reply
    0
    • F Franklin Smith

      I have done some searching, and I do understand how to do some of what I want to do. But I can't just do a bulk insert into the database I'm already using. Sample Data

      Real Estate Number,Owner,Street Number,Street,Street Type,Street Direction,Unit,City,Zip Code
      004306-5000,MANWARREN JAMES P,10933,BRIDGES,RD,,,Jacksonville,32225-
      013092-0164,WILSON DANTE J,8438,MC GIRTS VILLAGE,LN,,,Jacksonville,32225
      013100-1035,LEWIS GARNET A,4435,JACKSON HOLE,CT,,,Jacksonville,32225
      048496-0040,ALLEN JON R,2830,WICKWIRE,ST,,2,Jacksonville,32225-
      112862-1000,CITY OF JACKSONVILLE,0,FT CAROLINE,RD,,,Jacksonville,32225
      112863-0000,ONAS CORPORATION,0,FT CAROLINE,RD,,,Jacksonville,32225

      Database structure

      SELECT [PropertyID]
      ,[RealEstateNumber]
      ,[Name]
      ,[StreetNumber]
      ,[StreetName]
      ,[Type]
      ,[Direction]
      ,[Unit]
      ,[City]
      ,[ZipCode]
      ,[LastReviewed]
      FROM [DuvalFL].[dbo].[PropertyLookupInformation]

      PropertyID is a uniqueidentifier. It defaults to newid() LastReviewed is a datetime. I need to get the CSV file(File name changes each time) into a temp table. I don't have a problem with an actual table being made to do a temporary insert and then dropping it at the end of the procedure. Truncate the Zip Code field to 5 characters.(some of them have a "-" after the first 5 digits) If the Real Estate Number exists in the database then update the record. Otherwise Insert a new record. I could do all this from my program, but the DB is faster than using for or foreach statements to process each record.

      M Offline
      M Offline
      Mycroft Holmes
      wrote on last edited by
      #2

      I'm pretty sure you will not be able to use the table with propertyid as the target for bulkcopy. Column count and I think format must be exact for bulk copy to work. I load everyting into a temp table designed specifically to receive the data, all fields are varchar and after loading I run a stored proc to do the transforms into the final table. This article mat help CSV to Table to CSV [^]

      Never underestimate the power of human stupidity RAH

      F 1 Reply Last reply
      0
      • M Mycroft Holmes

        I'm pretty sure you will not be able to use the table with propertyid as the target for bulkcopy. Column count and I think format must be exact for bulk copy to work. I load everyting into a temp table designed specifically to receive the data, all fields are varchar and after loading I run a stored proc to do the transforms into the final table. This article mat help CSV to Table to CSV [^]

        Never underestimate the power of human stupidity RAH

        F Offline
        F Offline
        Franklin Smith
        wrote on last edited by
        #3

        I know the bulk insert won't work on the CSV itself. I know that the below code is incorrect to do much of what I want. Problem is I don't know how make it all work together. The first part should work as is, I just need to pass the file name to the procedure. Edited the first part here to actually work. Plus adding in all the normalization updates I know need to be there.

        CREATE TABLE [dbo].[TempCsvImport](
        [RealEstateNumber] [nvarchar](11) NOT NULL,
        [Name] [nvarchar](100) NOT NULL,
        [StreetNumber] [nvarchar](10) NOT NULL,
        [StreetName] [nvarchar](50) NOT NULL,
        [Type] [nvarchar](10) NOT NULL,
        [Direction] [nvarchar](10) NULL,
        [Unit] [nvarchar](20) NULL,
        [City] [nvarchar](20) NOT NULL,
        [ZipCode] [nvarchar](10) NOT NULL)
        GO

        SET QUOTED_IDENTIFIER OFF

        DECLARE @fn varchar(8000)
        SELECT @fn = 'C:\Users\owner\Documents\Visual Studio 2010\Projects\UPA Data Miner\UPA Data Miner\bin\Debug\634556001928942124.csv'
        DECLARE @cmd varchar(8000)
        SET @cmd = "BULK INSERT dbo.[TempCsvImport] FROM '" + @fn + "' WITH(CODEPAGE='RAW',FIELDTERMINATOR = ',', ROWTERMINATOR='\n')"

        EXEC(@cmd)

        GO

        UPDATE [dbo].[TempCsvImport]
        SET [ZipCode] = SUBSTRING(ZipCode,1,5)
        WHERE LEN([ZipCode]) != 5

        GO

        UPDATE [dbo].TempCsvImport
        SET RealEstateNumber = REPLACE(RealEstateNumber,"-","")
        WHERE CHARINDEX("-",RealEstateNumber) != 0

        The next part is where I have no clue how to make it all work. The '@' parameters all have information from a record in the TempCsvImport

        /*For each record in TempCsvImport*/

        SELECT * FROM [dbo].[PropertyLookupInformation]
        WHERE [dbo].[PropertyLookupInformation].[RealEstateNumber] = @RealEstateNumber
        /*If a record is found*/
        UPDATE [dbo].[PropertyLookupInformation]
        SET [Name] = @Name
        ,[StreetNumber] = @StreetNumber
        ,[StreetName] = @StreetName
        ,[Type] = @Type
        ,[Direction] = @Direction
        ,[Unit] = @Unit
        ,[City] = @City
        ,[ZipCode] = @ZipCode
        ,[LastReviewed] = CURRENT_TIMESTAMP
        WHERE RealEstateNumber = @RealEstateNumber

        /*Otherwise insert*/

        INSERT INTO [dbo].[PropertyLookupInformation]
        ([PropertyID]
        ,[RealEstateNumber]
        ,[Name]
        ,[StreetNumber]
        ,[StreetName]
        ,[Type]
        ,[Direction]
        ,[Unit]
        ,[City]
        ,[ZipCode]
        ,[LastReviewed])
        VALUES
        ( NEWID()
        ,@RealEstateNumber

        M 1 Reply Last reply
        0
        • F Franklin Smith

          I know the bulk insert won't work on the CSV itself. I know that the below code is incorrect to do much of what I want. Problem is I don't know how make it all work together. The first part should work as is, I just need to pass the file name to the procedure. Edited the first part here to actually work. Plus adding in all the normalization updates I know need to be there.

          CREATE TABLE [dbo].[TempCsvImport](
          [RealEstateNumber] [nvarchar](11) NOT NULL,
          [Name] [nvarchar](100) NOT NULL,
          [StreetNumber] [nvarchar](10) NOT NULL,
          [StreetName] [nvarchar](50) NOT NULL,
          [Type] [nvarchar](10) NOT NULL,
          [Direction] [nvarchar](10) NULL,
          [Unit] [nvarchar](20) NULL,
          [City] [nvarchar](20) NOT NULL,
          [ZipCode] [nvarchar](10) NOT NULL)
          GO

          SET QUOTED_IDENTIFIER OFF

          DECLARE @fn varchar(8000)
          SELECT @fn = 'C:\Users\owner\Documents\Visual Studio 2010\Projects\UPA Data Miner\UPA Data Miner\bin\Debug\634556001928942124.csv'
          DECLARE @cmd varchar(8000)
          SET @cmd = "BULK INSERT dbo.[TempCsvImport] FROM '" + @fn + "' WITH(CODEPAGE='RAW',FIELDTERMINATOR = ',', ROWTERMINATOR='\n')"

          EXEC(@cmd)

          GO

          UPDATE [dbo].[TempCsvImport]
          SET [ZipCode] = SUBSTRING(ZipCode,1,5)
          WHERE LEN([ZipCode]) != 5

          GO

          UPDATE [dbo].TempCsvImport
          SET RealEstateNumber = REPLACE(RealEstateNumber,"-","")
          WHERE CHARINDEX("-",RealEstateNumber) != 0

          The next part is where I have no clue how to make it all work. The '@' parameters all have information from a record in the TempCsvImport

          /*For each record in TempCsvImport*/

          SELECT * FROM [dbo].[PropertyLookupInformation]
          WHERE [dbo].[PropertyLookupInformation].[RealEstateNumber] = @RealEstateNumber
          /*If a record is found*/
          UPDATE [dbo].[PropertyLookupInformation]
          SET [Name] = @Name
          ,[StreetNumber] = @StreetNumber
          ,[StreetName] = @StreetName
          ,[Type] = @Type
          ,[Direction] = @Direction
          ,[Unit] = @Unit
          ,[City] = @City
          ,[ZipCode] = @ZipCode
          ,[LastReviewed] = CURRENT_TIMESTAMP
          WHERE RealEstateNumber = @RealEstateNumber

          /*Otherwise insert*/

          INSERT INTO [dbo].[PropertyLookupInformation]
          ([PropertyID]
          ,[RealEstateNumber]
          ,[Name]
          ,[StreetNumber]
          ,[StreetName]
          ,[Type]
          ,[Direction]
          ,[Unit]
          ,[City]
          ,[ZipCode]
          ,[LastReviewed])
          VALUES
          ( NEWID()
          ,@RealEstateNumber

          M Offline
          M Offline
          Mycroft Holmes
          wrote on last edited by
          #4

          I never do this type of operation from within the database, too fragile with permissions and operations that are not part of a database operation. Like checking for the source file, moving it to archive and general IO work. Break your process into 2 operations, does the bulk load work! Now that you have the data in a table (change it to a temp table later). You logic of using variables implies a cursor, something to avoid if possible. What is wrong with a normal update and insert query?

          Update D Set D,Field = S.Field
          --Select *
          from DestTable D inner join SourceTable S on D.Key = S.Key

          Insert DestTable
          Select Field1....
          From SourceTable
          Where KeyField not in(select keyfield from desttable)

          Never underestimate the power of human stupidity RAH

          F 1 Reply Last reply
          0
          • M Mycroft Holmes

            I never do this type of operation from within the database, too fragile with permissions and operations that are not part of a database operation. Like checking for the source file, moving it to archive and general IO work. Break your process into 2 operations, does the bulk load work! Now that you have the data in a table (change it to a temp table later). You logic of using variables implies a cursor, something to avoid if possible. What is wrong with a normal update and insert query?

            Update D Set D,Field = S.Field
            --Select *
            from DestTable D inner join SourceTable S on D.Key = S.Key

            Insert DestTable
            Select Field1....
            From SourceTable
            Where KeyField not in(select keyfield from desttable)

            Never underestimate the power of human stupidity RAH

            F Offline
            F Offline
            Franklin Smith
            wrote on last edited by
            #5

            All the code where I created the temp table and all works like a charm. The update/insert phase is where I have a problem. It has to do with my limited knowledge of SQL. Of course during all this I found out the data was being inserted wrong so I have to redo what I've already inserted which gives me a chance to fix the problem. Thank god it isn't live data. But I finally decided to go with

            SET ANSI_WARNINGS OFF
            MERGE PropertyLookupInformation as targetDB
            USING TempCsvImport as sourceDB
            ON targetDB.RealEstateNumber = sourceDB.RealEstateNumber
            WHEN MATCHED
            THEN
            UPDATE
            SET Name = sourceDB.Name
            ,StreetNumber = sourceDB.StreetNumber
            ,StreetName = sourceDB.StreetName
            ,Type = sourceDB.Type
            ,Direction = sourceDB.Direction
            ,Unit = sourceDB.Unit
            ,City = sourceDB.City
            ,ZipCode = sourceDB.ZipCode
            ,LastReviewed = CURRENT_TIMESTAMP
            WHEN NOT MATCHED BY TARGET
            THEN
            INSERT
            (PropertyID
            ,RealEstateNumber
            ,Name
            ,StreetNumber
            ,StreetName
            ,Type
            ,Direction
            ,Unit
            ,City
            ,ZipCode
            ,LastReviewed)
            VALUES
            ( NEWID()
            ,sourceDB.RealEstateNumber
            ,sourceDB.Name
            ,sourceDB.StreetNumber
            ,sourceDB.StreetName
            ,sourceDB.Type
            ,sourceDB.Direction
            ,sourceDB.Unit
            ,sourceDB.City
            ,sourceDB.ZipCode
            ,CURRENT_TIMESTAMP)
            ;

            The only thing I'm not sure of doing it this way is performance issues. Now just to get it put into a stored proc and we are set and ready to go.

            M 1 Reply Last reply
            0
            • F Franklin Smith

              All the code where I created the temp table and all works like a charm. The update/insert phase is where I have a problem. It has to do with my limited knowledge of SQL. Of course during all this I found out the data was being inserted wrong so I have to redo what I've already inserted which gives me a chance to fix the problem. Thank god it isn't live data. But I finally decided to go with

              SET ANSI_WARNINGS OFF
              MERGE PropertyLookupInformation as targetDB
              USING TempCsvImport as sourceDB
              ON targetDB.RealEstateNumber = sourceDB.RealEstateNumber
              WHEN MATCHED
              THEN
              UPDATE
              SET Name = sourceDB.Name
              ,StreetNumber = sourceDB.StreetNumber
              ,StreetName = sourceDB.StreetName
              ,Type = sourceDB.Type
              ,Direction = sourceDB.Direction
              ,Unit = sourceDB.Unit
              ,City = sourceDB.City
              ,ZipCode = sourceDB.ZipCode
              ,LastReviewed = CURRENT_TIMESTAMP
              WHEN NOT MATCHED BY TARGET
              THEN
              INSERT
              (PropertyID
              ,RealEstateNumber
              ,Name
              ,StreetNumber
              ,StreetName
              ,Type
              ,Direction
              ,Unit
              ,City
              ,ZipCode
              ,LastReviewed)
              VALUES
              ( NEWID()
              ,sourceDB.RealEstateNumber
              ,sourceDB.Name
              ,sourceDB.StreetNumber
              ,sourceDB.StreetName
              ,sourceDB.Type
              ,sourceDB.Direction
              ,sourceDB.Unit
              ,sourceDB.City
              ,sourceDB.ZipCode
              ,CURRENT_TIMESTAMP)
              ;

              The only thing I'm not sure of doing it this way is performance issues. Now just to get it put into a stored proc and we are set and ready to go.

              M Offline
              M Offline
              Mycroft Holmes
              wrote on last edited by
              #6

              Brownie points for using the MERGE functions, I dislike them only b/c I consider separate update/insert queries simpler to support. You may be right to suspect the performance but I have no helpful info on that.

              Never underestimate the power of human stupidity RAH

              1 Reply Last reply
              0
              • F Franklin Smith

                I have done some searching, and I do understand how to do some of what I want to do. But I can't just do a bulk insert into the database I'm already using. Sample Data

                Real Estate Number,Owner,Street Number,Street,Street Type,Street Direction,Unit,City,Zip Code
                004306-5000,MANWARREN JAMES P,10933,BRIDGES,RD,,,Jacksonville,32225-
                013092-0164,WILSON DANTE J,8438,MC GIRTS VILLAGE,LN,,,Jacksonville,32225
                013100-1035,LEWIS GARNET A,4435,JACKSON HOLE,CT,,,Jacksonville,32225
                048496-0040,ALLEN JON R,2830,WICKWIRE,ST,,2,Jacksonville,32225-
                112862-1000,CITY OF JACKSONVILLE,0,FT CAROLINE,RD,,,Jacksonville,32225
                112863-0000,ONAS CORPORATION,0,FT CAROLINE,RD,,,Jacksonville,32225

                Database structure

                SELECT [PropertyID]
                ,[RealEstateNumber]
                ,[Name]
                ,[StreetNumber]
                ,[StreetName]
                ,[Type]
                ,[Direction]
                ,[Unit]
                ,[City]
                ,[ZipCode]
                ,[LastReviewed]
                FROM [DuvalFL].[dbo].[PropertyLookupInformation]

                PropertyID is a uniqueidentifier. It defaults to newid() LastReviewed is a datetime. I need to get the CSV file(File name changes each time) into a temp table. I don't have a problem with an actual table being made to do a temporary insert and then dropping it at the end of the procedure. Truncate the Zip Code field to 5 characters.(some of them have a "-" after the first 5 digits) If the Real Estate Number exists in the database then update the record. Otherwise Insert a new record. I could do all this from my program, but the DB is faster than using for or foreach statements to process each record.

                P Online
                P Online
                PIEBALDconsult
                wrote on last edited by
                #7

                When I've used BCP in the past, I've created a special table to hold the raw data (all as varchar) and used a trigger to move (and convert) the data to the real destination. This technique also allows for adding foreign keys as necessary. On the other hand, I only use BCP for occasional imports, if I have data I need to import frequently I write a console app or Windows Service to do it.

                1 Reply Last reply
                0
                Reply
                • Reply as topic
                Log in to reply
                • Oldest to Newest
                • Newest to Oldest
                • Most Votes


                • Login

                • Don't have an account? Register

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • World
                • Users
                • Groups