Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Design and Architecture
  4. General Architecture query - SSIS / XML / BizTalk [modified]

General Architecture query - SSIS / XML / BizTalk [modified]

Scheduled Pinned Locked Moved Design and Architecture
databasesql-servercsharpsysadminbusiness
7 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B Offline
    B Offline
    bgallagher_lib
    wrote on last edited by
    #1

    Hi I am working on a project to process three files...two CSVs and one XML. These files are being moved from a file share to a SQL Server database table using BizTalk. The xml file is being transformed into the same flat file format as the two flat files using a C# component in SSIS. Then, these flat files are processed by SSIS packages. There is a lot of business logic in the SSIS transformations. The SSIS packages also do several look-ups using linked servers. All lookups and transforms are done on a row-by-row basis (which is slow). Also, any errors that occur are put in a separate database table depending on the business object that causes the error (i.e. BusObj1_error, BusObj2_error, BusObj3_error). Basically, I was hoping someone could suggest a better architecture that would improve performance, allow scalability and flexibility, and allow many developers works as a team on the same pieces of functionality. E.g. - Put validation rules in a db rather than hardcoded into SSIS. - Instead of using different error tables, use a single error table with a errorTypeId FK to an ErrorType table. - Migrate all transformations from SSIS C# so that multiple developers can work on different Business logic classes at same time. Thanks

    modified on Wednesday, August 3, 2011 5:28 AM

    J M 2 Replies Last reply
    0
    • B bgallagher_lib

      Hi I am working on a project to process three files...two CSVs and one XML. These files are being moved from a file share to a SQL Server database table using BizTalk. The xml file is being transformed into the same flat file format as the two flat files using a C# component in SSIS. Then, these flat files are processed by SSIS packages. There is a lot of business logic in the SSIS transformations. The SSIS packages also do several look-ups using linked servers. All lookups and transforms are done on a row-by-row basis (which is slow). Also, any errors that occur are put in a separate database table depending on the business object that causes the error (i.e. BusObj1_error, BusObj2_error, BusObj3_error). Basically, I was hoping someone could suggest a better architecture that would improve performance, allow scalability and flexibility, and allow many developers works as a team on the same pieces of functionality. E.g. - Put validation rules in a db rather than hardcoded into SSIS. - Instead of using different error tables, use a single error table with a errorTypeId FK to an ErrorType table. - Migrate all transformations from SSIS C# so that multiple developers can work on different Business logic classes at same time. Thanks

      modified on Wednesday, August 3, 2011 5:28 AM

      J Offline
      J Offline
      jschell
      wrote on last edited by
      #2

      First steps when something is "slow". Determine what the goal is to make it not "slow" (so are you looking for 10% or 20000% better.) Second step is to determine why it is "slow" - specifically what parts make it slow. Myself I would suspect SSIS and the linked sources. Putting the raw data directly in the databae, in work tables, and then processing it there to produce the final result would probably be much faster.

      B 1 Reply Last reply
      0
      • J jschell

        First steps when something is "slow". Determine what the goal is to make it not "slow" (so are you looking for 10% or 20000% better.) Second step is to determine why it is "slow" - specifically what parts make it slow. Myself I would suspect SSIS and the linked sources. Putting the raw data directly in the databae, in work tables, and then processing it there to produce the final result would probably be much faster.

        B Offline
        B Offline
        bgallagher_lib
        wrote on last edited by
        #3

        Thanks jschell What would you say is the ideal architecture for what I have described?

        J 1 Reply Last reply
        0
        • B bgallagher_lib

          Hi I am working on a project to process three files...two CSVs and one XML. These files are being moved from a file share to a SQL Server database table using BizTalk. The xml file is being transformed into the same flat file format as the two flat files using a C# component in SSIS. Then, these flat files are processed by SSIS packages. There is a lot of business logic in the SSIS transformations. The SSIS packages also do several look-ups using linked servers. All lookups and transforms are done on a row-by-row basis (which is slow). Also, any errors that occur are put in a separate database table depending on the business object that causes the error (i.e. BusObj1_error, BusObj2_error, BusObj3_error). Basically, I was hoping someone could suggest a better architecture that would improve performance, allow scalability and flexibility, and allow many developers works as a team on the same pieces of functionality. E.g. - Put validation rules in a db rather than hardcoded into SSIS. - Instead of using different error tables, use a single error table with a errorTypeId FK to an ErrorType table. - Migrate all transformations from SSIS C# so that multiple developers can work on different Business logic classes at same time. Thanks

          modified on Wednesday, August 3, 2011 5:28 AM

          M Offline
          M Offline
          Mycroft Holmes
          wrote on last edited by
          #4

          Fucking ETL, whoever put the transform in ETL should be taken out and shot. Load the data into a staging table that exactly reflects the source data, NO transforms, no data types, make all the fields varchar(###) big enough to service your data fields. Now you have the data in a platform that is designed to manipulate data. Write a stored procedure that transforms the data from the staging table to your target table. This gives you a fast, repeatable process that you can manage in TSQL not some obscure data/row/field/column transform object in your ETL tool. If you don't have TSQL skills then hire someone who does, they are not uncommon. BizTalk, and every other transform tool I have ever had to use are garbage, I can write a bulk copy and stored procedure process that will out perform them every time. The only tool that came close to performance equality has been Informatica that will cost you $100k+ to implement.

          Never underestimate the power of human stupidity RAH

          B S 2 Replies Last reply
          0
          • M Mycroft Holmes

            Fucking ETL, whoever put the transform in ETL should be taken out and shot. Load the data into a staging table that exactly reflects the source data, NO transforms, no data types, make all the fields varchar(###) big enough to service your data fields. Now you have the data in a platform that is designed to manipulate data. Write a stored procedure that transforms the data from the staging table to your target table. This gives you a fast, repeatable process that you can manage in TSQL not some obscure data/row/field/column transform object in your ETL tool. If you don't have TSQL skills then hire someone who does, they are not uncommon. BizTalk, and every other transform tool I have ever had to use are garbage, I can write a bulk copy and stored procedure process that will out perform them every time. The only tool that came close to performance equality has been Informatica that will cost you $100k+ to implement.

            Never underestimate the power of human stupidity RAH

            B Offline
            B Offline
            bgallagher_lib
            wrote on last edited by
            #5

            Thanks for the response, Mycroft. I would tend to agree with you on using SQL tables and procs.

            1 Reply Last reply
            0
            • B bgallagher_lib

              Thanks jschell What would you say is the ideal architecture for what I have described?

              J Offline
              J Offline
              jschell
              wrote on last edited by
              #6

              What I would say is that the "ideal" architecture depends on specific details that one can only learn by studying the business requirements and needs in depth.

              1 Reply Last reply
              0
              • M Mycroft Holmes

                Fucking ETL, whoever put the transform in ETL should be taken out and shot. Load the data into a staging table that exactly reflects the source data, NO transforms, no data types, make all the fields varchar(###) big enough to service your data fields. Now you have the data in a platform that is designed to manipulate data. Write a stored procedure that transforms the data from the staging table to your target table. This gives you a fast, repeatable process that you can manage in TSQL not some obscure data/row/field/column transform object in your ETL tool. If you don't have TSQL skills then hire someone who does, they are not uncommon. BizTalk, and every other transform tool I have ever had to use are garbage, I can write a bulk copy and stored procedure process that will out perform them every time. The only tool that came close to performance equality has been Informatica that will cost you $100k+ to implement.

                Never underestimate the power of human stupidity RAH

                S Offline
                S Offline
                S Douglas
                wrote on last edited by
                #7

                Mycroft Holmes wrote:

                f***ing ETL, whoever put the transform in ETL should be taken out and shot.

                LOL, Tell us how you really feel :) Not that I disagree, the tools I've seen in action make Access loook like a rockstar (tibco and something from oracle)


                Common sense is admitting there is cause and effect and that you can exert some control over what you understand.

                1 Reply Last reply
                0
                Reply
                • Reply as topic
                Log in to reply
                • Oldest to Newest
                • Newest to Oldest
                • Most Votes


                • Login

                • Don't have an account? Register

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • World
                • Users
                • Groups