Design a system to Transform source Files to target file.
-
I need to design a system which will process hundreds of source files (different format) and convert to one target files. There should be two interfaces 1. Command line 2. web user interface Command line interface is to run the transform job though batch and web user interface is to define the format of source file and mapping details of source file to target file i.e. one time job for each source file. All the source files are fixed width or delimited files. What is the correct approach? Should I create one stage table for each source file to the stage data? Should I create one stage table on runtime when user will define of layout (Most are fixed width mainframe files) of source file? Should I create only one generic table, containing around 100 columns all varchar type? I am looking for the best approach to design the system. Performance is very critical for this app. There are hundreds of files and we need to transform all the files daily within certain time. Thanks in Advance Akshay
Lucky akky keep smiling
-
I need to design a system which will process hundreds of source files (different format) and convert to one target files. There should be two interfaces 1. Command line 2. web user interface Command line interface is to run the transform job though batch and web user interface is to define the format of source file and mapping details of source file to target file i.e. one time job for each source file. All the source files are fixed width or delimited files. What is the correct approach? Should I create one stage table for each source file to the stage data? Should I create one stage table on runtime when user will define of layout (Most are fixed width mainframe files) of source file? Should I create only one generic table, containing around 100 columns all varchar type? I am looking for the best approach to design the system. Performance is very critical for this app. There are hundreds of files and we need to transform all the files daily within certain time. Thanks in Advance Akshay
Lucky akky keep smiling
I've built many of these and not once have I used a command line interface. I use a service to do timed and repeated processing. Generally I use a separate staging file/database for each source file/set. I find you can usually group the files into sets which have the same data structure. I also use stored procs to do the processing from the staging tables to the final data table, this may be frowned upon as it is less flexible than a full ETL tool but I find it suits my style. I ended up with a winforms app that allows the user to configure a file for loading, defining the title and data rows, the delimiter or column widths and create a staging table with varchar fields so I can either BCP or bulk copy into the staging table. I then assign 1 of about 8 procedures to process to the final data table. BCP in 2005/8 is more fragile than 2000 so I use bulk copy a lot, slower but more robust.
-
I need to design a system which will process hundreds of source files (different format) and convert to one target files. There should be two interfaces 1. Command line 2. web user interface Command line interface is to run the transform job though batch and web user interface is to define the format of source file and mapping details of source file to target file i.e. one time job for each source file. All the source files are fixed width or delimited files. What is the correct approach? Should I create one stage table for each source file to the stage data? Should I create one stage table on runtime when user will define of layout (Most are fixed width mainframe files) of source file? Should I create only one generic table, containing around 100 columns all varchar type? I am looking for the best approach to design the system. Performance is very critical for this app. There are hundreds of files and we need to transform all the files daily within certain time. Thanks in Advance Akshay
Lucky akky keep smiling
Create run time stage table depending upon the format of targetting file.....