Fuzzy Grouping performance issue
Database
1
Posts
1
Posters
0
Views
1
Watching
-
Hello All, We have a SSIS package which includes Fuzzy Grouping in Data Flow. It takes two columns from source table and saves outputs in different table with match score etc. Following is the way we are doing it: 1. Load required data from table using OLEDB connection (source) 2. Sort the data 3. Apply Fuzzy grouping (using dedicated database instead tempdb and MinSimilarity = 0.6) 4. Send to destination table using OLEDB connection (destination) In input table we have millions of records. It takes too long to execute and even sometime it fails after running 12 hours. Any suggestions for performance improvement are welcomed. Appreciate your help.
- Ashish