Using Change Data Capture to Replicate data
-
I am building a reporting database that is pulling data from several source and different databases. The plan so far is to turn on change data capture on the source databases and process the log generated using SSIS. I can find a few papers and sample online on how to process the ubiquitous INSERT/UPDATE/DELETE log and it is helpful but I fail to find any good sample or tool with multiple tables. I can't see how I can process the items in the right order and keep integrity. Let's say we have a classic database with Product, Order and OrderLine. And this scenario:
8:00 - Create a product P1
8:01 - Create an Order O1
8:02 - Create an Order Line L1
8:03 - Delete the Order Line L1
8:04 - Delete the Order O1
8:05 - Create a new Order O1
8:06 - Create a new Order Line L1The cdc tables will contain For Product
8:00 INSERT P1
For Order
8:01 INSERT O1
8:04 DELETE O1
8:05 INSERT O1For Order Line
8:02 INSERT L1 Order:O1 Product:P1
8:03 DELETE L1
8:06 CREATE L1 Order:O1 Product:P1To keep the referential integrity of the reporting database, I would need to: - insert order lines after orders - delete order lines before orders Because of this I can't really treat a CDC at a time. As I see it the only way would be to process all CDC simulataneously in datetime order. Is there ways to do that? The source databases are Oracle and MS SQL and possibly text files. The reporting database is MS SQL
-
I am building a reporting database that is pulling data from several source and different databases. The plan so far is to turn on change data capture on the source databases and process the log generated using SSIS. I can find a few papers and sample online on how to process the ubiquitous INSERT/UPDATE/DELETE log and it is helpful but I fail to find any good sample or tool with multiple tables. I can't see how I can process the items in the right order and keep integrity. Let's say we have a classic database with Product, Order and OrderLine. And this scenario:
8:00 - Create a product P1
8:01 - Create an Order O1
8:02 - Create an Order Line L1
8:03 - Delete the Order Line L1
8:04 - Delete the Order O1
8:05 - Create a new Order O1
8:06 - Create a new Order Line L1The cdc tables will contain For Product
8:00 INSERT P1
For Order
8:01 INSERT O1
8:04 DELETE O1
8:05 INSERT O1For Order Line
8:02 INSERT L1 Order:O1 Product:P1
8:03 DELETE L1
8:06 CREATE L1 Order:O1 Product:P1To keep the referential integrity of the reporting database, I would need to: - insert order lines after orders - delete order lines before orders Because of this I can't really treat a CDC at a time. As I see it the only way would be to process all CDC simulataneously in datetime order. Is there ways to do that? The source databases are Oracle and MS SQL and possibly text files. The reporting database is MS SQL
Wow in 20+ year of writing reporting systems I have never had to consider this for of replication, imagine reading the logs! Why do normal replication tools (something else I know little about) not meet your requirements.
Never underestimate the power of human stupidity RAH
-
I am building a reporting database that is pulling data from several source and different databases. The plan so far is to turn on change data capture on the source databases and process the log generated using SSIS. I can find a few papers and sample online on how to process the ubiquitous INSERT/UPDATE/DELETE log and it is helpful but I fail to find any good sample or tool with multiple tables. I can't see how I can process the items in the right order and keep integrity. Let's say we have a classic database with Product, Order and OrderLine. And this scenario:
8:00 - Create a product P1
8:01 - Create an Order O1
8:02 - Create an Order Line L1
8:03 - Delete the Order Line L1
8:04 - Delete the Order O1
8:05 - Create a new Order O1
8:06 - Create a new Order Line L1The cdc tables will contain For Product
8:00 INSERT P1
For Order
8:01 INSERT O1
8:04 DELETE O1
8:05 INSERT O1For Order Line
8:02 INSERT L1 Order:O1 Product:P1
8:03 DELETE L1
8:06 CREATE L1 Order:O1 Product:P1To keep the referential integrity of the reporting database, I would need to: - insert order lines after orders - delete order lines before orders Because of this I can't really treat a CDC at a time. As I see it the only way would be to process all CDC simulataneously in datetime order. Is there ways to do that? The source databases are Oracle and MS SQL and possibly text files. The reporting database is MS SQL
hello veri nice
-
I am building a reporting database that is pulling data from several source and different databases. The plan so far is to turn on change data capture on the source databases and process the log generated using SSIS. I can find a few papers and sample online on how to process the ubiquitous INSERT/UPDATE/DELETE log and it is helpful but I fail to find any good sample or tool with multiple tables. I can't see how I can process the items in the right order and keep integrity. Let's say we have a classic database with Product, Order and OrderLine. And this scenario:
8:00 - Create a product P1
8:01 - Create an Order O1
8:02 - Create an Order Line L1
8:03 - Delete the Order Line L1
8:04 - Delete the Order O1
8:05 - Create a new Order O1
8:06 - Create a new Order Line L1The cdc tables will contain For Product
8:00 INSERT P1
For Order
8:01 INSERT O1
8:04 DELETE O1
8:05 INSERT O1For Order Line
8:02 INSERT L1 Order:O1 Product:P1
8:03 DELETE L1
8:06 CREATE L1 Order:O1 Product:P1To keep the referential integrity of the reporting database, I would need to: - insert order lines after orders - delete order lines before orders Because of this I can't really treat a CDC at a time. As I see it the only way would be to process all CDC simulataneously in datetime order. Is there ways to do that? The source databases are Oracle and MS SQL and possibly text files. The reporting database is MS SQL