Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Streaming data from one datalayer to another

Streaming data from one datalayer to another

Scheduled Pinned Locked Moved C#
databasehelpcombusinesstools
3 Posts 2 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • E Offline
    E Offline
    Evilfish2000
    wrote on last edited by
    #1

    I have a problem where I am creating a data layer for a project, where I have a scenario I don't really know which solution is better. The main function of the project/service is to regularly check for changes in several databases and copy the information to a data warehouse(one big database really). One scenario is when you add an new table to be tracked in the system. It has a specific task to copy the entire table from the site database to the warehouse. It does this by opening a data reader and then generate a batch INSERT script until it reaches a limit and then it execute the script towards the data warehouse. In pseudo code it looks like this:

    using (IDbCommand clientCommand = Connection.CreateCommand())
    {
    clientCommand.CommandText = clientSql;
    using (IDataReader clientReader = clientCommand.ExecuteReader()
    {
    while(clientReader.Read())
    {
    AddDataToList(clientReader);
    }

    	if(List.Count >= 100)
    	{
    		using (var com = warehouseConnection.CreateCommand())
    		{
    			com.CommandText = GetInsertScript(List);
    			AddedCount += com.ExecuteNonQuery();
    		}
    	}
    }
    

    }

    Its more complicated than that, with more checks and sql code. I see two main issues with this: 1: Database logic in business logic that I want to move over to my data layer. This will make the code testable and easier to understand. 2: I do not like that we wait for data from the client, and then wait for data to be written to the other database, when this could be done in 2 tasks that transfer and receiving information asynchronously with each other via ConcurrentQueue for instance. This speeds up the process, which is also an issue in some cases. The problem I am having is that I that I cant get all the data in one go and then add it to the warehouse. Sometimes the tables are huge, which can lead to memory issues on the service. So I want to process the data as they come in through the client reader and continuously add them to the warehouse. But, I also want the database logic in the data layer. Adding the data is simple, but retrieving it is a bit more complicated. Researching this issue have given me 2 choices: Option 1 (Pull from data layer): I could use the yield keyword to return data as it comes through the reader in my data layer:

    public IEnumerator GetTableContent()
    {
    SqlCommand command = new SqlCommand("command", SqlConnection);
    using (var reader = command.ExecuteReader())
    {
    var items = new object[reader.FieldCount];
    rea

    P 1 Reply Last reply
    0
    • E Evilfish2000

      I have a problem where I am creating a data layer for a project, where I have a scenario I don't really know which solution is better. The main function of the project/service is to regularly check for changes in several databases and copy the information to a data warehouse(one big database really). One scenario is when you add an new table to be tracked in the system. It has a specific task to copy the entire table from the site database to the warehouse. It does this by opening a data reader and then generate a batch INSERT script until it reaches a limit and then it execute the script towards the data warehouse. In pseudo code it looks like this:

      using (IDbCommand clientCommand = Connection.CreateCommand())
      {
      clientCommand.CommandText = clientSql;
      using (IDataReader clientReader = clientCommand.ExecuteReader()
      {
      while(clientReader.Read())
      {
      AddDataToList(clientReader);
      }

      	if(List.Count >= 100)
      	{
      		using (var com = warehouseConnection.CreateCommand())
      		{
      			com.CommandText = GetInsertScript(List);
      			AddedCount += com.ExecuteNonQuery();
      		}
      	}
      }
      

      }

      Its more complicated than that, with more checks and sql code. I see two main issues with this: 1: Database logic in business logic that I want to move over to my data layer. This will make the code testable and easier to understand. 2: I do not like that we wait for data from the client, and then wait for data to be written to the other database, when this could be done in 2 tasks that transfer and receiving information asynchronously with each other via ConcurrentQueue for instance. This speeds up the process, which is also an issue in some cases. The problem I am having is that I that I cant get all the data in one go and then add it to the warehouse. Sometimes the tables are huge, which can lead to memory issues on the service. So I want to process the data as they come in through the client reader and continuously add them to the warehouse. But, I also want the database logic in the data layer. Adding the data is simple, but retrieving it is a bit more complicated. Researching this issue have given me 2 choices: Option 1 (Pull from data layer): I could use the yield keyword to return data as it comes through the reader in my data layer:

      public IEnumerator GetTableContent()
      {
      SqlCommand command = new SqlCommand("command", SqlConnection);
      using (var reader = command.ExecuteReader())
      {
      var items = new object[reader.FieldCount];
      rea

      P Offline
      P Offline
      Pete OHanlon
      wrote on last edited by
      #2

      To be honest, neither solution really appeals to me. Have you considered using a database server to do the ETL instead? SQL Server, for instance, provides excellent ETL capabilities.

      Advanced TypeScript Programming Projects

      E 1 Reply Last reply
      0
      • P Pete OHanlon

        To be honest, neither solution really appeals to me. Have you considered using a database server to do the ETL instead? SQL Server, for instance, provides excellent ETL capabilities.

        Advanced TypeScript Programming Projects

        E Offline
        E Offline
        Evilfish2000
        wrote on last edited by
        #3

        That is a very interesting suggestion. I cant use that at the moment, because the system is not setup for that. But its clear this is the way to go. Thank you for that!

        1 Reply Last reply
        0
        Reply
        • Reply as topic
        Log in to reply
        • Oldest to Newest
        • Newest to Oldest
        • Most Votes


        • Login

        • Don't have an account? Register

        • Login or register to search.
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • World
        • Users
        • Groups