data structure design for data aggregation
-
Hello everyone, I want to learn from your advice about how to design data structure in the following scenario to be most efficient. One thread will create data (composed of ID and content), and output to a queue or something (data structure could be chosen to make the scenario more efficient), the data created is very frequently working to produce data, another thread is responsible to aggregate data (for the same ID, aggregate the content and output to a file), and aggregation thread works less frequently -- sleep 10 minutes, aggregate and then sleep again. I am going to find a solution which could balance, 1. Less performance impact to the data creating thread; 2. To make data aggregation thread works as efficient as possible and consume less memory. Any advice about how to design data structures? Currently, - I am stupidly using a List, appending data by the data creation thread, and I think appending data to the List is less performance impact to data create thread than using Dictionary to insert. Am I correct? - Read data from begin to end of another thread -- using ID as key into a Dictionary, since there maybe duplicate ID, so when I insert into the Dictionary, I will check if contains key, if yes, I will update the data, or else insert a new one; - Using lock on the whole List to make thread safe. Does the lock on the whole List is too heavy? Any smarter ways? :-) thanks in advance, George
-
Hello everyone, I want to learn from your advice about how to design data structure in the following scenario to be most efficient. One thread will create data (composed of ID and content), and output to a queue or something (data structure could be chosen to make the scenario more efficient), the data created is very frequently working to produce data, another thread is responsible to aggregate data (for the same ID, aggregate the content and output to a file), and aggregation thread works less frequently -- sleep 10 minutes, aggregate and then sleep again. I am going to find a solution which could balance, 1. Less performance impact to the data creating thread; 2. To make data aggregation thread works as efficient as possible and consume less memory. Any advice about how to design data structures? Currently, - I am stupidly using a List, appending data by the data creation thread, and I think appending data to the List is less performance impact to data create thread than using Dictionary to insert. Am I correct? - Read data from begin to end of another thread -- using ID as key into a Dictionary, since there maybe duplicate ID, so when I insert into the Dictionary, I will check if contains key, if yes, I will update the data, or else insert a new one; - Using lock on the whole List to make thread safe. Does the lock on the whole List is too heavy? Any smarter ways? :-) thanks in advance, George
I am thinking this would be better done with a database. Databases have built in functions for maintaining ID's. They also have functions for aggregating the data in most ways you can think of.
-
I am thinking this would be better done with a database. Databases have built in functions for maintaining ID's. They also have functions for aggregating the data in most ways you can think of.
Thanks Christian, Is there a built-in data structure like MuitlMap in C++, which allows one key maps to multiple values in C#? regards, George
-
Thanks Christian, Is there a built-in data structure like MuitlMap in C++, which allows one key maps to multiple values in C#? regards, George
I don't think so.
-
I don't think so.
Thanks Christian, If not using database, any ideas about what is the best solution you could have with memory based data structures? regards, George