Can you run the same function from two backgroundWorkers
-
What's taking so long is I have to search for particular lines in the email and they aren't always in the same order, same line or anything. So I have to search for the messageID, From line, the subject line, the actual text of the email, if there is a attachment and what the name of said attachment could be, then have to watch out for the end of email marker or if I found evenything then I just end it there. So yes there is some regexes in there but I think only a few lines for the one thing I'm looking for (I'd need to go look at that part again to see what it's for). Though I am doing that line by line, isn't there a way with the streamreader to search for what ever it is you are looking for (lets say I need to find "From: " not "Received: from "). This is why I'm doing it line by line as I'm able to look at the start of the line and determine if that is what I need. This is where I'm writing that "master file" rather than the individual ones. I could probably skip the write to the csv file and just go straight to the database, but for some reason when I first wrote this 9 months ago I had some problems with something and thus the writing to the csv file (I don't remember what they were). This is also my first large application in C#, up till 1 1/2 years ago I was mainly doing VB, VBA & Access in M$ land.
one should not perform multiple passes on a (text) file, just read it once; or read a part, skip some, read some more, and never go back. Anything else is bound to be slow. If you need searching back and forth, store it all in memory or use a database. From your description it really sounds like a DB is in order. IMO you should thoroughly rethink the whole approach; a sub-optimal approach will not get fixed by throwing in some multi-threading. :)
Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.
-
That seems reasonable. I don't know anything about reading messages from a mail server, but... Each thread can read and parse one message and report back the result to be written. Whether or not the thread also downloads the message I don't know, but that should be doable. So you can have a class that distributes work to a bunch of threads. The process on the thread performs the work and reports back when finished. For writing, you can have an event handler that locks a stream when it writes.
I don't "download" the message, just read it from the server and write the needed info to the file. This is an example of the first section of the mail that I'm working with...
+OK 670581 octets
Return-Path: <email address>
Received: from hrndva-omtalb.email host([ip address])
by hrndva-imta01.email hostwith ESMTP
id <20100324163246244.LLFZ11363@hrndva-imta01.email host>
for <email it's going to>; Wed, 24 Mar 2010 16:32:46 +0000
Return-Path: <email address>
X-Authority-Analysis: v=1.0 c=1 a=Y--C8wIrtp4A:10 a=ed-Ggqp32-PxgnFQ28IA:9 a=gFYqYUHr3cvJf5tUtWv3jj12YwYA:4 a=wPNLvfGTeEIA:10 a=SSmOFEACAAAA:8 a=Xz8RjLcVAAAA:8 a=bvyAQD6M8USi_luE8VwA:9 a=zkXRgtjM-mmOsYMX5XAA:7 a=lSBj04H3UYGbvfZ5gLKUj7ga3v4A:4 a=TQY7aazGoy4vupPYzM8A:9 a=A9QQSRYdmLSsclXicRDPQfuie2oA:4 a=IKIoO-ieCDEA:10 a=l42U5Vqe35IA:10 a=OU-3oeRcviPOZ7V7:21 a=r3OCwUNA-PGXxiAt:21
X-Cloudmark-Score: 0
X-Originating-IP: IP Address
Received: from [IP Address] ([IP Address] helo=computer it's from (I think))
by hrndva-oedge02.email host (envelope-from <email address>)
(ecelerity 2.2.2.39 r()) with ESMTP
id 8E/A4-28072-8AE3AAB4; Wed, 24 Mar 2010 16:32:45 +0000
Received: from 127.0.0.1 (AVG SMTP 8.5.437 [271.1.1/2767]); Wed, 24 Mar 2010 12:31:41 -0500
Message-ID: <006301cacb77$ddf18460$ae02a8c0@pc it's from>
From: "Name" <emailaddress>
To: "person it's going to" <their email>
Subject: kinda obvious but using this data
Date: Wed, 24 Mar 2010 12:31:41 -0500
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_NextPart_000_005F_01CACB4D.F4FC82B0"
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2800.1983
Disposition-Notification-To: "Person from" <email address>
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1983There is a lot more after that but should give you an idea... hope I replaced all the stuff I should have :confused:
-
Luc Pattyn wrote:
consider having a single database
I concur.
Luc Pattyn wrote:
lots of regexes
Hadn't thought of that, but yeah, good point.
PIEBALDconsult wrote:
Luc Pattyn wrote: lots of regexes
here are the only lines
Regex objLongDollar = new Regex("\\d+,\\d+\\.\\d+");
Regex objNumRd = new Regex("\\d+rd");
Regex objNumSt = new Regex("\\d+st");This is in the parse message section...
-
What's taking so long is I have to search for particular lines in the email and they aren't always in the same order, same line or anything. So I have to search for the messageID, From line, the subject line, the actual text of the email, if there is a attachment and what the name of said attachment could be, then have to watch out for the end of email marker or if I found evenything then I just end it there. So yes there is some regexes in there but I think only a few lines for the one thing I'm looking for (I'd need to go look at that part again to see what it's for). Though I am doing that line by line, isn't there a way with the streamreader to search for what ever it is you are looking for (lets say I need to find "From: " not "Received: from "). This is why I'm doing it line by line as I'm able to look at the start of the line and determine if that is what I need. This is where I'm writing that "master file" rather than the individual ones. I could probably skip the write to the csv file and just go straight to the database, but for some reason when I first wrote this 9 months ago I had some problems with something and thus the writing to the csv file (I don't remember what they were). This is also my first large application in C#, up till 1 1/2 years ago I was mainly doing VB, VBA & Access in M$ land.
Isn't some of that available as properties or something? Without having seen what the emails look like, I'd recommend reading the whole email into one string and using one RegEx to extract what you need.
-
PIEBALDconsult wrote:
Luc Pattyn wrote: lots of regexes
here are the only lines
Regex objLongDollar = new Regex("\\d+,\\d+\\.\\d+");
Regex objNumRd = new Regex("\\d+rd");
Regex objNumSt = new Regex("\\d+st");This is in the parse message section...
The result will depend on how large the search object is, and how often you execute such regexes. When I care about performance, I avoid the Regex class. I use string methods, maybe a StringBuilder, maybe a character array, maybe several nested loops, but no regexes. Regexes are good for compact code when performance does not matter at all, and readability is not a primary concern either. Here[^] is the report on a little experiment I once performed. :)
Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.
-
MacRaider4 wrote:
As you can see there is nothing different between them
My sight tells otherwise. I suggest you start believing the error messages you are getting. :)
Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.
Luc Pattyn wrote:
My sight tells otherwise. I suggest you start believing the error messages you are getting.
Ok I've been staring at this for much too long, and have looked at it many times and I'm not seeing the difference other than one has a 2 and the other a 3. :~
-
That seems reasonable. I don't know anything about reading messages from a mail server, but... Each thread can read and parse one message and report back the result to be written. Whether or not the thread also downloads the message I don't know, but that should be doable. So you can have a class that distributes work to a bunch of threads. The process on the thread performs the work and reports back when finished. For writing, you can have an event handler that locks a stream when it writes.
Ok so to speed up that section you are suggesting to create a class that passes work to lets say 4 background workers? That sounds really good, but I've never done anything like that and how would I then return that info back to the Form? With a return? I would also loose my updates on the progressBar would I not? Ok my brain is really starting to hurt now, thankfully I've only got 10 min left in my day right now... will have to get back to this tomorrow! Thank you all for everything thus far...
-
Luc Pattyn wrote:
My sight tells otherwise. I suggest you start believing the error messages you are getting.
Ok I've been staring at this for much too long, and have looked at it many times and I'm not seeing the difference other than one has a 2 and the other a 3. :~
one BaCKgroUNDworKEr isn't the other. :)
Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.
-
Ok so to speed up that section you are suggesting to create a class that passes work to lets say 4 background workers? That sounds really good, but I've never done anything like that and how would I then return that info back to the Form? With a return? I would also loose my updates on the progressBar would I not? Ok my brain is really starting to hurt now, thankfully I've only got 10 min left in my day right now... will have to get back to this tomorrow! Thank you all for everything thus far...
MacRaider4 wrote:
return that info back to the Form
Well, I question the use of a form at all; I'd use a Windows Service, but that's just me. You can have a Service that pulls the data into the database and then the form pulls it (already fluffed and folded) from there. Or you could use an event.
-
MacRaider4 wrote:
return that info back to the Form
Well, I question the use of a form at all; I'd use a Windows Service, but that's just me. You can have a Service that pulls the data into the database and then the form pulls it (already fluffed and folded) from there. Or you could use an event.
I'll have to look up services as I've never done anything with that before. Though I will say doing this project has made me a better C# programmer, at this rate in another year I'll be answering some of these questions for other people. Some one else mentioned just doing this all in one pass, now that I'm looking back at my code I think that is a very good idea. Is this something I could do with the service or event? I could then do my initial pass to get the number, then have a couple workers work on the list storing the data in arrays. Once those are done combine the arrays or better yet just have the arrays loaded stright into the database which should take no time at all even with checking to make sure that message isn't already there?
-
I'll have to look up services as I've never done anything with that before. Though I will say doing this project has made me a better C# programmer, at this rate in another year I'll be answering some of these questions for other people. Some one else mentioned just doing this all in one pass, now that I'm looking back at my code I think that is a very good idea. Is this something I could do with the service or event? I could then do my initial pass to get the number, then have a couple workers work on the list storing the data in arrays. Once those are done combine the arrays or better yet just have the arrays loaded stright into the database which should take no time at all even with checking to make sure that message isn't already there?
MacRaider4 wrote:
have the arrays loaded stright into the database
Right. The Service would periodically (once a minute?) query the email server for messages, if there are some, get them, process them, and stick the results in the database. You could still use a thread to process each message in necessary. Depending on your needs, you could then have the same Service host a WCF Web Service that your client application can use to get the data. :thumbsup:
-
MacRaider4 wrote:
have the arrays loaded stright into the database
Right. The Service would periodically (once a minute?) query the email server for messages, if there are some, get them, process them, and stick the results in the database. You could still use a thread to process each message in necessary. Depending on your needs, you could then have the same Service host a WCF Web Service that your client application can use to get the data. :thumbsup:
That was my original intent once I got it working, just didn't know about the service part. So let me see if I have this right now: 1. Log into the server and get the number of messages 2. Decide if I need to use a bgw and how many 3. Do the work with no or a couple workers: a. have the worker/s log in with the number of account/s each is processing b. process the "entire" message all at once and store in an array 4. Update the database 5. Have the form check fo updates? Do this sound about right?
-
That was my original intent once I got it working, just didn't know about the service part. So let me see if I have this right now: 1. Log into the server and get the number of messages 2. Decide if I need to use a bgw and how many 3. Do the work with no or a couple workers: a. have the worker/s log in with the number of account/s each is processing b. process the "entire" message all at once and store in an array 4. Update the database 5. Have the form check fo updates? Do this sound about right?
Yeah, basically. But remember that I'm not familiar with reading messages from an email server, so I don't understand the "a. have the worker/s log in with the number of account/s each is processing" part. I would have a worker read a message, process it, and stick it in the database; then maybe get another. Or read all the available messages and pass them to the workers. There are many ways to skin this cat.
-
Yeah, basically. But remember that I'm not familiar with reading messages from an email server, so I don't understand the "a. have the worker/s log in with the number of account/s each is processing" part. I would have a worker read a message, process it, and stick it in the database; then maybe get another. Or read all the available messages and pass them to the workers. There are many ways to skin this cat.
Still can't get my 3rd worker to "work" as I'm still getting the backgroundWorker3 does not exist in the current context in the first occurance of each line in InitializeBackgroundWorker3. So that's putting a hinderance on everything.
public Form1()
{
InitializeComponent();
InitializeBackgroundWorker();
InitializeBackgroundWorker2();
InitializeBackgroundWorker3();btnGetMessageInfo.Enabled = false; btnCancelConnection.Enabled = false; } private void InitializeBackgroundWorker() { backgroundWorker1.DoWork += new DoWorkEventHandler(backgroundWorker1\_DoWork); backgroundWorker1.RunWorkerCompleted += new RunWorkerCompletedEventHandler(backgroundWorker1\_RunWorkerCompleted); backgroundWorker1.ProgressChanged += new ProgressChangedEventHandler(backgroundWorker3\_ProgressChanged); } private void InitializeBackgroundWorker2() { backgroundWorker2.DoWork += new DoWorkEventHandler(backgroundWorker2\_DoWork); backgroundWorker2.RunWorkerCompleted += new RunWorkerCompletedEventHandler(backgroundWorker2\_RunWorkerCompleted); backgroundWorker2.ProgressChanged += new ProgressChangedEventHandler(backgroundWorker2\_ProgressChanged); } private void InitializeBackgroundWorker3() { backgroundWorker3.DoWork += new DoWorkEventHandler(backgroundWorker3\_DoWork); backgroundWorker3.RunWorkerCompleted += new RunWorkerCompletedEventHandler(backgroundWorker3\_RunWorkerCompleted); backgroundWorker3.ProgressChanged += new ProgressChangedEventHandler(backgroundWorker3\_ProgressChanged); }
So what I have now is when you click on the connect button (first thing you can do), it logs into the server and gets the total number and size of each message. And stores that in a global variable and displays some info on the form. Then it figures out how many workers to use (based on number of messages)and assigns start and end values for each worker. I'm now in the process of writing the work for the workers (focusing on 1 and 2 since only those work). It's moving along, though slowly.
modified on Friday, February 11, 2011 1:33 PM