HTTPModule and a Word Corpus
-
I'm doing a HTTPModule to summarize the text within web pages in my web server. The summarization process is with me but i need a big corpus of words to help on summarization. My problem is : Where do i put this corpus in memory so it can be acessed quickly in runtime ???? Now i have a static variable that stay in memory with the corpus and the next time a page loads it has already the static corpus in memory. I have this static variable when i call HTTPModule. I still have the problem of loading the corpus in the first time that someone loads the page... I takes a while reading it from a file. Is there any way to call my HTTPModule when the IIS starts or something like that??? Do you have any other ideas that can help me??? Where do i put the corpus in memory??? A Windows Service Works??? Can i call a Windows Service that has the corpus in memory and be fast enougth??? Thank you very much !!! :) Sorry for my English...:( Bruno Conde. pharaoh
-
I'm doing a HTTPModule to summarize the text within web pages in my web server. The summarization process is with me but i need a big corpus of words to help on summarization. My problem is : Where do i put this corpus in memory so it can be acessed quickly in runtime ???? Now i have a static variable that stay in memory with the corpus and the next time a page loads it has already the static corpus in memory. I have this static variable when i call HTTPModule. I still have the problem of loading the corpus in the first time that someone loads the page... I takes a while reading it from a file. Is there any way to call my HTTPModule when the IIS starts or something like that??? Do you have any other ideas that can help me??? Where do i put the corpus in memory??? A Windows Service Works??? Can i call a Windows Service that has the corpus in memory and be fast enougth??? Thank you very much !!! :) Sorry for my English...:( Bruno Conde. pharaoh
IDEA 1 - Object Serialization If you have a lot of parsing going on when you read your corpus of words, you may think about loading a structure that has a serialization feature. You can load the corpus, serialize the object, and deserialize it. This should give you some performance benefit since you won't have to parse anything. I did this for a financial analysis application. I would parse XML documents, store the information in the structures that I needed, then serialized them for use when the application restarted. It improved performance (memory usage and loading speed). IDEA 2 -
OnBeginRequestAsync
You could do the loading of your corpus in theOnBeginRequestAsync
method. If needed, it would load in a separate worker thread and, when complete, could store the corpus in your static variable. While the corpus loads, any request that needed it could "wait" or inform the user that the summary feature is loading and will presently be available. "we must lose precision to make significant statements about complex systems." -deKorvin on uncertainty