Dynamic / Run-time casting
-
I'm writing an app that downloads and processes documents from a web site. Since the document types (i.e., ContentType from the HTTP header) are unknown until they are downloaded, I have created a parent object called WebDocument. After downloading and determining the type, I would like to cast this generic parent object to a specific child type such as HtmlDocument, MsWordDocument, PdfDocument, etc. How can I dynamically cast like this? It would be nice if I could somehow use the string returned from the HTTP ContentType to do this. This dynamic casting becomes even more of a concern since I would like to allow other developers to create plugins to handle other types of documents.
-
I'm writing an app that downloads and processes documents from a web site. Since the document types (i.e., ContentType from the HTTP header) are unknown until they are downloaded, I have created a parent object called WebDocument. After downloading and determining the type, I would like to cast this generic parent object to a specific child type such as HtmlDocument, MsWordDocument, PdfDocument, etc. How can I dynamically cast like this? It would be nice if I could somehow use the string returned from the HTTP ContentType to do this. This dynamic casting becomes even more of a concern since I would like to allow other developers to create plugins to handle other types of documents.
If WebDocument is your parent and MsWordDocument etc are specilised children then it makes OO sense to do the cast. Windows usually just associates a program with a extension and that's that, and you can ask the shell to execute that for you by attempting to execute the document. I suppose the question is how much information do you need to store about different kinds of documents, and can it be more simply stored as a string attribute (Eg the mimetype recieved stored as a string) rather then lots of specialised classes. Those sorts of things are design considerations though, something only you can decide! :-) /********************************** Paul Evans, Dorset, UK. Personal Homepage "EnjoySoftware" @ http://www.enjoysoftware.co.uk/ **********************************/
-
If WebDocument is your parent and MsWordDocument etc are specilised children then it makes OO sense to do the cast. Windows usually just associates a program with a extension and that's that, and you can ask the shell to execute that for you by attempting to execute the document. I suppose the question is how much information do you need to store about different kinds of documents, and can it be more simply stored as a string attribute (Eg the mimetype recieved stored as a string) rather then lots of specialised classes. Those sorts of things are design considerations though, something only you can decide! :-) /********************************** Paul Evans, Dorset, UK. Personal Homepage "EnjoySoftware" @ http://www.enjoysoftware.co.uk/ **********************************/
I don't want to execute the files, I want to process them. Unfortunately you can't rely on file extensions when dealing with the web. I can use a .asp, .pl, .php, etc to return any type (not just an HTML document). This is why it is so important that I use dynamic casting. Also, I need to do more than just store information about the document. I need to process each type of document differently (hence the specialized child classes). So, for example, if it's an HTML document, I want to run an HTML parser on it or check the validity of its links. If it's a GIF, I may want to process its formatting or read its internal comments. However, it's not really the specific processing I have a question about - I need to know how to cast a parent object to an inherited child object without knowing the specific type at design time.
-
I'm writing an app that downloads and processes documents from a web site. Since the document types (i.e., ContentType from the HTTP header) are unknown until they are downloaded, I have created a parent object called WebDocument. After downloading and determining the type, I would like to cast this generic parent object to a specific child type such as HtmlDocument, MsWordDocument, PdfDocument, etc. How can I dynamically cast like this? It would be nice if I could somehow use the string returned from the HTTP ContentType to do this. This dynamic casting becomes even more of a concern since I would like to allow other developers to create plugins to handle other types of documents.
I did something like this on a project. My solution was a variation of the GoF Bridge design pattern. The idea is to separate the abstraction (the base WebDocument class) from its implementation (child classes of WebDocument). 1. Make WebDocument an abstract class, so it can't be instantiated. Add a method that child classes must override that will be called to process themselves (i.e. Process). Create a static factory method on WebDocument to instantiate an the appropriate handler class. You'll need to pass some information into the method so the class can decide which to create (the HTTP header, etc.). public abtract class WebDocument { protected HttpRequest Request; public static WebDocument CreateInstance(HttpRequest r) { switch r.ContentType { case "text/html": return new HtmlDocument(r) break; ... } } public abstract void Process() } 2. Create a class for each document type you need to handle (PDF, Word, etc.) that inherits from WebDocument. Create something like UnhandledDocument to process documents that you don't currently support. public class HtmlDocument : WebDocument { public HtmlDocument(HttpRequest r) { this.Request = r; } public override void Process() { // do something with this.Request } } 3. Write client code something like this: HttpRequest req = HttpContext.Current.Request; WebDocument d = WebDocument.CreateInstance(req); d.Process(); Can you see how the abstraction (WebDocument) is separated from an implementation (HtmlDocument)? Supporting new document types is as easy as creating the implementation class and adding it into CreateInstance, and will affect no other code. The client doesn't know or need to know the instance type. All it is responsible for is getting an instance of WebDocument to process a request. Hope this helps. It certainly helped me!
-
I did something like this on a project. My solution was a variation of the GoF Bridge design pattern. The idea is to separate the abstraction (the base WebDocument class) from its implementation (child classes of WebDocument). 1. Make WebDocument an abstract class, so it can't be instantiated. Add a method that child classes must override that will be called to process themselves (i.e. Process). Create a static factory method on WebDocument to instantiate an the appropriate handler class. You'll need to pass some information into the method so the class can decide which to create (the HTTP header, etc.). public abtract class WebDocument { protected HttpRequest Request; public static WebDocument CreateInstance(HttpRequest r) { switch r.ContentType { case "text/html": return new HtmlDocument(r) break; ... } } public abstract void Process() } 2. Create a class for each document type you need to handle (PDF, Word, etc.) that inherits from WebDocument. Create something like UnhandledDocument to process documents that you don't currently support. public class HtmlDocument : WebDocument { public HtmlDocument(HttpRequest r) { this.Request = r; } public override void Process() { // do something with this.Request } } 3. Write client code something like this: HttpRequest req = HttpContext.Current.Request; WebDocument d = WebDocument.CreateInstance(req); d.Process(); Can you see how the abstraction (WebDocument) is separated from an implementation (HtmlDocument)? Supporting new document types is as easy as creating the implementation class and adding it into CreateInstance, and will affect no other code. The client doesn't know or need to know the instance type. All it is responsible for is getting an instance of WebDocument to process a request. Hope this helps. It certainly helped me!
Thank you CBoland! This is very helpful. I didn't want to resort to the switch or if-then-else statements to pick a type (which is why I was asking about dynamic casting), however, it looks like very clean code, and I may end up doing it this way. I'll probably modify the static "CreateInstance()" method to check for plugins which implement the new IWebDocument interface. This way I (or other developers on the project) can easily distribute updates for old versions, while new version can simply add another line to the switch statement.
-
Thank you CBoland! This is very helpful. I didn't want to resort to the switch or if-then-else statements to pick a type (which is why I was asking about dynamic casting), however, it looks like very clean code, and I may end up doing it this way. I'll probably modify the static "CreateInstance()" method to check for plugins which implement the new IWebDocument interface. This way I (or other developers on the project) can easily distribute updates for old versions, while new version can simply add another line to the switch statement.
See what you are saying. Ok if you are going that route, may I suggest that the generic version that falls out of all other known documents store the data in a byte array from the stream, that way at least it can just reproduce the data as it was by replaying the stream, but still implement the interface as per whatever spec u give it.