Outlook Express data format
-
Does anyone know what the format of the Outlook Express storage file is or where to find the format for it? My problem in a nutshell is this: I want to be able to filter out all of the spam emails from my Hotmail accounts. I've looked around and tried many of the shareware/trialware programs out there that claim to do this, but have had no luck with them working. So, instead of trying to find a program that does what I want, I'll have to (try) write one instead. The only problem is that I don't know the format of the DBX files that Outlook Express is using. Before I spend a lot of time trying to figure out the format by example, I thought I'd see if anyone here knows what it is of has seen it described anywhere. What would be even better is if someone has already written a class that handles this (though I'm not too optomistic about that). So, can anyone help me with this? Thanks in advance for any help or pointers! Mike
-
Does anyone know what the format of the Outlook Express storage file is or where to find the format for it? My problem in a nutshell is this: I want to be able to filter out all of the spam emails from my Hotmail accounts. I've looked around and tried many of the shareware/trialware programs out there that claim to do this, but have had no luck with them working. So, instead of trying to find a program that does what I want, I'll have to (try) write one instead. The only problem is that I don't know the format of the DBX files that Outlook Express is using. Before I spend a lot of time trying to figure out the format by example, I thought I'd see if anyone here knows what it is of has seen it described anywhere. What would be even better is if someone has already written a class that handles this (though I'm not too optomistic about that). So, can anyone help me with this? Thanks in advance for any help or pointers! Mike
FYI, MSDN has articles abt Outlook Express format! Thanks, Ramu
-
Does anyone know what the format of the Outlook Express storage file is or where to find the format for it? My problem in a nutshell is this: I want to be able to filter out all of the spam emails from my Hotmail accounts. I've looked around and tried many of the shareware/trialware programs out there that claim to do this, but have had no luck with them working. So, instead of trying to find a program that does what I want, I'll have to (try) write one instead. The only problem is that I don't know the format of the DBX files that Outlook Express is using. Before I spend a lot of time trying to figure out the format by example, I thought I'd see if anyone here knows what it is of has seen it described anywhere. What would be even better is if someone has already written a class that handles this (though I'm not too optomistic about that). So, can anyone help me with this? Thanks in advance for any help or pointers! Mike
can't help with the DB layout. but, IMO, it's probably easier to go at it at the source - at the mail transport level. i think Outlook Express supports MAPI. if so, you could write an app that scans your inbox looking for files that match patterns, and delete the ones that do. this app does something like that: http://www.smalleranimals.com/snoop.htm. it's not too complicated to do. -c
Garbage collection, making life better - for weenies!
Image Processing - now with extra cess.
-
Does anyone know what the format of the Outlook Express storage file is or where to find the format for it? My problem in a nutshell is this: I want to be able to filter out all of the spam emails from my Hotmail accounts. I've looked around and tried many of the shareware/trialware programs out there that claim to do this, but have had no luck with them working. So, instead of trying to find a program that does what I want, I'll have to (try) write one instead. The only problem is that I don't know the format of the DBX files that Outlook Express is using. Before I spend a lot of time trying to figure out the format by example, I thought I'd see if anyone here knows what it is of has seen it described anywhere. What would be even better is if someone has already written a class that handles this (though I'm not too optomistic about that). So, can anyone help me with this? Thanks in advance for any help or pointers! Mike
File format for Outlook Express 5.0 DBX files Header: These are the values we use to see what type the file is. 0x0 Signature [16/32 bytes] = 4A4D4636 03001000 /* OE4 SIGNATURE */ = CFAD12FE C5FD746F 66E3D111 9A4E00C0 /*OE5 Email DBX*/ = CFAD12FE C6FD746F 66E3D111 9A4E00C0 /*OE5 Folder DBX*/ As I said, we use these values to identify it, but we don't know what these values mean 0x5C Highest Email ID [4 bytes] This is the current highest. The next email will have a number one higher than this. 0x7C File Size [4 bytes] Total size of file 0xC4 Item Count [4 bytes] Number of items stored in this DBX file. Appears to be accurate. We use it as a second check that we haven't gone wrong whilst reading the indexes. 0xE4 Index Pointer [4 bytes] File offset pointing to a page of Data Indexes. This page can even be a page of Indexes pointing to pages of indexes. This area needs to be explored a little more Indexes: (pointed to by the pointer in 0xE4) This area is a little blurry, but it appears to work 0x0 Self [4 bytes] - Current Offset 0x4 Unknown [4 bytes] 0x8 Table Pointer [4 bytes] - Pointer to another of these tables 0xC Parent [4 bytes] - If this is a table of indexes which is referenced by another table above it, this will point to the parent's table 0x10 Unknown [1 byte] 0x11 Pointer Count [1 byte] - Number of pointers in this table 0x12 Unknown [2 bytes] 0x14 index Count [4 bytes] - I'm not sure Size = 0x18 Straight after this comes [Pointer Count] entries like this 0x0 Index Pointer [4 bytes] - Pointer to a data block 0x4 Table Pointer [4 bytes] - Pointer to another Table of indexes 0x8 Index Count [4 bytes] - Not sure. If non-zero [Table Pointer] has more indexes? Size = 0x0C These [Index Pointer] items point to offsets in the file which are data items. At present Emails and Folders have been worked on. Item Header: 0x0 Self [4 bytes] - current offset 0x4 Size [4 bytes] - size of block that follows 0x8 Unknown [2 bytes] 0xA Count [1 byte] - number of items in the block before the data starts 0xB Unknown [1 byte] Size = 0x0C Then comes [Count] number of items: 0x0 type [1 byte] - specifies what the data is 0x1 value [3 bytes] - actual data The following types have been found for Emails Type - Value 0x02 - 0x04 - buffer pointer to file offset of email data 0x05 - buffer pointer to asciiz string cont
-
File format for Outlook Express 5.0 DBX files Header: These are the values we use to see what type the file is. 0x0 Signature [16/32 bytes] = 4A4D4636 03001000 /* OE4 SIGNATURE */ = CFAD12FE C5FD746F 66E3D111 9A4E00C0 /*OE5 Email DBX*/ = CFAD12FE C6FD746F 66E3D111 9A4E00C0 /*OE5 Folder DBX*/ As I said, we use these values to identify it, but we don't know what these values mean 0x5C Highest Email ID [4 bytes] This is the current highest. The next email will have a number one higher than this. 0x7C File Size [4 bytes] Total size of file 0xC4 Item Count [4 bytes] Number of items stored in this DBX file. Appears to be accurate. We use it as a second check that we haven't gone wrong whilst reading the indexes. 0xE4 Index Pointer [4 bytes] File offset pointing to a page of Data Indexes. This page can even be a page of Indexes pointing to pages of indexes. This area needs to be explored a little more Indexes: (pointed to by the pointer in 0xE4) This area is a little blurry, but it appears to work 0x0 Self [4 bytes] - Current Offset 0x4 Unknown [4 bytes] 0x8 Table Pointer [4 bytes] - Pointer to another of these tables 0xC Parent [4 bytes] - If this is a table of indexes which is referenced by another table above it, this will point to the parent's table 0x10 Unknown [1 byte] 0x11 Pointer Count [1 byte] - Number of pointers in this table 0x12 Unknown [2 bytes] 0x14 index Count [4 bytes] - I'm not sure Size = 0x18 Straight after this comes [Pointer Count] entries like this 0x0 Index Pointer [4 bytes] - Pointer to a data block 0x4 Table Pointer [4 bytes] - Pointer to another Table of indexes 0x8 Index Count [4 bytes] - Not sure. If non-zero [Table Pointer] has more indexes? Size = 0x0C These [Index Pointer] items point to offsets in the file which are data items. At present Emails and Folders have been worked on. Item Header: 0x0 Self [4 bytes] - current offset 0x4 Size [4 bytes] - size of block that follows 0x8 Unknown [2 bytes] 0xA Count [1 byte] - number of items in the block before the data starts 0xB Unknown [1 byte] Size = 0x0C Then comes [Count] number of items: 0x0 type [1 byte] - specifies what the data is 0x1 value [3 bytes] - actual data The following types have been found for Emails Type - Value 0x02 - 0x04 - buffer pointer to file offset of email data 0x05 - buffer pointer to asciiz string cont
Wow!!! That looks like fun. Thanks for the response. I think you've given me enough to think about as to whether I want to continue on with this or to try and find a program that does what I need (or at least close to what I need). At least now, if I look at one of the files, I should have a better idea of what is going on. Thanks again!
-
can't help with the DB layout. but, IMO, it's probably easier to go at it at the source - at the mail transport level. i think Outlook Express supports MAPI. if so, you could write an app that scans your inbox looking for files that match patterns, and delete the ones that do. this app does something like that: http://www.smalleranimals.com/snoop.htm. it's not too complicated to do. -c
Garbage collection, making life better - for weenies!
Image Processing - now with extra cess.
Thanks for the reply Chris! After seeing the great (and daunting) data that Ernest posted, I am beginning to think you're right. I was hoping for something that I could do in a couple of days (it doesn't have to be pretty, it just needs to work) but this has the appearance of a full fledged, several weeks project. More time than I want to put in, I think.
-
FYI, MSDN has articles abt Outlook Express format! Thanks, Ramu
Thanks for the response! If you could post some of the URL's that would be great! I've searched many different times and didn't find anything that I thought was useful. It is entirely possible that I didn't search with the right keywords or that I just overlooked articles with the relevant information. Thanks again!