MIME Attachment Names containing Extended Characters Fails
-
Hello all, While working on a project that emails files with international filenames, I've come across an unusual issue. if I attach with a US-ASCII filename only, I can get better than 200 characters long without errors. If I include an extended character, it encodes in UTF-8 and the length before it gets funky is very small (< 40 characters). To define funky.. here's an example filename after it goes bad: =utf-8BSU5GT1JNw4FUSUNBX0ltcGFjdF9Bc3Nl it looks like UTF8 encoded string with a UTF-8 decoding instruction or a mime boundary... not sure which. Has anyone seen this before -- and what are the rules / limitations of filenames -- I tried emailing the file by hand through outlook and it handles it, so I don't think it is a MIME specific limitation. Sample code:
class Program
{
private const string DOMAIN = "foobar.com";
private const string SMTPHOST = "mail." + DOMAIN;
private const string FROM = "chadwick.posey@" + DOMAIN;
private const string TO = FROM;static void Main(string\[\] args) { MailMessage msg = new MailMessage(FROM, TO, "Subject", "Body"); string path = Path.GetTempPath(); string name = "AAAAAA\_AAAAAAAAAAAA\_AAAAAAA\_AAAA - IIIIIII CCCCCCCCCC DD IIIIIIÁIIII\_20111018\_091327.pptx"; File.WriteAllText(path + "\\\\" + name, "blah"); Attachment att = new Attachment(path + "\\\\" + name, new ContentType("application/vnd.openxmlformats-officedocument.presentationml.presentation")); msg.Attachments.Add(att); SmtpClient client = new SmtpClient(SMTPHOST, 25); client.Send(msg); } }
I've tried (so far) -- setting the encoding for the attachment.NameEncoding to UTF8 and UTF32, neither worked. Setting the ContentDisposition.FileName on the attachment fails because it is not using US-ASCII characters only. Any suggestions on how to get it to include the full filename with the accent / extended characters in tact? Thanks Chadwick
============================= I'm a developer, he's a developer, she's a developer, Wouldn't ya like to be a developer too?
-
Hello all, While working on a project that emails files with international filenames, I've come across an unusual issue. if I attach with a US-ASCII filename only, I can get better than 200 characters long without errors. If I include an extended character, it encodes in UTF-8 and the length before it gets funky is very small (< 40 characters). To define funky.. here's an example filename after it goes bad: =utf-8BSU5GT1JNw4FUSUNBX0ltcGFjdF9Bc3Nl it looks like UTF8 encoded string with a UTF-8 decoding instruction or a mime boundary... not sure which. Has anyone seen this before -- and what are the rules / limitations of filenames -- I tried emailing the file by hand through outlook and it handles it, so I don't think it is a MIME specific limitation. Sample code:
class Program
{
private const string DOMAIN = "foobar.com";
private const string SMTPHOST = "mail." + DOMAIN;
private const string FROM = "chadwick.posey@" + DOMAIN;
private const string TO = FROM;static void Main(string\[\] args) { MailMessage msg = new MailMessage(FROM, TO, "Subject", "Body"); string path = Path.GetTempPath(); string name = "AAAAAA\_AAAAAAAAAAAA\_AAAAAAA\_AAAA - IIIIIII CCCCCCCCCC DD IIIIIIÁIIII\_20111018\_091327.pptx"; File.WriteAllText(path + "\\\\" + name, "blah"); Attachment att = new Attachment(path + "\\\\" + name, new ContentType("application/vnd.openxmlformats-officedocument.presentationml.presentation")); msg.Attachments.Add(att); SmtpClient client = new SmtpClient(SMTPHOST, 25); client.Send(msg); } }
I've tried (so far) -- setting the encoding for the attachment.NameEncoding to UTF8 and UTF32, neither worked. Setting the ContentDisposition.FileName on the attachment fails because it is not using US-ASCII characters only. Any suggestions on how to get it to include the full filename with the accent / extended characters in tact? Thanks Chadwick
============================= I'm a developer, he's a developer, she's a developer, Wouldn't ya like to be a developer too?
Chadwick Posey wrote:
and what are the rules / limitations of filenames
Long filenames under Windows can be upto 255 UTF16 characters. (Source[^]) Lots of older apps (and drivers) will still be using the MAX_PATH value and coding everything in ASCII, cutting back your effective storage-space.
Chadwick Posey wrote:
Any suggestions on how to get it to include the full filename with the accent / extended characters in tact?
Is it an option to archive the files in a ZIP-file and to attach that? I'm not sure whether it would work, but it'd be the first alternative that I'd go for.
Bastard Programmer from Hell :suss:
-
Chadwick Posey wrote:
and what are the rules / limitations of filenames
Long filenames under Windows can be upto 255 UTF16 characters. (Source[^]) Lots of older apps (and drivers) will still be using the MAX_PATH value and coding everything in ASCII, cutting back your effective storage-space.
Chadwick Posey wrote:
Any suggestions on how to get it to include the full filename with the accent / extended characters in tact?
Is it an option to archive the files in a ZIP-file and to attach that? I'm not sure whether it would work, but it'd be the first alternative that I'd go for.
Bastard Programmer from Hell :suss:
I will definitely look into that... optimally I'd like to send the bare file with the name intact, but this will be my last resort option I think. I figured even if the filename length was 255 with single byte characters, that it would support 128 or so with double byte characters, but it is significantly less... I think at 36 characters (72 bytes) it starts exhibiting the strange behavior... I'm wondering if the whole MIME-wrapping thing (I think the mime source is limited to 75 characters wide or something) is throwing something off if it is in UTF-8. Thanks Chadwick
============================= I'm a developer, he's a developer, she's a developer, Wouldn't ya like to be a developer too?
-
I will definitely look into that... optimally I'd like to send the bare file with the name intact, but this will be my last resort option I think. I figured even if the filename length was 255 with single byte characters, that it would support 128 or so with double byte characters, but it is significantly less... I think at 36 characters (72 bytes) it starts exhibiting the strange behavior... I'm wondering if the whole MIME-wrapping thing (I think the mime source is limited to 75 characters wide or something) is throwing something off if it is in UTF-8. Thanks Chadwick
============================= I'm a developer, he's a developer, she's a developer, Wouldn't ya like to be a developer too?
For posterity, I'm including my findings here -- I believe it to be a bug with .NET, and have submitted as such to connect here: Microsoft Connect[^] It appears to stem from the way encoded headers are wrapped. if the filename length exceeds a certain number of bytes when encoded, the system double-encodes the filename for some reason... Odd... Hope they fix it... if anyone can please test the sample code I uploaded to Microsoft and verify it fails, I would greatly appreciate the help. ================================================= Update: We worked around the issue by zipping the file using ASCII characters. It has been patched in a .NET 4 Framework Hotfix available here: Microsoft Connect Download[^]
============================= I'm a developer, he's a developer, she's a developer, Wouldn't ya like to be a developer too?