Muliple Languages
-
A view months ago I developed an application for a customer in Bangkok, Thailand sending data to a central web server (MS SQL Server database) from Pocket PC devices in the field. Also in the central office data are entered on PC and synchronized with the field. The Pocket PC's have their own database, so offline processing is possible. To communicate I use XML, SOAP, Web Service. We do not use ms sql server on the devices but the old fashined OS build in database. (no money for cal's) A view days ago the whole system stopped. Clients in the field could not communicate with the ms sql server backend anymore. After debugging I found the following problem. 1. We have addressfields in two languages. The users can enter the address in English and in an alternative language. (Chinese, Thai, Khmer e.t.c.) 2. The customer set up the English Fields in VARCHAR and the alternative language fields in NVARCHAR 3. In the organisations headquarter in Bangkok (Thailand) a data entry girl entered a new Bangkok Address. 4. She was a little bit confused, so she entered a Thai Address in the English Fields and The English Address into the alternative Fields. :^) For your Info: Thai is a very complicated script. Up to 4 characters are mixed together in one position. You write an "I" above a consonant and an "U" under a consonant. 5. Now all the devices could not sync anymore, because the VARCHAR field containing Thai Data produced an unhandled XML parse error. :( SQL Server returned garbage which the XML parser (.net cf) could not handle. What is my point I would like to make: Programming for multi language environments can be very tricky, especially with Asian Languages. Be careful! Consider all cases! Chris ;) Vietiane Lao (PDR) -- modified at 23:23 Monday 16th January, 2006
-
A view months ago I developed an application for a customer in Bangkok, Thailand sending data to a central web server (MS SQL Server database) from Pocket PC devices in the field. Also in the central office data are entered on PC and synchronized with the field. The Pocket PC's have their own database, so offline processing is possible. To communicate I use XML, SOAP, Web Service. We do not use ms sql server on the devices but the old fashined OS build in database. (no money for cal's) A view days ago the whole system stopped. Clients in the field could not communicate with the ms sql server backend anymore. After debugging I found the following problem. 1. We have addressfields in two languages. The users can enter the address in English and in an alternative language. (Chinese, Thai, Khmer e.t.c.) 2. The customer set up the English Fields in VARCHAR and the alternative language fields in NVARCHAR 3. In the organisations headquarter in Bangkok (Thailand) a data entry girl entered a new Bangkok Address. 4. She was a little bit confused, so she entered a Thai Address in the English Fields and The English Address into the alternative Fields. :^) For your Info: Thai is a very complicated script. Up to 4 characters are mixed together in one position. You write an "I" above a consonant and an "U" under a consonant. 5. Now all the devices could not sync anymore, because the VARCHAR field containing Thai Data produced an unhandled XML parse error. :( SQL Server returned garbage which the XML parser (.net cf) could not handle. What is my point I would like to make: Programming for multi language environments can be very tricky, especially with Asian Languages. Be careful! Consider all cases! Chris ;) Vietiane Lao (PDR) -- modified at 23:23 Monday 16th January, 2006
yes, I gather I18n applied to asian languages (Im in Australia so 'asia' for me is the Philipines, Thailand, Singapore etc) is quite an interesting subject. Why dont you write and post an article here on CP demonstrating some of this ? - you obviously have the experience ... 'g' -- modified at 23:41 Monday 16th January, 2006
-
yes, I gather I18n applied to asian languages (Im in Australia so 'asia' for me is the Philipines, Thailand, Singapore etc) is quite an interesting subject. Why dont you write and post an article here on CP demonstrating some of this ? - you obviously have the experience ... 'g' -- modified at 23:41 Monday 16th January, 2006
Garth J Lancaster wrote:
Im in Australia so 'asia' for me is the Philipines, Thailand, Singapore etc
Are you reminded of us only when you play cricket? ;P Cheers, Vikram.
"When I read in books about a "base class", I figured this was the class that was at the bottom of the inheritence tree. It's the "base", right? Like the base of a pyramid." - Marc Clifton.
-
Garth J Lancaster wrote:
Im in Australia so 'asia' for me is the Philipines, Thailand, Singapore etc
Are you reminded of us only when you play cricket? ;P Cheers, Vikram.
"When I read in books about a "base class", I figured this was the class that was at the bottom of the inheritence tree. It's the "base", right? Like the base of a pyramid." - Marc Clifton.
Vikram Shannon wrote:
Are you reminded of us only when you play cricket?
chuckle - not at all Vikram :-) I only implied that (while 'Asia' technically/geographically does include India, ?Pakistan (I forget the exact definition)), I think in relative terms to where I am. I experienced this 'relativity aspect' when I was in the UK recently - my UK friends would typically think of an 'Asian' as being a person primarily of Indian extraction, while down here we think of an Asian as being more of Chinese ethnic extraction perhaps... Please dont take offence at this next quip - you must understand us aussies are probably the most politically incorrect race on earth, we must remind ourselves other people often misinterpret what we say as 'fun' as being racist ... I call the Indians I work with 'curry munchers', while they call me a 'sheep shagger' (I was born in New Zealand, if that explains it) so no, I'm not just reminded of Indians when we play cricket - I just dont put you into the same ethnic/geographical group.. I hope none of that is offensive - I'll delete it if so 'g' -- modified at 4:27 Tuesday 17th January, 2006
-
Vikram Shannon wrote:
Are you reminded of us only when you play cricket?
chuckle - not at all Vikram :-) I only implied that (while 'Asia' technically/geographically does include India, ?Pakistan (I forget the exact definition)), I think in relative terms to where I am. I experienced this 'relativity aspect' when I was in the UK recently - my UK friends would typically think of an 'Asian' as being a person primarily of Indian extraction, while down here we think of an Asian as being more of Chinese ethnic extraction perhaps... Please dont take offence at this next quip - you must understand us aussies are probably the most politically incorrect race on earth, we must remind ourselves other people often misinterpret what we say as 'fun' as being racist ... I call the Indians I work with 'curry munchers', while they call me a 'sheep shagger' (I was born in New Zealand, if that explains it) so no, I'm not just reminded of Indians when we play cricket - I just dont put you into the same ethnic/geographical group.. I hope none of that is offensive - I'll delete it if so 'g' -- modified at 4:27 Tuesday 17th January, 2006
Garth J Lancaster wrote:
I hope none of that is offensive - I'll delete it if so
Hell, no - did you miss that emoticon? :-D
Garth J Lancaster wrote:
Please dont take offence at this next quip - you must understand us aussies are probably the most politically incorrect race on earth, we must remind ourselves other people often misinterpret what we say as 'fun' as being racist ... I call the Indians I work with 'curry munchers', while they call me a 'sheep shagger' (I was born in New Zealand, if that explains it)
:laugh::laugh::laugh: Cheers, Vikram.
"When I read in books about a "base class", I figured this was the class that was at the bottom of the inheritence tree. It's the "base", right? Like the base of a pyramid." - Marc Clifton. don`t try to be clever ass wid me while you can`t.. - Adnan Siddiqi.
-
Garth J Lancaster wrote:
I hope none of that is offensive - I'll delete it if so
Hell, no - did you miss that emoticon? :-D
Garth J Lancaster wrote:
Please dont take offence at this next quip - you must understand us aussies are probably the most politically incorrect race on earth, we must remind ourselves other people often misinterpret what we say as 'fun' as being racist ... I call the Indians I work with 'curry munchers', while they call me a 'sheep shagger' (I was born in New Zealand, if that explains it)
:laugh::laugh::laugh: Cheers, Vikram.
"When I read in books about a "base class", I figured this was the class that was at the bottom of the inheritence tree. It's the "base", right? Like the base of a pyramid." - Marc Clifton. don`t try to be clever ass wid me while you can`t.. - Adnan Siddiqi.
Vikram Shannon wrote:
Hell, no - did you miss that emoticon?
no, but I am conscious that not everybody shares the same views - there are rednecks and racial bigots everywhere 'g'
-
A view months ago I developed an application for a customer in Bangkok, Thailand sending data to a central web server (MS SQL Server database) from Pocket PC devices in the field. Also in the central office data are entered on PC and synchronized with the field. The Pocket PC's have their own database, so offline processing is possible. To communicate I use XML, SOAP, Web Service. We do not use ms sql server on the devices but the old fashined OS build in database. (no money for cal's) A view days ago the whole system stopped. Clients in the field could not communicate with the ms sql server backend anymore. After debugging I found the following problem. 1. We have addressfields in two languages. The users can enter the address in English and in an alternative language. (Chinese, Thai, Khmer e.t.c.) 2. The customer set up the English Fields in VARCHAR and the alternative language fields in NVARCHAR 3. In the organisations headquarter in Bangkok (Thailand) a data entry girl entered a new Bangkok Address. 4. She was a little bit confused, so she entered a Thai Address in the English Fields and The English Address into the alternative Fields. :^) For your Info: Thai is a very complicated script. Up to 4 characters are mixed together in one position. You write an "I" above a consonant and an "U" under a consonant. 5. Now all the devices could not sync anymore, because the VARCHAR field containing Thai Data produced an unhandled XML parse error. :( SQL Server returned garbage which the XML parser (.net cf) could not handle. What is my point I would like to make: Programming for multi language environments can be very tricky, especially with Asian Languages. Be careful! Consider all cases! Chris ;) Vietiane Lao (PDR) -- modified at 23:23 Monday 16th January, 2006
dl4gbe wrote:
What is my point I would like to make: Programming for multi language environments can be very tricky, especially with Asian Languages. Be careful! Consider all cases!
No kidding! We have an app in use in Thailand among 40+ other countries and it's not an easy task. In the situation you outlined it's traditional to use nvarchar for every field to avoid this sort of problem.
"Hello, hello, what's all this shouting, we'll have no trouble here! This is a Local Shop for Local People, there's nothing for you here!" -Edward Tattsyrup
-
yes, I gather I18n applied to asian languages (Im in Australia so 'asia' for me is the Philipines, Thailand, Singapore etc) is quite an interesting subject. Why dont you write and post an article here on CP demonstrating some of this ? - you obviously have the experience ... 'g' -- modified at 23:41 Monday 16th January, 2006
Umm...well defining nvarchar address and varchar address in the way described is not exactly article worthy, more like a bad example for the dailywtf than anything else.
"Hello, hello, what's all this shouting, we'll have no trouble here! This is a Local Shop for Local People, there's nothing for you here!" -Edward Tattsyrup
-
Umm...well defining nvarchar address and varchar address in the way described is not exactly article worthy, more like a bad example for the dailywtf than anything else.
"Hello, hello, what's all this shouting, we'll have no trouble here! This is a Local Shop for Local People, there's nothing for you here!" -Edward Tattsyrup
John Cardinal wrote:
is not exactly article worthy
I agree, probably not, it shows bad design decisions, but I was inviting him to write a decent I18n article (sounds like you could do a good job yourself) .. technical issues aside (nvarchar vs varchar), I was interested to see if applying I18n to Thai/Chinese 'asian' languages etc would be harder than applying it to Indian/Hindi for example (that would make a better article than the pure 'technicalities') .. then again, if 'applying I18n' is a methodology, it might not matter what the underlying character presentation is .. 'g'
-
John Cardinal wrote:
is not exactly article worthy
I agree, probably not, it shows bad design decisions, but I was inviting him to write a decent I18n article (sounds like you could do a good job yourself) .. technical issues aside (nvarchar vs varchar), I was interested to see if applying I18n to Thai/Chinese 'asian' languages etc would be harder than applying it to Indian/Hindi for example (that would make a better article than the pure 'technicalities') .. then again, if 'applying I18n' is a methodology, it might not matter what the underlying character presentation is .. 'g'
All languages are the same for 90% of work of internationalization. Use locales properly in the app, leave extra room for localized versions, expect right to left and left to right display format, don't use concatenated strings to form displays (i.e. don't munge together the word "Customer" with the word "List" to form "Customer List" because obviously that will break in so many different languages. Store all date and times in UTC format and only format them for display and input processing etc etc. Don't assume all the printers in the world use non-metric paper sizes. Throw the idea of a "character" out the window for just about any language and don't ever allow yourself to think it again, don't assume Unicode means two bytes, lot's of arabic languages use three for their glyphs. Working in .net much of the work is done for you in the System.Globalization namespace and pretty much every class that could conceivably have any locale related issues comes with locale accepting versions. When we first wrote our major app back in '99 in c++ for win32 this was nothing short of a nightmare to do, now it's a walk in the park with .net. There are copious amounts of info available all over the internet on localization and internationalization. A good week of study and you know all you need to know. Most programmers should start at Unicode.org just to get their minds around the whole language issue as it will probably be the most complex for any experienced programmer with pre-conceived notions about these things. The only tricky part of internationalization even in .net that I personally have had to wrestle with is for word breaking and indexing for a search engine built into our application. That is where the idiosyncracies of individual languages start to show big time. Thai is a special exception for word breaking and formatting as it does not contain any concept of a "word", "sentences" are just long uninterrupted strings of text. Windows does a good job of presenting it, but actually having to break it apart into searchable "words" is a tricky business. CJK languages (Chinese, Japanese and Korean) pose their own challenges but at least there are some rules for the concepts of "word" breaking and there is a neat trick that can be applied to any language (including Thai) without whitespace to separate words that I put into practice in an article here regarding unicode word breaking.
"Hello, hello, what's all this shouting, we'll have no trouble here! This is a Local Shop fo
-
All languages are the same for 90% of work of internationalization. Use locales properly in the app, leave extra room for localized versions, expect right to left and left to right display format, don't use concatenated strings to form displays (i.e. don't munge together the word "Customer" with the word "List" to form "Customer List" because obviously that will break in so many different languages. Store all date and times in UTC format and only format them for display and input processing etc etc. Don't assume all the printers in the world use non-metric paper sizes. Throw the idea of a "character" out the window for just about any language and don't ever allow yourself to think it again, don't assume Unicode means two bytes, lot's of arabic languages use three for their glyphs. Working in .net much of the work is done for you in the System.Globalization namespace and pretty much every class that could conceivably have any locale related issues comes with locale accepting versions. When we first wrote our major app back in '99 in c++ for win32 this was nothing short of a nightmare to do, now it's a walk in the park with .net. There are copious amounts of info available all over the internet on localization and internationalization. A good week of study and you know all you need to know. Most programmers should start at Unicode.org just to get their minds around the whole language issue as it will probably be the most complex for any experienced programmer with pre-conceived notions about these things. The only tricky part of internationalization even in .net that I personally have had to wrestle with is for word breaking and indexing for a search engine built into our application. That is where the idiosyncracies of individual languages start to show big time. Thai is a special exception for word breaking and formatting as it does not contain any concept of a "word", "sentences" are just long uninterrupted strings of text. Windows does a good job of presenting it, but actually having to break it apart into searchable "words" is a tricky business. CJK languages (Chinese, Japanese and Korean) pose their own challenges but at least there are some rules for the concepts of "word" breaking and there is a neat trick that can be applied to any language (including Thai) without whitespace to separate words that I put into practice in an article here regarding unicode word breaking.
"Hello, hello, what's all this shouting, we'll have no trouble here! This is a Local Shop fo
excellent - thanks for the info John .. I'll file this away - while my target market is 'English speaking' at the moment, I should definately be aware of this .. have a great day Garth