Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. A plea to Japanese (or Asian) Language Web Developers

A plea to Japanese (or Asian) Language Web Developers

Scheduled Pinned Locked Moved The Lounge
questionmobilecomhelptutorial
23 Posts 14 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Kornfeld Eliyahu PeterK Kornfeld Eliyahu Peter

    The line-breaking rules of Japanese are very permissive - you can have a line-break everywhere (almost). Unfortunatelly even those rules are only implemented in FireFox... I check (not now but in the past) some Japanese sites and found no specific handling of breaking so it seams to me that a fluent Japanese able to figure out what all this about regardless the line breaking... If you however insist not-to-break (as there is no 'right' or 'exact' breaking) you may use CSS, but that of course may (probably will) change the exact layout of your site... Try to come up with a fluid CSS design where the exact width of a text block will not break (but change a bit) your overall page design...

    Skipper: We'll fix it. Alex: Fix it? How you gonna fix this? Skipper: Grit, spit and a whole lotta duct tape.

    D Offline
    D Offline
    dan sh
    wrote on last edited by
    #6

    So, this is why they are slow readers. They need to first figure out what the sentence actually means.

    1 Reply Last reply
    0
    • J Jeremy Falcon

      Also, here is a Bootstrap website supporting more than one language to help get your motor running.. http://en.houbovypark.cz/[^]

      Jeremy Falcon

      S Offline
      S Offline
      Simon Lee Shugar
      wrote on last edited by
      #7

      Thanks for the help I'll have a look into it. I was wondering if the Japanese language used certain characters to symbolise a block of characters that must be read together. The web is an interesting weave indeed.

      Simon Lee Shugar (Software Developer) www.simonshugar.co.uk "If something goes by a false name, would it mean that thing is fake? False by nature?" By Gilbert Durandil

      M 1 Reply Last reply
      0
      • G Gary Wheeler

        The book is a bit old and may be out of print, but you might take a look at "CJKV Information Processing" by Ken Lunde. It was published by O'Reilly in 1999. The ISBN is 1-56592-224-7. Chapter 7, "Typography", includes a discussion on text wrapping issues. I can tell you from experience in developing localized applications that the general rule is to never construct text from stock phrases. Every piece of text in the application should be grammatically complete. The only time you should substitute one piece of text into another for display to the user is for parameter values (numbers, filenames, etc.). Absolutely nothing will impress your user base less than a clumsy UI that makes grammatical mistakes like a three-year old on Valium.

        Software Zen: delete this;

        S Offline
        S Offline
        Simon Lee Shugar
        wrote on last edited by
        #8

        Gary Wheeler wrote:

        I can tell you from experience in developing localized applications that the general rule is to never construct text from stock phrases.

        I think this might be the best advise yet. It's not just translating text from English to japanese or translating from English to Japanese in context to application but in fact translating English to Japanese in context to the application and in context to the design... Interesting. (Substitute the languages at will)

        Simon Lee Shugar (Software Developer) www.simonshugar.co.uk "If something goes by a false name, would it mean that thing is fake? False by nature?" By Gilbert Durandil

        1 Reply Last reply
        0
        • S Simon Lee Shugar

          I've added this question to the Web Development forum but it was suggested I add a plea into the lounge also. I'd appreciate the help - the link and question are below! Link: http://www.codeproject.com/Messages/5102431/How-do-you-deal-with-Japanese-Asian-languages-in-r.aspx[^] "This is something I have come up against recently and that is dealing with the Japanese language in responsive web applications. Changing the formation of a sentence can change the entire context or meaning. How do we deal with this - if a sentence is too long for a field and spills onto the next line? Is there anywhere I can read up on how this is handled? Any information would be appreciated! Example 私は、コードが好き = I , like the code 私は、コードが好 き = I , code is good Can"

          Simon Lee Shugar (Software Developer) www.simonshugar.co.uk "If something goes by a false name, would it mean that thing is fake? False by nature?" By Gilbert Durandil

          A Offline
          A Offline
          Anthony Appleyard
          wrote on last edited by
          #9

          Google Translater's Japanese-to-English translater makes the same mistake as yours. It seems to have a bug :: it treats end-of-line as end-of-sentence. Its French-to-English version did the same with a French sentence:- Le chat mangeait la souris. AS The cat ate it mouse. Le chat mangeait la souris. AS The cat ate the mouse. This may happen only with very short lines.

          1 Reply Last reply
          0
          • S Simon Lee Shugar

            I've added this question to the Web Development forum but it was suggested I add a plea into the lounge also. I'd appreciate the help - the link and question are below! Link: http://www.codeproject.com/Messages/5102431/How-do-you-deal-with-Japanese-Asian-languages-in-r.aspx[^] "This is something I have come up against recently and that is dealing with the Japanese language in responsive web applications. Changing the formation of a sentence can change the entire context or meaning. How do we deal with this - if a sentence is too long for a field and spills onto the next line? Is there anywhere I can read up on how this is handled? Any information would be appreciated! Example 私は、コードが好き = I , like the code 私は、コードが好 き = I , code is good Can"

            Simon Lee Shugar (Software Developer) www.simonshugar.co.uk "If something goes by a false name, would it mean that thing is fake? False by nature?" By Gilbert Durandil

            U Offline
            U Offline
            User 10331519
            wrote on last edited by
            #10

            Those sentences are identical; Japanese is not grammatically affected by line wrapping. As the other commenter suggests, the problem lies with Google Translate (which, by the way, generally produces gibberish from Japanese).

            1 Reply Last reply
            0
            • S Simon Lee Shugar

              Thanks for the help I'll have a look into it. I was wondering if the Japanese language used certain characters to symbolise a block of characters that must be read together. The web is an interesting weave indeed.

              Simon Lee Shugar (Software Developer) www.simonshugar.co.uk "If something goes by a false name, would it mean that thing is fake? False by nature?" By Gilbert Durandil

              M Offline
              M Offline
              Mario Prawirosudiro
              wrote on last edited by
              #11

              It does, but this generally only happens in katakana/hiragana, not kanji. For example, a small 'tsu' indicates a long consonant. And, the presence of 'u' after anything that ends with 'o' (like 'to' or 'ko') indicates a long 'o'. The same goes for an 'i' after anything that ends with 'e'. I can't put down the exact letters here, but I'm sure you'll be able to find many sources for katakana/hiragana online. That said, I've never seen any split-up words in any Japanese texts I've seen (admittedly, they're mostly games and manga ;) ), and there are cases where two or more words are joined together to provide context. So if you're looking for a way to split words up, my suggestion is don't.

              R 1 Reply Last reply
              0
              • J Jeremy Falcon

                Without a specific question I cannot give a specific answer and keep in mind I've only studied Japanese for a few months, but your issue has nothing to do with a different language... a different font would give you the same issue. Perhaps even a different text size (for the visually impaired) would give you this problem as well. There's no one exact answer, except to remember one core, fundamental of design philosophy between print and the web. The web is live and fluid. Print is not. So in other words you need to design your page in such as way to work in more than one scenario. If for whatever reason, you switch languages and it causes a line wrap in one language where in English there was none, then your UI layout needs to handle it. In the case where this is not acceptable, then consider having one layout for one language and another for a different language. http://www.nomensa.com/blog/2010/7-tips-and-techniques-for-multi-lingual-website-accessibility[^] Points 6 and 7 in that link talk about this a bit more.

                Jeremy Falcon

                J Offline
                J Offline
                jibalt
                wrote on last edited by
                #12

                Whoosh!

                1 Reply Last reply
                0
                • S Simon Lee Shugar

                  I've added this question to the Web Development forum but it was suggested I add a plea into the lounge also. I'd appreciate the help - the link and question are below! Link: http://www.codeproject.com/Messages/5102431/How-do-you-deal-with-Japanese-Asian-languages-in-r.aspx[^] "This is something I have come up against recently and that is dealing with the Japanese language in responsive web applications. Changing the formation of a sentence can change the entire context or meaning. How do we deal with this - if a sentence is too long for a field and spills onto the next line? Is there anywhere I can read up on how this is handled? Any information would be appreciated! Example 私は、コードが好き = I , like the code 私は、コードが好 き = I , code is good Can"

                  Simon Lee Shugar (Software Developer) www.simonshugar.co.uk "If something goes by a false name, would it mean that thing is fake? False by nature?" By Gilbert Durandil

                  K Offline
                  K Offline
                  Kirk 10389821
                  wrote on last edited by
                  #13

                  I help maintain a site that supports 11 languages including Japanese. We have a phrase based system we use, and we use outsourced translators, and in-house spot checkers familiar with different languages. We build new interfaces in English (because that is our primary language). We try to design for the fact that English is usually the shortest way to say most things (in characters due to a volume of 2,3 and 4 letter words), by putting labels ABOVE fields, not in front of them. Then we send out all of the phases to be translated. We then review the pages in every language to make sure something drastic did not happen to the flow. We also use the same group(s) to translate. So, they know our background/context. They also require that ALL translations are reviewed by a second person, and edited before they get back to us. HTH

                  S 1 Reply Last reply
                  0
                  • K Kirk 10389821

                    I help maintain a site that supports 11 languages including Japanese. We have a phrase based system we use, and we use outsourced translators, and in-house spot checkers familiar with different languages. We build new interfaces in English (because that is our primary language). We try to design for the fact that English is usually the shortest way to say most things (in characters due to a volume of 2,3 and 4 letter words), by putting labels ABOVE fields, not in front of them. Then we send out all of the phases to be translated. We then review the pages in every language to make sure something drastic did not happen to the flow. We also use the same group(s) to translate. So, they know our background/context. They also require that ALL translations are reviewed by a second person, and edited before they get back to us. HTH

                    S Offline
                    S Offline
                    Simon Lee Shugar
                    wrote on last edited by
                    #14

                    Thanks Kirk. This is most likely the option I am going to suggest. Gary from an earlier post said something similar and both replies seem to be the most sensible solution. We have a similar process to yours I think it is just the design we need to be more wary of. Thanks again!

                    Simon Lee Shugar (Software Developer) www.simonshugar.co.uk "If something goes by a false name, would it mean that thing is fake? False by nature?" By Gilbert Durandil

                    K D 2 Replies Last reply
                    0
                    • S Simon Lee Shugar

                      Thanks Kirk. This is most likely the option I am going to suggest. Gary from an earlier post said something similar and both replies seem to be the most sensible solution. We have a similar process to yours I think it is just the design we need to be more wary of. Thanks again!

                      Simon Lee Shugar (Software Developer) www.simonshugar.co.uk "If something goes by a false name, would it mean that thing is fake? False by nature?" By Gilbert Durandil

                      K Offline
                      K Offline
                      Kirk 10389821
                      wrote on last edited by
                      #15

                      No worries. Real experience counts for a lot. Also, we use SUBSTITUTION based phrase, it turns out it makes many things easier to translate in larger chunks. Example: "All E-Mail replies will be sent to {EMAIL_REPLY_TO} and will be sent from {EMAIL_FROM} so be sure to white list this account." By doing this, if the language REQUIRES a completely different ordering, it will come back properly: "XXX {EMAIL_FROM} YYY {EMAIL_REPLY_TO} ZZZ" "AAA {EMAIL_REPLY_TO} BBB {EMAIL_FROM} CCC" Trust me. A lot of output has embedded data from the web site. Think about a simple email to change your password using the attached link. Without such control, where do you put the link? How would that affect the wording? How about a link in case they DID NOT choose to reset their password?

                      1 Reply Last reply
                      0
                      • S Simon Lee Shugar

                        Thanks Kirk. This is most likely the option I am going to suggest. Gary from an earlier post said something similar and both replies seem to be the most sensible solution. We have a similar process to yours I think it is just the design we need to be more wary of. Thanks again!

                        Simon Lee Shugar (Software Developer) www.simonshugar.co.uk "If something goes by a false name, would it mean that thing is fake? False by nature?" By Gilbert Durandil

                        D Offline
                        D Offline
                        DanKorn
                        wrote on last edited by
                        #16

                        If you think Japanese is fun, try some bidirectional text, that is, mixing left-to-right with right-to-left text (such as English and Arabic). Good luck trying to make sense of how lines should wrap!

                        U 1 Reply Last reply
                        0
                        • M Mario Prawirosudiro

                          It does, but this generally only happens in katakana/hiragana, not kanji. For example, a small 'tsu' indicates a long consonant. And, the presence of 'u' after anything that ends with 'o' (like 'to' or 'ko') indicates a long 'o'. The same goes for an 'i' after anything that ends with 'e'. I can't put down the exact letters here, but I'm sure you'll be able to find many sources for katakana/hiragana online. That said, I've never seen any split-up words in any Japanese texts I've seen (admittedly, they're mostly games and manga ;) ), and there are cases where two or more words are joined together to provide context. So if you're looking for a way to split words up, my suggestion is don't.

                          R Offline
                          R Offline
                          RASPeter
                          wrote on last edited by
                          #17

                          Blocks of kanji that should be read together as a single word are quite common, actually. It's similar to how we string together Greek and Latin roots to make new words. For example: Locomotive = 機関車

                          M 1 Reply Last reply
                          0
                          • D DanKorn

                            If you think Japanese is fun, try some bidirectional text, that is, mixing left-to-right with right-to-left text (such as English and Arabic). Good luck trying to make sense of how lines should wrap!

                            U Offline
                            U Offline
                            User 11817871
                            wrote on last edited by
                            #18

                            Japanese newspapers still flow text right to left and vertically, top to bottom! What is even more impressive is that Japanese OCR software, such as Fuji Xerox's DocuWorks Desk, understands this!

                            1 Reply Last reply
                            0
                            • S Simon Lee Shugar

                              I've added this question to the Web Development forum but it was suggested I add a plea into the lounge also. I'd appreciate the help - the link and question are below! Link: http://www.codeproject.com/Messages/5102431/How-do-you-deal-with-Japanese-Asian-languages-in-r.aspx[^] "This is something I have come up against recently and that is dealing with the Japanese language in responsive web applications. Changing the formation of a sentence can change the entire context or meaning. How do we deal with this - if a sentence is too long for a field and spills onto the next line? Is there anywhere I can read up on how this is handled? Any information would be appreciated! Example 私は、コードが好き = I , like the code 私は、コードが好 き = I , code is good Can"

                              Simon Lee Shugar (Software Developer) www.simonshugar.co.uk "If something goes by a false name, would it mean that thing is fake? False by nature?" By Gilbert Durandil

                              C Offline
                              C Offline
                              Colorado_Bill
                              wrote on last edited by
                              #19

                              As many others have mentioned, Google translate is awful when in comes to Japanese. I have been studying Japanese for quite some time and even I am sometimes mislead by the weird formatting of various web pages. Since no one else has mentioned it, I will try to help a bit here with the language side of it -- typically you don't see Japanese words broken up between lines since it makes it harder to read (esp with Kanjis that have multiple readings). What this means is that "ideally" you would break up sentences on "particle" and word boundaries. Particles are "markers" that help identify the subject and object etc. Parsing for these is way beyond this type of discussion, and require some working knowledge of the language. Also, to make it harder yet, spaces are not necessary NOR required in Japanese writing (western style punctuation use has crept in though). As a rough algorithm though (if you cannot read nor parse for real particles/words) you could assume any Kanji followed by Hiragana is a word (until you hit a period or another Kanji or Katakana). Also, Katakana are single words too (typically foreign words like "code" in your case). In your case: 好き (すき) is a single word (to like/love, depending on context) and shouldn't be broken up with a line break (IMHO). Following this algorithm would break up the sentence like: 私は, ____ コード ____ が ___ 好き which gives four "words" -- it turns out that for this case you have 2 particles は and が ( for those keeping score ) but keeping them attached to their prior "word" isn't typically too confusing to read. This was probably too long winded for this question but I hope it helped some.

                              Bill

                              1 Reply Last reply
                              0
                              • R RASPeter

                                Blocks of kanji that should be read together as a single word are quite common, actually. It's similar to how we string together Greek and Latin roots to make new words. For example: Locomotive = 機関車

                                M Offline
                                M Offline
                                Mario Prawirosudiro
                                wrote on last edited by
                                #20

                                Indeed they are. I really meant to say syllables. In most cases, kanji letters, when strung together, act as a single syllable. For example, the kanji for 'person' could be read 'hito' when standalone, or 'jin' when it's a part of a word. However, in most cases, they're standalone syllables, unlike (for example) 'ho' + 'u', which is read 'hoo' (long 'o'). At least that is what I know. I don't have much formal training when it comes to Japanese. I mostly learn it out of self interest.

                                R 1 Reply Last reply
                                0
                                • M Mario Prawirosudiro

                                  Indeed they are. I really meant to say syllables. In most cases, kanji letters, when strung together, act as a single syllable. For example, the kanji for 'person' could be read 'hito' when standalone, or 'jin' when it's a part of a word. However, in most cases, they're standalone syllables, unlike (for example) 'ho' + 'u', which is read 'hoo' (long 'o'). At least that is what I know. I don't have much formal training when it comes to Japanese. I mostly learn it out of self interest.

                                  R Offline
                                  R Offline
                                  RASPeter
                                  wrote on last edited by
                                  #21

                                  I think you actually mean katakana, not kanji. Japanese use two phonetic character sets: katakana and hiragana. Historically, hiragana was used by women and katakana was used by men, and you can still see that history in the characters themselves. Hiragana tend to be more round and have loops, while katakana tend to be more angular. In modern usage, hiragana is used for native Japanese words and katakana is used for foreign words. Kanji are Chinese characters, and each is a word in itself, representing a distinct concept. They are not used as syllables, because that's what katakana and hiragana were created for. When kanji are strung together it is to merge the concepts together to describe a new thing that can't be described adequately by any existing single character (again, just like we do with Greek and Latin roots). All the characters in both katakana and hiragana were derived from kanji that have the same pronunciation, and most (possibly all?) retain the meaning of the original kanji, even though that's not typically how they're used. It's pretty common to see kanji and hiragana together, and there are two general cases. First, small hiragana are sometimes placed above or below a kanji character as a pronunciation guide, called furigana, because children are generally taught hiragana first and gradually introduced to kanji as they get older. Second, hiragana are often added to the end of a kanji word (one or more characters) to indicate verb conjugation, because Japanese has verb tenses and Mandarin (which kanji were actually created for) does not. Anyway, I hope I didn't go too overboard there. I actually do have some formal training in both Chinese and Japanese. Not enough to claim even moderate fluency, sadly, but enough to understand how the writing systems work.

                                  M 1 Reply Last reply
                                  0
                                  • R RASPeter

                                    I think you actually mean katakana, not kanji. Japanese use two phonetic character sets: katakana and hiragana. Historically, hiragana was used by women and katakana was used by men, and you can still see that history in the characters themselves. Hiragana tend to be more round and have loops, while katakana tend to be more angular. In modern usage, hiragana is used for native Japanese words and katakana is used for foreign words. Kanji are Chinese characters, and each is a word in itself, representing a distinct concept. They are not used as syllables, because that's what katakana and hiragana were created for. When kanji are strung together it is to merge the concepts together to describe a new thing that can't be described adequately by any existing single character (again, just like we do with Greek and Latin roots). All the characters in both katakana and hiragana were derived from kanji that have the same pronunciation, and most (possibly all?) retain the meaning of the original kanji, even though that's not typically how they're used. It's pretty common to see kanji and hiragana together, and there are two general cases. First, small hiragana are sometimes placed above or below a kanji character as a pronunciation guide, called furigana, because children are generally taught hiragana first and gradually introduced to kanji as they get older. Second, hiragana are often added to the end of a kanji word (one or more characters) to indicate verb conjugation, because Japanese has verb tenses and Mandarin (which kanji were actually created for) does not. Anyway, I hope I didn't go too overboard there. I actually do have some formal training in both Chinese and Japanese. Not enough to claim even moderate fluency, sadly, but enough to understand how the writing systems work.

                                    M Offline
                                    M Offline
                                    Mario Prawirosudiro
                                    wrote on last edited by
                                    #22

                                    Yes, sorry. I meant one kanji letter is usually one syllable, though is some cases, it could be more than one (like the kanji for mountain in 'Fujiyama'). Like I said in my original post, katakana/hiragana in many cases require more than one letter to produce a syllable. When I said 'syllable', it's from the perspective of someone who's used to Latin alphabet, thus 'yama' is two syllables, though it's written with one kanji. And 'he + i' (which is read 'hee' with a long 'e') is a single syllable, though it's written with two hiraganas. So what I'm trying to say is, while you could split words consisting only of kanji into two lines with ease, that migh not be the case with words containing horagana/katakana, due to that reason, as it might confuse the reader. Or at least makes it harder for them to read. Now, I don't know what kind of provision the Japanese language has for dealing with word splitting for line breaks, but personally, I've never seen any split words in the texts (*cough*manga*cough*) I've read. I don't know about websites though.

                                    R 1 Reply Last reply
                                    0
                                    • M Mario Prawirosudiro

                                      Yes, sorry. I meant one kanji letter is usually one syllable, though is some cases, it could be more than one (like the kanji for mountain in 'Fujiyama'). Like I said in my original post, katakana/hiragana in many cases require more than one letter to produce a syllable. When I said 'syllable', it's from the perspective of someone who's used to Latin alphabet, thus 'yama' is two syllables, though it's written with one kanji. And 'he + i' (which is read 'hee' with a long 'e') is a single syllable, though it's written with two hiraganas. So what I'm trying to say is, while you could split words consisting only of kanji into two lines with ease, that migh not be the case with words containing horagana/katakana, due to that reason, as it might confuse the reader. Or at least makes it harder for them to read. Now, I don't know what kind of provision the Japanese language has for dealing with word splitting for line breaks, but personally, I've never seen any split words in the texts (*cough*manga*cough*) I've read. I don't know about websites though.

                                      R Offline
                                      R Offline
                                      RASPeter
                                      wrote on last edited by
                                      #23

                                      If katakana/hiragana are present in typed Japanese there will also be spaces. There might not be if it's pure kanji, but anyone who can read that is already having to figure out a lot from context (verb tense, etc) so putting a line break in the wrong spot is not likely to increase their cognitive load significantly.

                                      1 Reply Last reply
                                      0
                                      Reply
                                      • Reply as topic
                                      Log in to reply
                                      • Oldest to Newest
                                      • Newest to Oldest
                                      • Most Votes


                                      • Login

                                      • Don't have an account? Register

                                      • Login or register to search.
                                      • First post
                                        Last post
                                      0
                                      • Categories
                                      • Recent
                                      • Tags
                                      • Popular
                                      • World
                                      • Users
                                      • Groups