Thoughts on Internationalization
-
We're getting pressure from one of our customers to internationalize our software product. All of our currency and dates are handled correctly or get fixed quickly. We have about 70% of the words translated via resx files. There are also some database translations where we allow customization. All of this works. However, since it isn't 100% complete it came up in a discussion from management. One of the devs wants to remove the resx files and put all translations in a database table (actually 3). Curious if anybody out there has any strong opinions on why database only vs resx translations are better or not. There are articles out there and stack overflow questions, but most of it is older. Is resx still in favor? Is it a good choice. My feel is re-working all of the resx for some new custom format isn't a good use of our time. Thanks for your thoughts.
Hogan
Right. "Time" or "lag". Resource files are easier and faster to update versus a "resource management system" sitting on a server (IMO). Your can easily write a file parser at some point to report on your "resources".
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
-
We're getting pressure from one of our customers to internationalize our software product. All of our currency and dates are handled correctly or get fixed quickly. We have about 70% of the words translated via resx files. There are also some database translations where we allow customization. All of this works. However, since it isn't 100% complete it came up in a discussion from management. One of the devs wants to remove the resx files and put all translations in a database table (actually 3). Curious if anybody out there has any strong opinions on why database only vs resx translations are better or not. There are articles out there and stack overflow questions, but most of it is older. Is resx still in favor? Is it a good choice. My feel is re-working all of the resx for some new custom format isn't a good use of our time. Thanks for your thoughts.
Hogan
snorkie wrote:
We're getting pressure from one of our customers to internationalize our software product
Basically if you want to sell in France and/or Quebec (Canada) you better be able to support French.
snorkie wrote:
One of the devs wants to remove the resx files and put all translations in a database table (actually 3)
Certainly not something I would want to see happen. The UI is going to just end up caching that every single time. What happens if someone does a browser refresh? Do you load each page or everything all at once? If you have 1,000 distinct text items on a new page do you really want to do a pass through cache (1,000 separate database calls.) There would of course be process (not code) considerations for how it gets into the database during normal feature deliveries. Does it end up being treated as a database update which means there are also rollback considerations if a feature fails? Same consideration applies to all non-prod boxes also such as Developer and QA. What happens if the if the database is down? I think the 'dev' that wants this should be required to write the conversion Epic along with all of the stories and designs needed to support it. Then cost it out and present that, the cost, to management. I suspect that will be nixed by management even presuming the 'dev' has the willingness and expertise to fully write out the plan. It should include at a minimum. - Specific analysis of why this is better. - How UI uses it. Specifically what needs to change in the UI. - Performance impacts - Deployment steps for changes - Removing the old code. - How this will be supported with current translation service.
snorkie wrote:
There are also some database translations where we allow customization.
Presumably a customer can change this. That should not be a consideration for this case. However if a customer wants to support users with different language needs does that existing design account for that?
-
I just have to add my 'horror story' from at least as long ago: My company company went to a professional translator to have the prompts for the word processor (remember that we used to call it a word processor?) to German. The translator was given all the strings as one list of English terms. Nothing else. No context indication. This text processor had, of course, a function for 'Replace all -old text- with -new text-', with a checkmark for 'Manual check?'. In the German translation this came out as 'Ersetze -- mit --', and a checkbox 'Handbuch kontrollieren?' This was discovered in time for the official release, but only a few days before.
-
Couldn't agree more. My point was that a simply plucking text from a database and putting it in a user interface will make it look bad to the point of being useless. Here is a horror story I've seen "in a galaxy far, far away". Programmer who knew everything tells his team: just put all the texts you need translated between some kind of delimiters. I'm going to write a nice little program that extracts all those texts, put them in a database and pass them to the i18n team. They will just have to enter translations for those texts and Bob's your uncle, I solved all these pesky problems. Trouble came soon after, first when they realized some words had multiple meanings. In English "port" can be a harbour or the left side of the ship but in French "port" and "bâbord" are very different words. Translators had no clue in what context a word was used, besides they could enter only one translation for a word. Source code also became a cryptic mess where something like
SetLabel("Density")
becameSetLabel(load_string(ID_452))
. Some of the texts where too long, others too short, in brief such a mess that most users gave up on using localized translations and stuck to English. But the programmer who everything remained convinced he solved the problem. Moral of the story: humans are messy and their languages too. There is no silver bullet and text in a database is very, very far from being one.Mircea
Yes but unfortunately there are very few companies, if any (even very large) that can afford to hire 100 people fluent in living languages to work exclusively on context translations for each software project. And keeping in mind that 100 is not even close to the number of identified living languages. But it likely is close to what one might consider a viable market. So one just hopes that one can get by.
Mircea Neacsu wrote:
Trouble came soon after, first when they realized some words had multiple meanings.
I have worked for a number of companies that had no problems using services to provide translations based on provided text. And there are more difficult problems than just providing the context for a specific word.
Mircea Neacsu wrote:
gave up on using localized translations and stuck to English.
France and Quebec (province of Canada) both have laws that basically state that a company cannot require an employee to speak/read any language except French. So if you bring in that software there the company could end up with a number of employees sitting around staring at the walls all day. And the governments stipulate that the software they use must be in French. You can't get the contract without agreeing to that.
Mircea Neacsu wrote:
became SetLabel(load_string(ID_452)).
If programming was easy they wouldn't need people to do it.
-
Yes but unfortunately there are very few companies, if any (even very large) that can afford to hire 100 people fluent in living languages to work exclusively on context translations for each software project. And keeping in mind that 100 is not even close to the number of identified living languages. But it likely is close to what one might consider a viable market. So one just hopes that one can get by.
Mircea Neacsu wrote:
Trouble came soon after, first when they realized some words had multiple meanings.
I have worked for a number of companies that had no problems using services to provide translations based on provided text. And there are more difficult problems than just providing the context for a specific word.
Mircea Neacsu wrote:
gave up on using localized translations and stuck to English.
France and Quebec (province of Canada) both have laws that basically state that a company cannot require an employee to speak/read any language except French. So if you bring in that software there the company could end up with a number of employees sitting around staring at the walls all day. And the governments stipulate that the software they use must be in French. You can't get the contract without agreeing to that.
Mircea Neacsu wrote:
became SetLabel(load_string(ID_452)).
If programming was easy they wouldn't need people to do it.
jschell wrote:
France and Quebec (province of Canada) both have laws...
I lived in Montreal for over 30 years so I know a bit about language laws in Quebec. Accidentally I also know a bit about those in France. I cannot say more because I would run afoul of CP rules 🤐 :) No amount of regulation can force people to use a dysfunctional product. They will find a way to go over/under/behind those regulations. If, in your case, a database or a simple text file was good enough, more power to you :thumbsup:
Mircea
-
Yes but unfortunately there are very few companies, if any (even very large) that can afford to hire 100 people fluent in living languages to work exclusively on context translations for each software project. And keeping in mind that 100 is not even close to the number of identified living languages. But it likely is close to what one might consider a viable market. So one just hopes that one can get by.
Mircea Neacsu wrote:
Trouble came soon after, first when they realized some words had multiple meanings.
I have worked for a number of companies that had no problems using services to provide translations based on provided text. And there are more difficult problems than just providing the context for a specific word.
Mircea Neacsu wrote:
gave up on using localized translations and stuck to English.
France and Quebec (province of Canada) both have laws that basically state that a company cannot require an employee to speak/read any language except French. So if you bring in that software there the company could end up with a number of employees sitting around staring at the walls all day. And the governments stipulate that the software they use must be in French. You can't get the contract without agreeing to that.
Mircea Neacsu wrote:
became SetLabel(load_string(ID_452)).
If programming was easy they wouldn't need people to do it.
jschell wrote:
And keeping in mind that 100 is not even close to the number of identified living languages. But it likely is close to what one might consider a viable market.
If you cover 100 languages, you are bound to also run into a lot of cultural aspects that is not language specific or based. 20 years ago, 'everyone' wanted to collect the entire internet in their databases. Archive.org is one of the (few) survivors of that craze. I was in it, and went to an international conference. Access control to the collected information was an essential issue, and one of the speakers told that he had been in negotiations with delegate from US native groups about how to protect information that should be available only to males, or only to females. Also, some information should be available only during the harvesting season, other information only during the seeding season. The limits of either of course depended on the kind of crop. Needless to say, the access control of the system presented by the speaker did not have sufficient provisions for these demands. He presented it as an unsolved issue. If we simply state "We can't honor such cultural restrictions - The whole world must simply adapt to our culture, accept our freedoms (and most certainly respect all our taboos)!", then we are cultural imperialists as bad as in the era of colonization. And we are.
-
Couldn't agree more. My point was that a simply plucking text from a database and putting it in a user interface will make it look bad to the point of being useless. Here is a horror story I've seen "in a galaxy far, far away". Programmer who knew everything tells his team: just put all the texts you need translated between some kind of delimiters. I'm going to write a nice little program that extracts all those texts, put them in a database and pass them to the i18n team. They will just have to enter translations for those texts and Bob's your uncle, I solved all these pesky problems. Trouble came soon after, first when they realized some words had multiple meanings. In English "port" can be a harbour or the left side of the ship but in French "port" and "bâbord" are very different words. Translators had no clue in what context a word was used, besides they could enter only one translation for a word. Source code also became a cryptic mess where something like
SetLabel("Density")
becameSetLabel(load_string(ID_452))
. Some of the texts where too long, others too short, in brief such a mess that most users gave up on using localized translations and stuck to English. But the programmer who everything remained convinced he solved the problem. Moral of the story: humans are messy and their languages too. There is no silver bullet and text in a database is very, very far from being one.Mircea
Mircea Neacsu wrote:
There is no silver bullet
One could use a trick we used in the 1980's, when IT books where not translated. You learn English :thumbsup:
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
Mircea Neacsu wrote:
There is no silver bullet
One could use a trick we used in the 1980's, when IT books where not translated. You learn English :thumbsup:
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
99% of my heavy criticism of computer book authors and their editors is directed towards English language textbooks. They are most certainly no better than the translated ones. I guess that part of the problem is that major parts of the English speaking world (read: in the Us of A) do not read very much any more. Their critical sense to reject (sometimes very) bad books, from a language, editorial and presentation point of view has worn out. They do not know how to distinguish a well written book from a crappy one. So the fraction of crappy books is steadily increasing. My impression is that the average IT textbook written in other languages (my experience is with Scandinavian languages, but I suspect that it holds for a lot of other languages) is written under a lot stricter editorial control, and is a lot less smudged with 'edutainment' elements, going much more directly to the point. So the number of pages are about half. Originating in the US of A has not in any way been any guarantee for quality for an IT textbook. Quite to the contrary. When I feel the temptation to dig out my marker and my pen to clean up the text, I often think of how I could reshape this text into something much better in a Norwegian edition, half the pages. But at the professional level I am reading new texts today, the market for a Norwegian textbook is too small for it ever to pay the expenses. Making an abridged English version would lead to a lot of copyright issues.
-
99% of my heavy criticism of computer book authors and their editors is directed towards English language textbooks. They are most certainly no better than the translated ones. I guess that part of the problem is that major parts of the English speaking world (read: in the Us of A) do not read very much any more. Their critical sense to reject (sometimes very) bad books, from a language, editorial and presentation point of view has worn out. They do not know how to distinguish a well written book from a crappy one. So the fraction of crappy books is steadily increasing. My impression is that the average IT textbook written in other languages (my experience is with Scandinavian languages, but I suspect that it holds for a lot of other languages) is written under a lot stricter editorial control, and is a lot less smudged with 'edutainment' elements, going much more directly to the point. So the number of pages are about half. Originating in the US of A has not in any way been any guarantee for quality for an IT textbook. Quite to the contrary. When I feel the temptation to dig out my marker and my pen to clean up the text, I often think of how I could reshape this text into something much better in a Norwegian edition, half the pages. But at the professional level I am reading new texts today, the market for a Norwegian textbook is too small for it ever to pay the expenses. Making an abridged English version would lead to a lot of copyright issues.
trønderen wrote:
Originating in the US of A has not in any way been any guarantee for quality
Full stop there, as that is not just limited to books. Learning Enlish (not American) gives you a wider range, just as learning to write in English does. To drive that point home, our little CP community is English only.
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
99% of my heavy criticism of computer book authors and their editors is directed towards English language textbooks. They are most certainly no better than the translated ones. I guess that part of the problem is that major parts of the English speaking world (read: in the Us of A) do not read very much any more. Their critical sense to reject (sometimes very) bad books, from a language, editorial and presentation point of view has worn out. They do not know how to distinguish a well written book from a crappy one. So the fraction of crappy books is steadily increasing. My impression is that the average IT textbook written in other languages (my experience is with Scandinavian languages, but I suspect that it holds for a lot of other languages) is written under a lot stricter editorial control, and is a lot less smudged with 'edutainment' elements, going much more directly to the point. So the number of pages are about half. Originating in the US of A has not in any way been any guarantee for quality for an IT textbook. Quite to the contrary. When I feel the temptation to dig out my marker and my pen to clean up the text, I often think of how I could reshape this text into something much better in a Norwegian edition, half the pages. But at the professional level I am reading new texts today, the market for a Norwegian textbook is too small for it ever to pay the expenses. Making an abridged English version would lead to a lot of copyright issues.
trønderen wrote:
I guess that part of the problem is that major parts of the English speaking world (read: in the Us of A) do not read very much any more. Their critical sense to reject (sometimes very) bad books, from a language, editorial and presentation point of view has worn out. They do not know how to distinguish a well written book from a crappy one. So the fraction of crappy books is steadily increasing.
I doubt the implied cause there. It is much easier to produce (publish) a book now than even 20 years ago. And much, much easier than 50 years ago. And it is orders of magnitude different for self publishing. 50 years ago one would need a publisher to accept the book and then an editor working for that publisher would edit it. (Not totally true but one would need much more knowledge and money to self publish then.) Now even when that path is followed the role of the editor is less. Probably due to the publisher wanting to save costs but also because there are so many more books published. I would be very surprised if the publishers were not seeking quantity rather than quality now. Much more so than in the past. I suspect all of those factors have even more of an impact for 'text books'. After all just one consideration is that there is quite a bit of difference in editing a romance novel versus editing a programming language book.
-
trønderen wrote:
I guess that part of the problem is that major parts of the English speaking world (read: in the Us of A) do not read very much any more. Their critical sense to reject (sometimes very) bad books, from a language, editorial and presentation point of view has worn out. They do not know how to distinguish a well written book from a crappy one. So the fraction of crappy books is steadily increasing.
I doubt the implied cause there. It is much easier to produce (publish) a book now than even 20 years ago. And much, much easier than 50 years ago. And it is orders of magnitude different for self publishing. 50 years ago one would need a publisher to accept the book and then an editor working for that publisher would edit it. (Not totally true but one would need much more knowledge and money to self publish then.) Now even when that path is followed the role of the editor is less. Probably due to the publisher wanting to save costs but also because there are so many more books published. I would be very surprised if the publishers were not seeking quantity rather than quality now. Much more so than in the past. I suspect all of those factors have even more of an impact for 'text books'. After all just one consideration is that there is quite a bit of difference in editing a romance novel versus editing a programming language book.
The reduction of quality is most certainly not limited to self published books. I guess every English IT book I have bought(*) was published by what everybody would classify as highly respected publishing houses. These no longer need to spend resources on keeping the quality up, through editors and reviewers. The books sell anyway. One thing that one could mention to explain all the talkety-talk and lack of conciseness: The entry of the PC as a writing tool. When the authors were still using typewriters, doing editing was much more cumbersome; it required a lot more work to switch two sentences around, or move a paragraph to another chapter. The first thing that happened was that authors wrote down every thought they could think of, without filtering the way they did before. The second thing was that they forgot how to use the delete key, and how to do cut and paste to clean up the structure of the text. I guess that the cost of publishing, the process, makes up a larger fraction of the budget today. The cost of the paper is a smaller fraction than it used to be. Publishing/printing a 600 page book is not three times as expensive as a 200 page one. (Well it never was three times as expensive, but the cost of the materials made much more impact on the sales price 50 years ago.) (*) I have got one self-published IT book - Ted Nelson: Computer Lib/Dream Machines[^], the book introducing the concept of hypertext. It was published 49 years ago, before you had MS Word for writing your manuscript. Most of it is typewriter copy, or hand written. This is probably the first IT book I'd try to save if a fire broke out in my home.