regex to replace accents
-
I have this regex to replace accents but it fails if the text contains "|". string Text = "word|word2";
// Regex.
System.Text.RegularExpressions.Regex replace_a_Accents = new System.Text.RegularExpressions.Regex("[á|à|ä|â]", System.Text.RegularExpressions.RegexOptions.Compiled);
System.Text.RegularExpressions.Regex replace_e_Accents = new System.Text.RegularExpressions.Regex("[é|è|ë|ê]", System.Text.RegularExpressions.RegexOptions.Compiled);
System.Text.RegularExpressions.Regex replace_i_Accents = new System.Text.RegularExpressions.Regex("[í|ì|ï|î]", System.Text.RegularExpressions.RegexOptions.Compiled);
System.Text.RegularExpressions.Regex replace_o_Accents = new System.Text.RegularExpressions.Regex("[ó|ò|ö|ô]", System.Text.RegularExpressions.RegexOptions.Compiled);
System.Text.RegularExpressions.Regex replace_u_Accents = new System.Text.RegularExpressions.Regex("[ú|ù|ü|û]", System.Text.RegularExpressions.RegexOptions.Compiled);// Reemplaza.
Texto_Retorno = replace_a_Accents.Replace(Texto_Retorno, "a");
Texto_Retorno = replace_e_Accents.Replace(Texto_Retorno, "e");
Texto_Retorno = replace_i_Accents.Replace(Texto_Retorno, "i");
Texto_Retorno = replace_o_Accents.Replace(Texto_Retorno, "o");
Texto_Retorno = replace_u_Accents.Replace(Texto_Retorno, "u"); -
I have this regex to replace accents but it fails if the text contains "|". string Text = "word|word2";
// Regex.
System.Text.RegularExpressions.Regex replace_a_Accents = new System.Text.RegularExpressions.Regex("[á|à|ä|â]", System.Text.RegularExpressions.RegexOptions.Compiled);
System.Text.RegularExpressions.Regex replace_e_Accents = new System.Text.RegularExpressions.Regex("[é|è|ë|ê]", System.Text.RegularExpressions.RegexOptions.Compiled);
System.Text.RegularExpressions.Regex replace_i_Accents = new System.Text.RegularExpressions.Regex("[í|ì|ï|î]", System.Text.RegularExpressions.RegexOptions.Compiled);
System.Text.RegularExpressions.Regex replace_o_Accents = new System.Text.RegularExpressions.Regex("[ó|ò|ö|ô]", System.Text.RegularExpressions.RegexOptions.Compiled);
System.Text.RegularExpressions.Regex replace_u_Accents = new System.Text.RegularExpressions.Regex("[ú|ù|ü|û]", System.Text.RegularExpressions.RegexOptions.Compiled);// Reemplaza.
Texto_Retorno = replace_a_Accents.Replace(Texto_Retorno, "a");
Texto_Retorno = replace_e_Accents.Replace(Texto_Retorno, "e");
Texto_Retorno = replace_i_Accents.Replace(Texto_Retorno, "i");
Texto_Retorno = replace_o_Accents.Replace(Texto_Retorno, "o");
Texto_Retorno = replace_u_Accents.Replace(Texto_Retorno, "u");Why would you use a Regex for something so simple?
bool replacedAny = false;
char[] characters = Texto_Retorno.ToCharArray();
for (int index = 0; index < characters.Length; index++)
{
switch (characters[index])
{
case 'á':
case 'à':
case 'ä':
case 'â':
{
characters[index] = 'a';
replacedAny = true;
break;
}
case 'é':
case 'è':
case 'ë':
case 'ê':
{
characters[index] = 'e';
replacedAny = true;
break;
}
case 'í':
case 'ì':
case 'ï':
case 'î':
{
characters[index] = 'i';
replacedAny = true;
break;
}
case 'ó':
case 'ò':
case 'ö':
case 'ô':
{
characters[index] = 'o';
replacedAny = true;
break;
}
case 'ú':
case 'ù':
case 'ü':
case 'û':
{
characters[index] = 'u';
replacedAny = true;
break;
}
}
}if (replacedAny)
{
Texto_Retorno = new string(characters);
}
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
Why would you use a Regex for something so simple?
bool replacedAny = false;
char[] characters = Texto_Retorno.ToCharArray();
for (int index = 0; index < characters.Length; index++)
{
switch (characters[index])
{
case 'á':
case 'à':
case 'ä':
case 'â':
{
characters[index] = 'a';
replacedAny = true;
break;
}
case 'é':
case 'è':
case 'ë':
case 'ê':
{
characters[index] = 'e';
replacedAny = true;
break;
}
case 'í':
case 'ì':
case 'ï':
case 'î':
{
characters[index] = 'i';
replacedAny = true;
break;
}
case 'ó':
case 'ò':
case 'ö':
case 'ô':
{
characters[index] = 'o';
replacedAny = true;
break;
}
case 'ú':
case 'ù':
case 'ü':
case 'û':
{
characters[index] = 'u';
replacedAny = true;
break;
}
}
}if (replacedAny)
{
Texto_Retorno = new string(characters);
}
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
I have this regex to replace accents but it fails if the text contains "|". string Text = "word|word2";
// Regex.
System.Text.RegularExpressions.Regex replace_a_Accents = new System.Text.RegularExpressions.Regex("[á|à|ä|â]", System.Text.RegularExpressions.RegexOptions.Compiled);
System.Text.RegularExpressions.Regex replace_e_Accents = new System.Text.RegularExpressions.Regex("[é|è|ë|ê]", System.Text.RegularExpressions.RegexOptions.Compiled);
System.Text.RegularExpressions.Regex replace_i_Accents = new System.Text.RegularExpressions.Regex("[í|ì|ï|î]", System.Text.RegularExpressions.RegexOptions.Compiled);
System.Text.RegularExpressions.Regex replace_o_Accents = new System.Text.RegularExpressions.Regex("[ó|ò|ö|ô]", System.Text.RegularExpressions.RegexOptions.Compiled);
System.Text.RegularExpressions.Regex replace_u_Accents = new System.Text.RegularExpressions.Regex("[ú|ù|ü|û]", System.Text.RegularExpressions.RegexOptions.Compiled);// Reemplaza.
Texto_Retorno = replace_a_Accents.Replace(Texto_Retorno, "a");
Texto_Retorno = replace_e_Accents.Replace(Texto_Retorno, "e");
Texto_Retorno = replace_i_Accents.Replace(Texto_Retorno, "i");
Texto_Retorno = replace_o_Accents.Replace(Texto_Retorno, "o");
Texto_Retorno = replace_u_Accents.Replace(Texto_Retorno, "u");In [...] groups (alternatives) you don't want the |s. [aeiou] as a regex will match any vowel, for example.
Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012
-
Nothing to see here. :innocent-whistle-smily: In my defence, "3" is directly above "e" on the keyboard. :laugh:
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
Nothing to see here. :innocent-whistle-smily: In my defence, "3" is directly above "e" on the keyboard. :laugh:
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
Why would you use a Regex for something so simple?
bool replacedAny = false;
char[] characters = Texto_Retorno.ToCharArray();
for (int index = 0; index < characters.Length; index++)
{
switch (characters[index])
{
case 'á':
case 'à':
case 'ä':
case 'â':
{
characters[index] = 'a';
replacedAny = true;
break;
}
case 'é':
case 'è':
case 'ë':
case 'ê':
{
characters[index] = 'e';
replacedAny = true;
break;
}
case 'í':
case 'ì':
case 'ï':
case 'î':
{
characters[index] = 'i';
replacedAny = true;
break;
}
case 'ó':
case 'ò':
case 'ö':
case 'ô':
{
characters[index] = 'o';
replacedAny = true;
break;
}
case 'ú':
case 'ù':
case 'ü':
case 'û':
{
characters[index] = 'u';
replacedAny = true;
break;
}
}
}if (replacedAny)
{
Texto_Retorno = new string(characters);
}
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
I was dreading it, Regex doesn't do the job and doesn't replace either?
-
Why would you use a Regex for something so simple?
bool replacedAny = false;
char[] characters = Texto_Retorno.ToCharArray();
for (int index = 0; index < characters.Length; index++)
{
switch (characters[index])
{
case 'á':
case 'à':
case 'ä':
case 'â':
{
characters[index] = 'a';
replacedAny = true;
break;
}
case 'é':
case 'è':
case 'ë':
case 'ê':
{
characters[index] = 'e';
replacedAny = true;
break;
}
case 'í':
case 'ì':
case 'ï':
case 'î':
{
characters[index] = 'i';
replacedAny = true;
break;
}
case 'ó':
case 'ò':
case 'ö':
case 'ô':
{
characters[index] = 'o';
replacedAny = true;
break;
}
case 'ú':
case 'ù':
case 'ü':
case 'û':
{
characters[index] = 'u';
replacedAny = true;
break;
}
}
}if (replacedAny)
{
Texto_Retorno = new string(characters);
}
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
case 'Á' case 'À' case 'Ó' ... to be continue
-
I was dreading it, Regex doesn't do the job and doesn't replace either?
Regex can do the job. But running five+ separate regex operations on a string just to replace a few letters with their unaccented alternatives is overkill. The other option, which is even nastier and less obvious, is to use Unicode normalization:
static string RemoveDiacritics(string stIn)
{
string stFormD = stIn.Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();for(int ich = 0; ich < stFormD.Length; ich++) { UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(stFormD\[ich\]); if (uc != UnicodeCategory.NonSpacingMark) { sb.Append(stFormD\[ich\]); } } return sb.ToString().Normalize(NormalizationForm.FormC);
}
string input = "Príliš žlutoucký kun úpel dábelské ódy.";
string result = RemoveDiacritics(input); // "Prilis zlutoucky kun upel dabelske ody."
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer