Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Web Development
  3. Linux, Apache, MySQL, PHP
  4. greek alphabet encoding issues

greek alphabet encoding issues

Scheduled Pinned Locked Moved Linux, Apache, MySQL, PHP
phphtmllampdatabasevisual-studio
2 Posts 2 Posters 43 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Offline
    M Offline
    Member_16359559
    wrote on last edited by
    #1

    I'm neither a PHP coder nor a DBA, barely a sysadmin 🙂

    We run an old LAMP stack here which worked fine up until Debian 9 (PHP 5.6.30).
    Recent migrations to Debian 10 and 12 (both PHP 5.6.40 from https://deb.sury.org) came with Greek alphabet encoding issues, not seen earlier.
    These letters always looked "broken" in the database, even on Debian 9 (e.g. gamma - γ)
    They displayed correctly under html in Debian 9, but not later.

    Initially I thought it was:

    SELECT @@character_set_database, @@collation_database;
    +--------------------------+----------------------+
    | @@character_set_database | @@collation_database |
    +--------------------------+----------------------+
    | latin1 | latin1_swedish_ci |

    in Debian 9

    vs

    +--------------------------+----------------------+
    | @@character_set_database | @@collation_database |
    +--------------------------+----------------------+
    | utf8mb4 | utf8mb4_general_ci |

    in Debian 10 and 12.

    Changing it from utf8 to latin1 at different stages made no difference.

    Then I thought it may be something in one of php.ini files but they are all contain:

    default_charset = "UTF-8"

    in the older (working) and newer (broken) stacks.

    Could somebody provide a direction on how to fix it?

    J 1 Reply Last reply
    0
    • M Member_16359559

      I'm neither a PHP coder nor a DBA, barely a sysadmin 🙂

      We run an old LAMP stack here which worked fine up until Debian 9 (PHP 5.6.30).
      Recent migrations to Debian 10 and 12 (both PHP 5.6.40 from https://deb.sury.org) came with Greek alphabet encoding issues, not seen earlier.
      These letters always looked "broken" in the database, even on Debian 9 (e.g. gamma - γ)
      They displayed correctly under html in Debian 9, but not later.

      Initially I thought it was:

      SELECT @@character_set_database, @@collation_database;
      +--------------------------+----------------------+
      | @@character_set_database | @@collation_database |
      +--------------------------+----------------------+
      | latin1 | latin1_swedish_ci |

      in Debian 9

      vs

      +--------------------------+----------------------+
      | @@character_set_database | @@collation_database |
      +--------------------------+----------------------+
      | utf8mb4 | utf8mb4_general_ci |

      in Debian 10 and 12.

      Changing it from utf8 to latin1 at different stages made no difference.

      Then I thought it may be something in one of php.ini files but they are all contain:

      default_charset = "UTF-8"

      in the older (working) and newer (broken) stacks.

      Could somebody provide a direction on how to fix it?

      J Offline
      J Offline
      jschell
      wrote on last edited by
      #2

      Member 16359559 wrote:

      displayed correctly under html in Debian 9, but not later.

      I would start with the assumptions in there. A stored character has a binary representation. There is no way you can see a character until it is 'translated' into a viewer. Doesn't matter what the viewer is it still must do that translation. And sometimes there is more than one translation in the pipeline (for example database to driver, to a 'string', then to 'html', then to a browser.)

      Member 16359559 wrote:

      Greek alphabet

      That also is an assumption. There is a character set, a binary representation, which is used for a 'language'. It will have a specific name. So you really need to determine what that is. Then the steps First extract the binary data from the database. Probably test data. Determine if the form is correct. That is does the binary value (hex or binary) match the character set. Note that if you attempt this by displaying a 'character' then you are not doing it correctly. If the binary value is not correct for the character set then nothing you can do in the code will fix that. Second step determine the pipeline from binary to display. This does include the driver between the whatever application/code that reads the database itself. For each step you must determine whether the binary value retains its original correct value. If you find a step where it is wrong then you have found where the problem is. You should also look for a different way to view the data besides html and a browser. Even just initially that might be the easiest way to determine if those are the problem.

      1 Reply Last reply
      0
      Reply
      • Reply as topic
      Log in to reply
      • Oldest to Newest
      • Newest to Oldest
      • Most Votes


      • Login

      • Don't have an account? Register

      • Login or register to search.
      • First post
        Last post
      0
      • Categories
      • Recent
      • Tags
      • Popular
      • World
      • Users
      • Groups