Strict Standards: Declaration of BB_Walker_Blank::start_lvl() should be compatible with BB_Walker::start_lvl($output) in /webroot/s/2/s2admin/synapp2.org/www/forum/bb-includes/classes.php on line 1127

Strict Standards: Declaration of BB_Walker_Blank::end_lvl() should be compatible with BB_Walker::end_lvl($output) in /webroot/s/2/s2admin/synapp2.org/www/forum/bb-includes/classes.php on line 1127

Strict Standards: Declaration of BB_Walker_Blank::start_el() should be compatible with BB_Walker::start_el($output) in /webroot/s/2/s2admin/synapp2.org/www/forum/bb-includes/classes.php on line 1127

Strict Standards: Declaration of BB_Walker_Blank::end_el() should be compatible with BB_Walker::end_el($output) in /webroot/s/2/s2admin/synapp2.org/www/forum/bb-includes/classes.php on line 1127

Strict Standards: Redefining already defined constructor for class bbdb in /webroot/s/2/s2admin/synapp2.org/www/forum/bb-includes/db-mysql.php on line 12
charset problem « Support Forums

Support Forums » Issues
Issues

charset problem

(7 posts)

  1. Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /webroot/s/2/s2admin/synapp2.org/www/forum/bb-includes/kses.php on line 440

    Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /webroot/s/2/s2admin/synapp2.org/www/forum/bb-includes/kses.php on line 510

    Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /webroot/s/2/s2admin/synapp2.org/www/forum/bb-includes/kses.php on line 512
    richard

    Administrator

    I've been making progress on a comprehensive approach to character set management, and have learned a lot. In a nutshell, all the layers have to match - html, xml, php, MySQL (database and connection) and, in the case of FPDF the fonts.

    Through SynApp2 1.8.0 (beta 4), there are some discontinuities. The html pages and xml are using utf-8, but the database connection and manipulation assumes latin-1. A conversion layer, implemented by encode_entities() and decode_entites(), compensates. This explicit conversion layer turns out to be unnecessary if all the pieces agree on encoding and character set.

    See MySQL and UTF-8 Notes: http://www.phpwact.org/php/i18n/utf-8/mysql

    There's a significant difference between utf-8 and many, if not all, of the ISO characters sets like latin-1, latin-2, etc. - the number of bytes per character. UTF-8 is Unicode and each character may be encoded with multiple bytes, whereas the latin- character sets , for example, are represented with single-byte values that are commingled with traditional ASCII.

    In order for a client (browser) or database to understand what character is represented by a specific single-byte code (e.g., does \xA1 represent '¡' or 'Ą') the character set must be known. The same code represents a different character depending on the character set. This also implies that the single-byte character sets can't really be mixed. You can have latin-1 or latin-2, but not both. With utf-8 each character has its own code, so characters (languages) can be mixed.

    Handling multi-byte characters correctly, requires some additional care and attention to detail. Functions that handle [data] at an application-level, must be implemented appropriately. In the case of PHP particularly, string functions that count, search, or manipulate multi-byte character data, must be suited to the task.

    The changes needed for character set management fall into several distinct areas:

    1. configuration - designation of character set/encoding
    2. generating html markup with correct meta tag charset
    3. returning correct xml response encoding
    4. use correct data processing/manipulation functions
    5. implement utf-8/unicode multilingual character set and fonts for FPDF (with tFPDF)
    6. translate database results (as needed) to utf-8 for FPDF reports

    This all should be reasonably straight forward and contained.

    The ball is rolling.

    -Richard

    Posted 7 years ago #

  2. Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /webroot/s/2/s2admin/synapp2.org/www/forum/bb-includes/kses.php on line 440

    Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /webroot/s/2/s2admin/synapp2.org/www/forum/bb-includes/kses.php on line 510

    Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /webroot/s/2/s2admin/synapp2.org/www/forum/bb-includes/kses.php on line 512
    richard

    Administrator

    SynApp2 version 1.8.1 has support for ISO-8859-2 (latin2) and other single-byte character sets.

    Posted 7 years ago #

RSS feed for this topic

Reply

You must log in

log in
to post.