Roundcube Community Forum

SVN Releases => Issues & Bugs => Topic started by: jodaka on July 18, 2006, 12:49:15 AM

Title: character set problems
Post by: jodaka on July 18, 2006, 12:49:15 AM
I'm Russian and therefore have got lot's of emails in cp1251, KOI8-R charset's. For me i've chosen UTF-8. The problem is that roundcube doesn't show message subjects or bodies correct if charset isn't UTF-8
I don't know how - but SquirellMail shows everything correct. All charsets are shown just right.
In Roundcube - nope

any ideas ?

or should I attach screenshot ?
Title: Re: character set problems
Post by: flosoft on July 19, 2006, 09:11:48 AM
would be nice to have a screenshot.
Title: Re: character set problems
Post by: jodaka on July 19, 2006, 11:48:42 PM
Quote from: flosoft
would be nice to have a screenshot.
not a problem :)

I've included screenshot where you can see lot's of emails with broken (or truly said not UTF8 charset). And you can see there are only a few (2or3 emails that shows subjects correctly becouse of UTF8).
I've also attached to emails to show you, how most of emails are composed here in Russia. Here most users use The Bat! (composes emails in KOI8-R or sometimes in cp1251 by default) and Outlook (cp1251). And very few % of users know about UTF-8 and using it.
Title: Re: character set problems
Post by: jodaka on July 25, 2006, 01:52:18 AM
well, after some more digging with help from this article (http://ru.wikipedia.org/wiki/%D0%9A%D1%80%D0%BE%D0%BA%D0%BE%D0%B7%D1%8F%D0%B1%D1%80%D0%B0) I've found that RoundCube (or maybe Apache) convert all letters to ISO 8859-1. And that's why I can't read messages in cp1251 and KOI8-R

are there any ways to disable charset translation at all ? or to force it UTF-8, not ISO 8859-1
Title: Re: character set problems
Post by: yllar on July 25, 2006, 03:23:57 AM
Quote from: jodaka
well, after some more digging with help from this article (http://ru.wikipedia.org/wiki/%D0%9A%D1%80%D0%BE%D0%BA%D0%BE%D0%B7%D1%8F%D0%B1%D1%80%D0%B0) I've found that RoundCube (or maybe Apache) convert all letters to ISO 8859-1. And that's why I can't read messages in cp1251 and KOI8-R

are there any ways to disable charset translation at all ? or to force it UTF-8, not ISO 8859-1
look at http://httpd.apache.org/docs/2.0/mod/core.html#adddefaultcharset
Title: Re: character set problems
Post by: jodaka on July 25, 2006, 05:26:30 AM
i've already tryed setting up AddDefaultCharset with .htaccess but that doesn't help
looks like the problem is in RoundCube itself, because SquirelMail works well on the same host (and all charsets are correctly shown)
Title: Re: character set problems
Post by: sadgin on July 26, 2006, 02:30:33 AM
Quote from: jodaka
i've already tryed setting up AddDefaultCharset with .htaccess but that doesn't help
looks like the problem is in RoundCube itself, because SquirelMail works well on the same host (and all charsets are correctly shown)
I think it does not depend on apache charset.
I see that roundcube don't decode subjects and other headers which were not quoted printable encoded.
e.g.
the following subject will not shows correctly, even if mail have header Content-type with character set.
Subject: Уведомление об ответах на подписанную тему
but the following subject will show correctly.
Subject: =?koi8-r?B?REVMaXQ6IPDP0M/MzsXOydEgxs/Oz9TFy8kgKDI1IMnAzNEgMjAwNiDHz8TBKQ==?=
Title: Re: character set problems
Post by: jodaka on July 26, 2006, 05:42:18 AM
Quote from: sadgin
the following subject will not shows correctly, even if mail have header Content-type with character set.
Subject: Уведомление об ответах на подписанную тему
but the following subject will show correctly.
Subject: =?koi8-r?B?REVMaXQ6IPDP0M/MzsXOydEgxs/Oz9TFy8kgKDI1IMnAzNEgMjAwNiDHz8TBKQ==?=

Oh... looks like you right.
So it finaly a roundcube problem. Should I wright a feature request ?
or maybe try to look at Squirelmail sources and port it to roundcube...
Title: Re: character set problems
Post by: sadgin on July 27, 2006, 02:41:41 AM
Quote from: jodaka
Oh... looks like you right.
So it finaly a roundcube problem. Should I wright a feature request ?
or maybe try to look at Squirelmail sources and port it to roundcube...
You may try the following patch for today SVN:


diff -urN roundcubemail/program/include/rcube_imap.inc mail/program/include/rcube_imap.inc
--- roundcubemail/program/include/rcube_imap.inc    2006-07-20 16:06:39.000000000 +0700
+++ mail/program/include/rcube_imap.inc 2006-07-27 13:39:09.000000000 +0700
@@ -61,6 +61,7 @@
  var $capabilities = array();
  var $skip_deleted = FALSE;
  var $debug_level = 1;
+ var $charset = '';


  /**
@@ -84,6 +85,9 @@
   $this->__construct($db_conn);
   }

+ function scharset($chars) {
+   $this->charset = $chars;
+  }

  /**
  * Connect to an IMAP server
@@ -1737,7 +1740,7 @@
    }

   // no encoding information, defaults to what is specified in the class header
-  return rcube_charset_convert($input, 'ISO-8859-1');
+  return rcube_charset_convert($input, $this->charset);
   }


diff -urN roundcubemail/program/steps/mail/func.inc mail/program/steps/mail/func.inc
--- roundcubemail/program/steps/mail/func.inc  2006-07-27 12:51:13.000000000 +0700
+++ mail/program/steps/mail/func.inc  2006-07-27 13:37:55.000000000 +0700
@@ -434,6 +434,7 @@
   // format each col
   foreach ($a_show_cols as $col)
    {
+    $IMAP->scharset($header->charset);
    if ($col=='from' || $col=='to')
     $cont = rep_specialchars_output(rcmail_address_string($header->$col, 3, $attrib['addicon']));
    else if ($col=='subject')
@@ -1017,6 +1018,7 @@
   if (!$headers[$hkey])
    continue;

+  $IMAP->scharset($headers['charset']);
   if ($hkey=='date' && !empty($headers[$hkey]))
    $header_value = format_date(strtotime($headers[$hkey]));
   else if (in_array($hkey, array('from', 'to', 'cc', 'bcc', 'reply-to')))
Title: Re: character set problems
Post by: jodaka on July 27, 2006, 07:19:46 AM
well
I've just tryed this patch over daily svn (282) but nothing seems to change...
still unreadble subject and messages
Title: Re: character set problems
Post by: sadgin on July 27, 2006, 07:36:27 AM
Quote from: jodaka
well
I've just tryed this patch over daily svn (282) but nothing seems to change...
still unreadble subject and messages
hm.. strange... I am using same revision.
you may try to insert print "$this->charset";
before
return rcube_charset_convert($input, $this->charset);
to see if charset present in some message.
Title: Re: character set problems
Post by: jodaka on July 27, 2006, 07:43:55 AM
added print "[$this->charset]";
and... see the screenshot
Title: Re: character set problems
Post by: sadgin on July 27, 2006, 08:06:42 AM
Quote from: jodaka
added print "[$this->charset]";
and... see the screenshot
Is this message content header Content-type with charset field?
Title: Re: character set problems
Post by: valqk on November 03, 2006, 08:47:41 AM
Just a slight overview at current moment & cyrrilic encoding (windows-1251 , cp1251) problem.

The current svn version is running and kicking - translating all encodings.

I've started edbugging roundcube because the conversion wasn't working but...DAMN.

I've nowhere saw that iconv.so is required.

So, the iconv extension _IS REQURED_ if you want to translate from one lang to another.