Author Topic: character set problems  (Read 15527 times)

Offline jodaka

  • Newbie
  • *
  • Posts: 7
character set problems
« on: July 18, 2006, 12:49:15 AM »
I'm Russian and therefore have got lot's of emails in cp1251, KOI8-R charset's. For me i've chosen UTF-8. The problem is that roundcube doesn't show message subjects or bodies correct if charset isn't UTF-8
I don't know how - but SquirellMail shows everything correct. All charsets are shown just right.
In Roundcube - nope

any ideas ?

or should I attach screenshot ?

Offline flosoft

  • Sr. Member
  • ****
  • Posts: 349
    • http://flosoft.biz
Re: character set problems
« Reply #1 on: July 19, 2006, 09:11:48 AM »
would be nice to have a screenshot.

Offline jodaka

  • Newbie
  • *
  • Posts: 7
Re: character set problems
« Reply #2 on: July 19, 2006, 11:48:42 PM »
Quote from: flosoft
would be nice to have a screenshot.
not a problem :)

I've included screenshot where you can see lot's of emails with broken (or truly said not UTF8 charset). And you can see there are only a few (2or3 emails that shows subjects correctly becouse of UTF8).
I've also attached to emails to show you, how most of emails are composed here in Russia. Here most users use The Bat! (composes emails in KOI8-R or sometimes in cp1251 by default) and Outlook (cp1251). And very few % of users know about UTF-8 and using it.

Offline jodaka

  • Newbie
  • *
  • Posts: 7
Re: character set problems
« Reply #3 on: July 25, 2006, 01:52:18 AM »
well, after some more digging with help from this article (http://ru.wikipedia.org/wiki/%D0%9A%D1%80%D0%BE%D0%BA%D0%BE%D0%B7%D1%8F%D0%B1%D1%80%D0%B0) I've found that RoundCube (or maybe Apache) convert all letters to ISO 8859-1. And that's why I can't read messages in cp1251 and KOI8-R

are there any ways to disable charset translation at all ? or to force it UTF-8, not ISO 8859-1

Offline yllar

  • Full Member
  • ***
  • Posts: 106
Re: character set problems
« Reply #4 on: July 25, 2006, 03:23:57 AM »
Quote from: jodaka
well, after some more digging with help from this article (http://ru.wikipedia.org/wiki/%D0%9A%D1%80%D0%BE%D0%BA%D0%BE%D0%B7%D1%8F%D0%B1%D1%80%D0%B0) I've found that RoundCube (or maybe Apache) convert all letters to ISO 8859-1. And that's why I can't read messages in cp1251 and KOI8-R

are there any ways to disable charset translation at all ? or to force it UTF-8, not ISO 8859-1
look at http://httpd.apache.org/docs/2.0/mod/core.html#adddefaultcharset
irc://irc.freenode.net:6667/#roundcube

Offline jodaka

  • Newbie
  • *
  • Posts: 7
Re: character set problems
« Reply #5 on: July 25, 2006, 05:26:30 AM »
i've already tryed setting up AddDefaultCharset with .htaccess but that doesn't help
looks like the problem is in RoundCube itself, because SquirelMail works well on the same host (and all charsets are correctly shown)

Offline sadgin

  • Newbie
  • *
  • Posts: 4
Re: character set problems
« Reply #6 on: July 26, 2006, 02:30:33 AM »
Quote from: jodaka
i've already tryed setting up AddDefaultCharset with .htaccess but that doesn't help
looks like the problem is in RoundCube itself, because SquirelMail works well on the same host (and all charsets are correctly shown)
I think it does not depend on apache charset.
I see that roundcube don't decode subjects and other headers which were not quoted printable encoded.
e.g.
the following subject will not shows correctly, even if mail have header Content-type with character set.
Subject: Уведомление об ответах на подписанную тему
but the following subject will show correctly.
Subject: =?koi8-r?B?REVMaXQ6IPDP0M/MzsXOydEgxs/Oz9TFy8kgKDI1IMnAzNEgMjAwNiDHz8TBKQ==?=

Offline jodaka

  • Newbie
  • *
  • Posts: 7
Re: character set problems
« Reply #7 on: July 26, 2006, 05:42:18 AM »
Quote from: sadgin
the following subject will not shows correctly, even if mail have header Content-type with character set.
Subject: Уведомление об ответах на подписанную тему
but the following subject will show correctly.
Subject: =?koi8-r?B?REVMaXQ6IPDP0M/MzsXOydEgxs/Oz9TFy8kgKDI1IMnAzNEgMjAwNiDHz8TBKQ==?=

Oh... looks like you right.
So it finaly a roundcube problem. Should I wright a feature request ?
or maybe try to look at Squirelmail sources and port it to roundcube...

Offline sadgin

  • Newbie
  • *
  • Posts: 4
Re: character set problems
« Reply #8 on: July 27, 2006, 02:41:41 AM »
Quote from: jodaka
Oh... looks like you right.
So it finaly a roundcube problem. Should I wright a feature request ?
or maybe try to look at Squirelmail sources and port it to roundcube...
You may try the following patch for today SVN:


diff -urN roundcubemail/program/include/rcube_imap.inc mail/program/include/rcube_imap.inc
--- roundcubemail/program/include/rcube_imap.inc    2006-07-20 16:06:39.000000000 +0700
+++ mail/program/include/rcube_imap.inc 2006-07-27 13:39:09.000000000 +0700
@@ -61,6 +61,7 @@
  var $capabilities = array();
  var $skip_deleted = FALSE;
  var $debug_level = 1;
+ var $charset = '';


  /**
@@ -84,6 +85,9 @@
   $this->__construct($db_conn);
   }

+ function scharset($chars) {
+   $this->charset = $chars;
+  }

  /**
  * Connect to an IMAP server
@@ -1737,7 +1740,7 @@
    }

   // no encoding information, defaults to what is specified in the class header
-  return rcube_charset_convert($input, 'ISO-8859-1');
+  return rcube_charset_convert($input, $this->charset);
   }


diff -urN roundcubemail/program/steps/mail/func.inc mail/program/steps/mail/func.inc
--- roundcubemail/program/steps/mail/func.inc  2006-07-27 12:51:13.000000000 +0700
+++ mail/program/steps/mail/func.inc  2006-07-27 13:37:55.000000000 +0700
@@ -434,6 +434,7 @@
   // format each col
   foreach ($a_show_cols as $col)
    {
+    $IMAP->scharset($header->charset);
    if ($col=='from' || $col=='to')
     $cont = rep_specialchars_output(rcmail_address_string($header->$col, 3, $attrib['addicon']));
    else if ($col=='subject')
@@ -1017,6 +1018,7 @@
   if (!$headers[$hkey])
    continue;

+  $IMAP->scharset($headers['charset']);
   if ($hkey=='date' && !empty($headers[$hkey]))
    $header_value = format_date(strtotime($headers[$hkey]));
   else if (in_array($hkey, array('from', 'to', 'cc', 'bcc', 'reply-to')))

Offline jodaka

  • Newbie
  • *
  • Posts: 7
Re: character set problems
« Reply #9 on: July 27, 2006, 07:19:46 AM »
well
I've just tryed this patch over daily svn (282) but nothing seems to change...
still unreadble subject and messages

Offline sadgin

  • Newbie
  • *
  • Posts: 4
Re: character set problems
« Reply #10 on: July 27, 2006, 07:36:27 AM »
Quote from: jodaka
well
I've just tryed this patch over daily svn (282) but nothing seems to change...
still unreadble subject and messages
hm.. strange... I am using same revision.
you may try to insert print "$this->charset";
before
return rcube_charset_convert($input, $this->charset);
to see if charset present in some message.

Offline jodaka

  • Newbie
  • *
  • Posts: 7
Re: character set problems
« Reply #11 on: July 27, 2006, 07:43:55 AM »
added print "[$this->charset]";
and... see the screenshot

Offline sadgin

  • Newbie
  • *
  • Posts: 4
Re: character set problems
« Reply #12 on: July 27, 2006, 08:06:42 AM »
Quote from: jodaka
added print "[$this->charset]";
and... see the screenshot
Is this message content header Content-type with charset field?

Offline valqk

  • Newbie
  • *
  • Posts: 4
Re: character set problems
« Reply #13 on: November 03, 2006, 08:47:41 AM »
Just a slight overview at current moment & cyrrilic encoding (windows-1251 , cp1251) problem.

The current svn version is running and kicking - translating all encodings.

I've started edbugging roundcube because the conversion wasn't working but...DAMN.

I've nowhere saw that iconv.so is required.

So, the iconv extension _IS REQURED_ if you want to translate from one lang to another.