GB2312, ISO-2022-KR

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

GB2312, ISO-2022-KR

Vladimir Gorpenko
Something managed to be clarified. Perhaps, it will be interesting.

If you remember, I have two servers, will call them W and T. T normally
opens letters in these codings, W shows something like the Arab
characters.

1. In case of conversion of the codings RC at first tries to use iconv.
As iconv(php) is an interface to the linux utility iconv, it doesn't
work at the server W where chroot is used. On the T server iconv
transforms both of these codings normally.

2. In the second queue  RC tries to apply mbstring to conversion of
codings. Mbstring, apparently, doesn't understand GB2312 (GBK), but
understands ISO-2022-KR. Nevertheless, mbstring also doesn't work at W
server. I think, the reason is also somehow connected to chroot.

I suppose, RC has problems with different codings by operation under
chroot. Though I still hope to set up operation of iconv and mbstring.

--
Best regards,
    Vladimir Gorpenko
_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

Kyle Francis

Vladimir,

Is it possible to add iconv to your chroot? Are you permitted to since this is a production machine?  If so that seems your best option.  It seems as though you will need to copy the /usr/lib/gconv directory into your chroot root as well.

See https://bugs.php.net/bug.php?id=44096 towards the end if the page.

Kyle

On Sep 26, 2016 4:41 PM, Vladimir Gorpenko <[hidden email]> wrote:
>
> Something managed to be clarified. Perhaps, it will be interesting.
>
> If you remember, I have two servers, will call them W and T. T normally
> opens letters in these codings, W shows something like the Arab
> characters.
>
> 1. In case of conversion of the codings RC at first tries to use iconv.
> As iconv(php) is an interface to the linux utility iconv, it doesn't
> work at the server W where chroot is used. On the T server iconv
> transforms both of these codings normally.
>
> 2. In the second queue  RC tries to apply mbstring to conversion of
> codings. Mbstring, apparently, doesn't understand GB2312 (GBK), but
> understands ISO-2022-KR. Nevertheless, mbstring also doesn't work at W
> server. I think, the reason is also somehow connected to chroot.
>
> I suppose, RC has problems with different codings by operation under
> chroot. Though I still hope to set up operation of iconv and mbstring.
>
> --
> Best regards,
>     Vladimir Gorpenko
> _______________________________________________
> Roundcube Development discussion mailing list
> [hidden email]
> http://lists.roundcube.net/mailman/listinfo/dev


_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

Rimas Kudelis
In reply to this post by Vladimir Gorpenko
Hi,

2016-09-26 23:41, Vladimir Gorpenko rašė:
> 1. In case of conversion of the codings RC at first tries to use
> iconv. As iconv(php) is an interface to the linux utility iconv, it
> doesn't work at the server W where chroot is used. On the T server
> iconv transforms both of these codings normally.

Small correction: I think it's actually an interface to a _library_, not
_utility_. I don't think you need /usr/bin/iconv available in your chroot.

Regards,
Rimas
_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

Vladimir Gorpenko
In reply to this post by Kyle Francis

All many thanks.

The problem was really solved by copying/usr/lib64/gconv in /chroot. Why mbstring didn't work and whether it works now - I don't know, but iconv, obviously, works.

---
Best regards,
   Vladimir Gorpenko

 

Kyle Francis писал 2016-09-27 01:27:

Vladimir,

Is it possible to add iconv to your chroot? Are you permitted to since this is a production machine?  If so that seems your best option.  It seems as though you will need to copy the /usr/lib/gconv directory into your chroot root as well.

See https://bugs.php.net/bug.php?id=44096 towards the end if the page.

Kyle

On Sep 26, 2016 4:41 PM, Vladimir Gorpenko <[hidden email]> wrote:
>
> Something managed to be clarified. Perhaps, it will be interesting.
>
> If you remember, I have two servers, will call them W and T. T normally
> opens letters in these codings, W shows something like the Arab
> characters.
>
> 1. In case of conversion of the codings RC at first tries to use iconv.
> As iconv(php) is an interface to the linux utility iconv, it doesn't
> work at the server W where chroot is used. On the T server iconv
> transforms both of these codings normally.
>
> 2. In the second queue  RC tries to apply mbstring to conversion of
> codings. Mbstring, apparently, doesn't understand GB2312 (GBK), but
> understands ISO-2022-KR. Nevertheless, mbstring also doesn't work at W
> server. I think, the reason is also somehow connected to chroot.
>
> I suppose, RC has problems with different codings by operation under
> chroot. Though I still hope to set up operation of iconv and mbstring.
>
> --
> Best regards,
>     Vladimir Gorpenko
> _______________________________________________
> Roundcube Development discussion mailing list
> [hidden email]
> http://lists.roundcube.net/mailman/listinfo/dev


_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev

_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

Vladimir Gorpenko
In reply to this post by Rimas Kudelis
Hi!

Thanks for this note. I really was going to transfer under chroot
/usr/bin/iconv and worried whether it will be launched under Apache's
chroot.

---
Best regards,
    Vladimir Gorpenko

Rimas Kudelis писал 2016-09-27 09:21:

> Hi,
>
> 2016-09-26 23:41, Vladimir Gorpenko rašė:
>> 1. In case of conversion of the codings RC at first tries to use
>> iconv. As iconv(php) is an interface to the linux utility iconv, it
>> doesn't work at the server W where chroot is used. On the T server
>> iconv transforms both of these codings normally.
>
> Small correction: I think it's actually an interface to a _library_,
> not
> _utility_. I don't think you need /usr/bin/iconv available in your
> chroot.
>
> Regards,
> Rimas
> _______________________________________________
> Roundcube Development discussion mailing list
> [hidden email]
> http://lists.roundcube.net/mailman/listinfo/dev
_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

A.L.E.C
In reply to this post by Vladimir Gorpenko
On 26.09.2016 22:41, Vladimir Gorpenko wrote:
> 2. In the second queue  RC tries to apply mbstring to conversion of
> codings. Mbstring, apparently, doesn't understand GB2312 (GBK), but
> understands ISO-2022-KR. Nevertheless, mbstring also doesn't work at W
> server. I think, the reason is also somehow connected to chroot.

I'm curious if this patch would fix the GB2312 issue for mbstring path.

-- a/program/lib/Roundcube/rcube_charset.php
+++ b/program/lib/Roundcube/rcube_charset.php
@@ -39,8 +39,8 @@ class rcube_charset
         'UNKNOWN'       => 'ISO-8859-15',
         'USERDEFINED'   => 'ISO-8859-15',
         'KSC56011987'   => 'EUC-KR',
-        'GB2312'        => 'GBK',
-        'GB231280'      => 'GBK',
+        'GB2312'        => 'GB18030',
+        'GB231280'      => 'GB18030',
         'UNICODE'       => 'UTF-8',
         'UTF7IMAP'      => 'UTF7-IMAP',
         'TIS620'        => 'WINDOWS-874',
@@ -51,7 +51,7 @@ class rcube_charset
         '128'           => 'SHIFT-JIS',
         '129'           => 'CP949',
         '130'           => 'CP1361',
-        '134'           => 'GBK',
+        '134'           => 'GB18030',
         '136'           => 'BIG5',
         '161'           => 'WINDOWS-1253',
         '162'           => 'WINDOWS-1254',


--
Aleksander 'A.L.E.C' Machniak
Kolab Groupware Developer         [http://kolab.org]
Roundcube Webmail Developer   [http://roundcube.net]
----------------------------------------------------
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

Vladimir Gorpenko
Yes, it works.

Operations were carried out on T server on which there is no chroot.

I commented out the operators calling iconv and made these corrections.
GB2312 fulfilled normally.

It seems mb_check_encoding returns false.

I uncommented iconv and was convinced that iconv normally works both
with GB18030, and with ISO-2022-KR.

Whether it is necessary also to add a line:
    'GBK' => 'GB18030',
?

---
Best regards,
    Vladimir Gorpenko

A.L.E.C писал 2016-09-27 12:40:

> On 26.09.2016 22:41, Vladimir Gorpenko wrote:
>> 2. In the second queue  RC tries to apply mbstring to conversion of
>> codings. Mbstring, apparently, doesn't understand GB2312 (GBK), but
>> understands ISO-2022-KR. Nevertheless, mbstring also doesn't work at W
>> server. I think, the reason is also somehow connected to chroot.
>
> I'm curious if this patch would fix the GB2312 issue for mbstring path.
>
> -- a/program/lib/Roundcube/rcube_charset.php
> +++ b/program/lib/Roundcube/rcube_charset.php
> @@ -39,8 +39,8 @@ class rcube_charset
>          'UNKNOWN'       => 'ISO-8859-15',
>          'USERDEFINED'   => 'ISO-8859-15',
>          'KSC56011987'   => 'EUC-KR',
> -        'GB2312'        => 'GBK',
> -        'GB231280'      => 'GBK',
> +        'GB2312'        => 'GB18030',
> +        'GB231280'      => 'GB18030',
>          'UNICODE'       => 'UTF-8',
>          'UTF7IMAP'      => 'UTF7-IMAP',
>          'TIS620'        => 'WINDOWS-874',
> @@ -51,7 +51,7 @@ class rcube_charset
>          '128'           => 'SHIFT-JIS',
>          '129'           => 'CP949',
>          '130'           => 'CP1361',
> -        '134'           => 'GBK',
> +        '134'           => 'GB18030',
>          '136'           => 'BIG5',
>          '161'           => 'WINDOWS-1253',
>          '162'           => 'WINDOWS-1254',
_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

A.L.E.C
On 27.09.2016 13:30, Vladimir Gorpenko wrote:

> Yes, it works.
>
> Operations were carried out on T server on which there is no chroot.
>
> I commented out the operators calling iconv and made these corrections.
> GB2312 fulfilled normally.
>
> It seems mb_check_encoding returns false.
>
> I uncommented iconv and was convinced that iconv normally works both
> with GB18030, and with ISO-2022-KR.
>
> Whether it is necessary also to add a line:
>    'GBK' => 'GB18030',
> ?

It might be, indeed. Could you provide samples that fail without the
patch for both GB2312 and ISO-2022-KR, so I could investigate more?

--
Aleksander 'A.L.E.C' Machniak
Kolab Groupware Developer         [http://kolab.org]
Roundcube Webmail Developer   [http://roundcube.net]
----------------------------------------------------
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

Vladimir Gorpenko
I can send an example of ISO-2022-KR which mbstring can't process.
I send it to your address the separate letter.

In case of GB2312, I suppose, there is nothing to investigate. Iconv
converts it normally, mbstring of such coding doesn't support, and with
renaming also converts absolutely normally.

---
Best regards,
    Vladimir Gorpenko

A.L.E.C писал 2016-09-27 14:39:

> On 27.09.2016 13:30, Vladimir Gorpenko wrote:
>> Yes, it works.
>>
>> Operations were carried out on T server on which there is no chroot.
>>
>> I commented out the operators calling iconv and made these
>> corrections.
>> GB2312 fulfilled normally.
>>
>> It seems mb_check_encoding returns false.
>>
>> I uncommented iconv and was convinced that iconv normally works both
>> with GB18030, and with ISO-2022-KR.
>>
>> Whether it is necessary also to add a line:
>>    'GBK' => 'GB18030',
>> ?
>
> It might be, indeed. Could you provide samples that fail without the
> patch for both GB2312 and ISO-2022-KR, so I could investigate more?
_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

A.L.E.C
On 27.09.2016 14:04, Vladimir Gorpenko wrote:
> I can send an example of ISO-2022-KR which mbstring can't process.
> I send it to your address the separate letter.
>
> In case of GB2312, I suppose, there is nothing to investigate. Iconv
> converts it normally, mbstring of such coding doesn't support, and with
> renaming also converts absolutely normally.

I found some sources that it supports "GBK" name, but we probably should
not compare encoding name with mb_list_encodings() result, as it looks
it does not return all supported encodings. So, I'm just looking for the
most universal solution.

--
Aleksander 'A.L.E.C' Machniak
Kolab Groupware Developer         [http://kolab.org]
Roundcube Webmail Developer   [http://roundcube.net]
----------------------------------------------------
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

A.L.E.C
In reply to this post by Vladimir Gorpenko
On 09/27/2016 02:04 PM, Vladimir Gorpenko wrote:
> I can send an example of ISO-2022-KR which mbstring can't process.
> I send it to your address the separate letter.

I commented iconv code path and wasn't able to reproduce the issue. I'm
using PHP7.

> In case of GB2312, I suppose, there is nothing to investigate. Iconv
> converts it normally, mbstring of such coding doesn't support, and with
> renaming also converts absolutely normally.

Could you confirm that it works with
https://github.com/roundcube/roundcubemail/commit/42ddfe5ec9f0294bb3c44b6f7a9a0b205e951c45
instead of the previous patch?

--
Aleksander 'A.L.E.C' Machniak
Kolab Groupware Developer         [http://kolab.org]
Roundcube Webmail Developer   [http://roundcube.net]
----------------------------------------------------
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

Vladimir Gorpenko
It is good. I use php 5.6. Obviously, in php the 7th this error is
corrected.
Unfortunately,

I can't make the test about which you speak. I tried to make these
corrections. But in my version of RC the place designated at you as
lines 244-247 looks differently.


             // return if encoding found, string matches encoding and
convert succeeded
             if (in_array($mb_from, $mbstring_list) && in_array($mb_to,
$mbstring_list)) {
                 if (mb_check_encoding($str, $mb_from)) {
                     // Do the same as //IGNORE with iconv
                     mb_substitute_character('none');
                     $out = mb_convert_encoding($str, $mb_to, $mb_from);
                     mb_substitute_character($mbstring_sch);

                     if ($out !== false) {
                         return $out;
                     }
                 }
             }
I don't decide to adapt your fix to my rcube_charset version.

---
Best regards,
    Vladimir Gorpenko

A.L.E.C писал 2016-09-27 18:07:

> On 09/27/2016 02:04 PM, Vladimir Gorpenko wrote:
>> I can send an example of ISO-2022-KR which mbstring can't process.
>> I send it to your address the separate letter.
>
> I commented iconv code path and wasn't able to reproduce the issue. I'm
> using PHP7.
>
>> In case of GB2312, I suppose, there is nothing to investigate. Iconv
>> converts it normally, mbstring of such coding doesn't support, and
>> with
>> renaming also converts absolutely normally.
>
> Could you confirm that it works with
> https://github.com/roundcube/roundcubemail/commit/42ddfe5ec9f0294bb3c44b6f7a9a0b205e951c45
> instead of the previous patch?
_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

A.L.E.C
On 09/27/2016 05:37 PM, Vladimir Gorpenko wrote:

>             // return if encoding found, string matches encoding and
> convert succeeded
>             if (in_array($mb_from, $mbstring_list) && in_array($mb_to,
> $mbstring_list)) {
>                 if (mb_check_encoding($str, $mb_from)) {
>                     // Do the same as //IGNORE with iconv
>                     mb_substitute_character('none');
>                     $out = mb_convert_encoding($str, $mb_to, $mb_from);
>                     mb_substitute_character($mbstring_sch);
>
>                     if ($out !== false) {
>                         return $out;
>                     }
>                 }
>             }
> I don't decide to adapt your fix to my rcube_charset version.

In general mb_list_encodings() and mb_check_encoding() is not used now.

I did some more test and indeed mb_check_encoding() does not work with
'GBK', but mb_convert_encoding() does (at least with sample text I've
got). So, I assume current git-master code will work for you as well.

--
Aleksander 'A.L.E.C' Machniak
Kolab Groupware Developer         [http://kolab.org]
Roundcube Webmail Developer   [http://roundcube.net]
----------------------------------------------------
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GB2312, ISO-2022-KR

Vladimir Gorpenko
Probably I couldn't explain well.

When I tried to apply a fix to that text which I use (1.1.4), I couldn't
make it.
Those lines which shall be replaced according to a fix in 1.1.4 had
significantly other appearance. In particular, there was an additional
operator if.

But I understood the idea, thanks. If I deal still with this issue, I
will consider your words.

---
Best regards,
    Vladimir Gorpenko

A.L.E.C писал 2016-09-27 18:49:

> On 09/27/2016 05:37 PM, Vladimir Gorpenko wrote:
>>             // return if encoding found, string matches encoding and
>> convert succeeded
>>             if (in_array($mb_from, $mbstring_list) && in_array($mb_to,
>> $mbstring_list)) {
>>                 if (mb_check_encoding($str, $mb_from)) {
>>                     // Do the same as //IGNORE with iconv
>>                     mb_substitute_character('none');
>>                     $out = mb_convert_encoding($str, $mb_to,
>> $mb_from);
>>                     mb_substitute_character($mbstring_sch);
>>
>>                     if ($out !== false) {
>>                         return $out;
>>                     }
>>                 }
>>             }
>> I don't decide to adapt your fix to my rcube_charset version.
>
> In general mb_list_encodings() and mb_check_encoding() is not used now.
>
> I did some more test and indeed mb_check_encoding() does not work with
> 'GBK', but mb_convert_encoding() does (at least with sample text I've
> got). So, I assume current git-master code will work for you as well.
_______________________________________________
Roundcube Development discussion mailing list
[hidden email]
http://lists.roundcube.net/mailman/listinfo/dev
Loading...