Perl’s Data::Dumper and Smart::Comments are very useful for developing. But when process non ASCII data, even if you are processing them with utf8 pragma and having specified the encoding of STDERR, these modules output the character’s unicode (hexadecimal number) in the messages instead of the message itself. This problem will be solved by using $SIG{__WARN__} hook.
Problem when processing UTF-8 strings
When process non ASCII data, in my case Japanese, if I specify the encoding of STDERR, it works for MY error messages in the main program. But the output of Smart::Comments and Data::Dumper remains expressed as hex numbers.
Output from Data::Dumper
use strict;
use warnings;
use utf8;
use Data::Dumper;
binmode STDERR, ':encoding(utf8)'; #It works for error messages specified in the program.
warn "日本語エラーメッセージ。\n";
my @arr = qw(
こんな値や
あんな値
);
warn Dumper(\@arr);
Output
日本語エラーメッセージ。
$VAR1 = [
"\x{3053}\x{3093}\x{306a}\x{5024}\x{3084}",
"\x{3042}\x{3093}\x{306a}\x{5024}"
];
How to convert the output of Data::Dumper to utf8 string
It seems that the $SIG{__WARN__} handler can be useful to solve it.
use strict;
use warnings;
use utf8;
use Data::Dumper;
binmode STDERR, ':encoding(utf8)';
# Convert outputs from debugging modules to characters
local $SIG{__WARN__} = sub {
warn join("",
map {
my $str = $_;
$str =~ s/\\x\{(\w{4})\}/pack('U', hex($1))/eg; # \x{abcd} -> letter
$str;
} @_
);
};
my @arr = qw(
こんな値や
あんな値
);
warn Dumper(\@arr);
Output
$VAR1 = [
"こんな値や",
"あんな値"
];
Same for Smart::Comments, since this module uses Data::Dumper inside.