Index | index by Group | index by Distribution | index by Vendor | index by creation date | index by Name | Mirrors | Help | Search |
Name: perl-IO-HTML | Distribution: SUSE Linux Framework One |
Version: 1.004 | Vendor: SUSE LLC <https://www.suse.com/> |
Release: slfo.1.1.3 | Build date: Mon Aug 26 10:51:26 2024 |
Group: Development/Libraries/Perl | Build host: h04-ch1a |
Size: 46210 | Source RPM: perl-IO-HTML-1.004-slfo.1.1.3.src.rpm |
Packager: https://www.suse.com/ | |
Url: https://metacpan.org/release/IO-HTML | |
Summary: Open an HTML file with automatic charset detection |
IO::HTML provides an easy way to open a file containing HTML while automatically determining its encoding. It uses the HTML5 encoding sniffing algorithm specified in section 8.2.2.2 of the draft standard. The algorithm as implemented here is: * 1. If the file begins with a byte order mark indicating UTF-16LE, UTF-16BE, or UTF-8, then that is the encoding. * 2. If the first '$bytes_to_check' bytes of the file contain a '<meta>' tag that indicates the charset, and Encode recognizes the specified charset name, then that is the encoding. (This portion of the algorithm is implemented by 'find_charset_in'.) The '<meta>' tag can be in one of two formats: <meta charset="..."> <meta http-equiv="Content-Type" content="...charset=..."> The search is case-insensitive, and the order of attributes within the tag is irrelevant. Any additional attributes of the tag are ignored. The first matching tag with a recognized encoding ends the search. * 3. If the first '$bytes_to_check' bytes of the file are valid UTF-8 (with at least 1 non-ASCII character), then the encoding is UTF-8. * 4. If all else fails, use the default character encoding. The HTML5 standard suggests the default encoding should be locale dependent, but currently it is always 'cp1252' unless you set '$IO::HTML::default_encoding' to a different value. Note: 'sniff_encoding' does not apply this step; only 'html_file' does that.
Artistic-1.0 OR GPL-1.0-or-later
* Sun Sep 27 2020 timueller+perl@suse.de - updated to 1.004 see /usr/share/doc/packages/perl-IO-HTML/Changes 1.004 2020-09-26 - No code changes since 1.003, just documentation improvements - New example file: detect-encoding.pl 1.003 2015-09-26 Trial Release - Do not use incomplete quoted attribute values in find_charset_in. If we reach the end of the string without finding the closing quote, terminate processing instead of using whatever we did collect as the attribute's value. - Add tests for the $bytes_to_check configuration variable (GitHub#1) 1.002 2015-09-19 Trial Release - Add $bytes_to_check configuration variable (GitHub#1) * Tue Apr 14 2015 coolo@suse.com - updated to 1.001 see /usr/share/doc/packages/perl-IO-HTML/Changes * Mon Aug 05 2013 coolo@suse.com - initial package 1.00 * created by cpanspec 1.78.07
/usr/lib/perl5/vendor_perl/5.38.2/IO /usr/lib/perl5/vendor_perl/5.38.2/IO/HTML.pm /usr/share/doc/packages/perl-IO-HTML /usr/share/doc/packages/perl-IO-HTML/Changes /usr/share/doc/packages/perl-IO-HTML/README /usr/share/doc/packages/perl-IO-HTML/examples /usr/share/doc/packages/perl-IO-HTML/examples/detect-encoding.pl /usr/share/licenses/perl-IO-HTML /usr/share/licenses/perl-IO-HTML/LICENSE /usr/share/man/man3/IO::HTML.3pm.gz
Generated by rpm2html 1.8.1
Fabrice Bellet, Thu Oct 31 00:01:03 2024