</head><body>
<div id="apache">
<h1>mod_proxy_html: Technical Guide</h1>
-<p><a href="./">mod_proxy_html</a> From Version 2.4 (Sept 2004).
-<span class="v3">Updates in Version 3 (Dec. 2006) are highlighted.</span></p>
+<p><a href="./">mod_proxy_html</a> Version 3.1 (April 2009).</p>
<h2>Contents</h2>
<ul id="toc">
<li><a href="#url">URL Rewriting</a>
<h3 id="html">HTML Links</h3>
<p>HTML links are those attributes defined by the HTML 4 and XHTML 1
DTDs as of type <strong>%URI</strong>. For example, the <strong>href</strong>
-attribute of the <strong>a</strong> element. For a full list, see the
-declaration of <code>linked_elts</code> in <code>pstartElement</code>.
+attribute of the <strong>a</strong> element.
Rules are applicable provided the <b>h</b> flag is not set.
-<span class="v3">From Version 3, the definition of links to use is
-delegated to the system administrator via the <code>ProxyHTMLLinks</code>
-directive.</span></p>
+From Version 3, the definition of links to use is delegated to the
+system administrator via the <code>ProxyHTMLLinks</code> directive.
+(the accompanying <tt>proxy_html.conf</tt> configuration file gives
+you standard HTML4 and XHTML 1, as hardwired in earlier
+mod_proxy_html versions).</p>
<p>An HTML link always contains exactly one URL. So whenever mod_proxy_html
finds a matching <code>ProxyHTMLURLMap</code> rule, it will apply the
-transformation once and stop processing the attribute. <span class="v3">This
+transformation once and stop processing the attribute. This
can be overridden by the <code>l</code> flag, which causes processing
-a URL to continue after a rewrite.</span></p>
+a URL to continue after a rewrite.</p>
<h3 id="event">Scripting Events</h3>
<p>Scripting events are the contents of event attributes as defined in the
-HTML4 and XHTML1 DTDs; for example <code>onclick</code>. For a full list,
-see the declaration of <code>events</code> in <code>pstartElement</code>.
+HTML4 and XHTML1 DTDs; for example <code>onclick</code>.
Rules are applicable provided the <b>e</b> flag is not set.
-<span class="v3">From Version 3, the definition of events to use is
+From Version 3, the definition of events to use is
delegated to the system administrator via the <code>ProxyHTMLEvents</code>
-directive.</span></p>
+directive: see <tt>proxy_html.conf</tt>.</p>
<p>A scripting event may contain more than one URL, and will contain other
text. So when <code>ProxyHTMLExtended</code> is On, all applicable rules
will be applied in order until and unless a rule with the <b>L</b> flag
<p>If you declare a custom DTD, you should specify whether to generate
HTML or XHTML syntax in the output. This affects empty elements:
HTML <b><br></b> vs XHTML <b><br /></b>.</p>
-<p class="v3">If you select standard HTML or XHTML, mod_proxy_html 3 will
+<p>If you select standard HTML or XHTML, mod_proxy_html 3 will
perform some additional fixups of bogus markup. If you don't want this,
you can enter a standard DTD using the nonstandard form of
<code>ProxyHTMLDTD</code>, which will then be treated as unknown
<p>The parser uses <strong>UTF-8</strong> (Unicode) internally, and
mod_proxy_html prior to version 3 <em>always</em> generates output as UTF-8.
This is supported by all general-purpose web software, and supports more
-character sets and languages than any other charset. <span class="v3">
-Version 3 supports, but does not recommend different outputs, using
-the <code>ProxyHTMLCharsetOut</code> directive</span>.</p>
+character sets and languages than any other charset.</p>
<p>The character encoding should be declared in HTTP: for example<br />
<code>Content-Type: text/html; charset=latin1</code><br />
mod_proxy_html has always supported this in its input, and ensured
this happens in output. But prior to version 2, it did not fully
support detection (sniffing) the charset when a backend fails to
set the HTTP Header.</p>
-<p>From version 2.0, mod_proxy_html will detect the encoding of its input
+<p>From version 2, mod_proxy_html will detect the encoding of its input
as follows:</p>
<ol>
<li>The HTTP headers, where available, always take precedence over other
<code><meta http-equiv="Content-Type" ...></code>, any charset declared
here is used.</li>
<li>In the absence of any of the above indications, the HTML-over-HTTP default
-encoding <b>ISO-8859-1</b> <span class="v3">or the
-<code>ProxyHTMLCharsetDefault</code> value</span> is assumed.</li>
+encoding <b>ISO-8859-1</b> is assumed.</li>
<li>The parser is set to ignore invalid characters, so a malformed input
stream will generate glitches (unexpected characters) rather than risk
aborting a parse altogether.</li>
</ol>
-<p class="v3">In version 3.0, this remains the default, but
-internationalisation support is further improved, and is no longer
-limited to the encodings supported by libxml2:</p>
-<ul class="v3">
-<li>The <code>ProxyHTMLCharsetAlias</code> directive enables server
-administrators to support additional encodings by aliasing them to
-something supported by libxml2.</li>
-<li>When a charset that is neither directly supported nor aliased is
-encountered, mod_proxy_html 3 will attempt to support it using Apache/APR's
-charset conversion support in <code>apr_xlate</code>, which on most platforms
-is a wrapper for the leading conversion utility <code>iconv</code>.
-Because of undocumented behaviour of libxml2, this may cause problems
-when charset is specified in an HTML <code>META</code> element. This
-feature is therefore only enabled when <code>ProxyHTMLMeta</code> is On.</li>
-</ul>
+<p>From <strong>Version 3.1</strong> the above is delegated to
+<a href="../mod_xml2enc/">mod_xml2enc</a>, which also expands charset support
+and enables you to:</p>
+<ol>
+<li>Handle any character set supported by iconv on your system, in addition
+to those supported by libxml2.</li>
+<li>Alias an unsupported charset to a supported one: for example nonstandard
+Windows codepages to ISO equivalents.</li>
+<li>Override the ISO-8859-1 Default encoding.</li>
+<li>Convert output to your choice of charset (at an additional processing cost).</li>
+</ol>
<h2 id="meta">meta http-equiv support</h2>
<p>The HTML <code>meta</code> element includes a form
<code>force-response-1.0</code> environment variable in httpd.conf.
For example,<br /><code>BrowserMatch MSIE force-response-1.0</code></p>
</div>
-</body></html>
+<div id="navbar"><a class="internal" href="./" title="Up">Up</a>
+*
+<a class="internal" href="/" title="WebThing Apache Centre">Home</a>
+*
+<a class="internal" href="/contact.html" title="Contact WebThing">Contact</a>
+*
+<a class="external" href="http://www.webthing.com/" title="WebThing Ltd">WebÞing</a>
+*
+<a class="external" href="http://www.apache.org/" title="Apache Software Foundation">Apache</a></div></body></html>