</head><body>
<div id="apache">
<h1>mod_proxy_html: Configuration</h1>
-<p><a href="./">mod_proxy_html</a> Version 2.4 (Sept 2004) and upwards</p>
+<p><a href="./">mod_proxy_html</a> Version 2.4 (Sept 2004) and upwards.
+<span class="v3">Updates in Version 3 (Dec. 2006) are highlighted</span>.</p>
<h2>Configuration Directives</h2>
<p>The following can be used anywhere in an <strong>httpd.conf</strong>
or included configuration file.</p>
<dt>ProxyHTMLURLMap</dt>
<dd>
<p>Syntax:
-<code>ProxyHTMLURLMap from-pattern to-pattern flags</code></p>
+<code>ProxyHTMLURLMap from-pattern to-pattern [flags] [cond]</code></p>
<p>This is the key directive for rewriting HTML links. When parsing a document,
whenever a link target matches <samp>from-pattern</samp>, the matching
portion will be rewritten to <samp>to-pattern</samp>.</p>
and substitutions, including regular expression search and replace,
controlled by the optional third <code>flags</code> argument.
</p>
+<p class="v3">Starting at version 3.0, this also supports environment variable
+interpolation using the V and v flags, and rules may apply conditionally
+based on an environment variable. Note that interpolation takes place
+before the parse starts, so variables set during the parse (e.g.
+using SSI directives) will not apply. This flexible configuration
+is enabled by the <code>ProxyHTMLInterp</code> directive, or can
+be disabled for speed.</p>
<h3>Flags for ProxyHTMLURLMap</h3>
<p>Flags are case-sensitive.</p>
<dl class="table">
<dt>L</dt>
<dd><p>Last-match. If this rule matches, no more rules are applied
(note that this happens automatically for HTML links).</p></dd>
+<dt class="v3">l</dt>
+<dd class="v3">Opposite to L. Overrides the one-change-only default
+behaviour with HTML links.</dd>
<dt>R</dt>
<dd><p>Use Regular Expression matching-and-replace. <code>from-pattern</code>
is a regexp, and <code>to-pattern</code> a replacement string that may be
<dt>$</dt>
<dd><p>Match at end only. This applies only to string matching
(not regexps) and is irrelevant to HTML links.</p></dd>
+<dt class="v3">V</dt>
+<dd class="v3"><p>Interpolate environment variables in <code>to-pattern</code>.
+A string of the form <code>${varname|default}</code> will be replaced by the
+value of environment variable <code>varname</code>. If that is unset, it
+is replaced by <code>default</code>. The <code>|default</code> is optional.</p>
+<p>NOTE: interpolation will only be enabled if ProxyHTMLInterp is On.</p>
+</dd>
+<dt class="v3">v</dt>
+<dd class="v3"><p>Interpolate environment variables in <code>from-pattern</code>.
+Patterns supported are as above.</p>
+<p>NOTE: interpolation will only be enabled if ProxyHTMLInterp is On.</p>
+</dd>
</dl>
-<!-- <h4>Examples</h4> -->
+<h3 class="v3">Conditions for ProxyHTMLURLMap</h3>
+<p class="v3">The optional <code>cond</code> argument specifies a condition to
+test before the parse. If a condition is unsatisfied, the URLMap
+will be ignored in this parse.</p>
+<p class="v3">The condition takes the form <code>[!]var[=val]</code>, and is
+satisfied if the value of environment variable <code>var</code>
+is <code>val</code>. If the optional <code>=val</code> is omitted,
+then any value of <code>var</code> satisfies the condition, provided
+only it is set to something. If the first character is <code>!</code>,
+the condition is reversed.</p>
+<p class="v3">NOTE: conditions will only be applied if ProxyHTMLInterp is On.</p>
+</dd>
+<dt class="v3">ProxyHTMLInterp</dt>
+<dd class="v3">
+<p>Syntax: <code>ProxyHTMLInterp On|Off</code></p>
+<p>Enables new (per-request) features of ProxyHTMLURLMap.</p>
</dd>
<dt>ProxyHTMLDoctype</dt>
<dd>
<p>Starting at version 2.0, the default is changed to omitting any FPI,
on the grounds that no FPI is better than a bogus one. If your backend
generates decent HTML or XHTML, set it accordingly.</p>
+<p class="v3">From version 3, if the first form is used, mod_proxy_html
+will also clean up the HTML to the specified standard. It cannot
+fix every error, but it will strip out bogus elements and attributes.
+It will also optionally log other errors at <tt>LogLevel Debug</tt>.</p>
</dd>
<dt>ProxyHTMLFixups</dt>
<dd>
<dd><p>Syntax <code>ProxyHTMLMeta [On|Off]</code></p>
<p>Parses <code><meta http-equiv ...></code> elements to real HTTP
headers.</p>
+<p class="v3">In version 3, this is also tied in with the improved
+<a href="guide.html#charset">internationalisation support</a>, and is
+required to support some character encodings.</p>
</dd>
<dt>ProxyHTMLExtended</dt>
<dd><p>Syntax <code>ProxyHTMLExtended [On|Off]</code></p>
in increments of [nnnn] as set by this directive.</p>
<p>The default is 8192, and will work well for almost all pages. However,
if you know you're proxying a lot of pages containing stylesheets and/or
-scripts bigger than 8K, it will be more efficient to set a larger buffer
+scripts bigger than 8K (that is, for a single script or stylesheet,
+NOT in total), it will be more efficient to set a larger buffer
size and avoid the need to resize the buffer dynamically during a request.
</p>
</dd>
+<dt class="v3">ProxyHTMLEvents</dt>
+<dd class="v3">
+<p>Syntax <code>ProxyHTMLEvents attr [attr ...]</code></p>
+<p>Specifies one or more attributes to treat as scripting events and
+apply URLMaps to where appropriate. You can specify any number of
+attributes in one or more <code>ProxyHTMLEvents</code> directives.
+The <a href="/svn/apache/filters/proxy_html/">sample configuration</a>
+defines the events in standard HTML 4 and XHTML 1.</p>
+</dd>
+<dt class="v3">ProxyHTMLLinks</dt>
+<dd class="v3">
+<p>Syntax <code>ProxyHTMLLinks elt attr [attr ...]</code></p>
+<p>Specifies elements that have URL attributes that should be rewritten
+using standard URLMaps as in versions 1 and 2 of mod_proxy_html.
+You will need one <code>ProxyHTMLLinks</code> directive per element,
+but it can have any number of attributes. The <a
+href="/svn/apache/filters/proxy_html/">sample configuration</a>
+defines the HTML links for standard HTML 4 and XHTML 1.</p>
+</dd>
+<dt class="v3">ProxyHTMLCharsetAlias</dt>
+<dd class="v3">
+<p>Syntax <code>ProxyHTMLCharsetAlias charset alias [alias ...]</code></p>
+<p>This server-wide directive aliases one or more charset to another
+charset. This enables encodings not recognised by libxml2 to be handled
+internally by libxml2's charset support using the translation table for
+a recognised charset.</p>
+<p>For example, Latin 1 (<tt>ISO-8859-1</tt>) is supported by libxml2.
+Microsoft's <tt>Windows-1252</tt> is almost identical and can be supported
+by aliasing it:<br />
+<code>ProxyHTMLCharsetAlias ISO-8859-1 Windows-1252</code></p>
+</dd>
+<dt class="v3">ProxyHTMLCharsetDefault</dt>
+<dd class="v3">
+<p>Syntax <code>ProxyHTMLCharsetDefault name</code></p>
+<p>This defines the default encoding to assume when absolutely no charset
+information is available from the backend server. The default value for
+this is <code>ISO-8859-1</code>, as specified in HTTP/1.0 and assumed in
+earlier mod_proxy_html versions.</p>
+</dd>
+<dt class="v3">ProxyHTMLCharsetOut</dt>
+<dd class="v3">
+<p>Syntax <code>ProxyHTMLCharsetOut name</code></p>
+<p>This selects an encoding for mod_proxy_html output. It should not
+normally be used, as any change from the default <code>UTF-8</code>
+(Unicode - as used internally by libxml2) will impose an additional
+processing overhead. The special token <code>ProxyHTMLCharsetOut *</code>
+will generate output using the same encoding as the input.</p>
+</dd>
+<dt class="v3">ProxyHTMLStartParse</dt>
+<dd class="v3">
+<p>Syntax <code>ProxyHTMLStartParse element [elt*]</code></p>
+<p>Specify that the HTML parser should start at the first instance
+of any of the elements specified. This can be used where a broken
+backend inserts leading junk that messes up the parser (<a
+href="http://bahumbug.wordpress.com/2006/10/12/mod_proxy_html-revisited/"
+>example here</a>).</p>
+</dd>
</dl>
+<h3>Other Configuration</h3>
+<p class="v3">Normally, mod_proxy_html will refuse to run when not
+in a proxy or when the contents are not HTML. This can be overridden
+(at your own risk) by setting the environment variable
+<tt>PROXY_HTML_FORCE</tt> (e.g. with the <tt>SetEnv</tt> directive).</p>
</div>
</body></html>