Apache Variable Fun in htaccess
Server and Environment Variables are used by The Apache HTTP Server by provides a mechanism for storing information. This information can be used to control various operations such as logging or access control. The variables are also used as a mechanism to communicate with external programs such as CGI scripts. This document discusses different ways to manipulate and use these variables.
« SSL example usage in htaccess | .htaccess Tutorial Index | » .htaccess Security with MOD_SECURITY
Although these variables are referred to as environment variables, they are not the same as the environment variables controlled by the underlying operating system. Instead, these variables are stored and manipulated in an internal Apache structure. They only become actual operating system environment variables when they are provided to CGI scripts and Server Side Include scripts. If you wish to manipulate the operating system environment under which the server itself runs, you must use the standard environment manipulation mechanisms provided by your operating system shell.mod_env
- Using visitor dependent environment variables:
- Special Purpose Environment Variables
- SetEnvIf
- Glossary
Using visitor dependent environment variables:
Article:Additional SetEnvIf examples
SetEnvIf User-Agent ^KnockKnock/2.0 let_me_in Order Deny,Allow Deny from all Allow from env=let_me_in
Special Purpose Environment Variables
Interoperability problems have led to the introduction of mechanisms to modify the way Apache behaves when talking to particular clients. To make these mechanisms as flexible as possible, they are invoked by defining environment variables, typically with BrowserMatch, though SetEnv and PassEnv could also be used, for example.
- downgrade-1.0
- This forces the request to be treated as a HTTP/1.0 request even if it was in a later dialect.
- force-gzip
- If you have the DEFLATE filter activated, this environment variable will ignore the accept-encoding setting of your browser and will send compressed output unconditionally.
- force-no-vary
- This causes any Vary fields to be removed from the response header before it is sent back to the client. Some clients don't interpret this field correctly; setting this variable can work around this problem. Setting this variable also implies force-response-1.0.
- force-response-1.0
- This forces an HTTP/1.0 response to clients making an HTTP/1.0 request. It was originally implemented as a result of a problem with AOL's proxies. Some HTTP/1.0 clients may not behave correctly when given an HTTP/1.1 response, and this can be used to interoperate with them.
- gzip-only-text/html
- When set to a value of "1", this variable disables the DEFLATE output filter provided by mod_deflate for content-types other than text/html. If you'd rather use statically compressed files; mod_negotiation evaluates the variable as well (not only for gzip, but for all encodings that differ from "identity").
- no-gzip
- When set, the DEFLATE filter of mod_deflate will be turned off and mod_negotiation will refuse to deliver encoded resources.
- nokeepalive
- This disables KeepAlive when set.
- prefer-language
- This influences mod_negotiation's behaviour. If it contains a language tag (such as en, ja or x-klingon), mod_negotiation tries to deliver a variant with that language. If there's no such variant, the normal negotiation process applies.
- redirect-carefully
- This forces the server to be more careful when sending a redirect to the client. This is typically used when a client has a known problem handling redirects. This was originally implemented as a result of a problem with Microsoft's WebFolders software which has a problem handling redirects on directory resources via DAV methods.
- suppress-error-charset
- Available in versions after 2.0.54 When Apache issues a redirect in response to a client request, the response includes some actual text to be displayed in case the client can't (or doesn't) automatically follow the redirection. Apache ordinarily labels this text according to the character set which it uses, which is ISO-8859-1. However, if the redirection is to a page that uses a different character set, some broken browser versions will try to use the character set from the redirection text rather than the actual page. This can result in Greek, for instance, being incorrectly rendered. Setting this environment variable causes Apache to omit the character set for the redirection text, and these broken browsers will then correctly use that of the destination page.
- force-proxy-request-1.0, proxy-nokeepalive, proxy-sendchunked, proxy-sendcl
- These directives alter the protocol behavior of mod_proxy. See the mod_proxy documentation for more details.
SetEnvIf
The SetEnvIf directive defines environment variables based on attributes of the request. These attributes can be the values of various HTTP request header fields (see RFC2616 for more information about these), or of other aspects of the request, including the following:
- Remote_Host
- the hostname (if available) of the client making the request
- Remote_Addr
- the IP address of the client making the request
- Request_Method
- the name of the method being used (GET, POST, et cetera)
- Request_Protocol
- the name and version of the protocol with which the request was made (e.g., "HTTP/0.9", "HTTP/1.1", etc.)
- Request_URI
- the portion of the URL following the scheme and host portion
Some of the more commonly used request header field names include Host, User-Agent, and Referer.
If the attribute name doesn't match any of the special keywords, nor any of the request's header field names, it is tested as the name of an environment variable in the list of those associated with the request. This allows SetEnvIf directives to test against the result of prior matches.
Only those environment variables defined by earlier SetEnvIf[NoCase] directives are available for testing in this manner. 'Earlier' means that they were defined at a broader scope (such as server-wide) or previously in the current directive's scope.
SetEnvIfNoCase Example
The first three will set the environment variable is_image if the request was for an image file, and the fourth sets in_site_referral if the referring page was somewhere on the www.askapache.com Web site.
The sixth example will set the NetscapeComment environment variable to the string found in the corresponding SSL client certificate field (if found).
The last example will set environment variable HAVE_TS if the request contains any headers that begin with "TS" whose values begins with any character in the set [a-z].
SetEnvIfNoCase Request_URI ".gif$" is_image=gif SetEnvIfNoCase Request_URI ".jpg$" is_image=jpg SetEnvIfNoCase Request_URI ".xbm$" is_image=xbm SetEnvIfNoCase Referer www.askapache.com in_site_referral SetEnvIfNoCase ^TS* ^[a-z].* HAVE_TS
SetEnvIfNoCase Example 2
This will cause the site environment variable to be set to "askapache" if the HTTP request header field Host: was included and contained askapache.com, askApache.COm, or any other combination.
SetEnvIfNoCase Host askapache.com site=askapache
SetEnvIf Request_URI ^/manual/(de|en|es|fr|ja|ko|ru)/ prefer-language=$1 RedirectMatch 301 ^/manual(?:/(de|en|es|fr|ja|ko|ru)){2,}(/.*)?$ /manual/$1$2
Additional and detailed info on each htaccess code snippet can be found athtaccessElite
Basic Environment Manipulation
The most basic way to set an environment variable in Apache is using the unconditional SetEnv directive. Variables may also be passed from the environment of the shell which started the server using the PassEnv directive.
Conditional Per-Request Settings
For additional flexibility, the directives provided by mod_setenvif allow environment variables to be set on a per-request basis, conditional on characteristics of particular requests. For example, a variable could be set only when a specific browser (User-Agent) is making a request, or only when a specific Referer [sic] header is found. Even more flexibility is available through the mod_rewrite's RewriteRule which uses the [E=...] option to set environment variables.
Caveats
- It is not possible to override or change the standard CGI variables using the environment manipulation directives.
- When suexec is used to launch CGI scripts, the environment will be cleaned down to a set of safe variables before CGI scripts are launched. The list of safe variables is defined at compile-time in suexec.c.
- For portability reasons, the names of environment variables may contain only letters, numbers, and the underscore character. In addition, the first character may not be a number. Characters which do not match this restriction will be replaced by an underscore when passed to CGI scripts and SSI pages.
CGI Scripts
One of the primary uses of environment variables is to communicate information to CGI scripts. As discussed above, the environment passed to CGI scripts includes standard meta-information about the request in addition to any variables set within the Apache configuration. For more details, see the CGI tutorial.
SSI Pages
Server-parsed (SSI) documents processed by mod_include's INCLUDES filter can print environment variables using the echo element, and can use environment variables in flow control elements to makes parts of a page conditional on characteristics of a request. Apache also provides SSI pages with the standard CGI environment variables as discussed above. For more details, see the SSI tutorial.
Access Control
Access to the server can be controlled based on the value of environment variables using the allow from env= and deny from env= directives. In combination with SetEnvIf, this allows for flexible control of access to the server based on characteristics of the client. For example, you can use these directives to deny access to a particular browser (User-Agent).
Conditional Logging
Environment variables can be logged in the access log using the LogFormat option %e. In addition, the decision on whether or not to log requests can be made based on the status of environment variables using the conditional form of the CustomLog directive. In combination with SetEnvIf this allows for flexible control of which requests are logged. For example, you can choose not to log requests for filenames ending in gif, or you can choose to only log requests from clients which are outside your subnet.
*Conditional Response Headers
The Header directive can use the presence or absence of an environment variable to determine whether or not a certain HTTP header will be placed in the response to the client. This allows, for example, a certain response header to be sent only if a corresponding header is received in the request from the client.
*External Filter Activation
External filters configured by mod_ext_filter using the ExtFilterDefine directive can by activated conditional on an environment variable using the disableenv= and enableenv= options.
*URL Rewriting
The %{ENV:...} form of TestString in the RewriteCond allows mod_rewrite's rewrite engine to make decisions conditional on environment variables. Note that the variables accessible in mod_rewrite without the ENV: prefix are not actually environment variables. Rather, they are variables special to mod_rewrite which cannot be accessed from other modules.
Getting prefetching to show up in your logs
Getting prefetching to show up in our logs
First of all, how do we know a prefetch when we see one?
Firefox puts a header in each prefetching request, like this:X-moz: prefetch
So we'll need to ask our web server to trap that information and log it somewhere useful. The options are:
Make a separate log file, just for prefetching requests.
Add an extra field to our log file format.
Mush something about prefetching into an existing field in our log file.
I have enough log files as it is, and I don't want to confuse my log analysis software by adding a custom field, so I'm going to squidge the X-Moz header onto the end of the User-Agent field of my current logs. (They're in "combined" format, which includes a field for the referer). Log analysis software will usually ignore crap tagged on the end of the User-Agent field, so this will tell me which hits have been prefetched without breaking anything else.
Let's tell Apache about the new log format we're inventing. We'll call the format "combined_with_prefetching_hack".
Somewhere in our apache configuration file (httpd.conf or apache2.conf) we should have a line like this.LogFormat "%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"" combined
Underneath that, we'll add another line like this:LogFormat "%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i %{X-Moz}i"" combined_with_prefetching_hack
Then we'll find the place where we are currently telling apache to use the "combined" format for our site, and tell it to use "combined_with_prefetching_hack" instead.
Comment out a line a bit like this:CustomLog /var/log/apache2/access.log combined
and replace it with something more like this:CustomLog /var/log/apache2/access.log combined_with_prefetching_hack
then restart apache.
Now if we want our log file without the prefetched stuff:grep -v prefetch access.log
htaccess Guide Sections
- htaccess tricks for Webmasters
- HTTP Header control with htaccess
- PHP on Apache tips and tricks
- SEO Redirects without mod_rewrite
- mod_rewrite examples, tips, and tricks
- HTTP Caching and Site Speedups
- Authentication on Apache
- htaccess Security Tricks and Tips
- SSL tips and examples
- Variable Fun (mod_env) Section
- .htaccess Security with MOD_SECURITY
- SetEnvIf and SetEnvIfNoCase Examples
« SSL example usage in htaccess | .htaccess Tutorial Index | » .htaccess Security with MOD_SECURITY
« htaccess HTTPS / SSL Tips, Tricks, and HacksWordPress robots.txt SEO »
Comments