- publishing free software manuals
The Apache HTTP Server Reference Manual
by Apache Software Foundation
Paperback (6"x9"), 862 pages
ISBN 9781906966034
RRP £19.95 ($29.95)

Get a printed copy>>>

4.6  An In-Depth Discussion of Virtual Host Matching

The virtual host code was completely rewritten in Apache 1.3. This section attempts to explain exactly what Apache does when deciding what virtual host to serve a hit from. With the help of the new NameVirtualHost directive virtual host configuration should be a lot easier and safer than with versions prior to 1.3.

If you just want to make it work without understanding how, here are some examples (p. 1070).

4.6.1  Config File Parsing

There is a main_server which consists of all the definitions appearing outside of <VirtualHost> sections. There are virtual servers, called vhosts, which are defined by <VirtualHost> sections.

The directives ServerName and ServerPath can appear anywhere within the definition of a server. However, each appearance overrides the previous appearance (within that server).

The main_server has no default ServerPath, or ServerAlias. The default ServerName is deduced from the server’s IP address.

Port numbers specified in the VirtualHost directive do not influence what port numbers Apache will listen on, they only discriminate between which VirtualHost will be selected to handle a request.

Each address appearing in the VirtualHost directive can have an optional port. If the port is unspecified it is treated as a wildcard port. The special port * indicates a wildcard that matches any port. Collectively the entire set of addresses (including multiple A record results from DNS lookups) are called the vhost’s address set.

Unless a NameVirtualHost directive is used for the exact IP address and port pair in the VirtualHost directive, Apache selects the best match only on the basis of the IP address (or wildcard) and port number. If there are multiple identical best matches, the first VirtualHost appearing in the configuration file will be selected.

If you want Apache to further discriminate on the basis of the HTTP Host header supplied by the client, the NameVirtualHost directive must appear with the exact IP address (or wildcard) and port pair used in a correspnding set of VirtualHost directives.

The name-based virtual host selection occurs only after a single IP-based virtual host has been selected, and only considers the set of virtual hosts that carry an identical IP address and port pair.

Hostnames can be used in place of IP addresses in a virtual host definition, but it is resolved at startup and is not recommended.

Multiple NameVirtualHost directives can be used each with a set of VirtualHost directives but only one NameVirtualHost directive should be used for each specific IP:port pair.

The ordering of NameVirtualHost and VirtualHost directives is not important which makes the following two examples identical (only the order of the VirtualHost directives for one address set is important, see below):



NameVirtualHost 111.22.33.44
<VirtualHost 111.22.33.44>
# server A

</VirtualHost>
<VirtualHost 111.22.33.44>
# server B

</VirtualHost>
NameVirtualHost 111.22.33.55
<VirtualHost 111.22.33.55>
# server C

</VirtualHost>
<VirtualHost 111.22.33.55>
# server D

</VirtualHost>

<VirtualHost 111.22.33.44>
# server A
</VirtualHost>
<VirtualHost 111.22.33.55>
# server C

</VirtualHost>
<VirtualHost 111.22.33.44>
# server B

</VirtualHost>
<VirtualHost 111.22.33.55>
# server D

</VirtualHost>
NameVirtualHost 111.22.33.44
NameVirtualHost 111.22.33.55



(To aid the readability of your configuration you should prefer the left variant.)

During initialization a list for each IP address is generated and inserted into an hash table. If the IP address is used in a NameVirtualHost directive the list contains all name-based vhosts for the given IP address. If there are no vhosts defined for that address the NameVirtualHost directive is ignored and an error is logged. For an IP-based vhost the list in the hash table is empty.

Due to a fast hashing function the overhead of hashing an IP address during a request is minimal and almost not existent. Additionally the table is optimized for IP addresses which vary in the last octet.

For every vhost various default values are set. In particular:

  1. If a vhost has no ServerAdmin, Timeout, KeepAliveTimeout, KeepAlive, MaxKeepAliveRequests, ReceiveBufferSize, or SendBufferSize directive then the respective value is inherited from the main_server. (That is, inherited from whatever the final setting of that value is in the main_server.)
  2. The "lookup defaults" that define the default directory permissions for a vhost are merged with those of the main_server. This includes any per-directory configuration information for any module.
  3. The per-server configs for each module from the main_server are merged into the vhost server.

Essentially, the main_server is treated as "defaults" or a "base" on which to build each vhost. But the positioning of these main_server definitions in the config file is largely irrelevant – the entire config of the main_server has been parsed when this final merging occurs. So even if a main_server definition appears after a vhost definition it might affect the vhost definition.

If the main_server has no ServerName at this point, then the hostname of the machine that httpd is running on is used instead. We will call the main_server address set those IP addresses returned by a DNS lookup on the ServerName of the main_server.

For any undefined ServerName fields, a name-based vhost defaults to the address given first in the VirtualHost statement defining the vhost.

Any vhost that includes the magic _default_ wildcard is given the same ServerName as the main_server.

4.6.2  Virtual Host Matching

The server determines which vhost to use for a request as follows:

Hash table lookup

When the connection is first made by a client, the IP address to which the client connected is looked up in the internal IP hash table.

If the lookup fails (the IP address wasn’t found) the request is served from the _default_ vhost if there is such a vhost for the port to which the client sent the request. If there is no matching _default_ vhost the request is served from the main_server.

If the IP address is not found in the hash table then the match against the port number may also result in an entry corresponding to a NameVirtualHost *, which is subsequently handled like other name-based vhosts.

If the lookup succeeded (a corresponding list for the IP address was found) the next step is to decide if we have to deal with an IP-based or a name-base vhost.

IP-based vhost

If the entry we found has an empty name list then we have found an IP-based vhost, no further actions are performed and the request is served from that vhost.

Name-based vhost

If the entry corresponds to a name-based vhost the name list contains one or more vhost structures. This list contains the vhosts in the same order as the VirtualHost directives appear in the config file.

The first vhost on this list (the first vhost in the config file with the specified IP address) has the highest priority and catches any request to an unknown server name or a request without a Host: header field.

If the client provided a Host: header field the list is searched for a matching vhost and the first hit on a ServerName or ServerAlias is taken and the request is served from that vhost. A Host: header field can contain a port number, but Apache always matches against the real port to which the client sent the request.

If the client submitted a HTTP/1.0 request without Host: header field we don’t know to what server the client tried to connect and any existing ServerPath is matched against the URI from the request. The first matching path on the list is used and the request is served from that vhost.

If no matching vhost could be found the request is served from the first vhost with a matching port number that is on the list for the IP to which the client connected (as already mentioned before).

Persistent connections

The IP lookup described above is only done once for a particular TCP/IP session while the name lookup is done on every request during a KeepAlive/persistent connection. In other words a client may request pages from different name-based vhosts during a single persistent connection.

Absolute URI

If the URI from the request is an absolute URI, and its hostname and port match the main server or one of the configured virtual hosts and match the address and port to which the client sent the request, then the scheme/hostname/port prefix is stripped off and the remaining relative URI is served by the corresponding main server or virtual host. If it does not match, then the URI remains untouched and the request is taken to be a proxy request.

Observations

4.6.3  Tips

In addition to the tips on the DNS Issues (p. 1101) page, here are some further tips:

ISBN 9781906966034The Apache HTTP Server Reference ManualSee the print edition