- publishing free software manuals
The Apache HTTP Server Reference Manual
by Apache Software Foundation
Paperback (6"x9"), 862 pages
ISBN 9781906966034
RRP £19.95 ($29.95)

Get a printed copy>>>

3.23  Apache Module mod_cache



Description:

Content cache keyed to URIs.

Status:

Extension

Module Identifier:

cache_module

Source File:

mod_cache.c



Summary

This module should be used with care and can be used to circumvent Allow and Deny directives. You should not enable caching for any content to which you wish to limit access by client host name, address or environment variable.

mod_cache implements an RFC 261620 compliant HTTP content cache that can be used to cache either local or proxied content. mod_cache requires the services of one or more storage management modules. Two storage management modules are included in the base Apache distribution:

mod_disk_cache
implements a disk based storage manager.
mod_mem_cache
implements a memory based storage manager. mod_mem_cache can be configured to operate in two modes: caching open file descriptors or caching objects in heap storage. mod_mem_cache can be used to cache locally generated content or to cache backend server content for mod_proxy when configured using ProxyPass (aka reverse proxy)

Content is stored in and retrieved from the cache using URI based keys. Content with access protection is not cached.

Further details, discussion, and examples, are provided in the Caching Guide (p. 1293).

Directives:

CacheDefaultExpire

CacheDisable

CacheEnable

CacheIgnoreCacheControl

CacheIgnoreHeaders

CacheIgnoreNoLastMod

CacheIgnoreQueryString

CacheIgnoreURLSessionIdentifiers

CacheLastModifiedFactor

CacheLock

CacheLockMaxAge

CacheLockPath

CacheMaxExpire

CacheStoreNoStore

CacheStorePrivate

See also:

  • Caching Guide (p. 1293)

3.23.1  Related Modules and Directives





Related Modules

Related Directives





mod_disk_cache
mod_mem_cache

CacheRoot
CacheDirLevels
CacheDirLength
CacheMinFileSize
CacheMaxFileSize
MCacheSize
MCacheMaxObjectCount
MCacheMinObjectSize
MCacheMaxObjectSize
MCacheRemovalAlgorithm
MCacheMaxStreamingBuffer




3.23.2  Sample Configuration

Sample httpd.conf

#
# Sample Cache Configuration
#
LoadModule cache_module modules/mod_cache.so
<IfModule mod_cache.c>

#LoadModule disk_cache_module modules/mod_disk_cache.so
# If you want to use mod_disk_cache instead of mod_mem_cache,
# uncomment the line above and comment out the LoadModule line below.
<IfModule mod_disk_cache.c>

CacheRoot c:/cacheroot
CacheEnable disk /
CacheDirLevels 5
CacheDirLength 3

</IfModule>

LoadModule mem_cache_module modules/mod_mem_cache.so
<IfModule mod_mem_cache.c>

CacheEnable mem /
MCacheSize 4096
MCacheMaxObjectCount 100
MCacheMinObjectSize 1
MCacheMaxObjectSize 2048

</IfModule>

# When acting as a proxy, don’t cache the list of security updates
CacheDisable http://security.update.server/update-list/

</IfModule>

3.23.3  Avoiding the Thundering Herd

When a cached entry becomes stale, mod_cache will submit a conditional request to the backend, which is expected to confirm whether the cached entry is still fresh, and send an updated entity if not.

A small but finite amount of time exists between the time the cached entity becomes stale, and the time the stale entity is fully refreshed. On a busy server, a significant number of requests might arrive during this time, and cause a thundering herd of requests to strike the backend suddenly and unpredictably.

To keep the thundering herd at bay, the CacheLock directive can be used to define a directory in which locks are created for URLs in flight. The lock is used as a hint by other requests to either suppress an attempt to cache (someone else has gone to fetch the entity), or to indicate that a stale entry is being refreshed (stale content will be returned in the mean time).

Initial caching of an entry

When an entity is cached for the first time, a lock will be created for the entity until the response has been fully cached. During the lifetime of the lock, the cache will suppress the second and subsequent attempt to cache the same entity. While this doesn’t hold back the thundering herd, it does stop the cache attempting to cache the same entity multiple times simultaneously.

Refreshment of a stale entry

When an entity reaches its freshness lifetime and becomes stale, a lock will be created for the entity until the response has either been confirmed as still fresh, or replaced by the backend. During the lifetime of the lock, the second and subsequent incoming request will cause stale data to be returned, and the thundering herd is kept at bay.

Locks and Cache-Control: no-cache

Locks are used as a hint only to enable the cache to be more gentle on backend servers, however the lock can be overridden if necessary. If the client sends a request with a Cache-Control header forcing a reload, any lock that may be present will be ignored, and the client’s request will be honoured immediately and the cached entry refreshed.

As a further safety mechanism, locks have a configurable maximum age. Once this age has been reached, the lock is removed, and a new request is given the opportunity to create a new lock. This maximum age can be set using the CacheLockMaxAge directive, and defaults to 5 seconds.

Example configuration

Enabling the cache lock

#
# Enable the cache lock
#
<IfModule mod_cache.c>

CacheLock on
CacheLockPath /tmp/mod_cache-lock
CacheLockMaxAge 5

</IfModule>

CacheDefaultExpire Directive

Description:

The default duration to cache a document when no expiry date is specified.

Syntax:

CacheDefaultExpire seconds

Default:

CacheDefaultExpire 3600 (one hour)

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

The CacheDefaultExpire directive specifies a default time, in seconds, to cache a document if neither an expiry date nor last-modified date are provided with the document. The value specified with the CacheMaxExpire directive does not override this setting.

CacheDefaultExpire 86400

CacheDisable Directive

Description:

Disable caching of specified URLs

Syntax:

CacheDisable url-string

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

The CacheDisable directive instructs mod_cache to not cache urls at or below url-string.

Example

CacheDisable /local_files

The no-cache environment variable can be set to disable caching on a finer grained set of resources in versions 2.2.12 and later.

See also:

  • Environment Variables in Apache (p. 1471)

CacheEnable Directive

Description:

Enable caching of specified URLs using a specified storage manager

Syntax:

CacheEnable cache_type url-string

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

The CacheEnable directive instructs mod_cache to cache urls at or below url-string. The cache storage manager is specified with the cache_type argument. cache_type mem instructs mod_cache to use the memory based storage manager implemented by mod_mem_cache. cache_type disk instructs mod_cache to use the disk based storage manager implemented by mod_disk_cache. cache_type fd instructs mod_cache to use the file descriptor cache implemented by mod_mem_cache.

In the event that the URL space overlaps between different CacheEnable directives (as in the example below), each possible storage manager will be run until the first one that actually processes the request. The order in which the storage managers are run is determined by the order of the CacheEnable directives in the configuration file.

CacheEnable mem /manual
CacheEnable fd /images
CacheEnable disk /

When acting as a forward proxy server, url-string can also be used to specify remote sites and proxy protocols which caching should be enabled for.

# Cache proxied url’s
CacheEnable disk /
# Cache FTP-proxied url’s
CacheEnable disk ftp://
# Cache content from www.apache.org
CacheEnable disk http://www.apache.org/

The no-cache environment variable can be set to disable caching on a finer grained set of resources in versions 2.2.12 and later.

See also:

  • Environment Variables in Apache (p. 1471)

CacheIgnoreCacheControl Directive

Description:

Ignore request to not serve cached content to client

Syntax:

CacheIgnoreCacheControl On|Off

Default:

CacheIgnoreCacheControl Off

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

Ordinarily, requests containing a Cache-Control: no-cache or Pragma: no-cache header value will not be served from the cache. The CacheIgnoreCacheControl directive allows this behavior to be overridden. CacheIgnoreCacheControl On tells the server to attempt to serve the resource from the cache even if the request contains no-cache header values. Resources requiring authorization will never be cached.

CacheIgnoreCacheControl On

Warning: This directive will allow serving from the cache even if the client has requested that the document not be served from the cache. This might result in stale content being served.

See also:

  • CacheStorePrivate (p. 439)
  • CacheStoreNoStore (p. 438)

CacheIgnoreHeaders Directive

Description:

Do not store the given HTTP header(s) in the cache.

Syntax:

CacheIgnoreHeaders header-string [header-string]

Default:

CacheIgnoreHeaders None

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

According to RFC 2616, hop-by-hop HTTP headers are not stored in the cache. The following HTTP headers are hop-by-hop headers and thus do not get stored in the cache in any case regardless of the setting of CacheIgnoreHeaders:

  • Connection
  • Keep-Alive
  • Proxy-Authenticate
  • Proxy-Authorization
  • TE
  • Trailers
  • Transfer-Encoding
  • Upgrade

CacheIgnoreHeaders specifies additional HTTP headers that should not to be stored in the cache. For example, it makes sense in some cases to prevent cookies from being stored in the cache.

CacheIgnoreHeaders takes a space separated list of HTTP headers that should not be stored in the cache. If only hop-by-hop headers not should be stored in the cache (the RFC 2616 compliant behaviour), CacheIgnoreHeaders can be set to None.

Example 1

CacheIgnoreHeaders Set-Cookie

Example 2

CacheIgnoreHeaders None

Warning: If headers like Expires which are needed for proper cache management are not stored due to a CacheIgnoreHeaders setting, the behaviour of mod_cache is undefined.

CacheIgnoreNoLastMod Directive

Description:

Ignore the fact that a response has no Last Modified header.

Syntax:

CacheIgnoreNoLastMod On|Off

Default:

CacheIgnoreNoLastMod Off

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

Ordinarily, documents without a last-modified date are not cached. Under some circumstances the last-modified date is removed (during mod_include processing for example) or not provided at all. The CacheIgnoreNoLastMod directive provides a way to specify that documents without last-modified dates should be considered for caching, even without a last-modified date. If neither a last-modified date nor an expiry date are provided with the document then the value specified by the CacheDefaultExpire directive will be used to generate an expiration date.

CacheIgnoreNoLastMod On

CacheIgnoreQueryString Directive

Description:

Ignore query string when caching

Syntax:

CacheIgnoreQueryString On|Off

Default:

CacheIgnoreQueryString Off

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

Compatibility:

Available in Apache 2.2.6 and later

Ordinarily, requests with query string parameters are cached separately for each unique query string. This is according to RFC 2616/13.9 done only if an expiration time is specified. The CacheIgnoreQueryString directive tells the cache to cache requests even if no expiration time is specified, and to reply with a cached reply even if the query string differs. From a caching point of view the request is treated as if having no query string when this directive is enabled.

CacheIgnoreQueryString On

CacheIgnoreURLSessionIdentifiers Directive

Description:

Ignore defined session identifiers encoded in the URL when caching

Syntax:

CacheIgnoreURLSessionIdentifiers identifier [identifier]

Default:

CacheIgnoreURLSessionIdentifiers None

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

Sometimes applications encode the session identifier into the URL as in the following examples:

  • /someapplication/image.gif;jsessionid=123456789
  • /someapplication/image.gif?PHPSESSIONID=12345678

This causes cacheable resources to be stored separately for each session, which is often not desired. CacheIgnoreURLSessionIdentifiers lets define a list of identifiers that are removed from the key that is used to identify an entity in the cache, such that cacheable resources are not stored separately for each session.

CacheIgnoreURLSessionIdentifiers None clears the list of ignored identifiers. Otherwise, each identifier is added to the list.

Example 1

CacheIgnoreURLSessionIdentifiers jsessionid

Example 2

CacheIgnoreURLSessionIdentifiers None

CacheLastModifiedFactor Directive

Description:

The factor used to compute an expiry date based on the LastModified date.

Syntax:

CacheLastModifiedFactor float

Default:

CacheLastModifiedFactor 0.1

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

In the event that a document does not provide an expiry date but does provide a last-modified date, an expiry date can be calculated based on the time since the document was last modified. The CacheLastModifiedFactor directive specifies a factor to be used in the generation of this expiry date according to the following formula:

expiry-period = time-since-last-modified-date * factor expiry-date = current-date + expiry-period

For example, if the document was last modified 10 hours ago, and factor is 0.1 then the expiry-period will be set to 10*0.1 = 1 hour. If the current time was 3:00pm then the computed expiry-date would be 3:00pm + 1hour = 4:00pm.

If the expiry-period would be longer than that set by CacheMaxExpire, then the latter takes precedence.

CacheLastModifiedFactor 0.5

CacheLock Directive

Description:

Enable the thundering herd lock.

Syntax:

CacheLock on|off

Default:

CacheLock off

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

Compatibility:

Available in Apache 2.2.15 and later

The CacheLock directive enables the thundering herd lock for the given URL space.

In a minimal configuration the following directive is all that is needed to enable the thundering herd lock in the default system temp directory.

# Enable chache lock
CacheLock on

CacheLockMaxAge Directive

Description:

Set the maximum possible age of a cache lock.

Syntax:

CacheLockMaxAge integer

Default:

CacheLockMaxAge 5

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

The CacheLockMaxAge directive specifies the maximum age of any cache lock.

A lock older than this value in seconds will be ignored, and the next incoming request will be given the opportunity to re-establish the lock. This mechanism prevents a slow client taking an excessively long time to refresh an entity.

CacheLockPath Directive

Description:

Set the lock path directory.

Syntax:

CacheLockPath directory

Default:

CacheLockPath /tmp/mod_cache-lock

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

The CacheLockPath directive allows you to specify the directory in which the locks are created. By default, the system’s temporary folder is used. Locks consist of empty files that only exist for stale URLs in flight, so is significantly less resource intensive than the traditional disk cache.

CacheMaxExpire Directive

Description:

The maximum time in seconds to cache a document

Syntax:

CacheMaxExpire seconds

Default:

CacheMaxExpire 86400 (one day)

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

The CacheMaxExpire directive specifies the maximum number of seconds for which cacheable HTTP documents will be retained without checking the origin server. Thus, documents will be out of date at most this number of seconds. This maximum value is enforced even if an expiry date was supplied with the document.

CacheMaxExpire 604800

CacheStoreNoStore Directive

Description:

Attempt to cache requests or responses that have been marked as no-store.

Syntax:

CacheStoreNoStore On|Off

Default:

CacheStoreNoStore Off

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

Ordinarily, requests or responses with Cache-Control: no-store header values will not be stored in the cache. The CacheStoreNoCache directive allows this behavior to be overridden. CacheStoreNoCache On tells the server to attempt to cache the resource even if it contains no-store header values. Resources requiring authorization will never be cached.

CacheStoreNoStore On

Warning: As described in RFC 2616, the no-store directive is intended to "prevent the inadvertent release or retention of sensitive information (for example, on backup tapes)." Enabling this option could store sensitive information in the cache. You are hereby warned.

See also:

  • CacheIgnoreCacheControl (p. 430)
  • CacheStorePrivate (p. 439)

CacheStorePrivate Directive

Description:

Attempt to cache responses that the server has marked as private

Syntax:

CacheStorePrivate On|Off

Default:

CacheStorePrivate Off

Context:

server config, virtual host

Status:

Extension

Module:

mod_cache

Ordinarily, responses with Cache-Control: private header values will not be stored in the cache. The CacheStorePrivate directive allows this behavior to be overridden. CacheStorePrivate On tells the server to attempt to cache the resource even if it contains private header values. Resources requiring authorization will never be cached.

CacheStorePrivate On

Warning: This directive will allow caching even if the upstream server has requested that the resource not be cached. This directive is only ideal for a ‘private’ cache.

See also:

  • CacheIgnoreCacheControl (p. 430)
  • CacheStoreNoStore (p. 438)

ISBN 9781906966034The Apache HTTP Server Reference ManualSee the print edition