

Opcode caches save energy, expenses, improve overall user experience on web sites, and it's often one of the simplest optimizations to implement. This article will explain the basics of installing, configuring, and tuning an opcode cache for PHP, the Alternative PHP Cache (APC).
Opcode caches save energy, expenses, improve overall user experience on web sites, and it's often one of the simplest optimizations to implement. This article will explain the basics of installing, configuring, and tuning an opcode cache for PHP, the Alternative PHP Cache (APC).
The Alternative PHP Cache is “a free, open, and robust framework for caching and optimizing PHP intermediate code.” Other opcode caches exist for PHP and a list of the known opcode caches have been cited at the end of this article for those interested in further research2. This article will also assume usage of a common server environment such as Apache with mod_php.
APC is an actively maintained open source project available via PECL (http://pecl.php.net)). APC offers a several configurations for performance tuning, a user variable cache, and plans to support an optimizer currently under development. Opcode caches give PHP applications a large performance gain by avoiding the lexical analysis and parsing of human readable source code during the compilation phase of PHP (this will be discussed in detail later).

The above graph compares an example web site that implements both the APC file cache and user cache. The performance gains are measured by Apache Bench in page requests per second. Apache Bench is a simple load testing and performance benchmarking tool available at http://apache.org as part of the HTTP server project.
Because opcode caches offer a significant gain in performance for relatively very little effort, it's worthwhile just to get a basic installation running before delving into more specific configurations and usage.
Installation can be done automatically with the command pecl install apc. APC can also be installed by hand by downloading the source from http://pecl.php.net/packages/APC, and executing the phpize, ./configure, make, and make install commands from the source directory of the package.
APC's configuration options are read from the standard PHP ini files. The most basic options should consist of:
On to enable, Off to disable.apc.enabled, allows disabling the opcode cache. This can be useful in the event that any problems should arise and APC is suspect. The second option controls the size of the opcode cache. Set this to something appropriate for your system and code base, and increase it as necessary by monitoring apc.php.Within the APC source code is an apc.php file. It's good practice to copy this somewhere under a secured directory that can be accessed via the HTTP server. This page will allow you to monitor the usage of your cache, configurations, user variables, and files. A very useful tool!
After the installation, configuration, and optional monitoring are setup a restart of your server environment will bring up the server with APC running and ready to server requests.
It's always important to verify performance gains to ensure that they are having the intended affect. Apache Bench (AB) is a tool available with Apache that allows simple load testing and benchmarking of pages via HTTP. After APC is installed, it would be a good time to verify that the requests per second (RPS) have shown a significant improvement. If gains aren't apparent then monitoring of the error log and verification that files are being cached via apc.php are good starting points to investigating possible problems.
Part of the process of dynamic programming languages such as PHP is the job of converting human readable code into executable structures that the processor or virtual machine can execute. This involves some expensive processes like lexical analysis and parsing into a structured binary form (left diagram). Because this is an expensive process, many languages cache the resulting structures to save time on the next run.

APC works by intercepting the normal compilation phase and storing or fetching the resulting opcodes from a cache (right diagram). The results are referred to as opcodes, which are just C data structures that can be interpreted by the PHP virtual machine. The Vulcan Logic Disassembler (VLD) is a tool used by PHP core developers to debug or analyze these structures. Vulcan Logic Disassembler (VLD) by Derick Rethans is available at http://pecl.php.net/packages/VLD/.
Given a simple PHP script like the following:
VLD will provide the following opcode representation of the compiled code:
Opcode structures often consume more space than the actual source code. This can impact the configuration of APC as more memory than expected may be required to store the resulting opcodes.
In addition the file cache provided by APC, it's also capable of storing application specific data in the form of a user variable cache. This is a fast mechanism for storing and retrieving PHP variables. Because APC knows about internal PHP structures there's no need for any serialization or de-serialization of normal PHP data. The only exceptions to this are objects, which must be stored in a serialized form and are there fore slower to access. Resources also cannot be stored due to their per-process uniqueness.
User caches are typically application specific, so it's difficult to say anything that's not general about what data to store. User-specific data doesn't work well with a roundrobin DNS setup. This is both because of the large amount of user data that usually exists, and because a user could hit many servers without returning to one that has their data cached. However, users that always return to the same server and have long running sessions may be able to cache more user specific data. Other more useful data includes configurations, statistics, common data that resides in databases such as lists of cities, addresses, zones, etc.
APC has a number of functions available to get statistics, insert, delete, and retrieve values. The following is only a summary, for full documentation please see the PHP manual: http://www.php.net/manual/en/book.apc.php.
APC stores the cached opcodes in a shared memory segment for quick access to the data. Because this memory is shared among all the server processes, locking mechanism must be used to ensure exclusive access while modifying the memory structures. This creates a significant bottleneck in performance. Previous versions of APC used file locking mechanism which provide extremely reliable locking, but at a significant performance cost. Recent versions of APC now use the pthread mute locks that are standard on most hardware. Some other locking alternatives are available such as Inter-Process Control (IPC) locks, Linux Futex locks, and most recently spin locks ported from the PostgreSQL project. The following graphs show the recommended locking mechanisms in a high contention benchmark. Time in seconds is measured along the horizontal x-axis, with total CPU on the vertical y-axis. This is broken down into system CPU in red and user CPU in blue.

Because pthread mutex locks are heavily used in other applications, they provide a more reliable solution than spin locks and are currently the default in recent APC versions. The spin locking mechanisms were ported directly from the PostgreSQL project and can provide a significant difference over pthread locking if it's necessary. Spin locks are currently experimental, however, and have some known dead locking issues due to some PHP execution signals. A solution is currently being developed that should resolve this issue and provide stable spin locking for general use.
APC is, in a sense, one giant hash table shared among all the web server processes. As with any other hash table its performance is directly related to the number of collisions of its hash functions. Stated simply, for optimal performance the size of the hash table should be proportional to the number of elements stored within it. APC allows two configuration options for this.
Compared to memory access, modern hard disk drives are a performance quagmire, and they should be avoided as much as possible. Despite reading opcodes completely from memory, APC accesses the disk to make sure the source files haven't been modified as this would require updates to the cached opcodes. Without this check source code changes won't take affect. If, however, source code changes don't happen frequently and one is willing to restart the entire server when source code changes do happen, this expensive step can also be avoided. This makes the server nearly disk-less for typical PHP applications. Setting the apc.stat option to false configures APC to disable the stat calls to the filesystem. This both avoids accessing disk, but also avoids the system call. The performance gains are substantial for large code bases that include many libraries.
Alternatively, some code bases may require frequent file updates despite negative performance gains. It's not uncommon for updates to be accomplished with rsync, cvs, svn, or some other version control system or transport. Some of these tools, however, backdate modified times that APC uses to determine if files have been updated. If this is the case, apc.stat_ctime can be set to true to force APC to check the creation time rather than the modified time of the file. This often mitigates problems associated with these tools.
“All those moments will be lost in time, like tears in rain.” -Blade Runner
APC offers both a default file and user time to live (TTL) value. This limits the validity of a file or value to a set amount of time. This can be useful as either a dirty mechanism or as a safety mechanism to ensure that stale data isn't used from the cache.
A apc.gc_ttl setting also controls the garbage collection TTL. Because multiple processes could be using APC cache contents at the same time, data cannot be deleted until all processes have finished using this data. If a process dies without releasing its reference to this data, APC can never know when it's safe to delete the data from the cache. The gc_ttl is used to determine how long APC needs to wait before it can delete an item from the cache even if it appears that a process is using an entry. This is typically set to a large value to ensure items are never prematurely deleted.
The APC cache starts out empty after a server restart, so performance will not be ideal until the cache is fully “warmed” up with data. This in turn can cause not only poor general performance but can cause the cache to be inundated with locked processes waiting their turn to insert data. Priming the cache is a technique that solves this problem by inserting files and user variables that are expected to be requested on initial requests. To work properly, the cache must be primed after restart, but before any requests are received.

The iptables command under Linux is useful for controlling access during this process. It's important to understand that the priming must currently be done via a server process because command line scripts do not have access to the APC cache. As an example, a script that uses the compile_file and apc_store functions to prime the cache can be placed within an access restricted path under the server's document root. Before a server is started, iptables or other routing restrictions can be placed on every IP except for 127.0.0.1 or another internally known address. Once the server is restarted, a single request via a command line utility such as curl, can request this page which in turn will prime the cache. Once the request has finished, the iptables or routing restrictions can be lifted and the server will be ready to serve requests without worries of poor cache performance.
In some cases there may be files that should not be cached. This could be due to the large size of auto-generated code or the frequency that pages are updated. It could also be a temporary fix for code that isn't working under the opcode cache. (opcode caches sometimes have problems with conditional includes, classes, or functions).
With the release of php-5.2.0 support for RFC1867, or file upload progress, became available. This option allows APC and other extensions to report back information on the progress of a file upload. Javascript requests from the client can then be sent back to the server which will in turn display information back to the user about the progress of an upload. APC supports this mechanism via inserting and updating user variables with details about file uploads.
Where the APC key containing upload information would be prefixed by the apc.rfc1867_prefix value and “A86DCF0C”. i.e., upload_A86DCF0C (upload_ is the default prefix).
An interesting use of APC user variables is to control site behavior, especially on a large scale site with many servers. A good example would be the enabling of a new feature that could potentially cause problems or bad user experiences. Normally new features might be pushed out to a live site via an update to the source code. This method often requires a relatively large amount of time and effort on the part of the engineers. In the event that the feature needs to be disabled, new code must be pushed to every server. In cases of conversions this can also lead to data inconsistencies as one part of the site is using one part of code, while another part of the site is using it in a different way.
APC variables help resolve this by reducing the effort and time needed to enable and disable features. Instead of pushing out new or previous code, requests can be made via HTTP to restricted pages that update APC variables. Specific code is then conditionally executed based upon these APC values. The time for rolling back features now only takes as long as the time required for one HTTP request to every server. It's also much simpler to change a single variable than to deal with pushing and reverting many code revisions.
Opcode caches are a must have for any PHP based site serious about performance and its ability to scale. I hope this article has provided a good starting point for a basic installation as well as plenty of future work and tweaks for specific applications. The PHP documentation is the best place for more information about configuring and installing APC: http://www.php.net/manual/en/book.apc.php.
In addition to myself, APC has many contributors who have initially written, maintained, enhanced, fixed or otherwise worked on APC development including George Schlossnagle, Daniel Cowgill, Rasmus Lerdorf, Gopal Vijayaraghavan, Edin Kadribasic, Ilia Alshanetsky, Marcus Börger, Sara Golemon. I'm grateful for their work as well as those submitting bugs and patches for APC, PHP, and various other projects.
Other PHP Opcode caches include:
Excellent writeup - a lot of the apc settings tend to be cryptic even after reading the docs that come with it. A question about stability - it looks that pac has troubles on debian etch with php+suhosin - segfaults are not too uncommon. Is there a known workaround?
Thanks. Yeah often times cryptic settings come from it's long history and changes. I'm sure things can get cleaned up in time, you just don't want to break backwards compatibility for everyone unless you have to. I haven't done much work directly with suhosin, but looking at the code just now it seems there's plenty f things it's doing that could mess with the functionality of APC. This often happens with other Zend extensions and patches including some other optimizers from Zend. Unfortunately it's usually the case that you just need to determine any settings that may be conflicting or resolve to only use one or the other of the two extensions. Sorry that I don't have an easily solution to this one but both extensions are trying to mess with the same internal structures which creates compatibility difficulties. In the case of suhosin, I'll add it to my list to look at this some point in case there's something obvious that can be done in the future....

