

If you've ever perused the internet in search of an article or reference on writing an IRC bot in PHP, you were likely disappointed. Most such documents are outdated to the point that their examples are based on PHP 4 and its socket functions. Few show more than the most trivial of bot implementations and barely touch the tip of the iceberg where the IRC protocol is concerned. This article is intended to help remedy that situation.
The PHP Community channel on the Freenode IRC network, #phpc, had a longstanding bot called “Ai”. Like many bots at the time of her creation, she was based on PHP 4. Her source was never released and the only way to have updates made to her was to contact the original developer and wait until his availability allowed him time to work on her.
With the coming end-of-life of PHP 4 and at the encouragement of channel users, I decided to start a project to develop a new bot based on PHP 5 that would fully utilize its new object model and offer users a chance to contribute to the bot they used in their channel.
I could go into a long discussion of some of the earlier iterations of the project, but I'm not sure you would glean much from that. Suffice it to say, the bot that would later come to be known as Phergie was the subject of a fair amount of organic growth and is still today in an alpha state of development. Instead, I'll just go straight into a description of and reasons behind her current design and some additions that are currently intended for inclusion in short-term future releases.
To start with, I needed to get intimately familiar with the IRC protocol in order to make informed design decisions. The IRC Help web site has an excellent section on the RFCs related to the IRC protocol and extensions to it, such as the Client-To-Client Protocol or CTCP. If you don't plan on writing your own bot from scratch, this will at the very least give you a deeper understanding of the design of any existing library you might use. If you do write a bot from the ground up, this knowledge is crucial to getting it to a functional point and being able to troubleshoot issues with it.
The Phergie project came about around the same time that I discussed an idea with a friend of mine, Ben Ramsey, to wrap the libircclient library in a PECL extension. We both dabbled in that project on and off before becoming involved in other tasks, so for the moment it's on the back burner. Assuming that it would one day see a stable release, though, I wanted my new project to be capable of utilizing it when that time came.
Lastly, I wanted to make it as easy as possible to get the bot up and running with minimal configuration as well as to create new plugins for the bot to extend and enhance its base functionality. With all these thoughts in mind, I started work on the core. See Figure 1 for a diagram of its constituent components and how they interact with each other.

The bootstrap file is responsible for basic setup tasks that need to take place before the bot can run. These include things like checking the PHP version, setting the include path, reading the configuration file, and instantiating the driver and plugin classes.
It eventually executes a driver method to initiate an event handling loop, which handles receiving events and passing them onto plugins until the bot's connection is terminated. Once that method terminates, the bootstrap checks its return value and, if needed, will reinitialize itself in order to establish a new connection in place of a terminated one or to reload the configuration file. See below for the current revision of the bootstrap file source code.
The Phergie project follows the implementation of PHP itself by using a centralized INI configuration file that includes both core settings (the server hostname, the nick and username for the bot, etc.) as well as those that are specific to individual plugins.
A feature desired early on in the project was the ability to modify configuration settings both in memory and in the file on disk from within plugins. This made it possible to persist setting modifications between executions of the bot without requiring direct access to the configuration file itself.
PHP's native parse_ini_file function doesn't provide very exact information on parse errors and tends to have parsing issues when setting values contain an equal sign. There is also no equivalent function to write changes back to an INI file. Preservation of comments and formatting were an important related concern.
As such, a class is currently in development to handle reading and writing of the configuration file and to add a number of features such as support for constants and variables in setting values, use of arithmetic and bitwise operators, arrays of values under a single setting name, and smart caching of setting values into memory. Other classes may be developed in the future to handle other configuration file formats including PHP, XML, and JSON.
See below for the core settings section of the the current revision of the stock configuration file.
At the center of the core is the driver, which is responsible for handling communications between the bot and the server. This includes prioritizing, formatting, and sending commands issued via the API to the server as well as converting data from incoming events syndicated by the server into usable data objects.
The only driver currently in active development is based on streams and uses the Socket wrapper. Other drivers can easily be created to wrap any existing IRC PECL extensions or PHP libraries you may want to use, such as PEAR::Net_SmartIRC. Each driver extends a base abstract class that dictates the API it should implement in order for it to be usable by the rest of the core.
There are two basic types of events in the IRC protocol: requests and responses. It's important to note the difference because each type has its own class and is handled differently within plugins.
Requests are initiated by users and include both actions initiated by the bot as well as those initiated by other users and syndicated to the bot by the server. Each request type has its own event handler in the driver. The only oddity I came across in creating the portion of the streams driver for parsing incoming request events was the PRIVMSG command. Its first parameter is the nick of the intended recipient (in this case the bot) when the command is used to issue a message directly to a user. Since that information is not helpful in identifying the source of the event, I created a new method to return either the channel name or the nick of the user who originated the message.
Responses are initiated by the server as a result of an action taken by the bot, where each potential response is specific to that particular action. Because there are so many potential responses, they have a single “blanket” event handler. The response type is stored in a property of the event instance that is automatically passed to the plugin by the driver. Since the formatting of the related section on the IRC Help web site was fairly consistent, I saved a local copy of the page and wrote a quick throw-away script using the DOM extension to extract the name, description, and numeric code of each response and format it for easy transplantation into the response class.
On its own, the driver is not very useful. To make it so, code must be written to receive events from the server, act on them in some way, and in most cases dispatch commands to be sent back to the server in response. These units of code are referred to as plugins.
A base plugin class exists to provide functionality that is commonly needed in most plugins such as automatically determining the short name of a subclass for identification purposes, creating a local directory or database for storage, and handling configuration settings. All plugin classes either extend the base class or a subclass of it.
Some tasks that are required for all plugins are performed in the base plugin class constructor. Rather than requiring subclasses to explicitly call the parent constructor within their own, the base constructor calls another method, init, and contains a stub of this method so that subclasses can implement it only if needed to perform initialization tasks when the plugin is loaded.
Before the bootstrap instantiates a plugin, it calls a static method in that plugin class responsible for checking to ensure that the environment meets that plugin's needs. This can include the PHP version, loaded PHP extensions, and other Phergie plugins. If this method returns false, the bootstrap simply skips over instantiation of that plugin and continues.
When the bootstrap does instantiate a plugin, it stores a reference to the driver instance in that plugin to allow it to issue commands to the driver which are in turn sent to the server. The base plugin class makes use of the magic method __call to direct calls to nonexistent methods to the driver instance. This adds the convenience of being able to call a driver method like any other plugin method instead of referencing the driver instance property of the plugin every time a driver method is called.
The meat of plugins are constituted by their event handler methods. The base plugin class contains stub declarations of methods intended to handle specific types of events, which subclasses override. When the driver intercepts an event, it calls a method in the base plugin class to store that event in a property of the base class. Within event handler methods, plugins can simply refer to that property if and when they require information contained in the event.
See below for an example of a plugin that responds to DNS and reverse DNS lookup requests.
In an article to follow, I will delve more deeply into the subject of plugins and core features that support their development.
Matthew Turland lives in Duson, LA with his wife and three children and is currently employed as Lead Programmer for surgiSYS LLC. In his spare time he contributes to open source projects, frequents the #phpc channel on the Freenode IRC network under the nick Elazar, and shares his experiences on his blog at http://ishouldbecoding.com.

