Practical Uses for the PHP Tokenizer

by Stan Vassilev (2008-08-19)
 

In this article we take a look at the PHP tokenizer and its potential at analyzing and processing PHP source code. We will build several working examples, which you can start using and extending for your own purposes.

Introduction

When PHP has to process a request, the engine goes through several passes of parsing until the code is expressed as a set of instructions that the interpreter can execute. The first such step is “lexical scanning”, which splits the code into smaller strings called “tokens”. The token is the smallest meaningful unit of your source code, and it can represent a reserved word (for, while, class, if, etc.), operator (+, -, *, /, && etc.), value literals (integers, floats, strings etc.) and other special symbols.

The same lexical scanner which PHP uses, is also available to userspace PHP developers via the function token_get_all(). It is very simple to use: you pass your PHP source code as text, and it returns an array of tokens, that we will further process in the examples of this article.

Let's see what the tokenizer will output for a little code snippet:

  1. $code ='<?php
  2. echo 'string1'.'string2';
  3. ?>';
  4.  
  5. $tokens = token_get_all($code);
  6.  
  7. var_export($tokens);

Output of var_export($tokens) (whitespace adjusted for clarity):

  1. array (
  2.   0 => array (0 => 367, 1 => "<?php\n", 2 => 1),
  3.   1 => array (0 => 316, 1 => "echo", 2 => 1),
  4.   2 => array (0 => 370, 1 => "", 2 => 1),
  5.   3 => array (0 => 315, 1 => "'string1'", 2 => 1),
  6.   4 => ".",
  7.   5 => array (0 => 315, 1 => "'string2'", 2 => 1),
  8.   6 => ";",
  9.   7 => array (0 => 370, 1 => "\n", 2 => 1),
  10.   8 => array (0 => 369, 1 => "?>", 2 => 1),
  11. );

We see that most tokens are defined as arrays, each of which has three items. Item [0] is the token type code, item [1] is the slice of source code representing the token, and item [2] is source line position of the token.

You can fetch readable string names for token types with function token_name(). The same names are also defined as constants in PHP. If you have code autocompletion in an IDE like Eclipse PDT, type T_ to see a full list of token name constants. Check also this page in the PHP manual for a list of token types: www.php.net/manual/en/tokens.php.

Notice that several tokens (index 4 and 6) are defined as a single character string. The token_get_all() function represents all single-character operators and symbols in PHP in this manner, including curly braces, parenthesis etc. You can think of that character as both the token string representation, and the token type.

Since reading the raw token array is hard, and we'll need this for debugging purposes and learning, here is a simple “reporting” function, which prints the tokens for human reading:

  1. function reportTokens($source)
  2. {
  3.   $out = '';
  4.   $tokens = token_get_all($source);
  5.  
  6.   foreach ($tokens as $t) {
  7.     if (is_array($t)) {
  8.       // showing macros for new lines and tabs, to improve readability
  9.       $t[1] = str_replace(
  10.             array("\n", "\r", "\t"),
  11.             array("

Output of echo reportTokens($code):

  1. 1   T_OPEN_TAG                     <?php\r\n
  2. 2   T_ECHO                         echo
  3. 2   T_WHITESPACE                   
  4. 2   T_CONSTANT_ENCAPSED_STRING     'string1'
  5. -   -                              .
  6. 2   T_CONSTANT_ENCAPSED_STRING     'string2'
  7. -   -                              .
  8. 2   T_VARIABLE                     $a
  9. -   -                              ?
  10. 2   T_LNUMBER                      1
  11. -   -                              :
  12. 2   T_LNUMBER                      2
  13. -   -                              ;
  14. 2   T_WHITESPACE                   \r\n
  15. 3   T_CLOSE_TAG                    ?>

This looks better. Don't forget to prepare the output for HTML, if you want to run the report in a browser: echo “<pre>”, htmlspecialchars(reportTokens($code)), “</pre>”.

Example 1: Stripping whitespace and comments

Many of us have become used to writing elaborate in-code documentation in the form of PHPDoc comments (see http://www.phpdoc.org). Unlike regular line and block comments, PHPDoc blocks are not stripped by the PHP engine, as they are accessible at runtime via the Reflection API. Thus, software licenses, code examples and method descriptions become a permanent part of your PHP compiled files, even with opcode cache engines, such as APC or XCache.

If you have collected a big library of reusable code over time, which changes sparingly, and don't use this feature of the Reflection API, you can strip some unneeded weight from your library by removing comments and whitespace. It also speeds up the parsing nominally for those of us deploying without an opcode cache.

We will need a handy “token position” cursor that we can advance, token by token, among all methods that will handle aspects of the parsing. Normally we would have to implement this ourselves, however, PHP arrays have a built-in array position cursor, which is perfect for our purposes. We'll use the following PHP functions for interacting with that cursor and the array:

  1. // sets the position cursor to the array start
  2. // and returns the first item (first token in this case)
  3. $item = reset($array);
  4.  
  5. // returns the token at the current position
  6. // or false if there are no more tokens
  7. $item = current($array);
  8.  
  9. // acts similarly but also advances
  10. // one position forward before returning a value.
  11. $item = next($array);

Let's make the skeleton for our filter class:

  1. class CompactCode
  2. {
  3.   static protected $out;
  4.   static protected $tokens;
  5.  
  6.   static public function compact($source)
  7.   {
  8.     self::$tokens = token_get_all($source);   
  9.     self::$out = '';
  10.  
  11.     reset(self::$tokens);
  12.    
  13.     while ($t = current(self::$tokens)) {
  14.       if (is_array($t)) {
  15.         self::$out .= $t[1];
  16.       } else {
  17.         self::$out .= $t;
  18.       }
  19.      
  20.       next(self::$tokens);
  21.     }
  22.    
  23.     return self::$out;
  24.   }
  25. }

In the above code we walk through each token and append its output to self::$out with no modification. Next, we'll check for token types T_WHITESPACE, T_DOC_COMMENT and T_COMMENT and redirect the flow to a private method which will handle any sequence of those three tokens. You will see that splitting the parsing subtasks in separate methods is helping code comprehension a lot, especially in real world scenarios where the parsing task may become far more complex.

We'll add method skipWhiteAndComments(), which will advance the position cursor for as long as it keeps finding whitespace and comments, however, we don't want to completely ignore those tokens as this can produce incorrect code. Instead, we'll replace any sequence of comments and whitespace with a single space. For example, if we have a constant called “name”, we do not want echo name or echo/*comment*/name to become echoname. Curiously, the built-in PHP utility for compacting source code (php –w, php_strip_whitespace()) currently renders invalid code in the latter case.

  1. class CompactCode
  2. {
  3.   static protected $out;
  4.   static protected $tokens;
  5.  
  6.   static public function compact($source)
  7.   {
  8.     self::$tokens = token_get_all($source);   
  9.     self::$out = '';
  10.    
  11.     reset(self::$tokens);
  12.    
  13.     while ($t = current(self::$tokens)) {
  14.       if (is_array($t)) {
  15.         if ($t[0] == T_WHITESPACE || $t[0] == T_DOC_COMMENT || $t[0] == T_COMMENT) {
  16.           self::skipWhiteAndComments();
  17.           continue;
  18.         }       
  19.        
  20.         self::$out .= $t[1];
  21.       } else {
  22.         self::$out .= $t;
  23.       }
  24.      
  25.       next(self::$tokens);
  26.     }
  27.    
  28.     return self::$out;
  29.   }
  30.  
  31.  
  32.   static private function skipWhiteAndComments()
  33.   {
  34.     self::$out .= ' ';
  35.    
  36.     while ($t = current(self::$tokens)) {
  37.       if (is_array($t) && ($t[0] == T_WHITESPACE || $t[0] == T_DOC_COMMENT || $t[0] == T_COMMENT)) {
  38.         next(self::$tokens);
  39.       } else {
  40.         return;
  41.       }
  42.     }
  43.   }
  44. }

Notice that after the flow returns from method skipWhiteAndComments(), I added continue, which is needed so $t gets updated to the current token position set by skipWhiteAndComments() and is processed fully.

And it really is as simple as this, we're done. Here's a sample input to test our class with:

  1. <?php
  2. /**
  3. * This is my class.
  4. */
  5. class MyClass
  6. {
  7.   const example = 123;
  8.  
  9.   /**
  10.    * This is my function.
  11.    */
  12.   function myFunction()
  13.   {
  14.     // example line comment
  15.     echo self::example;
  16.    
  17.     echo/* testing edge case with block comment*/self::example;
  18.   }
  19. }
  20. ?>

And the resulting output:

  1. <?php
  2. class MyClass { const example = 123; function myFunction() { echo self::example; echo self::example; } } ?>

In practice, if we compact code like this we may find it hard to find the origin of an error as the lines of the statements have changed. However, it is possible to easily strip comments and whitespace while preserving the exact original lines for all statements. All we need to do is filter the tokens in method skipWhiteAndComments() down to the newline characters (\r and \n) they contain, and add them to the output:

  1. static private function skipWhiteAndComments()
  2. {
  3.   self::$out .= ' ';
  4.  
  5.   while ($t = current(self::$tokens)) {
  6.     if (is_array($t) && ($t[0] == T_WHITESPACE || $t[0] == T_DOC_COMMENT || $t[0] == T_COMMENT)) {
  7.       self::$out .= preg_replace("/[^\n\r]/", "", $t[1]);
  8.       next(self::$tokens);
  9.     } else {
  10.       return;
  11.     }
  12.   }
  13. }

Here is the resulting output:

  1. <?php
  2.  
  3.  
  4.  
  5. class MyClass
  6. {
  7. const example = 123;
  8.  
  9.  
  10.  
  11.  
  12. function myFunction()
  13. {
  14.  
  15. echo self::example;
  16.  
  17. echo self::example;
  18. }
  19. }
  20. ?>

We see that the indentation and comments are gone, but the line numbers are preserved.

Example 2: Source preprocessing

Source preprocessing is the act of using macro commands to modify your source code before compilation and execution. This may include skipping parts of our code, or adding new code, depending on a certain predefined condition. Just like our first example, source preprocessing may be too elaborate for our day-to-day application scripting tasks, but becomes useful when handling large libraries of reusable code. You can filter out debug-only code, for example logging and various development-time aids. We can also have one source file compiling to multiple platform-specific “driver” classes, for example for each server we target (by defining symbols for ex. APACHE, IIS), or each database engine we target (for ex. MYSQL, PGSQL, SQLITE), etc.

In this example we will implement two preprocessing constructs using the PHP tokenizer:

  • #IF_DEFINED <Symbol> ... #END_IF
  • #IF_DEFINED _INSERT <Symbol1>:<Code1>, <Symbol2>:<Code2>, ...
  • The first construct (IF_DEFINED ... END_IF) will allow us to selectively filter out or include portions of code depending on certain “symbols” we pass as environment to the preprocessor. The second construct (IF_DEFINED_INSERT) will insert a different code snippet depending on the defined symbols (I will show you the need for this a bit later). The “#” symbol in PHP is the start of a Perl-style line comment, and since this symbol is also used for preprocessing in other languages, it's a natural choice for our macro syntax.

We will start with the same class skeleton as in Example 1, and redirect to method processMacro() when we detect a T_COMMENT token. Because not every comment is a macro, the processMacro() method will return true or false depending on whether the comment was recognized as a macro:

  1. class Preprocessor
  2. {
  3.   static protected $out;
  4.   static protected $tokens;
  5.   static protected $symbols;
  6.  
  7.   static public function process($source, $symbols)
  8.   {
  9.     self::$tokens = token_get_all($source);
  10.     self::$symbols = $symbols;
  11.     self::$out = '';
  12.    
  13.     reset(self::$tokens);
  14.    
  15.     while ($t = current(self::$tokens)) {
  16.       if (is_array($t)) {
  17.         if ($t[0] == T_COMMENT) {
  18.           if (self::processMacro()) {
  19.             continue;
  20.           } 
  21.         }
  22.        
  23.         self::$out .= $t[1];
  24.       } else {
  25.         self::$out .= $t;
  26.       }
  27.      
  28.       next(self::$tokens);
  29.     }
  30.    
  31.     return self::$out;
  32.   }
  33. }

Now we'll add the processMacro() method. We will check with a regular expression if the comment matches one of the supported macro syntax rules, and if not just return (the comment will display normally, we don't filter regular comments in this example). But if recognized, we send each macro to its own method for further processing.

  1. static private function processMacro()
  2. {
  3.   $t = current(self::$tokens);
  4.  
  5.   // is it a known macro?
  6.   if (preg_match('/#(IF_DEFINED_INSERT|IF_DEFINED)\s+(.*)\s*/', $t[1], $macro)) {
  7.    
  8.     if ($macro[1] == 'IF_DEFINED_INSERT') {
  9.       self::macroInsert($macro[2]);
  10.     }
  11.    
  12.     if ($macro[1] == 'IF_DEFINED') {
  13.       self::macroIfBlock($macro[2]);
  14.     }
  15.    
  16.     return true;
  17.   } else {
  18.     return false;
  19.   }
  20. }

Let's implement the IF_DEFINED_INSERT macro first. We parse the expression, and then just output the relevant code snippets for all defined symbols:

  1. static private function macroInsert($expression)
  2. {
  3.   $subExpressions = explode(',', $expression);
  4.  
  5.   foreach ($subExpressions as $expr) {
  6.     $expr = explode(':', $expr);
  7.    
  8.     if (in_array(trim($expr[0]), self::$symbols)) {
  9.       self::$out .= trim($expr[1]);
  10.     }
  11.   }
  12.  
  13.   next(self::$tokens);
  14. }

And for the IF_DEFINED block implementation, we should check if the symbol is defined or not, and depending on that, output or skip all tokens right up to the next END_IF macro:

  1. static private function macroIfBlock($expression)
  2. {
  3.   $symbol = trim($expression);   
  4.   $showBlock = in_array($symbol, self::$symbols);
  5.  
  6.   // we move past the #IF_DEFINED token
  7.   next(self::$tokens);
  8.  
  9.   while ($t = current(self::$tokens)) {
  10.     if (is_array($t)) {
  11.       if ($t[0] == T_COMMENT && trim($t[1]) == '#END_IF') {
  12.         // we move past the END_IF token and return
  13.         next(self::$tokens);
  14.         return;
  15.       }
  16.      
  17.       if ($showBlock) self::$out .= $t[1];
  18.     } else {
  19.       if ($showBlock) self::$out .= $t;
  20.     }
  21.    
  22.     next(self::$tokens);
  23.   }
  24. }

And our simple preprocessor is complete. Let's try it with an example. I have a created a simple class, representing a database wrapper for MySQL, Microsoft SQL and PostgreSQL. My table/column quoting method has a different syntax for each database engine. Instead of manually maintaining three files or doing plenty of “if” checks at runtime, when many of my other methods are similar, I'll use our preprocessor to compile three classes out of a single source.

I also have an example debug block, which I can filter out at will depending on what kind of distribution I'm preparing:

  1. <?php
  2. class MyLib_Database#IF_DEFINED_INSERT MYSQL : _MySql, MSSQL : _MsSql, PGSQL: _PgSql
  3. {
  4.   public function quoteIdentifier($ident)
  5.   {
  6.     #IF_DEFINED DEBUG
  7.     echo 'This is a debug log message.';
  8.     #END_IF
  9.    
  10.     #IF_DEFINED MYSQL
  11.     return "`".$ident."`";
  12.     #END_IF
  13.    
  14.     #IF_DEFINED MSSQL
  15.     return '['.$ident.']';
  16.     #END_IF
  17.    
  18.     #IF_DEFINED PGSQL
  19.     return '"'.$ident.'"';
  20.     #END_IF
  21.   }
  22. }
  23. ?>

Output for echo Preprocessor:process($source, array('MYSQL')) (whitespace adjusted for clarity):

  1. <?php
  2. class MyLib_Database_MySql
  3. {
  4.   public function quoteIdentifier($ident)
  5.   {
  6.     return "`".$ident."`";
  7.   }
  8. }
  9. ?>

Output for echo Preprocessor:process($source, array('MSSQL')) (whitespace adjusted for clarity):

  1. <?php
  2. class MyLib_Database_MsSql
  3. {
  4.   public function quoteIdentifier($ident)
  5.   {
  6.     return '['.$ident.']';
  7.   }
  8. }
  9. ?>

Output for echo Preprocessor:process($source, array('PGSQL')) (whitespace adjusted for clarity):

  1. <?php
  2. class MyLib_Database_PgSql
  3. {
  4.   public function quoteIdentifier($ident)
  5.   {
  6.     return '"'.$ident.'"';
  7.   }
  8. }
  9. ?>

Output for echo Preprocessor:process($source, array('PGSQL', 'DEBUG')) (whitespace adjusted for clarity):

  1. <?php
  2. class MyLib_Database_PgSql
  3. {
  4.   public function quoteIdentifier($ident)
  5.   {
  6.     echo 'This is a debug log message.';
  7.  
  8.     return '"'.$ident.'"';
  9.   }
  10. }
  11. ?>

Things you can do to improve the Preprocessor is support line-preserving processing similar to Example 1, and handling of nested IF_DEFINED blocks. This is left as an exercise for the reader.

Example 3: Detecting class, interface and function definitions

In our last example we will parse a source file and return the names of all classes, interfaces and functions defined inside it. Such a list can then be saved and used for a flexible function/class autoloader or for reporting purposes.

Yet again we start with a skeleton class similar to our first example, this time however we have no $out member collecting output, but $definition where we'll push definition names. I also included the same skipWhiteAndComments() method, but removed any lines that produce output, the use of this method will be clear in a bit. We will redirect all T_CLASS, T_INTERFACE, T_FUNCTION tokens (standing for the class, interface, function reserved words, respectively) to a method we'll call readDefinition().

  1. class DefinitionScanner
  2. {
  3.   static protected $definitions;
  4.   static protected $tokens;
  5.   static protected $symbols;
  6.  
  7.   static public function scan($source)
  8.   {
  9.     self::$tokens = token_get_all($source);
  10.     self::$definitions = array();
  11.    
  12.     reset(self::$tokens);
  13.    
  14.     while ($t = current(self::$tokens)) {
  15.       if (is_array($t)) {
  16.         if ($t[0] == T_CLASS || $t[0] == T_INTERFACE || $t[0] == T_FUNCTION) {
  17.           self::readDefinition();
  18.           continue;
  19.         }
  20.       }
  21.      
  22.       next(self::$tokens);
  23.     }
  24.    
  25.     return self::$definitions;
  26.   }
  27.  
  28.   static private function skipWhiteAndComments()
  29.   {
  30.     while ($t = current(self::$tokens)) {
  31.       if (is_array($t) && ($t[0] == T_WHITESPACE || $t[0] == T_DOC_COMMENT || $t[0] == T_COMMENT)) {
  32.         next(self::$tokens);
  33.       } else {
  34.         return;
  35.       }
  36.     }
  37.   }
  38. }

The structure is simple: the class/interface/function token is followed by one or more comment or whitespace tokens, and the next token must be the definition name identifier (of type T_STRING). We won't check the type of that last token, since we assume the file we are scanning has no syntax errors:

  1. static private function readDefinition()
  2. {
  3.   $t = current(self::$tokens);
  4.   $definitionType = $t[1];
  5.  
  6.   // move past the class/interface/function token
  7.   next(self::$tokens);
  8.  
  9.   self::skipWhiteAndComments();
  10.  
  11.   $t = current(self::$tokens);
  12.   $definitionName = $t[1];
  13.  
  14.   self::$definitions[] = array(
  15.     'type' => $definitionType,
  16.     'name' => $definitionName
  17.   );
  18.  
  19.   // move past the name identifier
  20.   next(self::$tokens);
  21. }

Here's the source sample we'll test this on:

  1. <?php
  2. class ClassA
  3. { 
  4.   function a() {
  5.     // pseudo code
  6.     if ($this->test) {
  7.       doSomething();
  8.     } else {
  9.       doSomethingElse();
  10.     }
  11.   }
  12.  
  13.   function b() {}
  14.  
  15.   function c() {}
  16. }
  17.  
  18. function outerFunctionOne()
  19. {
  20.   // pseudo code
  21.   if ($this->test) {
  22.     doSomething();
  23.   } else {
  24.     doSomethingElse();
  25.   } 
  26. }
  27.  
  28. class ClassFoo
  29. {
  30.   function bar() {}
  31.   function baz() {}
  32. }
  33.  
  34. function outerFunctionTwo() {}
  35.  
  36. function outerFunctionThree() {}
  37.  
  38. interface MyInterface {
  39.   function myMethod() {}
  40. }
  41. ?>

And the output of var_export(DefinitionScanner::scan($exampleSource)) (whitespace adjusted for clarity):

  1.   0 => array('type' => 'class','name' => 'ClassA',),
  2.   1 => array('type' => 'function','name' => 'a',),
  3.   2 => array('type' => 'function','name' => 'b',),
  4.   3 => array('type' => 'function','name' => 'c',),
  5.   4 => array('type' => 'function','name' => 'outerFunctionOne',),
  6.   5 => array('type' => 'class','name' => 'ClassFoo',),
  7.   6 => array('type' => 'function','name' => 'bar',),
  8.   7 => array('type' => 'function','name' => 'baz',),
  9.   8 => array('type' => 'function','name' => 'outerFunctionTwo',),
  10.   9 => array('type' => 'function','name' => 'outerFunctionThree',),
  11.   10 => array('type' => 'interface', 'name' => 'MyInterface',),
  12.   11 => array ('type' => 'function', 'name' => 'myMethod',),
  13. )

It worked well, except for one thing: all class methods ended up detected as standalone functions. This happened since scan() was allowed to enter in a class definition and find all method definitions in there. To solve this, now we'll build a method that skips over the entire definition code block (as defined by curly brackets). Curly brackets can be nested, as they are also used for if/while/foreach blocks etc., so we will have to count the nesting level until the outermost block is closed. For simplicity, again, we assume there are no syntax errors in the code, and hence all code blocks are nested properly. Let's see how this looks in code:

  1. static private function skipCodeBlock()
  2. {
  3.   // we go forward until we find the first "{" token
  4.   while(($t = current(self::$tokens)) && $t != '{') {
  5.     next(self::$tokens);
  6.   }
  7.  
  8.   // we're about to enter the top level block
  9.   // which is our class/interface/function definition body
  10.   $nestingLevel = 0;     
  11.  
  12.   // we go forward keeping the $nestingLevel up-to-date
  13.   // until we get out of the definition body block
  14.   while($t = current(self::$tokens)) {
  15.     if ($t == '{') {
  16.       $nestingLevel++;
  17.     }
  18.    
  19.     if ($t == '}') {
  20.       $nestingLevel--;
  21.     }
  22.    
  23.     next(self::$tokens);
  24.    
  25.     if ($nestingLevel == 0) return;
  26.   }
  27. }

Now let's augment readDefinition() with that functionality:

  1. static private function readDefinition()
  2. {
  3.   $t = current(self::$tokens);
  4.   $definitionType = $t[1];
  5.  
  6.   // move past the class/interface/function token
  7.   next(self::$tokens);
  8.  
  9.   self::skipWhiteAndComments();
  10.  
  11.   $t = current(self::$tokens);
  12.   $definitionName = $t[1];
  13.  
  14.   self::$definitions[] = array(
  15.     'type' => $definitionType,
  16.     'name' => $definitionName
  17.   );
  18.  
  19.   // move past the name identifier
  20.   next(self::$tokens);
  21.  
  22.   self::skipCodeBlock();
  23. }

And here's the output on the same example source below:

  1.   0 => array('type' => 'class','name' => 'ClassA',),
  2.   1 => array('type' => 'function','name' => 'outerFunctionOne',),
  3.   2 => array('type' => 'class','name' => 'ClassFoo',),
  4.   3 => array('type' => 'function','name' => 'outerFunctionTwo',),
  5.   4 => array('type' => 'function','name' => 'outerFunctionThree',),
  6.   5 => array('type' => 'interface', 'name' => 'MyInterface',),
  7. )

There are no misdetected definitions this time.

Tips and Tricks

  • Notice that every time I pass code to the tokenizer, I explicitly include the opening/closing PHP tag <?php ?>. Just like when you execute a regular PHP page, the tags are required, or the tokenizer will not recognize the proper context and will parse your source as token type T_INLINE_HTML.
  • If you have long passages of inline html in your PHP document, you will notice that the tokenizer splits every ~4kbytes of content into a separate T_INLINE_HTML token. This is expected, and allows the PHP engine to process and output the page in small chunks versus all at once.
  • You can simplify your parsing code if you first convert all tokens to a common format (i.e. convert all single character tokens into three-item array format) prior to parsing. I decided not to do this in my examples, in order to avoid confusion over the native token_get_all() format.

Stan Vassilev has been employed in the IT industry for the past 9 years as a developer, software documentation writer, and a web designer. His interests include software architecture and development, web applications, graphical user interfaces, and he has been an active contributor to the Adobe Flash community for the last several years. Since 2005, he specializes in OSS powered backend development for online applications and services, using technologies such as Apache, PHP, Python and MySQL.

File under: art  homepage  php  tokenizer 
 

Comments

Re: Practical Uses for the PHP Tokenizer by dprevite (2008-10-06 10:43:50 (America/Toronto))
Can you post the source for this so I don't have to piece it together from the article?
Visit the forum