Parse Apache Log Files With PHP
Parsing the log files generated by Apache is one of those random tasks with a random occurrence in my world. This is a task that, until recently, hadn't come up enough to warrant any sort of a ready solution (and it was just fun enough to be ok to write a custom solution). So every time this came up I would always fire up Google and go on a scavenger hunt for a starter script written in php.
This always felt like a good idea at the time the need came up. These days, for some ungodly reason, parsing Apache logs seems to come up a little too frequently to keep this up. In the spirit of making my life a hell of a lot easier for tomorrow I've taken a shot at writing an Apache log parser written in PHP.
One thing I decided to implement is a filtering system so you can filter out based on a provided regex. Might not be too useful to everyone but it should be trivial to remove the functionality.
Anyway, I hope someone finds this useful (even to learn from and, of course, use)
Here's the main class:
<?php /** * Apache Log Parser * Parses an Apache log file and runs the strings through filters to find what you're looking for. * @author Eric Lamb * */ class apache_log_parser { /** * The path to the log file * @var string */ private $file = FALSE; /** * What filters to apply. Should be in the format of array('KEY_TO_SEARCH' => array('regex' => 'YOUR_REGEX')) * @var array */ public $filters = FALSE; /** * Duh. * @param string $file * @return void */ public function __construct($file) { if(!is_readable($file)) { return FALSE; } $this->file = $file; } /** * Executes the supplied filter to the string * @param $filer * @param $status * @return string */ private function applyFilters($str) { if(!$this->filters || !is_array($this->filters)) { return $str; } foreach($this->filters AS $area => $filter) { if(preg_match($filter, $str, $matches, PREG_OFFSET_CAPTURE)) { return $str; } } } /** * Returns an array of all the filtered lines * @param $limit * @return array */ public function getData($limit = FALSE) { $handle = fopen($this->file, 'rb'); if ($handle) { $count = 1; $lines = array(); while (!feof($handle)) { $buffer = fgets($handle); $data = $this->applyFilters($this->format_line($buffer)); if($data) { $lines+):(\d+:\d+:\d+) (]+)\] \"(\S+) (.*?) (\S+)\" (\S+) (\S+) (\".*?\") (\".*?\")$/", $line, $matches); // pattern to format the line return $matches; } /** * Takes the format_log_line array and makes it usable to us stupid humans * @param $line * @return array */ function format_line($line) { $logs = $this->format_log_line($line); // format the line if (isset($logs)) // check that it formated OK { $formated_log = array(); // make an array to store the lin info in $formated_log = $logs; $formated_log = $logs; $formated_log = $logs; $formated_log = $logs; $formated_log = $logs; $formated_log = $logs; $formated_log = $logs; $formated_log = $logs; $formated_log = $logs; $formated_log = $logs; $formated_log = $logs; $formated_log = $logs; $formated_log = $logs; return $formated_log; // return the array of info } else { $this->badRows++; // if the row is not in the right format add it to the bad rows return false; } } } ?>
And here's an example of how to use it:
<?php $data = new apache_log_parser($d->path.'/'.$entry); // Create an apache log parser $data->filters = array( 'path' => array('regex' => '/^.*\.(FLV|flv)$/') //pull only flv files ); $data = $data->getData(); ?>
A couple things to note about this script though:
1. The regex and parsing was pretty stolen from the Apache Log Parser on PHPClasses.org.
2. Without filters the script is pretty memory intensive. My needs don't require anything client facing but heed my adivice; Don't use this on a public web server.

Email
Twitter
Parse Apache Log Files With PHP | Made of Everything You’re Not | Eric Lamb…
Parsing the log files generated by Apache is one of those random tasks with a random occurrence in my world. This is a task that, until recently, hadn’t come up | Eric Lamb…
Nice code!
You have a copy and paste typo on the line:
$formated_log[‘user’] = $logs[2];
(should be 3, not 2)
Cheers,
DM
Thank you for your work.
I was searching PHP apache log parser.
Seems like there’s a bug in this version that wasn’t in the original class:
Fatal error: Call to undefined method apache_log_parser::getData() in /wwwroot/example.php on line 48
include ‘apache-log-parser.php’;
$log = ‘Jun-2011.log’;
$parser = new apache_log_parser($log); // Create an apache log parser
if($parser == false)
{
echo “Unable to create parser for: $log”;
exit;
}
$parser->filters = array(
‘path’ => array(‘regex’ => ‘/^.*\.(PHP|php)$/’) //pull only php files
);
$log_data = $parser->getData();
echo “
“;
You sir, are a wonderful human being. Saved me my morning. Great code.
Thanks for the class
I enhance the code with jQuery Plugin
Hi there thank you for the code, i was about to sit down and write my own parser then stumbled on your code, TYVM :D.
hi sorry about this, but just for anyone else reading i had a problem with this regarding memory consumption, when i started paring large log files, it would give the memory limit exceeded, so i wrote a small mod where you can parse the file only 1 line at a time. if anyone needs it please feel free to email me.
can i get that mod /
I’ve also written a class to do basically the same thing. I felt it would be useful to share it here.
I’ve put it into github for general use: https://github.com/Spudley/ApacheLogIterator
I originally intended to use your class, but I have a *very* large log file to process, so I couldn’t afford to load the whole thing into memory. Therefore, in my class I’ve extended the SPLFileObject. This means it’s an iterator class, so I can loop through it using foreach(), but only have a single record in memory at any one time. Best of both worlds. It also means I don’t have to write any file handling code, which keeps things short.
Feel free to have a look. I hope you find it useful.