Made of Everything You're Not

Personal blog of PHP programmer Eric Lamb.
  • Blog
  • Portfolio
« WP-Click-Track 0.7 Released (Umm Last Week)
Introducing WP-hResume »

Parse Apache Log Files With PHP

Parsing the log files generated by Apache is one of those random tasks with a random occurrence in my world. This is a task that, until recently, hadn't come up enough to warrant any sort of a ready solution (and it was just fun enough to be ok to write a custom solution). So every time this came up I would always fire up Google and go on a scavenger hunt for a starter script written in php.

Parse Apache Log Files With PHP

This always felt like a good idea at the time the need came up. These days, for some ungodly reason, parsing Apache logs seems to come up a little too frequently to keep this up. In the spirit of making my life a hell of a lot easier for tomorrow I've taken a shot at writing an Apache log parser written in PHP.

One thing I decided to implement is a filtering system so you can filter out based on a provided regex. Might not be too useful to everyone but it should be trivial to remove the functionality.

Anyway, I hope someone finds this useful (even to learn from and, of course, use)

Here's the main class:

<?php
/**
 * Apache Log Parser
 * Parses an Apache log file and runs the strings through filters to find what you're looking for.
 * @author Eric Lamb
 *
 */
class apache_log_parser
{
	/**
	 * The path to the log file
	 * @var string
	 */
	private $file = FALSE;
 
	/**
	 * What filters to apply. Should be in the format of array('KEY_TO_SEARCH' => array('regex' => 'YOUR_REGEX'))
	 * @var array
	 */
	public $filters = FALSE;
 
	/**
	 * Duh.
	 * @param string $file
	 * @return void
	 */
	public function __construct($file)
	{
		if(!is_readable($file))
		{
			return 	FALSE;
		}
 
		$this->file = $file;
	}
 
	/**
	 * Executes the supplied filter to the string
	 * @param $filer
	 * @param $status
	 * @return string
	 */
	private function applyFilters($str)
	{
		if(!$this->filters || !is_array($this->filters))
		{
			return $str;
		}
 
		foreach($this->filters AS $area => $filter)
		{
			if(preg_match($filter, $str, $matches, PREG_OFFSET_CAPTURE))
			{
				return $str;
			}
		}
	}
 
	/**
	 * Returns an array of all the filtered lines
	 * @param $limit
	 * @return array
	 */
	public function getData($limit = FALSE)
	{
		$handle = fopen($this->file, 'rb');
		if ($handle) {
			$count = 1;
			$lines = array();
		    while (!feof($handle)) {
		        $buffer = fgets($handle);
		        $data = $this->applyFilters($this->format_line($buffer));
		        if($data)
		        {
		        	$lines+):(\d+:\d+:\d+) (]+)\] \"(\S+) (.*?) (\S+)\" (\S+) (\S+) (\".*?\") (\".*?\")$/", $line, $matches); // pattern to format the line
		return $matches;
	}
 
	/**
	 * Takes the format_log_line array and makes it usable to us stupid humans
	 * @param $line
	 * @return array
	 */
	function format_line($line)
	{
		$logs = $this->format_log_line($line); // format the line
 
		if (isset($logs)) // check that it formated OK
		{
			$formated_log = array(); // make an array to store the lin info in
			$formated_log = $logs;
			$formated_log = $logs;
			$formated_log = $logs;
			$formated_log = $logs;
			$formated_log = $logs;
			$formated_log = $logs;
			$formated_log = $logs;
			$formated_log = $logs;
			$formated_log = $logs;
			$formated_log = $logs;
			$formated_log = $logs;
			$formated_log = $logs;
			$formated_log = $logs;
			return $formated_log; // return the array of info
		}
		else
		{
			$this->badRows++; // if the row is not in the right format add it to the bad rows
			return false;
		}
	}
}
?>

And here's an example of how to use it:

<?php
$data = new apache_log_parser($d->path.'/'.$entry); // Create an apache log parser
$data->filters = array(
	'path' => array('regex' => '/^.*\.(FLV|flv)$/') //pull only flv files
);
 
$data = $data->getData();
?>

A couple things to note about this script though:

1. The regex and parsing was pretty stolen from the Apache Log Parser on PHPClasses.org.
2. Without filters the script is pretty memory intensive. My needs don't require anything client facing but heed my adivice; Don't use this on a public web server.

Related Posts

ACM Interactions
Stand Alone ExpressionEngine Authentication
MSRC
Importing Legacy Users Into ExpressionEngine
Nesting Platform

Tags: apache php

This entry was written by Eric Lamb and posted on January 09th, 2010 at 12:35 am and is filed under IT, Programming. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response below.

10 Comments

  1. abcphp.com says:
    January 09, 2010 at 02:13 am

    Parse Apache Log Files With PHP | Made of Everything You’re Not | Eric Lamb…

    Parsing the log files generated by Apache is one of those random tasks with a random occurrence in my world. This is a task that, until recently, hadn’t come up | Eric Lamb…

    Reply | Quote
  2. Daniel says:
    January 11, 2011 at 11:00 am

    Nice code!

    You have a copy and paste typo on the line:
    $formated_log[‘user’] = $logs[2];

    (should be 3, not 2)

    Cheers,

    DM

    Reply | Quote
  3. matsu says:
    July 22, 2011 at 08:24 pm

    Thank you for your work.
    I was searching PHP apache log parser.

    Reply | Quote
  4. Pete says:
    August 26, 2011 at 10:10 am

    Seems like there’s a bug in this version that wasn’t in the original class:

    Fatal error: Call to undefined method apache_log_parser::getData() in /wwwroot/example.php on line 48

    include ‘apache-log-parser.php’;
    $log = ‘Jun-2011.log’;

    $parser = new apache_log_parser($log); // Create an apache log parser
    if($parser == false)
    {
      echo “Unable to create parser for: $log”;
      exit;
    }

    $parser->filters = array(
    ‘path’ => array(‘regex’ => ‘/^.*\.(PHP|php)$/’) //pull only php files
    );

    $log_data = $parser->getData();
    echo “

    ";
    print_r($log_data); // print out the array
    echo "

    “;

    Reply | Quote
  5. DJ says:
    October 17, 2011 at 07:18 am

    You sir, are a wonderful human being.  Saved me my morning.  Great code.

    Reply | Quote
  6. yauri says:
    November 30, 2011 at 05:47 am

    Thanks for the class smile
    I enhance the code with jQuery Plugin

    Reply | Quote
  7. Nikhil says:
    March 15, 2012 at 04:42 am

    Hi there thank you for the code, i was about to sit down and write my own parser then stumbled on your code, TYVM :D.

    Reply | Quote
  8. Nikhil says:
    March 16, 2012 at 05:18 am

    hi sorry about this, but just for anyone else reading i had a problem with this regarding memory consumption, when i started paring large log files, it would give the memory limit exceeded, so i wrote a small mod where you can parse the file only 1 line at a time. if anyone needs it please feel free to email me.

    Reply | Quote
  9. frank says:
    May 04, 2012 at 09:33 am

    can i get that mod /

    Reply | Quote
  10. Spudley says:
    August 17, 2012 at 05:52 am

    I’ve also written a class to do basically the same thing. I felt it would be useful to share it here.

    I’ve put it into github for general use:  https://github.com/Spudley/ApacheLogIterator

    I originally intended to use your class, but I have a *very* large log file to process, so I couldn’t afford to load the whole thing into memory. Therefore, in my class I’ve extended the SPLFileObject. This means it’s an iterator class, so I can loop through it using foreach(), but only have a single record in memory at any one time. Best of both worlds. It also means I don’t have to write any file handling code, which keeps things short.

    Feel free to have a look. I hope you find it useful.

    Reply | Quote

Leave a Reply

Click here to cancel reply.

  • Subscribe: Entries | Comments
  • About Me

    Email Email
    Twitter Twitter
    310.739.3322
  • Categories

    • Brain Dump
    • Business
    • Code
    • IT
    • Programming
    • Rant
    • Servers
  • Archives

    • February 2012
    • October 2011
    • August 2011
    • July 2011
    • June 2011
    • May 2011
    • April 2011
    • March 2011
    • February 2011
    • January 2011
    • December 2010
    • November 2010
    • October 2010
    • September 2010
    • August 2010
    • July 2010
    • June 2010
    • May 2010
    • April 2010
    • March 2010
    • February 2010
    • January 2010
    • December 2009
    • November 2009
    • October 2009
    • September 2009
    • August 2009
    • July 2009
    • June 2009
    • May 2009
    • April 2009
    • March 2009
    • February 2009
    • January 2009
    • December 2008
    • November 2008
    • October 2008
  • Advertisement

Copyright © 2008 - 2013 Eric Lamb - All rights reserved