Take Apache's access.log as a basis for web analytics system Piwik
Piwik is quite a popular and common开发者_StackOverflow中文版ly used web analytics system written in PHP. It can be seen as an alternative to Google Analytics and logs all sort of information into a MySQL database whenever somebody visits your homepage.
Should you consider using Piwik as your new primarly used web analytics software, the logging procedure starts from scratch. This means that you can't really compare the data just after having installed it since Piwik hasn't collected enough to give you a real overview on your visitors. Logging starts when you install Piwik and that's why you have to wait at first.
Every Apache webserver includes a file called access.log that logs every access to your webserver if activated. Is there a way to convert this file or import it into Piwik? access.log contains the IP address of every visitor, date and time, the HTTP request line, status code and the size of the returned object. Additionally, it even logs the referer and the user agent. Of course this doesn't include the installed plugins and the display resolution but it is still quite useful.
I've got two questions: Firstly, is it even reasonable to convert your access.log in that case or is there a really important piece of information that isn't collected by Apache but would be essential for Piwik? Secondly, is it easy to write such a converter and wouldn't it confuse Piwik when certain bits of information are missing?
The following picture shows the database schema of Piwik:
Piwik Database Schema http://dev.piwik.org/trac/browser/trunk/misc/db-schema.png?format=raw
Could all required fields in the table piwik_log_visit
be filled using your access.log file so that Piwik would work and show valid information about the website visitors? How would a script look like that converts all data into the database and could PHP handle it at all (think of maximum execution time)? How would a regular expression look like that does the task of preparing your access.log for conversion?
As per this ticket: http://dev.piwik.org/trac/ticket/703
The description says "I know this is already done by a Piwik user who is going to contribute his code soon. " so I would expect to see it in the future implemented in Piwik. It is possible to import logs, just a few information will be missing (resolution, plugin support, etc.).
精彩评论