What to store in a typical access log?
I have thought of the following:
- user id if available
- user ip address
- timestamp
- action executed
Am I missing some开发者_如何转开发thing? Are there any guidelines?
There are different kinds of access logs really. The most common ones are for your page access, and may have the format Sir Darius describes (this is typically called an access log
).
Then there's also the logging of internal actions (this is typically called an application log
). Many of those will be on a low logging level (meaning that you normally don't see them, but have the ability to switch them on temporarily).
If you don't take precautions, you will get a log like:
- Query XYZ executed in 2ms
- Query ABC executed in 1ms
- Starting transaction
- Order send
- Starting transaction
- Order deleted
- Query ABC executed in 1ms
When investigating a production issue, this is often not really useful. Every other line can belong to the same user or to different users. You don't know.
I found it easy to have a format like the following for every such log lines:
- Time
- IP address
- Session ID
- User ID
- Thread ID/name
- Sequence ID
The thread ID or name is important so you can distinguish the situation where the same user is doing multiple requests to your app at the same time.
The sequence ID is a counter that internally counts every request that the user does since the beginning of its session (in Java I used an AtomicInteger for this). The sequence ID is handy since it's an easier method to grep on when examining everything that took place during a specific request, since thread IDs are of course re-used when serving completely different requests. It's also handy when you handle a single request internally by using multiple threads.
With a little effort, a log format like this allows you to extract the actions of a single user from your log and zoom in to individual requests.
There are guidelines that should be used if you intend to use the access logs in order to gather statistics for tools like AWStats or Webalizer.
For example there is the Combined Log Format:
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)"
defined in Apache as:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined
This format is commonly used around the web and is understood by most software.
The W3C defines another format called Extended Log File Format, which is specified here: http://www.w3.org/TR/WD-logfile.html
This format is used for example by IIS, and is understood by AWStats.
精彩评论