开发者

php/mysql - logging users activities & huge database load

Assuming we have to log all the users activties of a community, i guess that in brief time our database will become very huge; so my question is:

is this anyway an acceptable compromise (to have a huge DB table) in order to offer this kind of service? Or we can do this in more efficent way?

EDIT: the kind of activity to be logged is a "classic" social-networking activity-log whre people can look what others are doing or have done and viceversa, so it will track for example when user edit profile, post something, login, logout etc...

EDIT 2: my table is already optimized in order to store only id's

log_activity_table(
id int
user int 
ip varchar
event varchar #eve开发者_JAVA百科nt-name
time varchar
callbacks text #some-info-from-the-triggered-event
)


Im actually working on a similar system so Im interested in the answers you get.

For my project, having a full historical accounting was not important so we chose to keep the table fairly lean much like what youre doing. Our tables look something like this:

CREATE TABLE `activity_log_entry` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `event` varchar(50) NOT NULL,
  `subject` text,
  `publisher_id` bigint(20) NOT NULL,
  `created_at` datetime NOT NULL,
  `expires_at` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `event_log_entry_action_idx` (`action`),
  KEY `event_log_entry_publisher_id_idx` (`publisher_id`),
  CONSTRAINT `event_log_entry_publisher_id_user_id` 
    FOREIGN KEY (`publisher_id`)  
    REFERENCES `user` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8

We decided that we dont want to store history forever so we will have a cron job that kills history after a certain time period. We have both created_at and expired_at columns simply out of convenience. When an event is logged these columns are updated automatically by the model and we use a simple strftime('%F %T', strtotime($expr)) where $expr is a string like '+30 days' we pull from configuration.

Our subject column is similar to your callback one. We also chose not to directly relate the subject of the activity to other tables because there is a possibility that not all event subjects will have a table, additionally its not even important to hold this relationship because the only thing we do with this event log is display activity feed messages. We store a serialized value object of data pertinent to the event for use in predetermined message templates. We also directly encode what the event pertained to (ie. profile, comment, status, etc..).

Our events (aka activities.) are simple strings like 'update','create', etc.. These are used in some queries and of course to help determine which message to display to a user.

We are still in the early stages so this may change quite a bit (possibly based on comments and answers to this question) but given our requirements it seemed like a good approach.


Case: When all user activities have different tables. Eg. Like, comment, post, become a member.

Then these table should have a key associating the entry to a user. Given a user you can get recent activities by querying each table by the user_key.

Hence if you don't have a schema yet or you are privileged to change it, go with having different tables for different activities and search multiple activities.

Case: There are some activities which are say generic and don't have individual table for it

Then have table for generic activities and search it along with other activity tables.


Do you need to store the specific activity of each user, or do you just want to log the kind of activity that is happening over time. If the latter, then you might consider something like RRDtool (or a similar approach) and store the amount of activity over different timesteps in a circular buffer, the size of which stays constant over time. See http://en.wikipedia.org/wiki/RRDtool.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜