开发者

Where to store users visited pages?

I have a project, where I have posts for example. The task is next: I must show to user his last posts visit. This is my solution: every time user visits new (for him) topic, I create a new record in table visits. Table visits has next structure: id, user_id, post_id, last_visit. Now my tables visits has ~14,000,000 records and its still growing every day.. May be my solution isnt op开发者_高级运维timal and exists another way how to store users visits? Its important to save every visit as standalone record, because I also have feature to select and use users visits. And I cant purge this table, because data could be needed later month, year. How I could optimize this situation?


Nope, you don't really have much choice other than to store your visit data in a table with columns for (at a bare minimum) user id, post id, and timestamp if you need to track the last time that each user visited each post.

I question whether you need an id field in that table, rather than using a composite key on (user_id, post_id), but I'd expect that to have a minor effect, provided that you already have a unique index on (user_id, post_id). (If you don't have an index on that pair of fields, adding one should improve query performance considerably and making it a unique index or composite key will protect against accidentally inserting duplicate records.)

If performance is still an issue despite proper indexing, you should be able to improve it a bit by segmenting the table into a collection of smaller tables, but segment it by user_id or post_id (rather than by date as previous answers have suggested). If you break it up by user or post id, then you will still be able to determine whether a given user has previously viewed a given post and, if so, on what date with only a single query. If you segment it by date, then that information will be spread across all tables and, in the worst-case scenario of a user who has never previously viewed a post (which I expect to be fairly common), you'll need to separately query each and every table before having a definitive answer.

As for whether to segment it by user id or by post id, that depends on whether you will more often be looking for all posts viewed by a user (segment by user_id to get them all in one query) or all users who have viewed a post (segment by post_id).


If it doesn't need to be long lasting, you could store it in session instead. If it does, you could either break the records apart by table, like say 1 per month, or you could only store the last 5-10 pages visited, and delete old ones as new ones come in. You could also change it to pages visited today, this week, etc.


If you do need all 14 million records, I would create another historical table to archive the visits that are not the most relevant for the day-to-day site operation.

At the end of the month (or week, or quarter, etc...) have some scheduled logic to archive records beyond a certain cutoff point to the historical table and reduce the number of records in the "live" table. This should help increase the query speed on the "live" table since you would have less records in it.

If you do need to query all of the data, you can use both tables and have all of the data available to you.


you could delete the ones you don't need - if you only want to show the last 10 visited posts then

DELETE FROM visits WHERE user_id = ? AND id NOT IN (SELECT id from visits where user_id = ? ORDER BY last_visit DESC LIMIT 0, 10);

(i think that's the best way to do that query, any mysql guru can tell me otherwise? you can ORDER BY in DELETE but the LIMIT only takes 1 parameter, so you can't do LIMIT 10, 100 there)

after inserting/updating each new row, or every few days if you like


Having a structure like (id, user_id, post_id, last_visit) for your vists table, makes it appear as though you are saving all posts, not just last post per Topic. Don't you need a topic ID in there somewhere so that you can determine what there last post PER TOPIC was, and so you know which row to replace when they post in the same topic more than once?


Store post_ids to $_SESSION and then using MYSQL IN with one SELECT query you will be able to show his visited posts. But all those ids will be destroyed after member close his browser, but anyways, this is much more faster and optimal than using database.

edit: sorry, I didn't notice you that you must store that records in database and use it after months. Then I have no idea how to optimize it, but with 14 mln. records you should definitely use indexes.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜