Why certain variables are used in "hotness" calculations?
I recently read a blog post about the Reddit "hotness" formula. The formula shown below seems to be the one used. There are a couple of variables I don't understand why they would be picked though. I plan on using this formula as a reference for an app I am involved in, so I'd like to know the basis around why these variables were used.
1st Dec 8, 2005 - Why use this date? Also, why use an offset time 开发者_运维知识库at all? Why not use epoch? Was this an arbitrary date used so that it is platform independent?
2nd - 45000 - Why use 45000 as a divisor? Is this an arbitrary number or does it have a specific meaning or purpose?
t = (time of entry post) - (Dec 8, 2005)
x = upvotes - downvotes
y = {1 if x > 0, 0 if x = 0, -1 if x < 0)
z = {1 if x < 0, otherwise x}
log(z) + (y * t)/45000
1st Dec 8, 2005 - Why use this date? Also, why use an offset time at all? Why not use epoch? Was this an arbitrary date used so that it is platform independent?
I suspect this was the "epoch" date for Reddit's original code. This would make it a good choice, as it keeps the t
variable starting closer to zero, which would keep the functions more stable.
2nd - 45000 - Why use 45000 as a divisor? Is this an arbitrary number or does it have a specific meaning or purpose?
This is effectively a scaling function for time. The larger this number, the less effect age has on the overall equation. I suspect 45000 was chosen after some testing and found to provide a reasonable decay rate given the chosen epoch.
精彩评论