开发者

Transaction activity running total in SQL

I am working with a transactions table. I want to generate a number 开发者_开发技巧of variables to determine the number of transactions performed by the same customer in the last 5 and 30 minutes and 1,2,3 and 4 hours and 1,2,5, and 10 days. What is an efficient way to generate such columns in the transactions table? -- The solution would be a running total until each transaction time. I have Oracle and other SQL variants at my disposal.


I don't think you want to store these statistics physically, as they would be constantly changing due to their temporal nature. The solution really depends on how you'll be using these statistics. I can think of 2 main ways:

  1. Ad-Hoc queries for a given unique customer - In this case I'd have a stored procedure taking the customer id and timeframe as parameters, and returning the relevant number of transactions per customer.

  2. 'Standardised' reporting for multiple customers across multiple windows - In this case, you probably want to have a view per time period selecting all customers and the number of transactions per customer for the appropriate time period.

Perhaps if you can give us some more information on your use-cases, we'll be able to be more specific..


For a single customer:

The best solution I can come up with isn't the use of analytics, but with subfactored queries / common table expressions. Oracle's generally clever enough to know whether or not to turn them into a temporary table, reducing the cost of going after mutiple passes over the same data.

with txns as 
       (select customer_id
               txn_id,
               txn_ts
          from transaction_table
         where customer_id = ?
           AND txn_ts >= SYSTIMESTAMP - NUMTODSINTERVAL(10, 'DAY')
       )
select customer_id,
       (select count(*) from txns 
         where event_ts >= systimestamp - numtodsinterval(5/1440, 'day')) 
         as txn_5_min,
       (select count(*) from txns 
         where event_ts >= systimestamp - numtodsinterval(30/1440, 'day')) 
         as txn_30_min,
       (select count(*) from txns 
         where event_ts >= systimestamp - numtodsinterval(1/24, 'day')) 
         as txn_1_hour,
       (select count(*) from txns 
         where event_ts >= systimestamp - numtodsinterval(2/24, 'day')) 
         as txn_2_hour,
       (select count(*) from txns 
         where event_ts >= systimestamp - numtodsinterval(3/24, 'day')) 
         as txn_3_hour,
       (select count(*) from txns 
         where event_ts >= systimestamp - numtodsinterval(4/24, 'day')) 
         as txn_4_hour,
       (select count(*) from txns 
         where event_ts >= systimestamp - numtodsinterval(1, 'day')) 
         as txn_1_day,
       (select count(*) from txns 
         where event_ts >= systimestamp - numtodsinterval(2, 'day')) 
         as txn_2_day,
       (select count(*) from txns 
         where event_ts >= systimestamp - numtodsinterval(5, 'day')) 
         as txn_5_day,
       (select count(*) from txns 
         where event_ts >= systimestamp - numtodsinterval(10, 'day')) 
         as txn_10_day
  from customer
 WHERE customer_id = ?;

You would be able to use a similar implementation for a multi-customer case, although efficiency would definitely suffer. Consider whether or not your 5-minute level data is going to be stale for all customers by the time you've finished retrieving it.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜