开发者

How Optimize sql query make it faster

I have a very simple small database, 2 of tables are:

Node (Node_ID, Node_name, Node_Date) : Node_ID is primary key

Citation (Origin_Id, Target_Id) : PRIMARY KEY (Origin_Id, Target_Id) each is FK in Node

Now I write a query that first find all citations that their Origin_Id has a specific date and then I want to know what are the target dates of these records.

I'm using sqlite in python the Node table has 3000 record and Citation has 9000 records, and my query is like this in a function:

def cited_years_list(self, date):
    c=self.cur
    try:
    c.execute("""select n.Node_Date,count(*) from Node n INNER JOIN 
              (select c.Origin_Id AS Origin_Id, c.Target_Id AS Target_Id, n.Node_Date AS 
               Date from CITATION c INNER JOIN NODE n ON c.Origin_Id=n.Node_Id where
               CAST(n.Node_Date as INT)={0}) VW ON VW.Target_Id=n.Node_Id 
               GROUP BY n.Node_Date;""".format(date))
    cited_years=c.fetchall()
    self.conn.commit()
    print('Cited Years are : \n ',str(cited_years))
    except Exception as e:
       print('C开发者_开发问答ited Years retrival failed ',e)
   return cited_years

Then I call this function for some specific years, But it's crazy slowwwwwwwww :( (around 1 min for a specific year) Although my query works fine, it is slow. would you please give me a suggestion to make it faster? I'd appreciate any idea about optimizing this query :)

I also should mention that I have indices on Origin_Id and Target_Id, so the inner join should be pretty fast, but it's not!!!


If this script runs over a period of time, you may consider loading the database into memory. Since you seem to be coding in python, there is a connection function called connection.backup that can backup an entire database into memory. Since memory is much faster than disk, this should increase speed. Of course, this doesn''t do anything to optimize the statement itself, since I don't have enough of the code to evaluate what it is you are doing with the code.


Instead of COUNT(*) use MAX(n.Node_Date)

SQLite doesn't keep a counter on number of tables like mysql does but instead it scans all your rows everytime you call COUNT meaning extremely slow.. yet you can use MAX() to fix that problem.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜