Finding unused join in a SQL query

2023-02-23 01:07 问答作者：

I am currently maintaining a significant number of SQL queries. Some of them are created by copy/paste operations, then removing unnecessary fields and sometimes forgetting to remove the tables where these fields come from.

I am l开发者_开发百科ooking for a tool (or anything apart from eyes+brain) that, given a SQL query, would analyze which of the joined tables have no field selected in the SELECT part.

Do you know of such a tool?

Thank you

Hypothetically a tool could exist but it would only be guaranteed to be correct if all the following criteria where met for said join

Its A LEFT or OUTER JOIN or an INNER JOIN where the cardinality where known to be 1-1 And...
Its Not Referenced in a SELECT, HAVING, GROUP BY or WHERE and...
It is not a JOIN to function that has a side effect...

Probably why there's no deterministic warnings in SQL parsers the way there is for let's say an unused variable in C#. But it might be worth while to create a SQL checker that looks for some of these conditions and lets the user know that there's a possibility for optimization here.

Just because no fields are referenced in the SELECT, that does not mean that the join is not important to the logic of the query and results could change if the join is removed.

Consider this simple example: Return the name of all customers who purchased an item in 2011.

SELECT DISTINCT c.CustomerName
    FROM Customer c
        INNER JOIN Sales s
            ON c.CustomerID = s.CustomerID
                AND s.SalesDate >= '2011-01-01'

No columns from the Sales table are returned in the SELECT, yet the join is critical to returning the correct result set.

Bottom line: I think you'll need a human eye/brain code review to clean things up properly.

below function replaces all select fields with count(*) and 2nd part removes unnecessary joins. This function works only with tables that has aliases and should be tested for very complex queries and wont work if there is inner queries in join condition.

function sql_query_count($sql) {
        //replace select fields with count(*)
        $a = true;
        $b = 0;
        $first_select = stripos($sql, 'select ');
        $last_from = 0;
        $i = 0;
        while($a){
            $i++;
            $b = stripos($sql, ' from ',$last_from);
            $c = strripos(substr($sql, $last_from, $b), 'select ');
            if ($c == $first_select || $c === false || $i>100) $a = false;
            $last_from = $b+6;
        }        
        if (stripos($sql, 'order by') !== false)
            $sql = substr($sql, 0, stripos($sql, 'order by'));
        $sql1 = 'select count(*) as c ' . substr($sql, $b);

        //remove unnecessary joins
        $joins = preg_split("/ join /i", $sql1);
        $join_count = count($joins);
        $join_type = '';
        if (count($joins)>1){
            for ($index = 0; $index < $join_count+2; $index++) {
                $sql_new = '';
                $where = '';
                $i = 0;
                foreach ($joins as $key => $value) { $i++;
                    $parts = preg_split("/ where /i", trim($value));
                    $value = $parts[0];
                    unset($parts[0]);
                    $where = implode(' where ', $parts);
                    $occurence_count = 0;
                    if ($i > 1) {
                        $a = explode(' on ', $value);
                        $c = preg_replace('!\s+!', ' ', trim($a[0]));
                        $c = explode(' ', $c);
                        $occurence_count = substr_count($sql1, ' '.$c[1].'.')+substr_count($sql1, '='.$c[1].'.');
                    }
                    $t = explode(' ', $value);
                    $j = '';
                    if (trim(strtolower($t[count($t) - 1])) == 'inner'){
                        $j = 'inner';
                        unset($t[count($t) - 1]);
                    } else if (trim(strtolower($t[count($t) - 2])).' '.trim(strtolower($t[count($t) - 1])) == 'left outer'){
                        $j = 'left outer';
                        unset($t[count($t) - 1]);
                        unset($t[count($t) - 1]);
                    }
                    if ($occurence_count == 0 || $occurence_count > 1) $sql_new.= ' '.$join_type.(($join_type!='')?' join ':'').implode(' ', $t);                    
                    $join_type = $j;
                }
                $sql_new .= ' where '.$where;
                $sql1 = $sql_new;
                $joins = preg_split("/ join /i", $sql1);
            }
        }
        return $sql1;
    }

As mentioned above, identifying redundant INNER JOINs will be a problem, as sometimes they have an impact on the returned data, even if no data is actually selected from those tables.

Said that, identifying redundant LEFT JOINs is possible. I'm using this automatic query optimizer to optimize SQL queries automatically. Among other things, it can identify redundant left joins.

继续阅读：analysis query-optimization sql

Finding unused join in a SQL query

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？