开发者

Using Order By NULL in a UNION

I have a query (see below) that I have a custom developed UDF that is used to calculate whether or not certain points are within a polygon (first query in UNION) or circular (second query in UNION) shape.

select  e.inquiry_match_type_id 
        , a.geo_boundary_id 
        , GeoBoundaryContains(c.tpi_geo_boundary_coverage_type_id, 29.287437, -95.055807, a.lat, a.lon, a.geo_boundary_vertex_id ) in_out 
        , e.inquiry_id 
        , e.external_id 
        , COALESCE(f.inquiry_device_id,0) inquiry_device_id 
        , b.external_info1 
        , b.external_info2 
        , b.geo_boundary_id 
        , b.geo_boundary_type_id 
from    geo_boundary_vertex a 
        join geo_boundary b on b.geo_boundary_id = a.geo_boundary_id 
        join trackpoint_index_geo_boundary_mem c on c.geo_boundary_id = b.geo_boundary_id 
        join trackpoint_index_mem d on d.trackpoint_index_id = c.trackpoint_index_id 
        join inquiry_mem e on e.inquiry_id = b.inquiry_id left 
        outer join inquiry_device_mem f on f.inquiry_id = e.inquiry_id and f.device_id = 3201 
where   d.trackpoint_index_id = 3127  
        and b.geo_boundary_type_id = 3  
        and e.expiration_date >= now() 
group by 
        a.geo_boundary_id 
UNION 
select  e.inquiry_match_type_id      
        , b.geo_boundary_id      
        , GeoBoundaryContains( c.tpi_geo_boundary_coverage_type_id, 29.287437, -95.055807, b.centroid_lat, b.centoid_lon, b.radius ) in_out      
        , e.inquiry_id      
        , e.external_id      
        , COALESCE(f.inquiry_device_id,0) inquiry_device_id      
        , b.external_info1      
        , b.external_info2      
        , b.geo_boundary_id      
        , b.geo_boundary_type_id 
from    geo_boundary b 
        join trackpoint_index_geo_boundary_mem c on c.geo_boundary_id = b.geo_boundary_id 
        join trackpoint_index_mem d on d.trackpoint_index_id = c.trackpoint_index_id 
        join inquiry_mem e on e.inquiry_id = b.inquiry_id 
        left outer join inquiry_device_mem f on f.inquiry_id = e.inquiry_id and f.device_id = 3201 
where   d.trackpoint_index_id = 3127  
        and b.geo_boundary_type_id = 2  
        and e.expiration_date >= now() 
group by
        b.geo_boundary_id 

When I run an explain for the query I get the following:

 id      select_type     table       type     possible_keys                                                                                                                                              key                                  key_len     ref                       rows     Extra                           
 ------  --------------  ----------  -------  ---------------------------------------------------------------------------------------------------------------------------------------------------------  -----------------------------------  ----------  ------------------------  -------  ------------------------------- 
 1       PRIMARY         d           const    PRIMARY                                                                                                                                                    PRIMARY                              4           const                     1        Using temporary; Using filesort 
 1       PRIMARY         c           ref      PRIMARY,fk_mtp_idx_geo_boundary_mtp_idx,fk_mtp_idx_geo_boundary_geo_boundary,fk_mtp_idx_geo_boundary_mtp_mem_idx,fk_mtp_idx_geo_boundary_geo_boundary_mem  fk_mtp_idx_geo_boundary_mtp_idx      4           const                     9                                        
 1       PRIMARY         b           eq_ref   PRIMARY,fk_geo_boundary_inquiry,fk_geo_boundary_geo_boundary_type                                                                                          PRIMARY                              4           gothim.c.geo_boundary_id  1        Using where                     
 1       PRIMARY         e           eq_ref   PRIMARY                                                                                                                                                    PRIMARY                              4           gothim.b.inquiry_id       1        Using where                     
 1       PRIMARY         f           ref      fk_inquiry_device_mem_inquiry                                                                                                                              fk_inquiry_device_mem_inquiry        4           gothim.e.inquiry_id       2                                        
 1       PRIMARY         a           ref      fk_geo_boundary_vertex_geo_boundary                                                                                                                        fk_geo_boundary_vertex_geo_boundary  4           gothim.b.geo_boundary_id  11       Using where                     
 2       UNION           d           const    PRIMARY                                                                                                                                                    PRIMARY                              4           const                     1        Using temporary; Using filesort 
 2       UNION           c           ref      PRIMARY,fk_mtp_idx_geo_boundary_mtp_idx,fk_mtp_idx_geo_boundary_geo_boundary,fk_mtp_idx_geo_boundary_mtp_mem_idx,fk_mtp_idx_geo_boundary_geo_boundary_mem  fk_mtp_idx_geo_boundary_mtp_idx      4           const                     9                                        
 2       UNION           b           eq_ref   PRIMARY,fk_geo_boundary_inquiry,fk_geo_boundary_geo_boundary_type                                                                                          PRIMARY                              4           gothim.c.geo_boundary_id  1        Using where                     
 2       UNION           e           eq_ref   PRIMARY                                                                                                                                                    PRIMARY                              4           gothim.b.inquiry_id       1        Using where                     
 2       UNION           f           ref      fk_inquiry_device_mem_inquiry                                                                                                                              fk_inquiry_device_mem_inquiry        4           gothim.e.inquiry_id       2                                        
 (null)  UNION RESULT    <union1,2>  ALL      (null)                                                                                                                                                     (null)                               (null)      (null)                    (null)   Using filesort                  

 12 record(s) selected [Fetch MetaData: 1ms] [Fetch Data: 5ms] 

Now, I can split the queries up and use the ORDER BY NULL trick to get rid of the filesort however when I attempt to add that to the end of a UNION it doesn't work.

I am considering splitting the query apart into 2 queries or possibly re-writing it completely not to use a UNION (though that is a bit more difficult of course). The other thing I have working against me is that we have this in production and I'd like to limit changes - I would have loved just to be able to add ORDER BY NULL to the end of the query and be done with it, but it doesn't work w/ the UNION.

Any help wo开发者_如何学Pythonuld be greatly appreciated.


Normally, ORDER BY can be used for the individual queries within a UNION like this:

(
SELECT  *
FROM    table1, …
GROUP BY
        id
ORDER BY 
        NULL
)
UNION ALL
(
SELECT  *
FROM    table2, …
GROUP BY
        id
ORDER BY 
        NULL
)

However, as the docs state:

However, use of ORDER BY for individual SELECT statements implies nothing about the order in which the rows appear in the final result because UNION by default produces an unordered set of rows. Therefore, the use of ORDER BY in this context is typically in conjunction with LIMIT, so that it is used to determine the subset of the selected rows to retrieve for the SELECT, even though it does not necessarily affect the order of those rows in the final UNION result. If ORDER BY appears without LIMIT in a SELECT, it is optimized away because it will have no effect anyway.

This is of course a smart move, however, not too smart, since they forgot to optimize away the ordering behavior of GROUP BY as well.

So as for now, you should add a very high LIMIT to your individual queries:

(
SELECT  *
FROM    table1, …
GROUP BY
        id
ORDER BY 
        NULL
LIMIT 100000000
)
UNION ALL
(
SELECT  *
FROM    table2, …
GROUP BY
        id
ORDER BY 
        NULL
LIMIT 100000000
)

I'll post it as a bug to MySQL, hope they'll fix it in the next release, but meanwhile you could use this solution.

Note that a similar solution (using TOP 100%) was used to force ordering of the subqueries in SQL Server 2000, however, it stopped working in 2005 (ORDER BY has no effect in subqueries with TOP 100% for the optimizer).

It is safe to use it though since it won't break your queries even if the optimizer behavior changes in the next releases, but will just make them as slow as they are now.


Maybe try something like

SELECT *
FROM
(
    [your entire query here]
) DerivedTable
ORDER BY NULL

I've never used MySQL so forgive me if I'm missing the plot :)

EDIT: What if you run each individual query separately (which, as you say, works), but insert the data into a temporary table. Then, at the end just do a select from the temp table.


Have you tried changing the UNION to UNION ALL?

A UNION tries to remove duplicate rows. In order to do that, it would have to sort the intermediate results what might explain what you are seeing in your execution plan.

From MySQL Union

By default the MySQL UNION removes all duplicate rows from the result set even if you don’t explicit using DISTINCT after the keyword UNION.

If you use UNION ALL explicitly, the duplicate rows remain in the result set. You only use this in the cases that you want to keep duplicate rows or you are sure that there is no duplicate rows in the result set.

Edit

I doubt it will make any difference (might even be worse) but could you try following "equivalent" query

select  *
from    (
          select  b.geo_boundary_id      
                  , GeoBoundaryContains( c.tpi_geo_boundary_coverage_type_id, 29.287437, -95.055807, b.centroid_lat, b.centoid_lon, b.radius ) in_out      
          from    geo_boundary b 
                  join trackpoint_index_geo_boundary_mem c on c.geo_boundary_id = b.geo_boundary_id 
          where   b.geo_boundary_type_id = 2  
          group by
                  b.geo_boundary_id 
          union all        
          select  a.geo_boundary_id 
                  , GeoBoundaryContains(c.tpi_geo_boundary_coverage_type_id, 29.287437, -95.055807, a.lat, a.lon, a.geo_boundary_vertex_id ) in_out 
          from    geo_boundary_vertex a 
                  join geo_boundary b on b.geo_boundary_id = a.geo_boundary_id 
                  join trackpoint_index_geo_boundary_mem c on c.geo_boundary_id = b.geo_boundary_id 
          where   b.geo_boundary_type_id = 3  
          group by 
                  a.geo_boundary_id 
        ) s
        inner join (                  
          select  e.inquiry_match_type_id 
                  , e.inquiry_id 
                  , e.external_id 
                  , COALESCE(f.inquiry_device_id,0) inquiry_device_id 
                  , b.external_info1 
                  , b.external_info2 
                  , b.geo_boundary_id 
                  , b.geo_boundary_type_id 
          from    geo_boundary b 
                  join trackpoint_index_geo_boundary_mem c on c.geo_boundary_id = b.geo_boundary_id 
                  join trackpoint_index_mem d on d.trackpoint_index_id = c.trackpoint_index_id 
                  join inquiry_mem e on e.inquiry_id = b.inquiry_id left 
                  outer join inquiry_device_mem f on f.inquiry_id = e.inquiry_id and f.device_id = 3201 
          where   d.trackpoint_index_id = 3127  
                  and b.geo_boundary_type_id IN (2, 3)
                  and e.expiration_date >= now() 
        ) r on r.geo_boundary_id = s.geo_boundary_id    
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜