SQL hidden techniques? [duplicate]

2023-01-02 00:30 问答作者：

This question already has answers here: Closed 12 years ago.

Possible Duplicate:
Hidden Features of SQL Server

What are those pro/subtle techniques that SQL provides and not many know about which also cut code and improve performance?

eg: I have just learned how to use CASE statements inside aggregate functions and it totally changed my approach on 开发者_运维技巧things.

Are there others?

UPDATE: Basically any vendor. But PostgreSQL if you want to focus only on one :D

OVER Clause (SQL Server) a.k.a. Window functions (PostgreSQL) or analytic functions (Oracle)

This has been very nice to know for me. You can do all sorts of handy things like counting, partitioning, ranking, etc.

SQL GROUP BY - CUBE, ROLLUP clauses

Analytic (AKA ranking, AKA windowing) functions IE:

ROW_NUMBER
RANK
DENSE_RANK
NTILE
OVER

Views: Normal and Materialized

It's difficult to say much without referencing vendor specific syntax

EXISTS. I'm amazed how many people still use COUNT(*) when checking existence or IN (SELECT...) clauses when EXISTS can do the job much quicker.

Most frequently you might see :

SELECT @MyVar = Count(*) FROM Table1 WHERE....
If @MyVar <> 0
BEGIN
   --do something
END

when

IF EXISTS(SELECT 1 FROM Table1 WHERE...)
BEGIN
    --don something
END

is always better.

SELECT... EXCEPT SELECT...

and

SELECT... INTERSECT SELECT...

can be useful (and disturbingly efficient) at pickout out differing or common rows--and that's for all columns in the row--between sets. This is extremely useful when you have lots of columns.

People don't use built in functions enough and like to reinvent the wheel, here are Ten SQL Server Functions That You Have Ignored Until Now

Using NEWSEQUENTIALID() instead of NEWID() on a clustered uniqueidentifier column will perform much better since it won't cause page splits and thus fragmentation

Using an auxilarry table of numbers so that you can quickly do some set based logic

for example

select DATEADD(m,number,'20010101')
from master..spt_values
where type = 'P'
order by 1

ANY, ALL and SOME

Two from Postgresql: DISTINCT ON (see example) and the new WITH.

Lately I have been using CROSS APPLY a lot.

Pivot

It's new in 2005 (which i know was a long time ago, but there's loads of people still using 2000). Saves doing a bunch of "case when name = 'tim' then value else 0 end" to build your aggregates this weekend.

Common Table Expressions (SQL Server 2005+)

WITH x AS (
    SELECT 1 as A, 2 as B, 3 as C
),
WITH y AS (
    SELECT 4 as A, 5 as B, 6 as C
    UNION
    SELECT 7 as A, 8 as B, 9 as C
)
SELECT A, B, C FROM x
UNION
SELECT A, B, C FROM y

They are really nice for breaking your queries into steps

under MySQL, using the keyword "STRAIGHT_JOIN". If you know your data, and the relationships of lookup tables that you are joining to, sometimes the optimizer looks at the smaller tables as a basis of a join and tries to query the "less record" count to your "bigger" table thus taking significantly more time. If your primary table is first in the "from", and its "criteria" up front, the straight join will hit that first, join to the rest of the tables and be done in no time.

I've had to do this dealing with gov't data of 10+ million records joined to about 15+ lookup tables. Without straight-join, the system choked after 20+ hours. Adding Straight-join, it was done in about 2 hrs.

In Sql Server, the HAVING clause. Particularly, HAVING(COUNT DISTINCT FOO)> @SomeNumber to quickly find rows with more than one distinct value for a given grouping.

From MSDN:

USE AdventureWorks2008R2 ;
GO
SELECT SalesOrderID, SUM(LineTotal) AS SubTotal
FROM Sales.SalesOrderDetail
GROUP BY SalesOrderID
HAVING SUM(LineTotal) > 100000.00
ORDER BY SalesOrderID ;

From PostgreSQL Docs:

Table partitioning

Partitioning refers to splitting what is logically one large table into smaller physical pieces. Partitioning can provide several benefits:

Query performance can be improved dramatically for certain kinds of queries.
Update performance can be improved too, since each piece of the table has indexes smaller than an index on the entire data set would be. When an index no longer fits easily in memory, both read and write operations on the index take progressively more disk accesses.
Bulk deletes may be accomplished by simply removing one of the partitions, if that requirement is planned into the partitioning design. DROP TABLE is far faster than a bulk DELETE, to say nothing of the ensuing VACUUM overhead.
Seldom-used data can be migrated to cheaper and slower storage media.

Derived tables to create "variables" and reduce repeated code.

Something like this but can be expanded upon. Obviously "Average Value" can be a much more complex calculation, and if you have several it helps clean up code.

Select *, case when AverageValue > 50 then 'Pass' Else 'Fail' end
From
(
 Select ColA, ColB, AverageValue = (ColA+ColB)/2
 From InnerMostTable
) AverageValues
Order By AverageValue Desc

In SQL Server using the Convert() function to get dates in the format mm/dd/yyyy instead of Cast() function

SELECT convert(datetime,  '1/1/2010', 101)

I use this all the time

继续阅读：postgresql sql sql-server

SQL hidden techniques? [duplicate]

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？