Why do I need to explicitly specify all columns in a SQL "GROUP BY" clause - why not "GROUP BY *"?

2022-12-29 21:44 问答作者：

This has always bothered me - why does the GROUP BY clause in a SQL statement require that I include all non-aggregate columns? These columns should be included by default - a kind of "GROUP BY *" - since I can't even run the query unless they're all included. Every column has to either be an aggregate or be specified in the "GROUP BY", but it seems like anything not开发者_开发技巧 aggregated should be automatically grouped.

Maybe it's part of the ANSI-SQL standard, but even so, I don't understand why. Can somebody help me understand the need for this convention?

It's hard to know exactly what the designers of the SQL language were thinking when they wrote the standard, but here's my opinion.

SQL, as a general rule, requires you to explicitly state your expectations and your intent. The language does not try to "guess what you meant", and automatically fill in the blanks. This is a good thing.

When you write a query the most important consideration is that it yields correct results. If you made a mistake, it's probably better that the SQL parser informs you, rather than making a guess about your intent and returning results that may not be correct. The declarative nature of SQL (where you state what you want to retrieve rather than the steps how to retrieve it) already makes it easy to inadvertently make mistakes. Introducing fuzziniess into the language syntax would not make this better.

In fact, every case I can think of where the language allows for shortcuts has caused problems. Take, for instance, natural joins - where you can omit the names of the columns you want to join on and allow the database to infer them based on column names. Once the column names change (as they naturally do over time) - the semantics of existing queries changes with them. This is bad ... very bad - you really don't want this kind of magic happening behind the scenes in your database code.

One consequence of this design choice, however, is that SQL is a verbose language in which you must explicitly express your intent. This can result in having to write more code than you may like, and gripe about why certain constructs are so verbose ... but at the end of the day - it is what it is.

The only logical reason I can think of to keep the GROUP BY clause as it is that you can include fields that are NOT included in your selection column in your grouping.

For example.

Select column1, SUM(column2) AS sum
 FROM table1
 GROUP BY column1, column3

Even though column3 is not represented elsewhere in the query, you can still group the results by it's value. (Of course once you have done that, you cannot tell from the result why the records were grouped as they were.)

It does seem like a simple shortcut for the overwhelmingly most common scenario (grouping by each of the non-aggregate columns) would be a simple yet effective tool for speeding up coding.

Perhaps "GROUP BY *"

Since it is already pretty common in SQL tools to allow references to the columns by result column number (ie. GROUP BY 1,2,3, etc.) It would seem simpler still to be able to allow the user to automatically include all the non-aggregate fields in one keystroke.

It's simple just like this: you asked to sql group the results by every single column in the from clause, meaning for every column in the from clause SQL, the sql engine will internally group the result sets before to present it to you. So that explains why it ask you to mention all the columns present in the from too because its not possible group it partially. If you mentioned the group by clause that is only possible to sql achieve your intent by grouping all the columns as well. It's a math restriction.

继续阅读：aggregate ansi-sql group-by sql sql-standards

Why do I need to explicitly specify all columns in a SQL "GROUP BY" clause - why not "GROUP BY *"?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？