开发者

Inconsistent COUNT with DISTINCT

I have partitioned a table by the field daynumber. During the day I write into this table logs and then calculate some stats. The table is big; each day I have ~3M new rows. The field myField is indexed.

This query

SELECT COUNT(DISTINCT myField) FROM mytable WHERE daynumber=somevalue; 

returns 0, which is a mistake.

This query

SELECT COUNT(*) FROM (SELECT DISTINCT(myField) FROM mytable WHERE daynumber=somevalue) t;

returns the correct value.

For some daynumber values, the first query works fine. I tried to repeir that partition but that had no effect. Any suggestions?


update

table scheme looks like

CREATE TABLE `mytable` (
  `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  `daynumber` INT(10) UNSIGNED NOT NULL,
  `myField` VARCHAR(1024) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
    ... other fields
  PRIMARY KEY (`daynumber`,`id`),
  KEY `myField` (`myField`(20))
) ENGINE=MYISAM DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC
PARTITION BY LIST (daynumber)
(PARTITION day_810 VALUES IN (810) ENGINE = MyISAM,
 PARTITION d开发者_开发知识库ay_811 VALUES IN (811) ENGINE = MyISAM,
 PARTITION day_812 VALUES IN (812) ENGINE = MyISAM
....)


this is because your myfield contains NULL value

SELECT COUNT(DISTINCT coalesce(myField, '') )
FROM mytable WHERE daynumber=somevalue; 

coalesce is sort of convert NULL to ''
might not what your ask for, but will return the correct count (still)


Lets' say

  • You match 4 rows for daynumber=somevalue
  • MyField for these 3 rows are 1, 2, 2, NULL

So, with the sub query there are 2 steps.

  • SELECT DISTINCT(myField) gives 3 rows: 1, 2, NULL
  • SELECT COUNT(*) of this gives 3.

Without the subquery, one step

  • SELECT COUNT(DISTINCT(myField)) counts only 1, 2 = 2

Reason:

  • COUNT(*) includes NULLs
  • COUNT(anythingelse) doesn't count NULLs

The queries can't be inconsistent because they are different queries

The best discussion on it I can recall is on DBA.SE: What is the difference between select count(*) and select count(any_non_null_col)?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜