MySQL regexp within subquery
I have a very particular problem and after looking at many ressources, i'm unable to find a solution to my problem.
The MySQL version i'm running is MySQL 5.0.91
Given the following tables definition :
DROP TABLE IF EXISTS `item`;
CREATE TABLE `item` (
`id` int(11) NOT NULL default '0',
`code` varchar(4096) default NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
-- ----------------------------
-- Records
-- ----------------------------
INSERT INTO `item` VALUES ('1', 'pizza|large|pepp');
INSERT INTO `item` VALUES ('3', 'pizza|medium|pepp');
INSERT INTO `item` VALUES ('2', 'pizza|small|pepp');
INSERT INTO `item` VALUES ('4', 'appetizer|fries|large');
INSERT INTO `item` VALUES ('5', 'beverage|2_liter|pepsi');
INSERT INTO `item` VALUES ('6', 'pizza|small|cheese');
DROP TABLE IF EXISTS `item_regexp`;
CREATE TABLE `item_regexp` (
`id` int(11) NOT NULL default '0',
`regexp` varchar(4096) default NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
-- ----------------------------
-- Records
-- ----------------------------
INSERT INTO `item_regexp` VALUES ('1', '((pizza)\\\\|)((large|medium)\\\\|)');
INSERT INTO `item_regexp` VALUES ('2', '((pizza)\\\\|)((.*)\\\\|)((alldressed))');
INSERT INTO `item_regexp` VALUES ('3', '((beverage)\\\\|)((2_liter)\\\\|)');
INSERT INTO `item_regexp` VALUES ('4', '((pizza)\\\\|)((.*)\\\\|)((pepp))');
In summary, table item represents items on an invoice. In my example, I have 5 items. The code is an internal representation of that specific Item.
Then, the item_regexp
table are used to specify possible product. This can be used for example to capture all possible product that satisfy a given code pattern to apply a discount, etc.
I would like to load all the item_regexp
entries with the number of item each can capture from the list of items on an invoice.
Doing subquery to count the number of entries in table item
that are captured by my regexp, gives me a proper result :
################
# QUERY #1 #
################
SELECT
(SELECT
count(*)
FROM
item
where
`item`.`code` REGEXP '((pizza)\\|)((large|medium)\\|)') as "regexp1 count"
,
(SELECT
count(*)
FROM
item
where
`item`.`code` REGEXP '((pizza)\\|)((.*)\\|)((alldressed))') as "regexp2 count"
,
(SELECT
count(*)
FROM
item
where
`item`.`code` REGEXP '((beverage)\\|)((2_liter)\\|)') as "regexp3 count"
,
(SELECT
count(*)
FROM
item
where
`item`.`code` REGEXP '((pizza)\\|)((.*)\\|)((pepp))') as "regexp4 count" ;
+---------------+---------------+---------------+---------------+
| regexp1 count | regexp2 count | regexp3 count | regexp4 count |
+---------------+---------------+---------------+---------------+
| 2 | 0 | 1 | 3 |
+---------------+---------------+---------------+---------------+
1 row in set
However, running this as a subquery within a more general query seemed to always give 0 as a count. This is as if the regexp was not working or not taken into account.
################
# QUERY #2 #
################
SELECT
`item_regexp`.`regexp`
,
(
SELECT
count(*)
FROM
item
where
`item`.`code` REGEXP `item_regexp`.`regexp`
) as "regexp_count"
FROM
item_regexp ;
+-------------------------------------+--------------+
| regexp | regexp_count |
+-----------------------------开发者_如何转开发--------+--------------+
| ((pizza)\\|)((large|medium)\\|) | 0 |
| ((pizza)\\|)((.*)\\|)((alldressed)) | 0 |
| ((beverage)\\|)((2_liter)\\|) | 0 |
| ((pizza)\\|)((.*)\\|)((pepp)) | 0 |
+-------------------------------------+--------------+
4 rows in set
Is there something I've missed it the process so that QUERY #2 yield the same count value as QUERY #1 ?
Thanks for your help.
Mike
That is because when you define a literal
((beverage)\\|)((2_liter)\\|)
REGEXP sees the double \ as a single . When you put it in a column, they are double \s, so it is equivalent to the literal
((beverage)\\\\|)((2_liter)\\\\|)
Which makes them different. You had it right with the literals, but your insert into item_regexp
is wrong. Try the below instead
delete from `item_regexp`;
INSERT INTO `item_regexp` VALUES ('1', '((pizza)\\|)((large|medium)\\|)');
INSERT INTO `item_regexp` VALUES ('2', '((pizza)\\|)((.*)\\|)((alldressed))');
INSERT INTO `item_regexp` VALUES ('3', '((beverage)\\|)((2_liter)\\|)');
INSERT INTO `item_regexp` VALUES ('4', '((pizza)\\|)((.*)\\|)((pepp))');
精彩评论