开发者

What is the equivalent of REGEXP_SUBSTR in mysql?

I want to extract a word from a string column of a table.

description
===========================
abc order_id: 2 xxxx yyy aa
mmm order_id: 3 nn kk yw

Expected result set

order_id
===========================
2
3

Table will at most have 100 rows, text length is ~256 char and column always has one order_id present. So performance is not a开发者_Go百科n issue.

In Oracle, I can use REGEXP_SUBSTR for this problem. How would I solve this in MySQL?

Edit 1

I am using LOCATE and SUBSTR to solve the problem. The code is ugly. Ten minutes after writing the code, I am cursing the guy who wrote such an ugly code.

I didn't find the REGEXP_SUBSTR function in MySQL docs. But I am hoping that it exists..

Answer to : Why cant the table be optimized? Why is the data stored in such a dumb fashion?

The example I gave just denotes the problem I am trying to solve. In real scenario, I am using a DB based 3rd party queuing software for executing asynchronous tasks. The queue serializes the Ruby object as text. I have no control over the table structure OR the data format. The tasks in the queue can be recurring. In our test setup, some of the recurring tasks are failing because of stale data. I have to delete these tasks to prevent the error. Such errors are not common, hence I don't want to maintain a normalized shadow table.


"I didn't find the REGEXP_SUBSTR function in MySQL docs. But I am hoping that it exists.."

Yes, starting from MySQL 8.0 it is supported. Regular Expressions:

REGEXP_SUBSTR(expr, pat[, pos[, occurrence[, match_type]]])

Returns the substring of the string expr that matches the regular expression specified by the pattern pat, NULL if there is no match. If expr or pat is NULL, the return value is NULL.


Like Konerak said, there is no equivalent of REGEXP_SUBSTR in MySql. You could do what you need using SUBSTRING logic, but it is ugly :

SELECT
  SUBSTRING(lastPart.end, 1, LOCATE(' ', lastPart.end) - 1) AS orderId
FROM
  (
    SELECT
      SUBSTRING(dataset.description, LOCATE('order_id: ', dataset.description) + LENGTH('order_id: ')) AS end
    FROM
      (
        SELECT 'abc order_id: 2 xxxx yyy aa' AS description
        UNION SELECT 'mmm order_id: 3 nn kk yw' AS description
        UNION SELECT 'mmm order_id: 1523 nn kk yw' AS description
      ) AS dataset
    ) AS lastPart

Edit: You could try this user defined function providing access to perl regex in MySql

SELECT 
  PREG_CAPTURE( '/.*order_id:\s(\d+).*/', dataset.description,1)
FROM
  (
    SELECT 'abc order_id: 2 xxxx yyy aa' AS description
    UNION SELECT 'mmm order_id: 3 nn kk yw' AS description
    UNION SELECT 'mmm order_id: 1523 nn kk yw' AS description
  ) AS dataset


or you can do this and save yourself the ugliness :

select SUBSTRING_INDEX(SUBSTRING_INDEX('habc order_id: 2 xxxx yyy aa',' ',3),' ',-1);


There is no MySQL equivalent. The MySQL REGEXP can be used for matching strings, but not for transforming them.

You can either try to work with stored procedures and a lot of REPLACE/SUBSTRING logic, or do it in your programming language - which should be the easiest option.

But are you sure your data format is well chosen? If you need the order_id, wouldn't it make sense to store it in a different column, so you can put indexes, use joins and the likes?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜