MYSQL REGEXP/RLIKE Advice?
I have a table called "scholarships" that has a field named "majors" which contains comma separated major names for the majors the scholarship is related too.
Let's say the field could contain one (or more separated by commas) of the following: business, agribusiness, business administration, international business.
If someone is searching "business" as the major, how can I select "business" as a match but not the others?
The closest I've come is this but I know it could be better - my regex ability isn't so hot.
SELECT scholarship_id, scholarship_award_name, scholarship_majors
FROM scholarships
WHERE scholarship_majors rlike '[, ][[:<:]]business[[:>:]][, ]'
OR scholarship_majors rlike '^[[:<:]]business[[:>:]][, ]'
OR scholarship_majors rlike '[, ][[:<:]]business[[:>:]]$'
I'm trying to catch the field 开发者_StackOverflowif it begins with "business" or ", business" or "business ", "business, " but not "business administration" etc...
Any advice?
Any advice?
Don't store the data in comma separated lists - that's denormalized data, and besides being difficult to isolate specifics, is also prone to bad data (typos, cases sensitivity...).
Define a
MAJORS
table:- MAJOR_ID (primary key)
- MAJOR_NAME
Use a many-to-many table to join scholarships to one or more majors:
SCHOLARSHIP_MAJORS
- SCHOLARSHIP_ID (primary key, foreign key to SCHOLARSHIPS table)
- MAJOR_ID (primary key, foreign key to MAJORS table)
Use JOINs to get scholarships based on majors:
SELECT s.scholarship_id, s.scholarship_award_name, m.major_name FROM SCHOLARSHIPS s JOIN SCHOLARSHIP_MAJORS sm ON sm.scholarship_id = s.scholarship_id JOIN MAJORS m ON m.major_id = sm.major_id WHERE m.major_name IN ('a', 'b', 'c')
...if you want the majors output in a comma separated list, use the GROUP_CONCAT function:
SELECT s.scholarship_id,
s.scholarship_award_name,
GROUP_CONCAT(m.major_name) AS majors
FROM SCHOLARSHIPS s
JOIN SCHOLARSHIP_MAJORS sm ON sm.scholarship_id = s.scholarship_id
JOIN MAJORS m ON m.major_id = sm.major_id
WHERE m.major_name IN ('a', 'b', 'c')
GROUP BY s.scholarship_id, s.scholarship_award_name
I was able to improve the sql by disallowing alpha characters like:
SELECT scholarship_id, scholarship_award_name, scholarship_majors
FROM scholarships
WHERE scholarship_majors rlike '[, ][^a-z][[:<:]]business[[:>:]][^a-z][, ]'
OR scholarship_majors rlike '^[[:<:]]business[[:>:]][^a-z][, ]'
OR scholarship_majors rlike '[, ][^a-z][[:<:]]business[[:>:]]$'
This seems to capture in the way I'm hoping for!
Still looking for any advice to improve this SQL statement.
I spent some time fighting with regexp since the database I'm dealing with has a couple CSV type fields.
Benchmarking revealed that this is a good way with easier syntax:
SELECT * FROM table WHERE FIND_IN_SET('string', my_field)
The field in question HAS to be a CSV string field. Perfect solution for my problem. Yes, I acknowledge the many to many tables are a more normalized way.
精彩评论