开发者

Trimming Course Schedule data using regular expression

I need your help on using regular expression to extract time information from the following kind of data:

TTh 7-8:30AM lec AIT PANEL D                                     
WF 1-2:30PM lec BA 104                                           
S 9:00a-12:00p lec CPA 306                                       
MTh 7:00a-8:30a lec AIT PANEL D                                  
TTh 2:30-4PM lec BA 104                                          
TTh 2:30-4PM lec BA 104                                          
TTh 4-5:30PM lec CPA 301                                         
S 9AM-12PM lec TBA                                               
Sa 09:00a-12:00p lec BA 301                                      
Sa 09:00a-12:00p lec BA 301                                      
MTh 08:30a-10:00a lec BA 301                                     
TTh 1-2:30PM lec CPA 301                                         
TTh 7-8:30AM lec AIT PANEL D   

So as you can see, these are schedules of courses.This set of values is a part of a much larger dataset, containing other attributes (delimted by a tab) and stored in a textfile. I want to trim all non-time related information from these schedule values so they become the following,

for example:

7-8:30AM                                 
1-2:30PM
9:00a-12:00p
7:00a-8:30a
2:30-4PM
2:30-4PM
4-5:30PM
9AM-12PM
09:00a-12:00p

Ok, so this is what i know about the schedule value:

  1. The tokens in the schedule values are delimited by space.
  2. A time token has numbers, a dash, colon, and can contain letters AM,PM,a,p.

I have this feeling that this is probably an easy task for you guys, and unfortunately, I am not well-versed in regex.

That's why I badly need your help. Thank you so much!

PS: I am using Notepad++ in editing the file, if there's a way that I can do this using Notepad++. Or can I also d开发者_StackOverflow中文版o this in MySQL using Update command?

similar question


You could probably do this much easier at the mySQL level instead of doing this in Regex. I am not familiar with mySQL but I think there is an equivalent to TSQL's CharIndex called SUBSTRING_INDEX.

Here is quick and dirty of one line using TSQL - in no way claims to be efficient, but it works, you should adapt for iterating through your raw data and convert to mySql syntax:

declare @scheduleLine varchar(500);
Set @scheduleLine = 'TTh 7-8:30AM lec AIT PANEL D';
declare @firstSpace int, @secondSpace int;
Set @firstSpace = CHARINDEX(' ',@scheduleLine,0);
Set @secondSpace = CHARINDEX(' ',@scheduleLine, @firstSpace+1)
Declare @timeOfClass varchar(20)
Set @timeOfClass = SUBSTRING(@scheduleLine, @firstSpace, (@secondSpace-@firstSpace));
Select @timeOfClass as TimeOfClass

Will result in:

 7-8:30AM

Just use the mySQL equivalent.


A little bit late, but have a try with:

Search what: .*?(\d+(?::\d+)?(?:[ap]m?)?-\d+(?::\d+)?(?:[ap]m?)?).*
Replace with: $1


I realized that I could just split the string using space as delimiter and get the 2nd element of the returned list.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜