开发者

regular expression to find all substrings inside double quotes php

I have a big database that has fields of paragraphs that are formatted like this:

["This is the first sentence", "This is the second sentent", "This is the third sentence", "This is the fourth sentence"]

I would like to extract the (using PHP) and put them in an array where each array element is a sentence. Right now, I am using this:

$trim_joined = substr($joined, 2, -2); //gets rid of the first and last bracket and double quote
$sentences = explode('", "', $trim_joined);

It seems a bit fragile because I am not 100% sure that this field follows this exact same format for every row of the database (over 350,000 rows). I was wondering if there is a regular expression that extracts ALL elements of the strin开发者_如何学Pythongs that are inside double quotes and puts them in an array. This way, I don't have to worry if there are entries without the brackets at beginning and end.

Unfortunately, i now little to nothing about regex so asking for help. Thanks in advance


If the format was consistent you could just use json_decode - as the rows are pretty much lists of strings. I would totally test that first, even if it runs a few minutes.

Failing that you can use a somewhat more robust CSV parser, after simply triming the square brackets (I would conjecture that's the optimum approach here):

 $strings = str_getcsv(trim(trim($row, "["), "]"));

The simplest regex solution would be:

 preg_match_all('/"([^"]*)"\K/', $row, $strings);


Here's a way without regex:

You could use json_decode():

<?php
$data='["This is the first sentence", "This is the second sentent", "This is the third sentence", "This is the fourth sentence"]';
$arr=json_decode($data,true);
print_r($arr);
?>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜