Pulling Track Info From an Audio Stream Using PHP
Is it possible to pull track info from an audio stream using PHP? I've done some digging and the closest function I can find is stream_get_transports but my host doesn't support http transports via fsocko开发者_JS百科pen() so I'll have to do some more tinkering to see what else that function returns.
Currently, I'm trying to pull artist and track metadata from an AOL stream.
This is a SHOUTcast stream, and yes it is possible. It has absolutely nothing to do with ID3 tags. I wrote a script awhile ago to do this, but can't find it anymore. Just last week I helped another guy who had a fairly complete script to do the same thing, but I can't just post the source to it, as it isn't mine. I will however get you in touch with him, if you e-mail me at brad@musatcha.com.
Anyway, here's how to do it yourself:
The first thing you need to do is connect to the server directly. Don't use HTTP. Well, you could probably use cURL, but it will likely be much more hassle than its worth. You connect to it with fsockopen()
(doc). Make sure to use the correct port. Also note that many web hosts will block a lot of ports, but you can usually use port 80. Fortunately, all of the AOL-hosted SHOUTcast streams use port 80.
Now, make your request just like your client would.
GET /whatever HTTP/1.0
But, before sending <CrLf><CrLf>
, include this next header!
Icy-MetaData:1
That tells the server that you want metadata. Now, send your pair of <CrLf>
.
Ok, the server will respond with a bunch of headers and then start sending you data. In those headers will be an icy-metaint:8192
or similar. That 8192 is the meta interval. This is important, and really the only value you need. It is usually 8192, but not always, so make sure to actually read this value!
Basically it means, you will get 8192 bytes of MP3 data and then a chunk of meta, followed by 8192 bytes of MP3 data, followed by a chunk of meta.
Read 8192 bytes of data (make sure you are not including the header in this count), discard them, and then read the next byte. This byte is the first byte of meta data, and indicates how long the meta data is. Take the value of this byte (the actual byte with ord()
(doc)), and multiply it by 16. The result is the number of bytes to read for metadata. Read those number of bytes into a string variable for you to work with.
Next, trim the value of this variable. Why? Because the string is padded with 0x0
at the end (to make it fit evenly into a multiple of 16 bytes), and trim()
(doc) takes care of that for us.
You will be left with something like this:
StreamTitle='Awesome Trance Mix - DI.fm';StreamUrl=''
I'll let you pick your method of choice for parsing this. Personally I'd probably just split with a limit of 2 on ;
, but beware of titles that contain ;
. I'm not sure what the escape character method is. A little experimentation should help you.
Don't forget to disconnect from the server when you're done with it!
There are lots of SHOUTcast MetaData references out there. This is a good one: http://www.smackfu.com/stuff/programming/shoutcast.html
Check this out: https://gist.github.com/fracasula/5781710
It's a little gist with a PHP function that lets you extract MP3 metadata (StreamTitle) from a streaming URL.
Usually the streaming server puts an icy-metaint
header in the response which tells us how often the metadata is sent in the stream. The function checks for that response header and, if present, it replaces the interval parameter with it.
Otherwise the function calls the streaming URL respecting your interval and, if any metadata isn't present, then it tries again through recursion starting from the offset parameter.
<?php
/**
* Please be aware. This gist requires at least PHP 5.4 to run correctly.
* Otherwise consider downgrading the $opts array code to the classic "array" syntax.
*/
function getMp3StreamTitle($streamingUrl, $interval, $offset = 0, $headers = true)
{
$needle = 'StreamTitle=';
$ua = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36';
$opts = [
'http' => [
'method' => 'GET',
'header' => 'Icy-MetaData: 1',
'user_agent' => $ua
]
];
if (($headers = get_headers($streamingUrl))) {
foreach ($headers as $h) {
if (strpos(strtolower($h), 'icy-metaint') !== false && ($interval = explode(':', $h)[1])) {
break;
}
}
}
$context = stream_context_create($opts);
if ($stream = fopen($streamingUrl, 'r', false, $context)) {
$buffer = stream_get_contents($stream, $interval, $offset);
fclose($stream);
if (strpos($buffer, $needle) !== false) {
$title = explode($needle, $buffer)[1];
return substr($title, 1, strpos($title, ';') - 2);
} else {
return getMp3StreamTitle($streamingUrl, $interval, $offset + $interval, false);
}
} else {
throw new Exception("Unable to open stream [{$streamingUrl}]");
}
}
var_dump(getMp3StreamTitle('http://str30.creacast.com/r101_thema6', 19200));
I hope this helps!
Thanks a lot for the code fra_casula. Here is a slightly simplified version running on PHP <= 5.3 (the original is targeted at 5.4). It also reuses the same connection resource.
I removed the exception because of my own needs, returning false if nothing is found instead.
private function getMp3StreamTitle($steam_url)
{
$result = false;
$icy_metaint = -1;
$needle = 'StreamTitle=';
$ua = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36';
$opts = array(
'http' => array(
'method' => 'GET',
'header' => 'Icy-MetaData: 1',
'user_agent' => $ua
)
);
$default = stream_context_set_default($opts);
$stream = fopen($steam_url, 'r');
if($stream && ($meta_data = stream_get_meta_data($stream)) && isset($meta_data['wrapper_data'])){
foreach ($meta_data['wrapper_data'] as $header){
if (strpos(strtolower($header), 'icy-metaint') !== false){
$tmp = explode(":", $header);
$icy_metaint = trim($tmp[1]);
break;
}
}
}
if($icy_metaint != -1)
{
$buffer = stream_get_contents($stream, 300, $icy_metaint);
if(strpos($buffer, $needle) !== false)
{
$title = explode($needle, $buffer);
$title = trim($title[1]);
$result = substr($title, 1, strpos($title, ';') - 2);
}
}
if($stream)
fclose($stream);
return $result;
}
This is the C# code for getting the metadata using HttpClient:
public async Task<string> GetMetaDataFromIceCastStream(string url)
{
m_httpClient.DefaultRequestHeaders.Add("Icy-MetaData", "1");
var response = await m_httpClient.GetAsync(url, HttpCompletionOption.ResponseHeadersRead);
m_httpClient.DefaultRequestHeaders.Remove("Icy-MetaData");
if (response.IsSuccessStatusCode)
{
IEnumerable<string> headerValues;
if (response.Headers.TryGetValues("icy-metaint", out headerValues))
{
string metaIntString = headerValues.First();
if (!string.IsNullOrEmpty(metaIntString))
{
int metadataInterval = int.Parse(metaIntString);
byte[] buffer = new byte[metadataInterval];
using (var stream = await response.Content.ReadAsStreamAsync())
{
int numBytesRead = 0;
int numBytesToRead = metadataInterval;
do
{
int n = stream.Read(buffer, numBytesRead, 10);
numBytesRead += n;
numBytesToRead -= n;
} while (numBytesToRead > 0);
int lengthOfMetaData = stream.ReadByte();
int metaBytesToRead = lengthOfMetaData * 16;
byte[] metadataBytes = new byte[metaBytesToRead];
var bytesRead = await stream.ReadAsync(metadataBytes, 0, metaBytesToRead);
var metaDataString = System.Text.Encoding.UTF8.GetString(metadataBytes);
return metaDataString;
}
}
}
}
return null;
}
UPDATE: This is an update with a more appropriate solution to the question. The original post is also provided below for information.
The script in this post, after some error correction, works and extracts the stream title using PHP: PHP script to extract artist & title from Shoutcast/Icecast stream.
I had to make a couple of changes, because the echo statements at the end were throwing an error. I added two print_r() statements after the function, and $argv[1] in the call so you can pass the URL to it from the command line.
<?php
define('CRLF', "\r\n");
class streaminfo{
public $valid = false;
public $useragent = 'Winamp 2.81';
protected $headers = array();
protected $metadata = array();
public function __construct($location){
$errno = $errstr = '';
$t = parse_url($location);
$sock = fsockopen($t['host'], $t['port'], $errno, $errstr, 5);
$path = isset($t['path'])?$t['path']:'/';
if ($sock){
$request = 'GET '.$path.' HTTP/1.0' . CRLF .
'Host: ' . $t['host'] . CRLF .
'Connection: Close' . CRLF .
'User-Agent: ' . $this->useragent . CRLF .
'Accept: */*' . CRLF .
'icy-metadata: 1'.CRLF.
'icy-prebuffer: 65536'.CRLF.
(isset($t['user'])?'Authorization: Basic '.base64_encode($t['user'].':'.$t['pass']).CRLF:'').
'X-TipOfTheDay: Winamp "Classic" rulez all of them.' . CRLF . CRLF;
if (fwrite($sock, $request)){
$theaders = $line = '';
while (!feof($sock)){
$line = fgets($sock, 4096);
if('' == trim($line)){
break;
}
$theaders .= $line;
}
$theaders = explode(CRLF, $theaders);
foreach ($theaders as $header){
$t = explode(':', $header);
if (isset($t[0]) && trim($t[0]) != ''){
$name = preg_replace('/[^a-z][^a-z0-9]*/i','', strtolower(trim($t[0])));
array_shift($t);
$value = trim(implode(':', $t));
if ($value != ''){
if (is_numeric($value)){
$this->headers[$name] = (int)$value;
}else{
$this->headers[$name] = $value;
}
}
}
}
if (!isset($this->headers['icymetaint'])){
$data = ''; $metainterval = 512;
while(!feof($sock)){
$data .= fgetc($sock);
if (strlen($data) >= $metainterval) break;
}
$this->print_data($data);
$matches = array();
preg_match_all('/([\x00-\xff]{2})\x0\x0([a-z]+)=/i', $data, $matches, PREG_OFFSET_CAPTURE);
preg_match_all('/([a-z]+)=([a-z0-9\(\)\[\]., ]+)/i', $data, $matches, PREG_SPLIT_NO_EMPTY);
echo '<pre>';var_dump($matches);echo '</pre>';
$title = $artist = '';
foreach ($matches[0] as $nr => $values){
$offset = $values[1];
$length = ord($values[0]{0}) +
(ord($values[0]{1}) * 256)+
(ord($values[0]{2}) * 256*256)+
(ord($values[0]{3}) * 256*256*256);
$info = substr($data, $offset + 4, $length);
$seperator = strpos($info, '=');
$this->metadata[substr($info, 0, $seperator)] = substr($info, $seperator + 1);
if (substr($info, 0, $seperator) == 'title') $title = substr($info, $seperator + 1);
if (substr($info, 0, $seperator) == 'artist') $artist = substr($info, $seperator + 1);
}
$this->metadata['streamtitle'] = $artist . ' - ' . $title;
}else{
$metainterval = $this->headers['icymetaint'];
$intervals = 0;
$metadata = '';
while(1){
$data = '';
while(!feof($sock)){
$data .= fgetc($sock);
if (strlen($data) >= $metainterval) break;
}
//$this->print_data($data);
$len = join(unpack('c', fgetc($sock))) * 16;
if ($len > 0){
$metadata = str_replace("\0", '', fread($sock, $len));
break;
}else{
$intervals++;
if ($intervals > 100) break;
}
}
$metarr = explode(';', $metadata);
foreach ($metarr as $meta){
$t = explode('=', $meta);
if (isset($t[0]) && trim($t[0]) != ''){
$name = preg_replace('/[^a-z][^a-z0-9]*/i','', strtolower(trim($t[0])));
array_shift($t);
$value = trim(implode('=', $t));
if (substr($value, 0, 1) == '"' || substr($value, 0, 1) == "'"){
$value = substr($value, 1);
}
if (substr($value, -1) == '"' || substr($value, -1) == "'"){
$value = substr($value, 0, -1);
}
if ($value != ''){
$this->metadata[$name] = $value;
}
}
}
}
fclose($sock);
$this->valid = true;
}else echo 'unable to write.';
}else echo 'no socket '.$errno.' - '.$errstr.'.';
print_r($theaders);
print_r($metadata);
}
public function print_data($data){
$data = str_split($data);
$c = 0;
$string = '';
echo "<pre>\n000000 ";
foreach ($data as $char){
$string .= addcslashes($char, "\n\r\0\t");
$hex = dechex(join(unpack('C', $char)));
if ($c % 4 == 0) echo ' ';
if ($c % (4*4) == 0 && $c != 0){
foreach (str_split($string) as $s){
//echo " $string\n";
if (ord($s) < 32 || ord($s) > 126){
echo '\\'.ord($s);
}else{
echo $s;
}
}
echo "\n";
$string = '';
echo str_pad($c, 6, '0', STR_PAD_LEFT).' ';
}
if (strlen($hex) < 1) $hex = '00';
if (strlen($hex) < 2) $hex = '0'.$hex;
echo $hex.' ';
$c++;
}
echo " $string\n</pre>";
}
public function __get($name){
if (isset($this->metadata[$name])){
return $this->metadata[$name];
}
if (isset($this->headers[$name])){
return $this->headers[$name];
}
return null;
}
}
$t = new streaminfo($argv[1]); // get metadata
/*
echo "Meta Interval: ".$t->icymetaint;
echo "\n";
echo 'Current Track: '.$t->streamtitle;
*/
?>
With the updated code, it prints the arrays of header and streamtitle info. If you only want the now_playing track, then comment out the two print_r() statements, and uncomment the echo statements at the end.
#Example: run this command:
php getstreamtitle.php http://162.244.80.118:3066
#and the result is...
Array
(
[0] => HTTP/1.0 200 OK
[1] => icy-notice1:<BR>This stream requires <a href="http://www.winamp.com">Winamp</a><BR>
[2] => icy-notice2:SHOUTcast DNAS/posix(linux x64) v2.6.0.750<BR>
[3] => Accept-Ranges:none
[4] => Access-Control-Allow-Origin:*
[5] => Cache-Control:no-cache,no-store,must-revalidate,max-age=0
[6] => Connection:close
[7] => icy-name:
[8] => icy-genre:Old Time Radio
[9] => icy-br:24
[10] => icy-sr:22050
[11] => icy-url:http://horror-theatre.com
[12] => icy-pub:1
[13] => content-type:audio/mpeg
[14] => icy-metaint:8192
[15] => X-Clacks-Overhead:GNU Terry Pratchett
[16] =>
)
StreamTitle='501026TooHotToLive';
Here is the original post using python and vlc
The PHP solution kept searching but never returned a response for me.
This is not PHP as requested, but may help others looking for a way to extract the 'now_playing' info from live streams.
If you only want the 'now_playing' info, you can edit the script to return that.
The python script extracts the metadata (including the 'now_playing' track) using VLC. You need VLC and the python libraries: sys, telnetlib, os, time and socket.
#!/usr/bin/python
# coding: utf-8
import sys, telnetlib, os, time, socket
HOST = "localhost"
password = "admin"
port = "4212"
def check_port():
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
res = sock.connect_ex((HOST, int(port)))
sock.close()
return res == 0
def checkstat():
if not check_port():
os.popen('vlc --no-audio --intf telnet --telnet-password admin --quiet 2>/dev/null &')
while not check_port():
time.sleep(.1)
def docmd(cmd):
tn = telnetlib.Telnet(HOST, port)
tn.read_until(b"Password: ")
tn.write(password.encode('utf-8') + b"\n")
tn.read_until(b"> ")
tn.write(cmd.encode('utf-8') + b"\n")
ans=tn.read_until(">".encode("utf-8"))[0:-3]
return(ans)
tn.close()
def nowplaying(playing):
npstart=playing.find('now_playing')
mystr=playing[npstart:]
npend=mystr.find('\n')
return mystr[:npend]
def metadata(playing):
fstr='+----'
mstart=playing.find(fstr)
mend=playing.find(fstr,mstart+len(fstr))
return playing[mstart:mend+len(fstr)]
checkstat()
docmd('add '+sys.argv[1])
playing=""
count=0
while not 'now_playing:' in playing:
time.sleep(.5)
playing=docmd('info')
count+=1
if count>9:
break
if playing == "":
print("--Timeout--")
else:
print(metadata(playing))
docmd('shutdown')
Example, extract metadata from Crypt Theater Station:
./radiometatdata.py http://107.181.227.250:8026
Response:
+----[ Meta data ]
|
| title: *CRYPT THEATER*
| filename: 107.181.227.250:8026
| genre: Old Time Radio
| now_playing: CBS Radio Mystery Theatre - A Ghostly Game of Death
|
+----
精彩评论