开发者

Record streaming and saving internet radio in python

I am looking for a python snippet t开发者_如何学Pythono read an internet radio stream(.asx, .pls etc) and save it to a file.

The final project is cron'ed script that will record an hour or two of internet radio and then transfer it to my phone for playback during my commute. (3g is kind of spotty along my commute)

any snippits or pointers are welcome.


The following has worked for me using the requests library to handle the http request.

import requests

stream_url = 'http://your-stream-source.com/stream'

r = requests.get(stream_url, stream=True)

with open('stream.mp3', 'wb') as f:
    try:
        for block in r.iter_content(1024):
            f.write(block)
    except KeyboardInterrupt:
        pass

That will save a stream to the stream.mp3 file until you interrupt it with ctrl+C.


So after tinkering and playing with it Ive found Streamripper to work best. This is the command i use

streamripper http://yp.shoutcast.com/sbin/tunein-station.pls?id=1377200 -d ./streams -l 10800 -a tb$FNAME


If you find that your requests or urllib.request call in Python 3 fails to save a stream because you receive "ICY 200 OK" in return instead of an "HTTP/1.0 200 OK" header, you need to tell the underlying functions ICY 200 OK is OK!

What you can effectively do is intercept the routine that handles reading the status after opening the stream, just before processing the headers.

Simply put a routine like this above your stream opening code.

def NiceToICY(self):
    class InterceptedHTTPResponse():
        pass
    import io
    line = self.fp.readline().replace(b"ICY 200 OK\r\n", b"HTTP/1.0 200 OK\r\n")
    InterceptedSelf = InterceptedHTTPResponse()
    InterceptedSelf.fp = io.BufferedReader(io.BytesIO(line))
    InterceptedSelf.debuglevel = self.debuglevel
    InterceptedSelf._close_conn = self._close_conn
    return ORIGINAL_HTTP_CLIENT_READ_STATUS(InterceptedSelf)

Then put these lines at the start of your main routine, before you open the URL.

ORIGINAL_HTTP_CLIENT_READ_STATUS = urllib.request.http.client.HTTPResponse._read_status
urllib.request.http.client.HTTPResponse._read_status = NiceToICY

They will override the standard routine (this one time only) and run the NiceToICY function in place of the normal status check when it has opened the stream. NiceToICY replaces the unrecognised status response, then copies across the relevant bits of the original response which are needed by the 'real' _read_status function. Finally the original is called and the values from that are passed back to the caller and everything else continues as normal.

I have found this to be the simplest way to get round the problem of the status message causing an error. Hope it's useful for you, too.


I am aware this is a year old, but this is still a viable question, which I have recently been fiddling with.

Most internet radio stations will give you an option of type of download, I choose the MP3 version, then read the info from a raw socket and write it to a file. The trick is figuring out how fast your download is compared to playing the song so you can create a balance on the read/write size. This would be in your buffer def.

Now that you have the file, it is fine to simply leave it on your drive (record), but most players will delete from file the already played chunk and clear the file out off the drive and ram when streaming is stopped.

I have used some code snippets from a file archive without compression app to handle a lot of the file file handling, playing, buffering magic. It's very similar in how the process flows. If you write up some sudo-code (which I highly recommend) you can see the similarities.


I'm only familiar with how shoutcast streaming works (which would be the .pls file you mention):

You download the pls file, which is just a playlist. It's format is fairly simple as it's just a text file that points to where the real stream is.

You can connect to that stream as it's just HTTP, that streams either MP3 or AAC. For your use, just save every byte you get to a file and you'll get an MP3 or AAC file you can transfer to your mp3 player.

Shoutcast has one addition that is optional: metadata. You can find how that works here, but is not really needed.

If you want a sample application that does this, let me know and I'll make up something later.


In line with the answer from https://stackoverflow.com/users/1543257/dingles (https://stackoverflow.com/a/41338150), here's how you can achieve the same result with the asynchronous HTTP client library - aiohttp:

import functools

import aiohttp
from aiohttp.client_proto import ResponseHandler
from aiohttp.http_parser import HttpResponseParserPy


class ICYHttpResponseParser(HttpResponseParserPy):
    def parse_message(self, lines):
        if lines[0].startswith(b"ICY "):
            lines[0] = b"HTTP/1.0 " + lines[0][4:]
        return super().parse_message(lines)


class ICYResponseHandler(ResponseHandler):
    def set_response_params(
        self,
        *,
        timer = None,
        skip_payload = False,
        read_until_eof = False,
        auto_decompress = True,
        read_timeout = None,
        read_bufsize = 2 ** 16,
        timeout_ceil_threshold = 5,
    ) -> None:
        # this is a copy of the implementation from here:
        # https://github.com/aio-libs/aiohttp/blob/v3.8.1/aiohttp/client_proto.py#L137-L165
        self._skip_payload = skip_payload

        self._read_timeout = read_timeout
        self._reschedule_timeout()

        self._timeout_ceil_threshold = timeout_ceil_threshold

        self._parser = ICYHttpResponseParser(
            self,
            self._loop,
            read_bufsize,
            timer=timer,
            payload_exception=aiohttp.ClientPayloadError,
            response_with_body=not skip_payload,
            read_until_eof=read_until_eof,
            auto_decompress=auto_decompress,
        )

        if self._tail:
            data, self._tail = self._tail, b""
            self.data_received(data)


class ICYConnector(aiohttp.TCPConnector):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._factory = functools.partial(ICYResponseHandler, loop=self._loop)

This can then be used as follows:

session = aiohttp.ClientSession(connector=ICYConnector())
async with session.get("url") as resp:
    print(resp.status)

Yes, it's using a few private classes and attributes but this is the only solution to change the handling of something that's part of HTTP spec and (theoretically) should not ever need to be changed by the library's user...

All things considered, I would say this is still rather clean in comparison to monkey patching which would cause the behavior to be changed for all requests (especially true for asyncio where setting before and resetting after a request does not guarantee that something else won't make a request while request to ICY is being made). This way, you can dedicate a ClientSession object specifically for requests to servers that respond with the ICY status line.

Note that this comes with a performance penalty for requests made with ICYConnector - in order for this to work, I am using the pure Python implementation of HttpResponseParser which is going to be slower than the one that aiohttp uses by default and is written in C. This cannot really be done differently without vendoring the whole library as the behavior for parsing status line is deeply hidden in the C code.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜