How to call external server page with PHP that includes JS?
There is a website which includes a JS code. Normally, when the page is opened by a user this JS code starts manually and provides a link about 10 seconds. I am trying to catch this link. In PHP, I call this page with file_get_contents function, however as you predict link is not there.
Is there any way to make a HTTP request to this page and wait until javascript code started. Or can I invoke this JS 开发者_高级运维function by using JQuery maybe?
If the question is not clear, I can provide more details. Thanks in advance.
I'd suggest looking into the javascript on the page and reverse-engineer how the link is dynamically generated. Then you can use regex to extract that info from the string returned by file_get_contents.
I can probably help you on the reverse-engineering if you can provide extra information on the page in question (or similar).
UPDATE: After some reversing, I found that the mp3 ripper site uses 2 api's: one to push a video to get processed, and one to poke at the current status.
First api:
http://www.youtube-mp3.org/api/pushItem/?item=http%3A//www.youtube.com/watch%3Fv%3DXXXXXXXX&xy=trve
Second api:
http://www.youtube-mp3.org/api/itemInfo/?video_id=XXXXXXXX&adloc=
XXXXXXXX is the youtube video id. The 2nd api returns JSONP where the padding is a variable assignment (info = {...};). In the json, there's a "h" member that returns a long hash which can be used to ultimately construct the mp3 download file link.
But to be a bit ethical, may I suggest looking into another approach (if allowed by your hosting environment)? You can use FFmpeg to convert the video on your own. There's a wrapper class here: YouTube-to-MP3 conversion class
Javascript is executed on the client and needs an interpreter to execute it. Your PHP server executing HTTP request calls won't interpret any javascript, it will simply just retrieve the HTML.
You could use some software like HtmlUnit to make the request and execute the javascript, and then see if you can extract the link after it has finished executing. This will depend on how much access you have to the server you are executing your PHP on.
Or you could study the JS files used by the website you are targeting, determine how it is requesting that link and see if you can just get it directly yourself. Bear in mind you are directly working around how the site is intended to work, so this isn't going to be a particularly elegant solution and a single change in their JS could cause your application to fail. This is fair enough as they might have this very process in place to stop people from being able to harvest the links in the manner you are describing.
file_get_contents
only fetches the HTML source of the URL requested, it does not execute the javascript code for you nor simulates the dom with all it's events.
The short answer would be, that this can not be easily done. One thing you could do is to parse the source and look for the link in there, like Dave suggested.
精彩评论