Best way to get App specific data from the Blackberry App World (API)
I'm collecting stats about mobile applications using Python and now I'm looking for the best solution to access the Blackberry App World data.
So far I've got my solution for iOS (http://www.apple.com/itunes/affiliates/resources/documentation/itunes-store-web-service-search-api.html) and Android (https://github.com/liato/android-market-api-py). The iOS solution uses the API provided by Apple, the Android solution simulates a phone and gathers data just the way a real phone does this in a structured way.
Now I can't seem to find a similar solution for the BlackBerry App World, so my question is, what's the best way to go? I can scrape the site, but I rather not since my scraper will break if they change their site. Ideally I'd use either a provided API or simulate a BlackBerry to access the Ap开发者_运维技巧p World data in a more structured way. Any suggestions?
I have been scraping the Blackberry website for a while and not had a problem with updates so far.
Are you using absolute XPaths from the root of the document to extract data? You can make a more robust scraper by using relative XPaths:
//div[@id="priceArea"]/div[@class="contentLic"]
I have been scraping the Blackberry website by using selenium webdriver and phantomDriver and csquery in .net for a while and not had a problem with updates so far.
//Creating dynamic browser and download the page source code based on apipath by using selenium web driver
driver = new PhantomJSDriver(phantomDriverPath);
//driver=new ChromeDriver(chromeDriverPath);
driver.Url = "https://appworld.blackberry.com/webstore/search/"+<search app name>+"/?lang=en&countrycode=IN";
driver.Navigate();
//Waiting for page loading
Thread.Sleep(2000);//2 seconds
if (driver.PageSource != null)
{
//Assigning downloaded page source code to CSQuery
CQ dom = CQ.CreateDocument(driver.PageSource);
//Waiting for page loading
driver.Manage().Timeouts().ImplicitlyWait(TimeSpan.FromSeconds(30));
//find the elements what ever you want based on the id,class name,tag name
string title1 = dom["#topListtopResultsAppTemplateHTML_listItem_0_title"].Text();
}
I have been scraping the Blackberry website by using Selenium WebDriver and phantomDriver and CSQuery in .NET for a while and I have not had a problem with updates so far.
//Creating dynamic browser and download the page source code
//based on apipath by using selenium web driver
public IWebDriver driver;
driver = new PhantomJSDriver(phantomDriverPath);
//driver=new ChromeDriver(chromeDriverPath);
driver.Url = "https://appworld.blackberry.com/webstore/search/"+appname+"/lang=en&countrycode=IN";
driver.Navigate();
//Waiting for page loading Thread.Sleep(2000);//2 seconds
if (driver.PageSource != null){
//Assigning downloaded page source code to CSQuery
CQ dom = CQ.CreateDocument(driver.PageSource);
//Waiting for page loading
driver.Manage().Timeouts().ImplicitlyWait(TimeSpan.FromSeconds(30));
//find the elements what ever you want based on the id,class name,tag name
string title1 = dom["#topListtopResultsAppTemplateHTML_listItem_0_title"].Text();
}
Before you start coding, please download Selenium WebDriver and phantom driver in your PC (like C:\Users\rakesh\Documents\Selenium\PhantomJSDriver
) and install CSQuery in your Visual Studio.
Install webdriver:
Install-Package Selenium.WebDriver
Install phantomjs:
Install-Package phantomjs.exe
精彩评论