Nokogiri XML import feed organisation?
I have built a site that relies on an XML feed that I currently parse with Nokogiri. Everything works fine and dandy although I have all the code currently within my Admin controller so I can actually invoke the import via a URL i.e. /admin/i开发者_如何学Gomport/.
I can't help but think that this doesn't belong in the controller. Is there a better way to do this, i.e. move the code into a stand alone import.rb file so it is only accessible from the console? If so where would I need to put this file, in the /lib/ directory?
Here is a code snippet:
class AdminController < ApplicationController
    def import
      f = File.open("#{Rails.root}/public/feed.xml")
      @doc = Nokogiri::XML(f)
      f.close
      ignore_list = [] # ignore list
      @doc.xpath("/*/product[not(name = following-sibling::product/name)]").each do |node|
        if !ignore_list.include? node.xpath("./programName").inner_text.strip
          Product.create(:name => clean_field(node.xpath("./name").inner_text).downcase, 
          :description => clean_field(node.xpath("./description").inner_text),
          :brand => Brand.find_or_create_by_name(clean_field_key(node.xpath("./brand").inner_text).downcase),         
          :merchant => Merchant.find_or_create_by_name(clean_field_key(node.xpath("./programName").inner_text).downcase),     
          :image => node.xpath("./imageUrl").inner_text.strip,
          :link => node.xpath("./productUrl").inner_text.strip,
          :category => Category.find_or_create_by_name(clean_field_key(node.xpath("./CategoryName").inner_text).downcase),
          :price => "£" + node.xpath("./price").inner_text.strip)
          print clean_field(node.xpath("./name").inner_text).downcase + "\n"       
        end
      end
    end
end
Your code sounds like it would work well being run as a Rails runner script. They're scripts that run outside of the normal Rails process for your site, but have full access to the ActiveRecord setup so you can easily access your database.
I don't think Rails is as strict about the file's location as it is for all its other files, but I'd create a subdirectory under 'app' called 'scripts' and put it there. Keeping a tidy directory structure is a good thing for maintenance.
You don't say if you're running Rails 3 or a previous version. If you're running Rails 3, type rails runner -h at the command-line of your Rails app for more info. 
Some folks feel that scripts should be run using rake, which I agree with IF they're manipulating files and folders and doing general maintenance of the rails-space your app runs in. If you're performing a periodic task that is part of the housekeeping of the database, or, in your case, retrieving content used in support of your app, I think it should be a "runner" task.
You can build functionality so you could still trigger the code to run via a URL, but I think there's potential for abuse of that, especially if you could overwrite needed data or fill the database with duplicate/redundant data. I think it'd be better to make the task run periodically, via cron initiated by the OS probably, just to keep things on a nice interval, or only run manually. If you keep it available via URL access I'd recommend using a password to help avoid abuse.
Finally, as someone who's been doing this a long time, I'd recommend a bit of structure and alignment in your code:
Product.create(
  :name        => clean_field(node.xpath("./name").inner_text).downcase,
  :description => clean_field(node.xpath("./description").inner_text),
  :brand       => Brand.find_or_create_by_name(clean_field_key(node.xpath("./brand").inner_text).downcase),
  :merchant    => Merchant.find_or_create_by_name(clean_field_key(node.xpath("./programName").inner_text).downcase),
  :image       => node.xpath("./imageUrl").inner_text.strip,
  :link        => node.xpath("./productUrl").inner_text.strip,
  :category    => Category.find_or_create_by_name(clean_field_key(node.xpath("./CategoryName").inner_text).downcase),
  :price       => "£" + node.xpath("./price").inner_text.strip
)
Simple alignment can go a long way to help you maintain your code, or help keep the sanity of someone down the line who ends up maintaining it. I'd probably keep it looking like:
Product.create(
  :name        => clean_field( node.xpath( "./name"        ).inner_text ).downcase,
  :description => clean_field( node.xpath( "./description" ).inner_text ),
  :brand       => Brand.find_or_create_by_name(    clean_field_key( node.xpath( "./brand"       ).inner_text ).downcase ),
  :merchant    => Merchant.find_or_create_by_name( clean_field_key( node.xpath( "./programName" ).inner_text ).downcase ),
  :image       => node.xpath( "./imageUrl"   ).inner_text.strip,
  :link        => node.xpath( "./productUrl" ).inner_text.strip,
  :category    => Category.find_or_create_by_name( clean_field_key( node.xpath( "./CategoryName" ).inner_text ).downcase ),
  :price       => "£" + node.xpath( "./price" ).inner_text.strip
)
but that's just me. I like having more whitespace, especially when there are nested methods, and I like having some vertical alignment among common/similar functions. I find it makes it easier to scan the code and see any differences, which helps when you are debugging or looking for a particular thing. Again, that's just my preference, but it's something I've learned over many years of writing code in a lot of different languages.
I have similar functions in some of my apps.  I typically put this logic in a class, e.g. Importer.  This way I can both use it from the console, and make an access-controlled controller action to let others use it from the web.  Where you put the class isn't terribly important, so long as it is in the app's load path.  I tend to put mine in app/models vs in /lib just so I don't have to reload the app in development when I make changes.
 
         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论