开发者

Cucumber reading a pdf into a temp file

I've got a cucumber suite set up to read a static PDF file and do assertions on it's content.

I recently updated all my gems, and since doing so, it doesn't work any开发者_Python百科more.

The cucumber step is as follows:

When /^I follow PDF link "([^"]*)"$/ do |arg1|
  temp_pdf = Tempfile.new('foo')
  temp_pdf << page.body
  temp_pdf.close
  temp_txt = Tempfile.new('txt')
  temp_txt.close
  'pdftotext -q #{temp_pdf.path} #{temp_txt.path}'
  page.drive.instance_variable_set('@body', File.read(temp_txt.path))
end

This used to work just fine. But after updating to Lion/my gems, It throws the following error when executing the line temp_pdf << page.body

encoding error: output conversion failed due to conv error, bytes 0xA3 0xC3 0x8F 0xC3
I/O error : encoder error

I tried a few different PDFs from different sources and they all seem to be failing. How can I get the PDF read into the temporary file?


The following piece of code works for me. Had to change temp_pdf << page.body, to page.source (as body is already parsed faulty). I also had to set the instance variable @dom on the drivers browser, instead of @body on the driver. This is because in the recent capybara versions (rack_test) driver no instance variable body exists, instead body calls '@browser.body':

https://github.com/jnicklas/capybara/blob/master/lib/capybara/rack_test/driver.rb

browser.body again, calls 'dom.to_xml', and if you look at 'dom' you see that it initializes @dom with Nokogiri::HTML, thus it makes a lot of sense that there've been nokogiri conversion errors in the first place.

https://github.com/jnicklas/capybara/blob/master/lib/capybara/rack_test/browser.rb

with_scope(selector) do
  click_link(label)
  temp_pdf = Tempfile.new('pdf')
  temp_pdf << page.source
  temp_pdf.close
  temp_txt = Tempfile.new('txt')
  temp_txt.close
  temp_txt_path = "#{temp_txt.path}.html"
  `pdftohtml -c -noframes #{temp_pdf.path} #{temp_txt_path}`
  page.driver.browser.instance_variable_set('@dom', Nokogiri::HTML(File.read(temp_txt_path))
end
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜