Accelerate S3 upload with paperclip
I'm using paperclip for uploading images in S3. But I've noted that this upload is very slow. I think because before complete the submit the file has开发者_开发技巧 to pass by my server, be processed and be sent to the S3 server.
Is there a method for accelerate this?
thanks
You did not post any code so I'm going to make a few assumptions here:
- in your project you have an
Album
andImage
model - An
Album has_many :images
- You already have paperclip and aws-sdk set up correctly with buckets and all else
- You are uploading many images at once
In order to upload many images, your form will look something like this:
<%= form_for @album, html: { multipart: true } do |f| %>
<%= f.file_field :files, accept: 'image/png,image/jpeg,image/gif', multiple: true %>
<%= f.submit %>
<% end %>
Your controller will look something like this
class AlbumsController < ApplicationController
def update
@album = Album.find params[:id]
@album.update album_params
redirect_to @album, notice: 'Images saved'
end
def album_params
params.require(:album).permit files: []
end
end
In order to manipulate images using an album you'll need
class Album < ApplicationRecord
has_many :images, dependent: :destroy
accepts_nested_attributes_for :images, allow_destroy: true
def files=(array = [])
array.each do |f|
images.create file: f
end
end
end
Your Image
file will look like this
class Image < ApplicationRecord
belongs_to :album
has_attached_file :file, styles: { thumbnail: '500x500#' }, default_url: '/default.jpg'
validates_attachment_content_type :file, content_type: /\Aimage\/.*\Z/
end
This is just the important stuff. With this setup, an upload of 22 images with a total of 12MB takes the :files=
method 41.1806895 seconds to execute on average on my local server. To check how long a method takes to run, use:
def files=(array = [])
start = Time.now
array.each do |f|
images.create file: f
end
p "ELAPSED TIME: #{Time.now - start}"
end
You ask for a faster upload of many images. There are a few ways to do this. Using jobs won't work because you can't pass complex data like images to a job.
Use delayed_paperclip instead. It moves image styles creation (like thumbnail: '500x500#'
) into background jobs.
Gemfile
source 'https://rubygems.org'
ruby '2.3.0'
...
gem 'delayed_paperclip'
...
Image file
class Image < ApplicationRecord
...
process_in_background :file
end
It speeds up the :files=
method. The same upload as before (22 images, 12MB) with this setup took 23.13998 seconds on my machine. That's 1.77963 times faster than before.
Another way of speeding things up is by using Threads. Remove delayed_paperclip
from the Gemfile and the process_in_background :file
line. Update your :files=
method:
def files=(array = [])
threads = []
array.each do |f|
threads << Thread.new do
images.create file: f
end
end
threads.each(&:join)
end
You might try this, but get some weird error and only see that 4 images saved. You must also use Mutex. Also, you must not use :join
on the threads because if you join, the method will wait until the threads are done running.
def files=(array = [])
semaphore = Mutex.new
array.each do |f|
Thread.new do
semaphore.synchronize do
images.create file: f
end
end
end
end
With this simple change to the method and no added gems, the same upload as before runs in 0.017628 seconds. That is 1,313 times faster than delayed_paperclip
. It's also 2,336 times faster than the regular setup.
What happens if you use delayed_paperclip
AND Threads
?
Don't change the :files=
method. Just turn delayed_paperclip
back on in your Gemfile and add back the process_in_background :file
line.
With this setup on my machine, the method runs in 0.001277 seconds on average. That's
- 13.8 times faster than
Threads
- 18,120.6 times faster than
delayed_paperclip
- 32,248.0 times faster than regular setup
Remember, this is on my machine and I have not tested this in production. I am also on wifi, not ethernet. All these things can change the results but I think the numbers speak for themselves.
Upload images faster. Done.
UPDATE: Don't use delayed_paperclip
. It can cause a busy database, and some images might not get saved. I've tested it. I think just using threads is fast enough. Remove the process_in_background
line from the Image
file. Also, here's what my files=
method looks like:
def files=(array = [])
Thread.new do
begin
array.each { |f| images.create file: f }
ensure
ActiveRecord::Base.connection_pool.release_connection
end
end
end
Note: Since we push the image saving to a background task and then redirect. The page that loads will not have images on them yet. The user has to refresh to update the page. One way around this is to use polling. Polling is when JavaScript checks for any changes every 5 seconds or so and makes changes if any to the page.
Another option is to use Web Sockets. Now that we have Rails 5, we can use ActionCable. Every time an image gets created, we broadcast an update for the album. If the user is on that page for that album, they will see updates happen as soon as they happen on the database without having the user refresh or the browser make a request every 5 seconds on an infinite loop.
Cool stuff.
Do you want to improve the appearance of the upload being faster or actually make the upload faster?
If it's the former you can put your image handling logic into a background task using something like delayed_job. This way when a user clicks the button they'll immediately go to their next page while you process the image (you can show a "processing in progress" image placeholder until the task is finished).
If it's the latter then it's entirely down to your server and internet connection. Where are you hosting?
How about uploading direct to S3?
Not sure if paperclip does this out of the box, but you could make it.
http://docs.amazonwebservices.com/AmazonS3/2006-03-01/dev/index.html?UsingHTTPPOST.html
Use delayed jobs, this is a good example here
Or you can use flash upload.
If you end up going the route of uploading directly to S3 which offloads the work from your Rails server, please check out my sample projects:
Sample project using Rails 3, Flash and MooTools-based FancyUploader to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-FancyUploader
Sample project using Rails 3, Flash/Silverlight/GoogleGears/BrowserPlus and jQuery-based Plupload to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-Plupload
By the way, you can do post-processing with Paperclip using something like this blog post describes:
http://www.railstoolkit.com/posts/fancyupload-amazon-s3-uploader-with-paperclip
As cwninja recommends, we upload direct to s3 so as to get rid of this extra upload. We use a modified version of the plugin described in this blog post:
http://elctech.wpengine.com/2009/02/updates-on-rails-s3-flash-upload-plugin/
Ours is modified to handle multiple file uploads (rewrote the the flex object
Not sure how well this plays with paperclip, we use attachment_fu, but it wasn't so bad to get it to work with that.
精彩评论