How to implement cloud storage for media assets in ZF?

2023-03-25 17:24 问答作者：

I have never programmed any cloud application so I'm basically researching to get started with it. I'm developing in Zend Framework and want to use the cloud to store media assets. The whole project should be scaleable for the cloud. Thinking about this, more and more questions are coming up in my mind:

Which storage provider would you recommend?
How do I handle access rights to the assets? (Public should only be able to access them if the according article is released. )
How do I keep track of all the assets? (Naming conventions? Create a database to assign them to an article?)
How do I refer to them in zend framework? (Does it make sense to use a cdn like cloudfront? How to create urls?)
Can I keep my code generic to be able to switch from a local storage (at the beginning) to a cloud storage with the growth of the project?
How do I optimize my assets for different devices/screen sizes? Can I still have only one source?

What I want to do:

Media Asset is uploaded by a journalist
Server safes original file to the cloud (only restricted asset)
Server prepares the image for web (scaling, quality) and safes it to the cloud
Media is associated to a news article
News and associated media assets are released or deleted (no release) by an admin
I want to distribute the assets through a CDN if they get release.

I would be very thankful for hints on how to tackle 开发者_如何学Pythonthis project ;-]

I would recommend Amazon S3, it is also what I have been developing on top of. I will also answer your question from an AWS S3 perspective.

How do I handle access rights to the assets? (Public should only be able to access them if the according article is released. )

When files are uploaded to Amazon S3, you can choose an access policy. You can also set an access policy to every file in the entire "bucket". A bucket is a unique name used to refer to your "cloud" based storage repository. Each file in the bucket is accessed by a single key.

For example, you upload a file called myAwesomeImage.jpg. Now when you transfer that file to S3, you get to choose several options for that file.

Content Types
Storage Options
ACL Rules
The Key/Name

So you can choose to put your awesome image, in a "fake directory" called some/path/to/file. So you would create a "key" for this Object to be stored under the "key" "some/path/to/file/myAwesomeImage.jpg".

Your buckets can store billions of Objects, and you can chose how you want to store them, you can choose to use the forward slash to create a folder, but it doesn't actually create a folder, it is just a useful mechanism that you can them employ in your application to signifify depth and organisation in your files.

Now, ACL

So when you upload your Object, you can pick several default access policies, or you can create your own. For example, if you upload an Object as ACL_PUBLIC, that means that anyone can access it.

However, if you upload it as ACL_PRIVATE that means that it is a private and only the owner of the file can access it.

Example

Public : http://i3.muscache.com/pictures/700010/small.jpg
Private : http://i3.muscache.com/pictures/700010/tiny.jpg

How do I keep track of all the assets? (Naming conventions? Create
a database to assign them to an article?)

So you have a few options here. You can either cache everything to store a local state of your Bucket, or you can constantly check with the Amazon S3 API to find out what files you have. You will know which you need based on your application.

Take a situation that I have... A image is uploaded to our companies file manager, and then three thumbnails are automatically generated, and then watermarks are also applied. This means that each image, could generate at least 3 images, and up to hundreds (depending on how many different watermarks we need to apply).

In our situation, I uploaded 20k images to S3 last week and then imported it into our File Manager. I have to store a local representation of waht we have in S3 because otherwise it takes too long to search and query the repository. I also am not interested in which watermark files and thumbnails we have, but I do need to make sure that they are generated. Storing them locally means you can do all of this.

This is the schema for my files table. (but I also have another files_dimensions) table which stores all of my dimensions too.

CREATE TABLE `files` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `lft` int(11) NOT NULL,
  `rgt` int(11) NOT NULL,
  `name` varchar(64) NOT NULL,
  `lastModified` date DEFAULT NULL,
  `size` int(11) DEFAULT NULL,
  `keyPath` varchar(255) DEFAULT NULL,
  `root` int(11) DEFAULT NULL,
  `type` varchar(11) DEFAULT NULL,
  `mime` varchar(64) DEFAULT NULL,
  `extension` varchar(11) DEFAULT NULL,
  `s3Synced` tinyint(1) DEFAULT NULL,
  `transferInProgress` tinyint(1) DEFAULT NULL,
  `bytesTransfered` bigint(20) DEFAULT NULL,
  `transferTotalTime` double DEFAULT NULL,
  `transferAverageSpeed` bigint(20) DEFAULT NULL,
  `amazonAcl` varchar(255) DEFAULT NULL,
  `transferFailCount` smallint(6) DEFAULT NULL,
  `transferFailMessage` varchar(255) DEFAULT NULL,
  `owningProperty` bigint(20) DEFAULT NULL,
  `bucketId` bigint(20) DEFAULT NULL,
  `ownerId` bigint(20) DEFAULT NULL,
  `md5Name` varchar(32) DEFAULT NULL,
  `transferInitiated` date DEFAULT NULL,
  `rrs` tinyint(1) DEFAULT NULL,
  `etag` varchar(66) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `files_owningProperty_idx` (`owningProperty`),
  KEY `files_bucketId_idx` (`bucketId`),
  KEY `files_ownerId_idx` (`ownerId`),
  CONSTRAINT `files_ibfk_1` FOREIGN KEY (`owningProperty`) REFERENCES `entities` (`id`) ON DELETE CASCADE,
  CONSTRAINT `files_ibfk_3` FOREIGN KEY (`ownerId`) REFERENCES `acl_users` (`id`),
  CONSTRAINT `files_ibfk_4` FOREIGN KEY (`bucketId`) REFERENCES `aws_buckets` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

How do I refer to them in zend framework? (Does it make sense to use a cdn like cloudfront? How to create urls?)

You would create a View Helper, and then have something like $view->createUrl( $file ) where $file contains everything that it needs to construct your URL. So you would have your Object path and it's key.

Can I keep my code generic to be able to switch from a local storage (at the beginning) to a cloud storage with the growth of the project?

Not really. Zend_Cloud is not fully developed yet. The idea with Zend_Cloud is that it will be interchangeable with any Cloud storage adapter, but it isn't ready.

How do I optimize my assets for different devices/screen sizes? Can I still have only one source?

I create different sizes of all of my Objects. I then append it like /123123123/large.jpg /123123123/medium.jpg