开发者

Large Scale Image Classifier

I have a large set of plant images labeled with the botanical name. What would be the best algorithm to use to train on this dataset in order to classify an unlabel photo? The photos are processed so that 100% of the pixels contain the plant (e.g. either closeups of the leaves or bark), so there are no other objects/empty-space/background that the algorithm would have to filter out.

I've already tried generating SIFT features for all the photos and feeding these (feature,label) pairs to a LibLinear SVM, but the accuracy was a miserable 6%.

I also tried feeding this same data to a few Weka classifiers. The accuracy was a little better (25% with Logistic, 18% with IBk), but Weka's not designed for scalability (it loads everything into memory). Since the SIFT feature dataset is a several million rows, I could only test Weka 开发者_Go百科with a random 3% slice, so it's probably not representative.

EDIT: Some sample images:

Large Scale Image Classifier

Large Scale Image Classifier


Normally, you would not train on the SIFT features directly. Cluster them (using k-means) and then train on the histogram of cluster membership identifiers (i.e., a k-dimensional vector, which counts, at position i, how many features were assigned to the i-th cluster).

This way, you obtain a single output per image (and a single, k-dimensional, feature vector).

Here's the quasi-code (using mahotas and milk in Pythonn):

from mahotas.surf import surf
from milk.unsupervised.kmeans import kmeans,assign_centroids
import milk

# First load your data:
images = ...
labels = ...

local_features = [surfs(im, 6, 4, 2) for im in imgs]
allfeatures = np.concatenate(local_features)
_, centroids = kmeans(allfeatures, k=100)
histograms = []
for ls in local_features:
     hist = assign_centroids(ls, centroids, histogram=True)
     histograms.append(hist)

cmatrix, _ = milk.nfoldcrossvalidation(histograms, labels)
print "Accuracy:", (100*cmatrix.trace())/cmatrix.sum()


This is a fairly hard problem.

You can give BoW model a try.

Basically, you extract SIFT features on all the images, then use K-means to cluster the features into visual words. After that, use the BoW vector to train you classifiers.

See the Wikipedia article above and the references papers in that for more details.


You probably need better alignment, and probably not more features. There is no way you can get acceptable performance unless you have correspondences. You need to know what points in one leaf correspond to points on another leaf. This is one of the "holy grail" problems in computer vision.

People have used shape context for this problem. You should probably look at this link. This paper describes the basic system behind leafsnap.


You can implement the BoW model according to this Bag-of-Features Descriptor on SIFT Features with OpenCV. It is a very good tutorial to implement the BoW model in OpenCV.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜