How to optimize an image processing class
I have the following class that processes a Bitmap to place a fisheye distortion on it.
I've run my app through TraceView and found that virtually all the processing time is spent looping through the bitmap.
One developer has suggested not using float as this will slow things down where graphics are concerned. Also using math.pow() and ceil() are not necessary? At the moment to place the effect by looping through the entire bitmap takes around 42 seconds, yes seconds:) I've tried replacing the floats with ints and that has reduced the time to 37 secs, but the effect is no longer present on the bitmap. The arg k is originally a float and sets the level of distortion eg 0.0002F, if I pass an int the effect doesn't work.Can anyone point me in the right direction on how to optimize this process? Once I've optimized it I'd like to look into perhaps not looping through the entire bitmap and maybe putting a bounding box around the effect or using the algorithm below that determines whether a pixel is within the circle with radius 150.
clas开发者_StackOverflow社区s Filters{
float xscale;
float yscale;
float xshift;
float yshift;
int [] s;
private String TAG = "Filters";
long getRadXStart = 0;
long getRadXEnd = 0;
long startSample = 0;
long endSample = 0;
public Filters(){
Log.e(TAG, "***********inside filter constructor");
}
public Bitmap barrel (Bitmap input, float k){
//Log.e(TAG, "***********INSIDE BARREL METHOD ");
float centerX=input.getWidth()/2; //center of distortion
float centerY=input.getHeight()/2;
int width = input.getWidth(); //image bounds
int height = input.getHeight();
Bitmap dst = Bitmap.createBitmap(width, height,input.getConfig() ); //output pic
// Log.e(TAG, "***********dst bitmap created ");
xshift = calc_shift(0,centerX-1,centerX,k);
float newcenterX = width-centerX;
float xshift_2 = calc_shift(0,newcenterX-1,newcenterX,k);
yshift = calc_shift(0,centerY-1,centerY,k);
float newcenterY = height-centerY;
float yshift_2 = calc_shift(0,newcenterY-1,newcenterY,k);
xscale = (width-xshift-xshift_2)/width;
// Log.e(TAG, "***********xscale ="+xscale);
yscale = (height-yshift-yshift_2)/height;
// Log.e(TAG, "***********yscale ="+yscale);
// Log.e(TAG, "***********filter.barrel() about to loop through bm");
/*for(int j=0;j<dst.getHeight();j++){
for(int i=0;i<dst.getWidth();i++){
float x = getRadialX((float)i,(float)j,centerX,centerY,k);
float y = getRadialY((float)i,(float)j,centerX,centerY,k);
sampleImage(input,x,y);
int color = ((s[1]&0x0ff)<<16)|((s[2]&0x0ff)<<8)|(s[3]&0x0ff);
// System.out.print(i+" "+j+" \\");
dst.setPixel(i, j, color);
}
}*/
int origPixel;
long startLoop = System.currentTimeMillis();
for(int j=0;j<dst.getHeight();j++){
for(int i=0;i<dst.getWidth();i++){
origPixel= input.getPixel(i,j);
getRadXStart = System.currentTimeMillis();
float x = getRadialX((float)j,(float)i,centerX,centerY,k);
getRadXEnd= System.currentTimeMillis();
float y = getRadialY((float)j,(float)i,centerX,centerY,k);
sampleImage(input,x,y);
int color = ((s[1]&0x0ff)<<16)|((s[2]&0x0ff)<<8)|(s[3]&0x0ff);
// System.out.print(i+" "+j+" \\");
if( Math.sqrt( Math.pow(i - centerX, 2) + ( Math.pow(j - centerY, 2) ) ) <= 150 ){
dst.setPixel(i, j, color);
}else{
dst.setPixel(i,j,origPixel);
}
}
}
long endLoop = System.currentTimeMillis();
long loopDuration = endLoop - startLoop;
long radXDuration = getRadXEnd - getRadXStart;
long sampleDur = endSample - startSample;
Log.e(TAG, "sample method took "+sampleDur+"ms");
Log.e(TAG, "getRadialX took "+radXDuration+"ms");
Log.e(TAG, "loop took "+loopDuration+"ms");
// Log.e(TAG, "***********filter.barrel() looped through bm about to return dst bm");
return dst;
}
void sampleImage(Bitmap arr, float idx0, float idx1)
{
startSample = System.currentTimeMillis();
s = new int [4];
if(idx0<0 || idx1<0 || idx0>(arr.getHeight()-1) || idx1>(arr.getWidth()-1)){
s[0]=0;
s[1]=0;
s[2]=0;
s[3]=0;
return;
}
float idx0_fl=(float) Math.floor(idx0);
float idx0_cl=(float) Math.ceil(idx0);
float idx1_fl=(float) Math.floor(idx1);
float idx1_cl=(float) Math.ceil(idx1);
int [] s1 = getARGB(arr,(int)idx0_fl,(int)idx1_fl);
int [] s2 = getARGB(arr,(int)idx0_fl,(int)idx1_cl);
int [] s3 = getARGB(arr,(int)idx0_cl,(int)idx1_cl);
int [] s4 = getARGB(arr,(int)idx0_cl,(int)idx1_fl);
float x = idx0 - idx0_fl;
float y = idx1 - idx1_fl;
s[0]= (int) (s1[0]*(1-x)*(1-y) + s2[0]*(1-x)*y + s3[0]*x*y + s4[0]*x*(1-y));
s[1]= (int) (s1[1]*(1-x)*(1-y) + s2[1]*(1-x)*y + s3[1]*x*y + s4[1]*x*(1-y));
s[2]= (int) (s1[2]*(1-x)*(1-y) + s2[2]*(1-x)*y + s3[2]*x*y + s4[2]*x*(1-y));
s[3]= (int) (s1[3]*(1-x)*(1-y) + s2[3]*(1-x)*y + s3[3]*x*y + s4[3]*x*(1-y));
endSample = System.currentTimeMillis();
}
int [] getARGB(Bitmap buf,int x, int y){
int rgb = buf.getPixel(y, x); // Returns by default ARGB.
int [] scalar = new int[4];
scalar[0] = (rgb >>> 24) & 0xFF;
scalar[1] = (rgb >>> 16) & 0xFF;
scalar[2] = (rgb >>> 8) & 0xFF;
scalar[3] = (rgb >>> 0) & 0xFF;
return scalar;
}
float getRadialX(float x,float y,float cx,float cy,float k){
x = (x*xscale+xshift);
y = (y*yscale+yshift);
float res = x+((x-cx)*k*((x-cx)*(x-cx)+(y-cy)*(y-cy)));
return res;
}
float getRadialY(float x,float y,float cx,float cy,float k){
x = (x*xscale+xshift);
y = (y*yscale+yshift);
float res = y+((y-cy)*k*((x-cx)*(x-cx)+(y-cy)*(y-cy)));
return res;
}
float thresh = 1;
float calc_shift(float x1,float x2,float cx,float k){
float x3 = (float)(x1+(x2-x1)*0.5);
float res1 = x1+((x1-cx)*k*((x1-cx)*(x1-cx)));
float res3 = x3+((x3-cx)*k*((x3-cx)*(x3-cx)));
if(res1>-thresh && res1 < thresh)
return x1;
if(res3<0){
return calc_shift(x3,x2,cx,k);
}
else{
return calc_shift(x1,x3,cx,k);
}
}
}// end of filters class
[update] I've created the arrays as instance variable and instantiated them in the Filter() constructor. Is this what you meant? The app was running at 84 secs(mistake), but now runs at 69 secs. there seems to be no GC logged out either.
class Filters{
private float xscale;
private float yscale;
private float xshift;
private float yshift;
private int [] s;
private int [] scalar;
private int [] s1;
private int [] s2;
private int [] s3;
private int [] s4;
private String TAG = "Filters";
long getRadXStart = 0;
long getRadXEnd = 0;
long startSample = 0;
long endSample = 0;
public Filters(){
Log.e(TAG, "***********inside filter constructor");
s = new int[4];
scalar = new int[4];
s1 = new int[4];
s2 = new int[4];
s3 = new int[4];
s4 = new int[4];
}
public Bitmap barrel (Bitmap input, float k){
//Log.e(TAG, "***********INSIDE BARREL METHOD ");
Debug.startMethodTracing("barrel");
float centerX=input.getWidth()/2; //center of distortion
float centerY=input.getHeight()/2;
int width = input.getWidth(); //image bounds
int height = input.getHeight();
Bitmap dst = Bitmap.createBitmap(width, height,input.getConfig() ); //output pic
// Log.e(TAG, "***********dst bitmap created ");
xshift = calc_shift(0,centerX-1,centerX,k);
float newcenterX = width-centerX;
float xshift_2 = calc_shift(0,newcenterX-1,newcenterX,k);
yshift = calc_shift(0,centerY-1,centerY,k);
float newcenterY = height-centerY;
float yshift_2 = calc_shift(0,newcenterY-1,newcenterY,k);
xscale = (width-xshift-xshift_2)/width;
// Log.e(TAG, "***********xscale ="+xscale);
yscale = (height-yshift-yshift_2)/height;
// Log.e(TAG, "***********yscale ="+yscale);
// Log.e(TAG, "***********filter.barrel() about to loop through bm");
/*for(int j=0;j<dst.getHeight();j++){
for(int i=0;i<dst.getWidth();i++){
float x = getRadialX((float)i,(float)j,centerX,centerY,k);
float y = getRadialY((float)i,(float)j,centerX,centerY,k);
sampleImage(input,x,y);
int color = ((s[1]&0x0ff)<<16)|((s[2]&0x0ff)<<8)|(s[3]&0x0ff);
// System.out.print(i+" "+j+" \\");
dst.setPixel(i, j, color);
}
}*/
int origPixel;
long startLoop = System.currentTimeMillis();
for(int j=0;j<dst.getHeight();j++){
for(int i=0;i<dst.getWidth();i++){
origPixel= input.getPixel(i,j);
getRadXStart = System.currentTimeMillis();
float x = getRadialX((float)j,(float)i,centerX,centerY,k);
getRadXEnd= System.currentTimeMillis();
float y = getRadialY((float)j,(float)i,centerX,centerY,k);
sampleImage(input,x,y);
int color = ((s[1]&0x0ff)<<16)|((s[2]&0x0ff)<<8)|(s[3]&0x0ff);
// System.out.print(i+" "+j+" \\");
if( Math.sqrt( Math.pow(i - centerX, 2) + ( Math.pow(j - centerY, 2) ) ) <= 150 ){
dst.setPixel(i, j, color);
}else{
dst.setPixel(i,j,origPixel);
}
}
}
long endLoop = System.currentTimeMillis();
long loopDuration = endLoop - startLoop;
long radXDuration = getRadXEnd - getRadXStart;
long sampleDur = endSample - startSample;
Log.e(TAG, "sample method took "+sampleDur+"ms");
Log.e(TAG, "getRadialX took "+radXDuration+"ms");
Log.e(TAG, "loop took "+loopDuration+"ms");
// Log.e(TAG, "***********filter.barrel() looped through bm about to return dst bm");
Debug.stopMethodTracing();
return dst;
}
void sampleImage(Bitmap arr, float idx0, float idx1)
{
startSample = System.currentTimeMillis();
// s = new int [4];
if(idx0<0 || idx1<0 || idx0>(arr.getHeight()-1) || idx1>(arr.getWidth()-1)){
s[0]=0;
s[1]=0;
s[2]=0;
s[3]=0;
return;
}
float idx0_fl=(float) Math.floor(idx0);
float idx0_cl=(float) Math.ceil(idx0);
float idx1_fl=(float) Math.floor(idx1);
float idx1_cl=(float) Math.ceil(idx1);
/* int [] s1 = getARGB(arr,(int)idx0_fl,(int)idx1_fl);
int [] s2 = getARGB(arr,(int)idx0_fl,(int)idx1_cl);
int [] s3 = getARGB(arr,(int)idx0_cl,(int)idx1_cl);
int [] s4 = getARGB(arr,(int)idx0_cl,(int)idx1_fl);*/
s1 = getARGB(arr,(int)idx0_fl,(int)idx1_fl);
s2 = getARGB(arr,(int)idx0_fl,(int)idx1_cl);
s3 = getARGB(arr,(int)idx0_cl,(int)idx1_cl);
s4 = getARGB(arr,(int)idx0_cl,(int)idx1_fl);
float x = idx0 - idx0_fl;
float y = idx1 - idx1_fl;
s[0]= (int) (s1[0]*(1-x)*(1-y) + s2[0]*(1-x)*y + s3[0]*x*y + s4[0]*x*(1-y));
s[1]= (int) (s1[1]*(1-x)*(1-y) + s2[1]*(1-x)*y + s3[1]*x*y + s4[1]*x*(1-y));
s[2]= (int) (s1[2]*(1-x)*(1-y) + s2[2]*(1-x)*y + s3[2]*x*y + s4[2]*x*(1-y));
s[3]= (int) (s1[3]*(1-x)*(1-y) + s2[3]*(1-x)*y + s3[3]*x*y + s4[3]*x*(1-y));
endSample = System.currentTimeMillis();
}
int [] getARGB(Bitmap buf,int x, int y){
int rgb = buf.getPixel(y, x); // Returns by default ARGB.
// int [] scalar = new int[4];
scalar[0] = (rgb >>> 24) & 0xFF;
scalar[1] = (rgb >>> 16) & 0xFF;
scalar[2] = (rgb >>> 8) & 0xFF;
scalar[3] = (rgb >>> 0) & 0xFF;
return scalar;
}
float getRadialX(float x,float y,float cx,float cy,float k){
x = (x*xscale+xshift);
y = (y*yscale+yshift);
float res = x+((x-cx)*k*((x-cx)*(x-cx)+(y-cy)*(y-cy)));
return res;
}
float getRadialY(float x,float y,float cx,float cy,float k){
x = (x*xscale+xshift);
y = (y*yscale+yshift);
float res = y+((y-cy)*k*((x-cx)*(x-cx)+(y-cy)*(y-cy)));
return res;
}
float thresh = 1;
float calc_shift(float x1,float x2,float cx,float k){
float x3 = (float)(x1+(x2-x1)*0.5);
float res1 = x1+((x1-cx)*k*((x1-cx)*(x1-cx)));
float res3 = x3+((x3-cx)*k*((x3-cx)*(x3-cx)));
if(res1>-thresh && res1 < thresh)
return x1;
if(res3<0){
return calc_shift(x3,x2,cx,k);
}
else{
return calc_shift(x1,x3,cx,k);
}
}
}// end of filters class
First off - measure some parts of your function and see where the bottlenecks are. Don't try and optimise by guesswork.
Having said that, I'll now attempt said task :)
Doing a sqrt()
per pixel is pretty expensive - you're comparing to a constant, so instead, square the constant and compare the squared value with that:
if( ( Math.pow(i - centerX, 2) + ( Math.pow(j - centerY, 2) ) ) <= 150*150 ){
Also using pow(x,2)
to square something is probably calling the library function for pow()
, converting your float
s to double
s, doing a general-purpose power-raising algorithm and converting back to float
s. Just use x*x
instead.
if(((i-centerX)*(i-centerX) + (j-centerY)*(j-centerY)) <= 150){
From what I see, your code is doing the following:
for (every pixel in bitmap){
getPixel();
...do something to pixel...
setPixel();
}
The getPixel()
and setPixel()
function calls are relatively expensive. Instead of calling them over and over again in your loop, you could try to get all of the pixels into an array using getPixels(), and then access each pixel through the array. Refer to this answer.
If that is still not sufficient, try coding the above in C++ through the NDK.
One thing you might try is avoid creating/recreating the int arrays in 'sampleImage' and 'getARGB' by instantiating them once outside the two nested loops and passing them into those methods. This wouldn't be best practice from a code maintainability standpoint. However, it would avoid the repeated object creation, array initialization and garbage collection overhead. These tend to be much costlier than the arithmetic operations in the remainder of the code.
精彩评论