Actionscript optimisation confusion
EDIT: Ah ha! The solution to my problem was simple afterall!
I was so focused on why it was taking so long that I was blinded by the fact that my object pool allocate()
function is lazily evaluated. i.e. It won't allocate until it's called. But my timer test was around alloc
, so obviously the first time through the timer timed the pool creation! DOH! So I've updated by object pool constructor to give you the option (I've updated it below too). Now mine is the fasest in the result table (as expected):
- pushing 1000000 array: t=599
- my object pool 1000000 array: t=157
- polygon op 1000000 array: t=212
The upside of all of this, is that it's been a deep-end jump into Actionscript. I've learned heaps about how it works under the hood, and even tried coding using Haxe (which is excellent by the way). The main takeaway for me, is that if you want to get fast array access, you need to allocate all of your array elements in a nice continuous block from 0 to count. So... after all of that, it was a simple bug, but this thread is still very interesting for the research, which is all valid still. :) Right... back to work, and thanks for the help anyway :)**
EDIT: I've had some responses informing me that arrays in Actionscript aren't as straightforward as C/C++ arrays, so I've been reading up on how they work. I'm still confused though, because what I am reading is the array tries to optimize the memory usage, particularly for sparse arrays. It seems to have two modes: Direct Access and a Hash. If array elements are created in order, one after the other, then it should be a Direct Access structure. This is confusing in my case, because I DO create all elements up front, in order, but then random access them later. Is this still slow? If so, it blows my mind. :|. Also, I can't use the 'vector' type, because I may have to target non-Flash 10 installs.
I'm going out of my mind here (sorry for the long post). I'm porting over a chunk of my C++ code to Actionscript 3 (I'm new to AS, but tonnes of C++ experience), and I've run into some strangeness. I'm using Flex 3 (but deriving Sprite, so not using any Flex classes), and I'm using the Flex IDE from Adobe.
In writing my linked list class, I decided to write an object pool class, to avoid new()ing every list node. I would normally use a freelist (in C++), but I can simplify it in AS because I can't cast memory to the object type, so I just new()
all of the classes in a free array up front, then in batches as needed. The idea is that you alloc()
from the free list, and it just takes the first object off the list, which has already been allocated, and decrements the free count.
Anyway, it all worked, but just for sanity sake, I created 1000000 objects by calling new()
then 1000000 objects calling my freelist.alloc()
, and lo and behold, mine is a LOT slower! I just can't believe it. How can copying a reference from an array be slower than new()
? Here is my object pool class:
public class lkObjectPool
{
public static const DEFAULT_REALLOC_COUNT:int=100;
public function lkObjectPool(classType:Class, batchSize:int=DEFAULT_REALLOC_COUNT, preallocate:Boolean=true)
{
if (!batchSize) {
batchSize=1;
}
_batchSize=batchSize;
_classType=classType;
if (preallocate) {
addBatch();
}
}
private function addBatch():void
{
var i:int;
for (i=0;i<_batchSize;++i) {
_freelist[_freelistCount++]=new _classType();
}
}
public function allocate():*
{
if (!_freelistCount) {
addBatch();
}
return _freelist[--_freelistCount];
}
public function free(object:*):void
{
_freelist[_freelistCount++]=object;
}
private var _batchSize:int;
private var _classType:Class;
private var _freelistCount:int;
private var _freelist:Array=[];
}
I was getting MUCH worse than simply calling new()
1000000 times. Anyway, I was going out of my mind, so I hunted around to see if anyone else had written an object pool, and came across this one:
http://lab.polygonal.de/2008/06/18/using-object-pools/
So, I tried their object pool out, and it was faster than the new()
and mine. Here are the tests:
var st:int;
var dt:int;
var i:int;
var list1:Array=new Array(1000000);
var list2:Array=new Array(1000000);
var list3:Array=new Array(1000000);
trace("-- pushing 1000000 array");
st=getTimer();
for (i=0;i<1000000;++i) {
list1[i]=new MyClass();
}
dt=getTimer()-st;
trace("t="+dt);
var objectPool:lkObjectPool=new lkObjectPool(MyClass, 1000000);
trace("-- my object pool 1000000 array");
st=getTimer();
for (i=0;i<1000000;++i) {
list2[i]=objectPool.allocate();
}
dt=getTimer()-st;
trace("t="+dt);
var pool:ObjectPool = new ObjectPool(false);
pool.allocate(1000000, MyClass);
pool.initialze("init", []);
trace("-- polygon po 1000000 array");
st=getTimer();
for (i=0;i<1000000;++i) {
list3[i]=pool.object;
}
dt=getTimer()-st;
trace("t="+dt);
Here are the results (new, mine, theirs - in milliseconds):
- pushing 1000000 array: t=455
- my object pool 1000000 array: t=552
- polygon po 1000000 array: t=210
So I checked through the code and it all comes down to how they get their next free node, and how I get my next free node (I've snipped them both down to the pure alloc
case):
Me:
public function allocate():*
{
return _freelist[--_freelistCount];
}
Them:
public function get object():*
{
var o:* = _allocNode.data;
_allocNode.data = null;
_allocNode = _allocNode.next;
_usageCount++;
return o;
}
So it looks like the only difference is 开发者_StackOverflow中文版I use an array (with my own cached index, not array.count
) and they do a link list removal.
So FINALLY my question is: How can an array index POSSIBLY be slower than a dereferencing an object, or even calling new()
explicitly.
I know it's a bit of a long question, but if anyone who could shed some light on this, it would be much appreciated!
AS3 doesn't have real arrays; the Array class actually uses an associative array (almost just a plain object) underneath.
Yes, it's horrible.
Apparently Flex 4 will feature real arrays.
- As said before accessing an Array by an index as you will do in C in not the same, it s not a direct access to the data the virtual machine will do some check before giving you the data you want. I suggest you read this post talking about implementation of Array structure within Tamarin (Adobe virtual machine used in Flash)
- To simplify , the second implementation use a Class for the Linked List which mean that the compiler (and then the
VM
) know exactly when you call a field (data
,next
, ...) where and how it can found them. It has less work to do, less overhead and so it will be faster.
If you want to use faster Array, use a typed Vector instead when available.
Avoid to use an Object to store field make a class, etc..
You can check for example Joa Ebert paper on As3
optimisation
Edit:
Let's see the ABC bytecode
produce for an Array access compare to a class property:
Compile this class:
package
{
import flash.display.Sprite;
public class TestArray extends Sprite
{
private var _array:Array=[1];
private var _ll:LinkList=new LinkList(1);
private function test():void {
var d1:*= _array[0];
var d2:*= _ll.data;
}
public function TestArray()
{
test();
}
}
}
class LinkList {
public var data:Object;
public var next:LinkList;
public function LinkList(data:Object) {
this.data = data;
}
}
and look at the bytecode
for test() function
(AVM2 overview):
1. GetLocal0
2. PushScope
3. GetLocal0
4. GetProperty QName(PrivateNamespace(""), "_array")
5. PushByte 0x0
6. GetProperty MultinameL(
NamespaceSet(
PrivateNamespace(""),
PrivateNamespace(""),
PackageNamespace(""),
PackageInternalNamespace(""),
Namespace(http://adobe.com/AS3/2006/builtin),
ProtectedNamespace("TestArray"), StaticProtectedNamespace("TestArray"),
StaticProtectedNamespace("flash.display:Sprite"),
StaticProtectedNamespace("flash.display:DisplayObjectContainer"),
StaticProtectedNamespace("flash.display:InteractiveObject"),
StaticProtectedNamespace("flash.display:DisplayObject"),
StaticProtectedNamespace("flash.events:EventDispatcher"),
StaticProtectedNamespace("Object")
))
7. CoerceAny
8. SetLocal1
9. GetLocal0
10. GetProperty QName(PrivateNamespace(""), "_ll")
11. GetProperty QName(PackageNamespace(""), data")
12. CoerceAny
13. SetLocal2
14. ReturnVoid
So accessing the indexed element from the Array result in the long lookup GetProperty
at line 6. which is not as free lookup as the single lookup line 11. for the data field access
You're trying to think for AS3. AS3 thinks for itself, especially when it comes to Arrays since AS3 arrays are not actual arrays.
One thing I see in your code that could be making a difference is that you are not typing your iterator counts. I think that AS3 will default any untyped numeric value to Number, which is way way way way slower than int or uint when it comes to array index access.
Try explicitly declaring your iterators as uint and see how the timings come out.
the As3 array is a mix between a dense vector and a Hash Table. All index between 0 and the first undefined index. I explain it all at: http://jpauclair.wordpress.com/2009/12/02/tamarin-part-i-as3-array/
Oh dear, I found the problem. See the question for my final edit explaining what happened. Still, this thread is interesting because it discusses some cool optimisation information.
精彩评论