开发者

How noticeable is the difference of performance among TList, TObjectList, and plain array, if it could be estimated?

*Summarization:

Please check the knowledgeable comments from the Delphi experts. Specifically for me, I would try to use old TList/TObjectList as David suggested, and use hard-cast and TObjectList.List property as A.Bouchez suggested. I will try TDynArray when refactoring in future.

=====================================================================

Say that I have a TAtom class as defined in the following code. There are about hundreds up to thousands of TAtom instances at run time, stored in a dynamic array for now. At run time, I need to do simple float math on TAtom.X/Y/Z of all the existing TAtom instances more than 30 times per second.

Now, I need to add the ability of adding, inserting, deleting of TAtom instances at run time. It seems that my choices are (1) request a big array; (2开发者_如何学JAVA) stick to dynamic array and manually SetLength; (3) switch to regular TList; (4) switch to regular TObjectList.

I want to avoid (1) unless it is necessary, because I then have to change quite a lot function signatures. (2) looks not good either, because TList/TObjectList seems born for this task. However, because type-casting is needed using the regular TList/TObjectList, could some one comment on the possible performance hit? I mean, it would be best if the performance burden could be estimated before I rewrites the code. If the performance will drop noticeably, is there other technics that I could use?

Furthermore, I am wondering if there is performance difference between using TList and TObjectList?

  TAtom = class
  public
    ElementZ: Integer;
    X, Y, Z: Extended;  
    other variables: other types;
  end;

  TAAtom = array of TAtom;


May I add another choice to your list?

If you don't use any inheritance feature for the data in TAtom, you could use a record instead of a class. Each class instance will require to be allocated in memory, filled with zero and initialized individually. Getmem/Freemem always cost, and memory fragmentation will increase.

A pre-allocated dynamic array of record will be faster than individual class instances for adding. And the data will fit better for CPU L1/L2 cache.

For inserting and deleting, an array of such records will be slower than TList if you have a huge number of items, because there'll be more data to delete/insert (TList/TObjectList both maintain just a list of pointers). For even faster insertion/deletion, you should better use a linked list.

There is some overhead in the TList/TObjectList mechanism because of internal notification. mechanism And the GetItem() property could be a bit slower (because of range checking) than using directly a dynamic array.

But with our TDynArray wrapper, you could stick to a dynamic array, and still have good performance, pre-allocation features, and TList-like methods. And even more methods available, like SaveToStream, Slice, Reverse, sorting with external indexes and such...

type
  TAtom = record // could be 'packed record' to save memory (but loose perf)
    ElementZ: Integer;
    X, Y, Z: Extended;  
    other variables: other types;
    // TDynArray also handle complex (e.g. string) types here
  end;
  TAtoms = array of TAtom;

var Atom: TAtom;
    AtomArray: TAtoms;
    AtomCount: integer;
    Atoms: TDynArray;
begin
  Atoms.Init(TypeInfo(TAtoms),AtomArray,@AtomCount);
  Atoms.Capacity := 10000; // pre-allocate array = same as SetLength(AtomArray,10000)
  for i := 1 to 10000 do begin
    A.ElementZ := Random(1000);
    A.X := Random;
    A.Y := Ramdom;
    A.Z := Random;
    // set other fields
    Atoms.Add(A); // fast adding of A properties
  end;
  // you have TList-like methods for your dynamic array
  Atoms.Delete(500); // delete 500th item
  A.ElementZ := 5000;
  Atoms.Insert(500,A); // insert A values at 500th index
  assert(Atoms.Count=10000);
  assert(AtomCount=10000); // same as Atoms.Count
  Atoms.Compare := SortDynArrayInteger;
  Atoms.Sort; // will sort by 1st Integer value = ElementZ
  for i := 1 to Atoms.Count-1 do // or AtomCount-1
    // you have still direct access to AtomArray[]
    // -> this is even the fastest access to the data
    assert(AtomArray[i].ElementZ >=AtomArray[i-1].ElementZ )
  Atoms.SaveToStream(aStream); // will also save any string content
  Atoms.Reverse; // reverse all items order
  Atoms.Clear;
  // faster adding will be done with direct access to the dynamic array
  Atom.Count := 10000; // allocate memory for 10000 items
  for i := 0 to 10000-1 do
  with AtomArray[i] do
  begin
    ElementZ := Random(2000);
    X := Random;
    Y := Random;
    Z := Random;
  end;
  Atoms.Sort; // TDynArray knows about the data just created
end; // no need to have any try...finally ..Free block

Works with Delphi 6 up to XE.

With newer version of Delphi supporting generics, you should better go into this direction.


If you use Generics.Collections.TObjectList<TAtom> and there's no need for casting.

Performance should be fine for the usage that you describe. Inserting is more demanding than adding to the end because you need to shift the items after the insertion point up the list.

So long as you avoid SetLength(A, Length(A)+1) and opt for a more sensible allocation strategy dynamic arrays are equivalent to all of the TList like classes.

On occasions I have had problems with performance and memory fragmentation when trying to maintain large lists as contiguous blocks of memory. Then I have resorted to a sub-allocation scheme. But since your lists contain object references which are essentially pointers, you already have implicit sub-allocation.

It's all somewhat speculative and you really need to measure – otherwise we can only guess.


However, because type-casting is needed using the regular TList/TObjectList, could some one comment on the possible performance hit?

If you type cast using the form

List[I] as TAtom

as small overhead will be added, which can really add up in your situation. However, if you hard typecast

TAtom(List[I])

as far as I know, no overhead actually occurs as the typecast is performed without checks and I also believe it is done at compile time.

As for the other aspect of your question, I think they have been all properly covered already...


Make a test project and measure the time of adding, inserting and deleting thousands of TAtom instances using the four methods. Then decide which one to use.

TList and TObjectList are probably faster than a dynamic array when adding, inserting and deleting, because the dynamic array constantly has to be reallocated. The implementation of TList and TObjectList doesn't do that.


As TObjectList is a direct descendant of TList the performances will be very close.


First question: Are we talking about Classes.TList and Contnrs.TObjectList or are we talking about Generics.Collections.TList respectively Generics.Collections.TObjectList ?

If we're talking about generics, both TList and TObjectList are implemented using dynamic arrays. If there's any performance difference between directly using a dynamic array or using the nicer interface of the generic container, it's going to be negligible.


If we're talking about the "older" TList and TObjectList, then we need to compare only TList with the dynamic array equivalent, since TObjectList is an descendant of TList so it inherits all it's performance characteristics. TList uses a block of memory allocated using ReallocMem. The dynamic array does the same thing internally, so there shouldn't be a significant difference!

Conclusion

If there's any performance difference between the two, it's probably because naive use of dynamic array uses the dreaded SetLength(A, Length(A)+1), while the better implementation in all Delphi-provided containers pre-allocate memory in larger chunks. With proper code there shouldn't be any significant difference between those alternative!


TList etc. do exactly what code working on chunks of memory or dynarrays would have to do, but their implementation is already optimized for the common case, and includes considerations about how the memory manager behaves.

One criteria could be the ratio of reads/updates to the sequence. If the sequence is updated infrequently after created, then there should be perceivable better speed with dynarrays, because access to elements with TList and the likes requires one method call plus bounds checking, plus a type check if you use the as operator.

In the end, the cost of arithmetic done on TAtom should dominate the run time, making the choice of dynarray or TListXXX irrelevant.


A dynamic arrays of records will have the lowest impact when accessing the items, if your atoms are objects, then all the lists are going to be somewhat equivalent in terms of access speed.

However, if you perform many of them, your key issue will be the inserts and deletes, for which all lists and arrays will perform poorly, but that is what profiling will tell you. If that's the case, you might then want to consider:

  • a linked list if you don't need to access atoms by index
  • a tree, should you have use for a space partition of your atoms, you might as well use that partition to hold your atoms rather than the array
  • allowing undefined/nil elements in your array/list, and maintaining a stack of undefined/nil elements, and an index if you need a sorted list (potentially the highest performance solution, but also likely the most complex to get right in terms of efficiency)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜