Is there added overhead to looking up a column in a DataTable by name rather than by index?

2022-12-26 13:06 问答作者：

In a DataTable object, is there added overhead to lo开发者_StackOverflow社区oking up a column value by name thisRow("ColumnA") rather than by the column index thisRow(0)? In which scenarios might this be an issue.

I work on a team that has lots of experience writing VB6 code and I noticed that didn't do column lookups by name for DataTable objects or data grids. Even in .NET code, we use a set of integer constants to reference column names in these types of objects. I asked our team lead why this was so, and he mentioned that in VB6, there was a lot of overhead in looking up data by column name rather than by index. Is this still true for .NET?

Example code (in VB.NET, but same applies to C#):

Public Sub TestADOData()
Dim dt As New DataTable

'Set up the columns in the DataTable    '
dt.Columns.Add(New DataColumn("ID", GetType(Integer)))
dt.Columns.Add(New DataColumn("Name", GetType(String)))
dt.Columns.Add(New DataColumn("Description", GetType(String)))

'Add some data to the data table    '
dt.Rows.Add(1, "Fred", "Pitcher")
dt.Rows.Add(3, "Hank", "Center Field")

'Method 1: By Column Name   '
For Each r As DataRow In dt.Rows
  Console.WriteLine( _
   "{0,-2} {1,-10} {2,-30}", r("ID"), r("Name"), r("Description"))
Next

Console.WriteLine()

'Method 2: By Column Name   '
For Each r As DataRow In dt.Rows
  Console.WriteLine("{0,-2} {1,-10} {2,-30}", r(0), r(1), r(2))
Next

End Sub

Is there an case where method 2 provides a performance advantage over method 1?

Yes, there should be a slight overhead connected to looking up columns by name instead of by index. I wouldn't worry about it, unless you keep looking up that same column in a loop, like in your code example. Because then the slight overhead might accumulate to a measurable overhead, depending on the table's number of rows.

The fastest way to access a particular column's value of some row is to lookup using the DataColumn object itself. For example:

Dim dt As DataTable = ...

Dim idColumn As DataColumn = dt.Columns("ID")
Dim nameColumn As DataColumn = dt.Columns("Name")
Dim descriptionColumn As DataColumn = dt.Columns("Description")

For Each r As DataRow In dt.Rows

    ' NB: lookup through a DataColumn object, not through a name, nor an index: '
    Dim id = r(idColumn)
    Dim name = r(nameColumn)
    Dim description = r(descriptionColumn)

    ...
Next

One last piece of advice: I would strongly advise you against using numerical indices! It makes your code more fragile, and also more difficult to understand and maintain: As soon as the logical order of a column changes, you need to adapt your code accordingly, possibly in several places (and you might easily oversee one of them, leading to bugs). If you instead use column names or DataColumn objects themselves for the lookup, you can change your columns' order without having to change the remaining code.

The facts are

Getting the column by index is a direct index into an ArrayList
If the name lookup is case sensitive a hashtable lookup is performed
If the name lookup is case insensitive then the list is scanned for the name and a more costly locale sensitive string comparison is performed.

So definately the retrieval by index will be the most performant, but does it matter?

Doing some basic (read naive) tests on my machine I found that 1,000,000 accesses to a column using the mentioned indexing mechanism take the following times

Direct index - 13.3ms
Case insensitive lookup - 109.11ms
Case sensitive lookup - 109.24ms

So depending on your scenario you can draw your conclusions.

Yes, there is overhead with looking up a column by name rather than absolute index (as it simply locates the column, then accesses it that way).

That being said, that's fairly premature optimization. The DataTable will first try to locate the column in a case-sensitive manner, which is very fast. If it can't locate the column that way, it looks in a case-insensitive manner, which is only slightly slower.

The absolute fastest way to access the data is via the DataColumn object itself, as that's what both the index-based and name-based accessors use.

The best approach would be to find the index by column name and use the index for locating column:

Dim table As DataTable = ...
Dim foo As int = table.Columns("Foo")

For Each row As DataRow In table.Rows
    Dim data = row(foo)
Next

You are looking by index and you can also guess from name, which column you are reading. Another advantage is that if you change order of fields in your "select" query, you will still get the correct value. On the other hand, if you hardcode indexes, your code will break.

Depends on the place you use it. On desktop application startup these kind of small inefficiencies may accumulate to long delays. On mouse and keyboard events most likely not. It will most likely be more efficient to spend time on function profiling (printing out execution times to dbgview), than optimizing this kind of low-level stuff.

继续阅读：.net

Is there added overhead to looking up a column in a DataTable by name rather than by index?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？