开发者

SQL Server: Get records that match specifc data at specific location?

I have a varchar column that has data such as 00110100001110100100010011111, and I need to get back records that have 1 in position 5 and 0 in position 11. What is the fastest way I can search for them?

开发者_开发问答

Right now I'm thinking of using substring: substring(column, 5, 1)==1 and substring (column, 11,1)==0. Is this the best way? Thanks.


LIKE '____1_____0%' is the simplest way with your current structure. It will involve a full table scan though due to the leading wildcard.

What does this string of characters represent though?

If it is a fixed set of boolean values you might consider separating them out into individual bit columns and indexing them individually.

This is more space efficient as 8 values can fit into 2 bytes (including null bitmap) as opposed to 2 values in 2 bytes for the varchar version.

You might well still end up with table scans however as these indexes will not be selective enough to be used except if the values are skewed and you are searching for the less common values but at least SQL Server will be able to maintain separate column statistics and use the indexes when this would help.

If it is an arbitrary set (e.g. an ever growing history of states) then you should probably separate out into a new table (EntityId, Position (int), Value (bit)). You can then use a relational division query to bring back all EntityIds matching the desired pattern.

SELECT  EntityId
WHERE   ( Position = 5
          AND Value = 1
        )
        OR ( Position = 11
             AND Value = 0
           )
GROUP BY EntityId
HAVING  COUNT(*) = 2


Use SUBSTRING. You can parameterise substring so if you want positions 3 and 13 you can change it or have it in a UDF etc

It depends what you want of course

If it's static positions, use Martin Smith's answer because it's cleaner

I suspect you need to refactor this column into several discrete ones though


Do positions 5 and 11 stay constant? Do you have ability to create computed columns and indexes?

If the answer to both of these questions is "yes", then you should be able to achieve good performance by implementing the following general idea:

  • Create computed column on substring(column, 5, 1).
  • Create computed column on substring(column, 11,1).
  • Create a composite index on both of these columns.

Then, in your query, just use the exact same expressions as in the definitions of your computed columns (such as: substring(column, 5, 1)==1 and substring (column, 11,1)==0, as you already proposed).

That being said, if you can, do yourself a favor and normalize your data model. Your table is not even in the 1st normal form!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜