A dictionary object that uses ranges of values for keys
I have need of a sort of specialized dictionary. My use case is this: The user wants to specify ranges of values (the range could be a single point as well) and assign a value to a particular range. We then want to perform a lookup using a single value as a key. If this single value occurs within one of the ranges then we will return the value associated to the range.
For example:
// represents the keyed value
struct Interval
{
public int Min;
public int Max;
}
// some code elsewhere in the program
var dictionary = new Dictionary<Interval, double>();
dictionary.Add(new Interval { Min = 0, Max = 10 }, 9.0);
var result = dictionary[1];
if (result == 9.0) JumpForJoy();
This is obviously just some code to illustrate what I'm looking for. Does anyone know of an algorithm to implement such a thing? If so could they point me towards it, please?
I have already tried imple开发者_StackOverflowmenting a custom IEqualityComparer object and overloading Equals() and GetHashCode() on Interval but to no avail so far. It may be that I'm doing something wrong though.
A dictionary is not the appropriate data structure for the operations you are describing.
If the intervals are required to never overlap then you can just build a sorted list of intervals and binary search it.
If the intervals can overlap then you have a more difficult problem to solve. To solve that problem efficiently you'll want to build an interval tree:
http://en.wikipedia.org/wiki/Interval_tree
This is a well-known data structure. See "Introduction To Algorithms" or any other decent undergraduate text on data structures.
This is only going to work when the intervals don't overlap. And your main problem seems to be converting from a single (key) value to an interval.
I would write a wrapper around SortedList. The SortedList.Keys.IndexOf() would find you an index that can be used to verify if the interval is valid and then use it.
This isn't exactly what you want but I think it may be the closest you can expect.
You can of course do better than this (Was I drinking earlier?). But you have to admit it is nice and simple.
var map = new Dictionary<Func<double, bool>, double>()
{
{ d => d >= 0.0 && d <= 10.0, 9.0 }
};
var key = map.Keys.Single(test => test(1.0))
var value = map[key];
I have solved a similar problem by ensuring that the collection is contiguous where the intervals never overlap and never have gaps between them. Each interval is defined as a lower boundary and any value lies in that interval if it is equal to or greater than that boundary and less than the lower boundary of the next interval. Anything below the lowest boundary is a special case bin.
This simplifies the problem somewhat. We also then optimized key searches by implementing a binary chop. I can't share the code, unfortunately.
I would make a little Interval class, which would something like that:
public class Interval
{
public int Start {get; set;}
public int End {get; set;}
public int Step {get; set;}
public double Value {get; set;}
public WriteToDictionary(Dictionary<int, double> dict)
{
for(int i = Start; i < End; i += Step)
{
dict.Add(i, Value);
}
}
}
So you still can a normal lookup within your dictionary. Maybe you should also perform some checks before calling Add()
or implement some kind of rollback if any value is already within the dictionary.
You can find a Java flavored C# implementation of an interval tree in the Open Geospatial Library. It needs some minor tweaks to solve your problem and it could also really use some C#-ification.
It's Open Source but I don't know under what license.
i adapted some ideas for Dictionary and func, like "ChaosPandion" gave me the idea in his earlier post here above.
i still solved the coding, but if i try to refactor
i have a amazing problem/bug/lack of understanding:
Dictionary<Func<string, double, bool>, double> map = new Dictionary<Func<string, double, bool>, double>()
{
{ (a, b) => a == "2018" && b == 4, 815.72},
{ (a, b) => a == "2018" && b == 6, 715.72}
};
What is does is, that i call the map with a search like "2018"(year) and 4(month), which the result is double value 815,72. When i check the unique map entries they look like this:
map working unique keys
so thats the orginal behaviour, anything fine so far. Then i try to refactor it, to this:
Dictionary<Func<string, double, bool>, double> map =
new Dictionary<Func<string, double, bool>, double>();
WS22(map, values2018, "2018");
private void WS22(Dictionary<Func<string, double, bool>, double> map, double[] valuesByYear, string strYear)
{
int iMonth = 1;
// step by step this works:
map.Add((a, b) => (a == strYear) && (b == 1), dValue);
map.Add((a, b) => (a == strYear) && (b == 2), dValue);
// do it more elegant...
foreach (double dValue in valuesByYear)
{
//this does not work: exception after second iteration of foreach run
map.Add((a, b) => (a == strYear) && (b == iMonth), dValue );
iMonth+=1;
}
}
this works: (i use b==1 and b==2)
this does not work (map not working exception on add item on second iteration)
so i think the problem is, that the map does not have a unique key while adding to map dictionary. The thing is, i dont see my error, why b==1 is working and b==iMonth not.
Thx for any help, that open my eyes :)
Using Binary Search, I created an MSTest v2 test case that approaches the solution. It assumes that the index is the actual number you are looking for, which does not (might not?) suit the description given by the OP.
Note that the ranges do not overlap. And that the ranges are
- [negative infinity, 0)
- [0, 5]
- (5, 15]
- (15, 30]
- (30, 100]
- (100, 500]
- (500, positive infinity]
This values passed as minimumValues
are sorted, since they are constants in my domain. If these values can change, the minimumValues
list should be sorted again.
Finally, there is a test that uses if
statements to get to the same result (which is probably more flexible if you need something else than the index
).
[TestClass]
public class RangeUnitTests
{
[DataTestMethod]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, -1, 0)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 0, 1)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 1, 1)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 5, 1)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 7, 2)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 15, 2)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 16, 3)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 30, 3)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 31, 4)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 100, 4)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 101, 5)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 500, 5)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 501, 6)]
public void Use_BinarySearch_To_Determine_Range(int[] minimumValues, int inputValue, int expectedRange)
{
var list = minimumValues.ToList();
var index = list.BinarySearch(inputValue);
if (index < 0)
{
index = ~index;
}
Assert.AreEqual(expectedRange, index);
}
[DataTestMethod]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, -1, 0)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 0, 1)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 1, 1)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 5, 1)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 7, 2)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 15, 2)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 16, 3)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 30, 3)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 31, 4)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 100, 4)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 101, 5)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 500, 5)]
[DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 501, 6)]
public void Use_Ifs_To_Determine_Range(int[] _, int inputValue, int expectedRange)
{
int actualRange = 6;
if (inputValue < 0)
{
actualRange = 0;
}
else if (inputValue <= 5)
{
actualRange = 1;
}
else if (inputValue <= 15)
{
actualRange = 2;
}
else if (inputValue <= 30)
{
actualRange = 3;
}
else if (inputValue <= 100)
{
actualRange = 4;
}
else if (inputValue <= 500)
{
actualRange = 5;
}
Assert.AreEqual(expectedRange, actualRange);
}
}
I did a little perfomance testing by duplicating the initial set [DataRow]
several times (up to 260 testcases for each method). I did not see a significant difference in performance with these parameteres. Note that I ran each [DataTestMethod]
in a seperate session. Hopefully this balances out any start-up costs that the test framework might add to first test that is executed.
You could check out the powercollections here found on codeplex that has a collection that can do what you are looking for.
Hope this helps, Best regards, Tom.
精彩评论