Best way to store List<Point> in a string and parse back
What would be the fastest way to store List of type Point, in a string that produces minimum string length and with fastest parse back algorithm?
I found that framework has Convert.ToBase64String, Convert.FromBase64S开发者_运维知识库tring methods. Open to any ideas using these or even better self designed algorithms ;)
Thanks in advance
C#, vs2005 (.net 2.0)
-Edit-
I will use the code in an activeX component and i cant add another library just for this purpose.
Use hex representation of ints, it reduces size of the string:
Serialize:
List<Point> list = new List<Point>(new Point[] {new Point(1, 2), new Point(10, 20), new Point (100, 200), new Point(1000, 2000), new Point(10000, 20000)});
// 1. To.
StringBuilder sb = new StringBuilder();
foreach (Point point in list)
{
sb.Append(Convert.ToString(point.X, 16));sb.Append('.');
sb.Append(Convert.ToString(point.Y, 16));sb.Append(':');
}
string serialized = sb.ToString();
Here is the string in form: "x.y:1.2:a.14:64.c8:3e8.7d0:2710.4e20:"
Deserialize, splitting ('serialized' is the string contains chain of numbers):
string[] groups = serialized.Split(new char[] {':'}, StringSplitOptions.RemoveEmptyEntries);
foreach (string group in groups)
{
string[] coords = group.Split('.');
restored.Add(new Point(Convert.ToInt32(coords[0], 16), Convert.ToInt32(coords[1], 16)));
}
Or you could you regexp to parse groups("[0-9a-fA-F].[0-9a-fA-F]"), it's up to you. I am not sure which is quicker.
Or a simple state machine (just for fun):
List<Point> restored = new List<Point>();
string value = default(string);
int left = 0;
int x = 0, y = 0;
for (int i = 0; i < serialized.Length; i++)
{
if (serialized[i] == '.')
{
value = serialized.Substring(left, i - left);
left = i + 1;
x = Convert.ToInt32(value, 16);
}
else if (serialized[i] == ':')
{
value = serialized.Substring(left, i - left);
left = i + 1;
y = Convert.ToInt32(value, 16);
restored.Add(new Point(x, y));
}
}
IMHO.
EDITED: Or even better to pack integers to groups of hex: 1212 to 'CC' like it is used in old financial systems; it makes length of string two times less.
To String:
MYSTRING = string.Join("", list.Select(
point => point.x.toString("X").PadLeft(4, '0') +
point.y.toString("X").PadLeft(4, '0')).toArray() )
From String:
new Regex("(.{8})").Split(MYSTRING).Where(x => !string.IsNullOrEmpty(x)).
Select(x=> new Point(x.Take(4), x.Skip(4).Take(4)))
How about to use JSON?
Just use ProtoBuf.Net.
In all honesty, it is near impossible to answer this authoritatively. Much depends on how large the list of points are, whether it really needs to be a string, and what aspect you are trying to optimize for. For example, if you need raw speed you may trade RAM for processing time, but if you need throughput you need to have an algorithm that doesn't consume too many resources.
The most compact and quickest way to store a list of anything and restore the list later is to use binary serialization. Of course this leaves you with the risk that a change in the CLR can render the file unuseable. For anyone who is interested, xml serialization is neither efficient from speed nor from space conciderations but the format can be read by other CLRs with no change.
Base64 algorithms are fairly efficient, and use a code-table lookup algorithm that is very quick. Base64 encoding the binary format may yield very good results. But if you don't need to store it as a string, why go through the extra trouble?
CORBA also defines a binary to string algorithm that has to be efficient for the types of things it is trying to do. If I recall correctly it uses a 128 symbol code table (i.e. 128 bit encoding) so it is more compact than base 64.
In the end, you are going to have to run some tests. Before you start your tests, you have to know when the algorithm is good enough. Just how small does the string size have to be before it is acceptable? Just how fast does the parsing algorithm have to be before it is acceptable. And just how many of these do you need to parse simultaneously? Only you can determine that.
精彩评论