How to get a short string after compressing/decompressing a string? [closed]
How to get a short string when we compress a long string in C#.
I want to compress a long string into a short string (with minimum length) and also want to decompress it to get back my original string. With minimum length means if the original string length is 10, the compressed string length must be half of original or less.
I don't want to use any Libraries other than .Net built in libraries.
For example: Original String: "Hello World"
Compressed String: "$n(@3" //something like this.
I use different methods but they don't compress in this manner.Any help? Thanks in advance.
Arbitrary guaranteed compression is impossible (See for example http://matt.might.net/articles/why-infinite-or-guaranteed-file-compression-is-impossible/).
Use GZipStream which is .NET since 2.0.
private static string CompressLongString(string longString)
{
MemoryStream outstream = new MemoryStream();
MemoryStream instream = new MemoryStream(Encoding.UTF8.GetBytes(longString));
using (GZipStream compress =
new GZipStream(outstream,
CompressionMode.Compress))
{
instream.CopyTo(compress);
}
return Encoding.Unicode.GetString(outstream.GetBuffer());
}
What is your real prolem? If you want to save memory by compressing a very long string, then you can convert it to byte[]
array in UTF-8. Create a MemoryStream
object, then create StreamWriter
in UTF-8 on that MemoryStream
and write your string to there. Then close streamwriter and stream and use ToArray()
to convert it to a compact array. Although this will create many temporary objects, the resulting array will be often much smaller than original string.
Note that this is not compression, just encoding characters to UTF-8, which is usually 50% smaller than UTF-16 normally used in strings. And it is done using standard .NET library as you requested. (But result is not literally a string as you wanted.)
You can use GZipStream:
http://www.codeproject.com/KB/files/GZipStream.aspx
http://msdn.microsoft.com/en-us/library/system.io.compression.gzipstream.aspx
From your comments I understand you want to save database size.
Compressing strings with 10 characters, does not gain you very much. And is not garanteed to win a certain percentage (you cannot compress an already compressed string).
You could store every string in a table (with the string and a numeric primary key), and reference the string with just the key from your other tables. If you have repeating strings. If your string do not repeat, you might break each string into words, and store the indexes of the words.
I suggest magic. A string is just a series of numbers, none of which can be discarded and keep the string the same. Therefore, to compress the string you would need to decide if there is any of the string you can live without and makes rules to do that. i can't think of any common ways, so you would have to make your own rules.
精彩评论