开发者

What data structure would be suitable for solving this problem?

I have an xml which contains some keywords. The format of the xml this:

<keywords>
 <keyword name="Name" />
 <keyword name="City" />
 <keyword name="Email" />
<keywords>

The number of keywords in the xml is variable and can be anything (number can be in thousands).

I have a text file which has two columns. Column one contains keywords and second columns has a value for the keyword. The size of each textfile is around 50MB. Based on the keywords in xml, I need to find the corresponding values from text file. I can easily parse the text file and get the values.

Now here is my problem: I have 10 textfiles, I need to find the values for the keywords mentioned in xml from all those 10 textfile and see if values for keywords from all the 10 textfiles are same or not. I need to display the results like this:

Name: 3 different values found in 10 textfile, City: abcdef Email: johnsmith@example开发者_如何学编程.com

Whichever keywords have same value in all files should display that value otherwise, display how many different values (number) exist for that particular keyword.

What is the most elegant way to solve this problem in C#? What is the best datastructure suited for such problems?


The data structure part of your question is a generic Lookup

The elegant part is, not surprisingly LINQ. Some combination of Enumerable.ToLookup Method or Enumerable.GroupBy Method, depending on how much work you need to do to associate the keys to the values.

Here is a treasure chest of examples for GroupBy usage

Cheers,
Berryl


Assuming all of the data fits into memory, you can use a MultiMap, that is a map that can take multiple values for each unique key. There isn't a default implementation in C# but plenty on the web (e.g. http://dotnetperls.com/multimap). If you need more details on how to parse the files to build the map you'll need to provide more details on the file format.


class KeyWord{
   private String name;
   private String value;
   public KeyWord(String k, String v){
      name = k;
      value = v;
   }
 }

 // different file 

 private List<KeyWord> keywords = new List<KeyWord>();

List is good.

other way if you have xml schema definition then you can dynamically generate the classes

"C:\Program Files\Microsoft Visual Studio 9\SDK\v2.0\Bin\xsd.exe" /classes /namespace:x.y.z schemaforkeywords.xsd


Try Dynamic Xml Reader if you are using C#4.0

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜