Get values from a .txt
I have a file.txt like this:
This is only a part of the .txt file
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>GeoServer Configuration</title>
<meta name="ROBOTS" content="NOINDEX, NOFOLLOW"/>
</head>
<body>
Workspaces
<ul>
<li>
<a href="http://xxxxxx:8080/geoserver/rest/workspaces/worldmap1.html">worldmap1</a>
</li>
<li>
<a href="http://xxxxxx:8080/geoserver/rest/workspaces/worldmap2.html">worldmap2</a>
</li>
</ul>
</body>
</html>
It´s possible to get the value ? I´m trying to pass the .txt to a .xml file but I have some problems because is not 开发者_高级运维a well formed xml.
Thanks in advance
first you have to add a root element. Let's suppose you create an XML file named TextFile1.xml
which contains the below XML
<Item>
<li>
<a href="http://10.80.14.188:8080/geoserver/rest/workspaces/worldmap1.html">worldmap1</a>
</li>
<li>
<a href="http://10.80.14.188:8080/geoserver/rest/workspaces/worldmap2.html">worldmap2</a>
</li>
</Item>
you can do the below to get the href value
public static class MyClass
{
public static void Main()
{
var xmldoc = XDocument.Load(@"TextFile1.xml");
XNamespace p = "http://www.w3.org/1999/xhtml";
var result = from item in xmldoc.Descendants(p + "a")
select item;
foreach (var item in result.ToList())
{
string href = item.Attribute("href").Value;
var splitHref = href.Split('/');
string page = splitHref[splitHref.Length - 1];
}
}
}
If this is the only input you have, you could change it into a valid xml document by adding a root node:
<root>
<li><a href="http://10.80.14.188:8080/geoserver/rest/workspaces/worldmap1.html">worldmap1</a></li>
<li><a href="http://10.80.14.188:8080/geoserver/rest/workspaces/worldmap2.html">worldmap2</a></li>
</root>
(This is easy to do with some simple string concatenation)
The document is now well-formed XML, hence you ca use Linq to XML or any other XML APIs to read the values you require.
Adding a root node seems the solution but if you cannot change the input, you can use regular expressions.
精彩评论