how do you serialize HTML in C#?
how do you serialize HTML in C#?
I think I know how to use XSD.exe to create C# classes from XML that can be used with the XmlSerializer class to serialize and verify the XML document.
Is there a way to do the same sort of thing with an HTML document? I have tried but the xsd command line says that the remote name www.w3.org cannot be resolved.
At a minimum, i开发者_如何学Cs there a way to use C# to find out if an HTML file is valid?
The HTMLAgilityPack is an open source library that parses HTML easily for you. You can then search/manipulate the structure of the document quite easily.
It's quite forgiving with the HTML you provide it, so I'm not sure if it's a good way of checking that if you've got a strict xHTML valid document. But it should be able to parse anything a modern browser can.
If it's XHTML that you're trying to validate, you can do it like this:
static void validate(string filename)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.ProhibitDtd = false;
settings.ValidationType = ValidationType.DTD;
settings.ValidationEventHandler +=
new ValidationEventHandler(ValidationCallBack);
settings.XmlResolver = new XhtmlUrlResolver();
// Create the XmlReader object.
XmlReader reader = XmlReader.Create(filename, settings);
// Parse the file.
while (reader.Read()) ;
}
// Display any validation errors.
private static void ValidationCallBack(object sender, ValidationEventArgs e)
{
Console.WriteLine("Validation Error: {0}", e.Message);
}
It will be a bit slow because it's downloading the schema files from the W3C web site.
To deserialize/parse HTML, I would also recommend HTMLAgilityPack. However, to validate the HTML, you could try running HTML Tidy. For XHTML, however, you can obtain an XSD.
精彩评论