Handling HTML character encoding issues
-Think this is called character encoding but please re-title if I'm wrong-
Issue: Trying to consume HTML with phpquery and maintain the html's integrity after it runs through the phpquer开发者_如何学JAVAy functions.
These are the changes to the HTML as it runs through the functions:
Original HTML:
<strong> Fast & Strong I Concrete</strong>
HTML Page Converted to PHPQueryObject:
<strong> Fast& Strong IÂ Concrete</strong>
PHPQueryObject run through Find() function:
<strong> Fast & Strong IÂ Concrete</strong>
Tried various combinations of htmlentities()
, html_entity_decode()
, iconv()
to handle the movement of the data and maintain the original structure without displaying a bunch of unnecessary characters. I think this is a limitation of phpquery’s ability to consume html, so I need a work around.
I’ve been successful removing the  and other unneeded characters by using iconv("UTF-8", "BIG5//IGNORE")
but it is somewhat destructive to the original html since it’s intended for Traditional Chinese Characters.
Question: What are Â
and
and how can I handle them so the consumed html #2 and #3 above display as originally intended #1 above without displaying extra characters to the browser?
精彩评论