php - loadHTML returning empty, html is fine -
i'm trying grab href value of element using php, i'm having trouble. here's snippet of code.
<?php ini_set("log_errors", 1); ini_set("error_log", "php-error.log"); $target_url = "http://foo.bar"; $request = $target_url; $html = $this->scraper($request); $dom = new domdocument(); $dom->loadhtml($html); // error point - $dom empty error_log("dom:"); error_log($dom); $xpath = new domxpath($dom); error_log("setting target url"); $target_url = $xpath->query("//*[@class='foo_bar']/href"); ?> logging $html results in standard, full html output of page. search shows xpath should work. however, when try log $dom after loadhtml, blank result. i've been struggling few hours trying work out why, no luck.
does have ideas/anything try?
edited add console output:
[30-sep-2015 13:51:59 america/new_york] dom: [30-sep-2015 13:51:59 america/new_york] setting target url
you should check html loaded dom. can use debugger, logging or var_dump() that.
var_dump($dom->savexml());
if wasn't loaded dom take step , validate html fetched scraper.
var_dump($html);
if html loaded dom still need fix xpath. expect href being attribute node.
//*[@class='foo_bar']/@href
you seem want read string value, cast it:
string(//*[@class='foo_bar']/@href)
that works domxpath::evaluate(), domxpath::query() can return node lists.
$target_url = $xpath->evaluate("string(//*[@class='foo_bar']/@href)"); a small example:
$document = new domdocument(); $document->loadhtml('<a href="http://example.com">example</a>'); $xpath = new domxpath($document); var_dump($xpath->evaluate('string(//a[1]/@href)')); output:
string(18) "http://example.com"
Comments
Post a Comment