php - loadHTML returning empty, html is fine -


i'm trying grab href value of element using php, i'm having trouble. here's snippet of code.

  <?php   ini_set("log_errors", 1);   ini_set("error_log", "php-error.log");   $target_url = "http://foo.bar";   $request = $target_url;   $html = $this->scraper($request);   $dom = new domdocument();   $dom->loadhtml($html);   // error point - $dom empty   error_log("dom:");   error_log($dom);   $xpath = new domxpath($dom);   error_log("setting target url");   $target_url = $xpath->query("//*[@class='foo_bar']/href");   ?> 

logging $html results in standard, full html output of page. search shows xpath should work. however, when try log $dom after loadhtml, blank result. i've been struggling few hours trying work out why, no luck.

does have ideas/anything try?

edited add console output:

    [30-sep-2015 13:51:59 america/new_york] dom:     [30-sep-2015 13:51:59 america/new_york] setting target url 

you should check html loaded dom. can use debugger, logging or var_dump() that.

var_dump($dom->savexml());

if wasn't loaded dom take step , validate html fetched scraper.

var_dump($html);

if html loaded dom still need fix xpath. expect href being attribute node.

//*[@class='foo_bar']/@href

you seem want read string value, cast it:

string(//*[@class='foo_bar']/@href)

that works domxpath::evaluate(), domxpath::query() can return node lists.

$target_url = $xpath->evaluate("string(//*[@class='foo_bar']/@href)"); 

a small example:

$document = new domdocument(); $document->loadhtml('<a href="http://example.com">example</a>'); $xpath = new domxpath($document); var_dump($xpath->evaluate('string(//a[1]/@href)')); 

output:

string(18) "http://example.com" 

Comments

Popular posts from this blog

1111. appearing after print sequence - php -

java - WARN : org.springframework.web.servlet.PageNotFound - No mapping found for HTTP request with URI [/board/] in DispatcherServlet with name 'appServlet' -

Ruby on Rails, ActiveRecord, Postgres, UTF-8 and ASCII-8BIT encodings -