I want to retrieve all the content in a div element in HTML. I am using PHP and XPath to do it. Here is the query:
$doc = new DOMDocument[]; $doc->loadHTMLFile[$uri]; $xpath= new DOMXPath[$doc]; $text_content = $xpath->query["/html/body/div[5]/div[1]/div[1]/div[1]/div[2]/div[5]/*"];I used wild card "*" to retrieve all elements [normal text, div, img, p etc.] under this div[div[5]], but when I printed the $text_content, I found out that it only stored all the div elements.
What is the correct way to do so?
Thanks in advance.
asked Jul 30, 2013 at 21:14
James ZhaoJames Zhao
6611 gold badge7 silver badges17 bronze badges
.../* will only retrieve the nodes that are immediate descendents of that file div[5] in the xpath query. e.g
... rest of document ...hello there
In this simplified example, your query will retrieve the
, because it's an immediate descent of the
To get all the descendants, regardless of level, you'd want
.../div[5]//* ^^---note doubled slashes// is short-hand for /descendant-or-self::node[]/, would return the span.
answered Jul 30, 2013 at 21:32
Marc BMarc B
350k41 gold badges402 silver badges486 bronze badges
2