PHP Simple HTML DOM Parser
A fast, simple and reliable HTML document parser for PHP.
Created by S.C. Chen, based on HTML Parser for PHP 4 by Jose Solorzano.
Parse any HTML document
PHP Simple HTML DOM Parser handles any HTML document, even ones that are considered invalid by the HTML specification.
Select elements using CSS selectors
PHP Simple HTML DOM Parser supports CSS style selectors to navigate the DOM, similar to jQuery.
Download
- Download the latest version from SourceForge
Contributing
- Request features on the Feature Request Tracker
- Report bugs on the Bug Tracker
- Get involved with the community on the Discussions Board
License
PHP Simple HTML DOM Parser is Free Software licensed under the MIT License.
Index
- Quick Start
- How to create HTML DOM object?
- How to find HTML elements?
- How to access the HTML element's attributes?
- How to traverse the DOM tree?
- How to dump contents of DOM object?
- How to customize the parsing behavior?
- API Reference
- FAQ
Quick Start
Top
- Get HTML elements
- Modify HTML elements
- Extract contents from HTML
- Scraping Slashdot!
$html = file_get_html['//www.google.com/'];
foreach[$html->find['img'] as $element]
echo $element->src . '
';
foreach[$html->find['a']
as $element]
echo $element->href . '
';
$html = str_get_html['
$html->find['div[id=hello]', 0]->innertext = 'foo';
echo $html;
echo file_get_html['//www.google.com/']->plaintext;
$html = file_get_html['//slashdot.org/'];
foreach[$html->find['div.article'] as $article] {
$item['title'] =
$article->find['div.title',
0]->plaintext;
$item['intro'] = $article->find['div.intro', 0]->plaintext;
$item['details'] = $article->find['div.details', 0]->plaintext;
$articles[] = $item;
}
print_r[$articles];
How to create HTML DOM object?
Top
- Quick way
- Object-oriented way
$html = str_get_html['Hello!'];
$html = file_get_html['//www.google.com/'];
$html = file_get_html['test.htm'];
$html = new simple_html_dom[];
$html->load['Hello!'];
$html->load_file['//www.google.com/'];
$html->load_file['test.htm'];
How to find HTML elements?
Top
- Basics
- Advanced
- Descendant selectors
- Nested selectors
- Attribute Filters
- Text & Comments
$ret = $html->find['a'];
$ret = $html->find['a', 0];
$ret =
$html->find['a', -1];
$ret = $html->find['div[id]'];
$ret = $html->find['div[id=foo]'];
$ret = $html->find['#foo'];
$ret = $html->find['.foo'];
$ret = $html->find['*[id]'];
$ret = $html->find['a, img'];
$ret = $html->find['a[title], img[title]'];
Supports these operators in attribute selectors:
[attribute] | Matches elements that have the specified attribute. |
[!attribute] | Matches elements that don't have the specified attribute. |
[attribute=value] | Matches elements that have the specified attribute with a certain value. |
[attribute!=value] | Matches elements that don't have the specified attribute with a certain value. |
[attribute^=value] | Matches elements that have the specified attribute and it starts with a certain value. |
[attribute$=value] | Matches elements that have the specified attribute and it ends with a certain value. |
[attribute*=value] | Matches elements that have the specified attribute and it contains a certain value. |
$es = $html->find['ul li'];
$es = $html->find['div div div'];
$es = $html->find['table.hello td'];
$es = $html->find[''table td[align=center]'];
foreach[$html->find['ul'] as $ul]
{
foreach[$ul->find['li'] as $li]
{
}
}
$e = $html->find['ul', 0]->find['li', 0];
How to access the HTML element's attributes?
Top
- Get, Set and Remove attributes
- Magic attributes
- Tips
$value = $e->href;
$e->href = 'my link';
$e->href = null;
if[isset[$e->href]]
echo 'href exist!';
$html = str_get_html["
$e = $html->find["div", 0];
echo $e->tag;
echo $e->outertext;
echo $e->innertext;
echo $e->plaintext;
$e->tag | Read or write the tag name of element. |
$e->outertext | Read or write the outer HTML text of element. |
$e->innertext | Read or write the inner HTML text of element. |
$e->plaintext | Read or write the plain text of element. |
echo $html->plaintext;
Top You can also call methods with Camel naming convertions. mixed element element element element element Top Top
$e->outertext = '
$e->outertext = '';
$e->outertext = $e->outertext . '
$e->outertext
= 'How to traverse the DOM tree?
echo $html->find["#div1", 0]->children[1]->children[1]->children[2]->id;
echo $html->getElementById["div1"]->childNodes[1]->childNodes[1]->childNodes[2]->getAttribute['id'];
Method Description Returns the Nth child object if index is set, otherwise return an array of children.
Returns the parent of element.
Returns the first child of element, or null if not found.
Returns the last child of element, or null if not found.
Returns the next sibling of element, or null if not found.
Returns the previous sibling of element, or null if not found.
How to dump contents of DOM object?
How to customize the parsing behavior?
function my_callback[$element] {
if [$element->tag=='b']
$element->outertext = '';
}
$html->set_callback['my_callback'];
echo $html; What is simple DOM?
What is simple HTML DOM parser?
What is simple HTML DOM parser PHP?
What is the use of HTML DOM?
What are DOM elements in HTML?
What is the difference between HTML and DOM?
Chủ Đề