Tuesday, July 14, 2009

EXTRACT ALL LINK FROM A GIVEN HTML PAGE

This below function accept an html page(as a result of curl_excec()) as input
it will search and retrieve all links from that page and return it
it uses xpath to retrieve the link
try it.....

function get_link($html)
{
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
for ($i = 0; $i < $hrefs->length; $i++) {
$href = $hrefs->item($i);
$url = $href->getAttribute('href');
echo $url."
"
return $url;
}
?>

No comments:

Post a Comment

LinkWithin

Related Posts with Thumbnails