php - simple_html_dom - read html page, two arrays -
this entire code
// include scrapper include('simple_html_dom.php'); // connect page scrapping $html = file_get_html('http://www.niagarafallsreview.ca/news/local'); // make empty arrays $headlines = array(); $links = array(); // 'h' headings on page foreach($html->find('h1') $header) { $headlines[] = $header->plaintext; } // 'a' links start 'http://www.niagarafallsreview.ca/2016/04/' foreach($html->find('a[href^="http://www.niagarafallsreview.ca/2016/04/"]') $link) { $links[] = $link->href; } // trim headlines because 1 on top , bottom not needed $output = array_slice($headlines, 1, -1); // each header output nice list of headers foreach ($output $headers){ echo "< href='#'>$headers</a>" . "<br />"; } // make sure links unique , no doubles found $result = array_unique($links); // each link output in nice list foreach ($result $linkk){ echo "<a href='$linkk'>$linkk</a>" . "<br />"; }
this code produce headings in nice list, , produce nice list of links.
my problem need combine them, $header text of href, , link in href $linkk
like this..
< href ='$linkk'>$headers</a>
i dont know how have 2 foreach statements. tried combine them unsuccessful.
any appreciated.
thanks.
try this:
// include scrapper include('simple_html_dom.php'); // connect page scrapping $html = file_get_html('http://www.niagarafallsreview.ca/news/local'); // make empty arrays $headlines = array(); $links = array(); // 'h' headings on page foreach($html->find('h1') $header) { $headlines[] = $header->plaintext; } // 'a' links start 'http://www.niagarafallsreview.ca/2016/04/' foreach($html->find('a[href^="http://www.niagarafallsreview.ca/2016/04/"]') $link) { $links[] = $link->href; } // trim headlines because 1 on top , bottom not needed $output = array_slice($headlines, 1, -1); // make sure links unique , no doubles found $result = array_unique($links); // each link output in nice list foreach ($result $i=>$linkk) { $headline = isset($output[$i]) ? $output[$i] : '(empty)'; echo "<a href='$linkk'>$headline</a>" . "<br />"; }
Comments
Post a Comment