PHP Classes

PHP Google Crawler: Perform Google searches and get the result URLs

Recommend this page to a friend!
     
  Info   Example   View files Files   Install with Composer Install with Composer   Download Download   Reputation   Support forum   Blog    
Ratings Unique User Downloads Download Rankings
Not yet rated by the usersTotal: 201 All time: 8,476 This week: 524Up
Version License PHP version Categories
php-google-crawler 1.0.0GNU General Publi...5PHP 5, Searching, Web services
Description 

Author

This package can perform Google searches and get the result URLs.

It can send HTTP requests to the Google search Web servers to perform searches for given keywords.

The package can parse the results and extract the URLs of the search result links.

Picture of Juraj Puchký
  Performance   Level  
Name: Juraj Puchký is available for providing paid consulting. Contact Juraj Puchký .
Classes: 17 packages by
Country: Czech Republic Czech Republic
Age: 42
All time rank: 109211 in Czech Republic Czech Republic
Week rank: 187 Up1 in Czech Republic Czech Republic Up
Innovation award
Innovation award
Nominee: 6x

Example

<?php

require_once 'implementation/Google.php';

use
App\ContactCrawler\Google;

function
collectEmail($url) {
   
$s = new \App\ContactCrawler\Search();
   
$data = $s->getData($url);
   
libxml_use_internal_errors(true);
   
$emails = [];
   
$dom = new DOMDocument();
    @
$dom->loadHTML($data);
   
libxml_clear_errors();
   
$results = $dom->getElementsByTagName('a');
    foreach (
$results as $r) {
        if (
strstr($r->getAttribute('href'), 'mailto:')) {
           
$emails[] = str_replace("mailto:", "", $r->getAttribute('href'));
        }
    }
    return
$emails;
}

function
collectUrls($url) {
   
$s = new \App\ContactCrawler\Search();
   
$data = $s->getData($url);
   
// subpages
   
libxml_use_internal_errors(true);
   
$urls = [];
   
$dom = new DOMDocument();
    @
$dom->loadHTML($data);
   
libxml_clear_errors();
   
$results = $dom->getElementsByTagName('a');
    foreach (
$results as $r) {
       
$urls[$r->getAttribute('href')] = $r->getAttribute('href');
    }
    return
$urls;
}

if (isset(
$argv[1])) {
   
$google = new Google();
   
$urls = $google->search($argv[1], "cs", 10000);
   
$emails = [];
    foreach (
$urls as $url) {
       
$pages = collectUrls($url);
        foreach (
$pages as $page) {
           
$ems = collectEmail($page);
            foreach (
$ems as $email) {
                echo
$email . "\n";
               
$emails[$email] = $email;
            }
        }
    }
    foreach (
$emails as $email) {
        echo
"$email\n";
    }
} else {
    echo
"Help: <keyword>\n";
}


Details

contact-crawler

Simple Crawler for collecting emails by kewords with google.

Usage: php crawler.php "keyword"


  Files folder image Files (5)  
File Role Description
Files folder imageimplementation (2 files)
Files folder imageinterfaces (1 file)
Accessible without login Plain text file crawler.php Example Example script
Accessible without login Plain text file README.md Doc. Documentation

  Files folder image Files (5)  /  implementation  
File Role Description
  Plain text file Google.php Class Class source
  Plain text file Search.php Class Class source

  Files folder image Files (5)  /  interfaces  
File Role Description
  Plain text file ISearch.php Class Class source

The PHP Classes site has supported package installation using the Composer tool since 2013, as you may verify by reading this instructions page.
Install with Composer Install with Composer
 Version Control Unique User Downloads Download Rankings  
 100%
Total:201
This week:0
All time:8,476
This week:524Up