Sunday, April 10, 2011

PHP cURL by Example

cURL is a computer software project providing a library and command-line tool for transferring data using various protocols. The cURL project produces two products, libcurl and cURL. It was first released in 1997.

cURL in php mostly used for getting contents of a webpage and parse it to extract data, other uses may be to fetch data from a web-service. Here arise a question is cURL is the only way of fetching data? simple answer is NO, there are other methods by which we can do our it, like,
  • you can use file_get_contents(); method.
  • or you can open a socket using fsockopen();
  • one can also use fopen();
  • or if your resource is xml or xhtml you can use simplexml_load_file();
and many more... so why cURL then? because it provide us with the richest set of options, now lets see it by an example...
 function getCURL($url){
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_USERAGENT, getRandomUA());
  curl_setopt($ch, CURLOPT_URL,$url);
  curl_setopt($ch, CURLOPT_FAILONERROR, true);
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
  curl_setopt($ch, CURLOPT_AUTOREFERER, true);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
  curl_setopt($ch, CURLOPT_TIMEOUT, 10);
  $html = curl_exec($ch);
  return $html;
 }
    function getRandomUA(){
        $arr=array();
        $arr[]="HTC_Touch_3G Mozilla/4.0 (compatible; MSIE 6.0; Windows CE; IEMobile 7.11)";
        $arr[]="BlackBerry9700/5.0.0.862 Profile/MIDP-2.1 Configuration/CLDC-1.1 VendorID/331 UNTRUSTED/1.0 3gpp-gba";
        $arr[]="Opera/9.80 (J2ME/MIDP; Opera Mini/9.80 (J2ME/23.377; U; en) Presto/2.5.25 Version/10.54";
        $arr[]="Mozilla/5.0 (Windows; U; Windows NT 6.1; sv-SE) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4";
        $arr[]="Mozilla/4.0 (iPhone; U; CPU iPhone OS 4_2_1 like Mac OS X; fi-fi) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8C148a Safari/6533.18.5";
        $arr[]="Mozilla/6.0 (X11; U; Linux i686; en-US; rv:1.9a3pre) Gecko/20070330";
        $arr[]="Mozilla/3.0 (X11; Linux i686; rv:2.0b12pre) Gecko/20110204 SeaMonkey/2.1b3pre";
        $arr[]="Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.127 Safari/533.4";
        $arr[]="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.4) Gecko/20060612 Firefox/1.5.0.4 Flock/0.7.0.17.1";
        $arr[]="Mozilla/5.0 (Linux; U; Android 1.1; en-gb; dream) AppleWebKit/525.10+ (KHTML, like Gecko) Version/3.0.4 Mobile Safari/523.12.2";
        $arr[]="Mozilla/3.0 (x86 [en] Windows NT 5.1; Sun)";
        $arr[]="Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.3; Creative AutoUpdate v1.40.02)";
        $k=rand(0, count($arr)-1);
        return $arr[$k];
    }
there is a reason why we have to you the second function getRandomUA(), some website block php default header, and if you are heavily requesting with same header it may block you even then, so we use multiple headers randomly to avoid this scenario.

0 comments:

Post a Comment

 

Blog Info

A Pakistani Website by Originative Systems

Total Pageviews

Tutorial Jinni Copyright © 2015 WoodMag is Modified by Originative Systems