How can user agents and proxies be used to prevent detection when scraping content from external websites in PHP?
User agents and proxies can be used to prevent detection when scraping content from external websites in PHP by mimicking a real user's behavior. By setting a user agent header in your HTTP requests, you can make your requests appear as if they are coming from a regular browser. Additionally, using proxies can help you rotate IP addresses and avoid being blocked by websites that limit access based on IP.
<?php
$url = 'https://www.example.com';
$userAgent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3';
$proxy = 'proxy.example.com:8080';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
$response = curl_exec($ch);
if($response === false){
echo 'Error: ' . curl_error($ch);
}
curl_close($ch);
echo $response;
?>