Insights

How to Set Up and Use cURL with a Proxy

Routing HTTP requests through proxies is fundamental to web scraping, API testing, and privacy-sensitive applications. This guide covers the practical setup in PHP and Python — the two languages we use most in production scraping work.

cURL proxy basics (command line)

The simplest proxy usage with cURL on the command line:

# HTTP proxy
curl -x http://proxy-host:port https://example.com

# With authentication
curl -x http://user:password@proxy-host:port https://example.com

# SOCKS5 proxy
curl --socks5 proxy-host:port https://example.com

# SOCKS5 with DNS resolution through the proxy
curl --socks5-hostname proxy-host:port https://example.com

The --socks5-hostname flag is important: it resolves DNS through the proxy rather than locally, which prevents DNS leaks that reveal your real location.

PHP: curl_setopt proxy configuration

<?php
$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, 'https://example.com');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

// HTTP proxy
curl_setopt($ch, CURLOPT_PROXY, 'http://proxy-host:port');

// Proxy authentication
curl_setopt($ch, CURLOPT_PROXYUSERPWD, 'username:password');

// For SOCKS5
curl_setopt($ch, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5_HOSTNAME);

// Timeout
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);

$response = curl_exec($ch);

if (curl_errno($ch)) {
    echo 'Proxy error: ' . curl_error($ch);
}

curl_close($ch);

Use CURLPROXY_SOCKS5_HOSTNAME instead of CURLPROXY_SOCKS5 to resolve DNS through the proxy. This is a common mistake that causes IP leaks.

Python: requests library with proxies

import requests

proxies = {
    'http': 'http://user:password@proxy-host:port',
    'https': 'http://user:password@proxy-host:port',
}

# Basic request through proxy
response = requests.get('https://example.com', proxies=proxies, timeout=30)

# SOCKS5 (requires pip install requests[socks])
socks_proxies = {
    'http': 'socks5h://user:password@proxy-host:port',
    'https': 'socks5h://user:password@proxy-host:port',
}

response = requests.get('https://example.com', proxies=socks_proxies)

Note the socks5h:// scheme — the h suffix means "resolve hostnames through the proxy" (equivalent to --socks5-hostname in cURL).

Rotating proxies

For scraping at scale, a single proxy IP will get rate-limited or banned. Use a rotating proxy pool:

import random

proxy_list = [
    'http://user:pass@proxy1:port',
    'http://user:pass@proxy2:port',
    'http://user:pass@proxy3:port',
]

def get_with_rotation(url):
    proxy = random.choice(proxy_list)
    proxies = {'http': proxy, 'https': proxy}
    return requests.get(url, proxies=proxies, timeout=30)

In production, we use proxy services with built-in rotation (residential and datacenter pools) rather than managing individual IPs. The proxy selection, retry logic, and per-site reputation tracking is a significant part of any production scraping system.

Debugging proxy issues

  • Connection refused — proxy is down or wrong port. Test with curl -v -x proxy:port https://httpbin.org/ip
  • 407 Proxy Authentication Required — wrong credentials or auth format
  • SSL errors through proxy — some proxies don't handle HTTPS CONNECT properly. Try a different proxy or check if you need CURLOPT_SSL_VERIFYPEER
  • Slow responses — geographic distance between proxy and target matters. Match proxy location to target server location when possible
  • IP leaks — always test with https://httpbin.org/ip through the proxy to confirm the exit IP is the proxy, not yours

For production scraping infrastructure, see our data extraction services.

Need production-grade scraping infrastructure?

We've been building proxy-managed extraction pipelines since 2005.

Start a conversation