海运的博客

Python Curl/Pycurl添加DNS解析支持

发布时间:January 30, 2015 // 分类:Python // No Comments

Pycurl底层使用libcurl,请先确定Libcurl是否已支持异步DNS解析c-ares,如不支持可升级libcurl支持异步DNS解析c-ares
其实Libcurl更新支持为异步DNS如果已安装pycurl不用重新安装Pycurl,见https://www.haiyun.me/archives/1070.html
通过pip安装:

export PATH=/usr/local/curl/bin/:$PATH
export LD_LIBRARY_PATH="/usr/local/curl/lib/"
export PYCURL_SSL_LIBRARY=openssl
pip install pycurl

下载pycurl源码包安装:

wget http://pycurl.sourceforge.net/download/pycurl-7.19.5.1.tar.gz
tar zxvf pycurl-7.19.5.1.tar.gz 
cd pycurl-7.19.5.1
export LD_LIBRARY_PATH="/usr/local/curl/lib/"
python setup.py install --curl-config=/usr/local/curl/bin/curl-config --with-ssl

检查pycurl是否已支持异步DNS解析c-ares:

>>> import pycurl
>>> pycurl.version
'PycURL/7.19.5.1 libcurl/7.40.0 OpenSSL/1.0.1e zlib/1.2.7 c-ares/1.10.0'

安装pycurl后使用时遇到的一些错误:

pycurl.so: undefined symbol: CRYPTO_num_locks

原因:libcurl安装时--with-ssl支持

libcurl link-time ssl backend (openssl) is different from compile-time ssl backend (none/other)

原因:libcurl和pycurl编译时ssl后端不一致,调整见上和libcurl安装

pycurl: libcurl link-time version (7.19.7) is older than compile-time version (7.4.0)

原因:编译pycurl时使用的编译的libcurl动态库,不过现在pycurl现在加载的是系统自带的版本较旧的动态库,解决将编译的libcurl动态库添加到系统动态库,见ldconfig

CentOS编译安装libcurl/curl添加异步DNS解析c-ares

发布时间:January 30, 2015 // 分类: // No Comments

在使用curl异步并发请求时如果有大量域名解析会长时间阻塞程序IO,可以编译升级libcurl以支持异步DNS解析。
Centos7自带libcurl已支持异步DNS支持,不过是--enable-threaded-resolver,可以使用curl-config --configure查看curl编译参数。
查看Libcurl是否已支持异步DNS解析,包含AsynchDNS为支持:

curl --version
curl 7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.40.0 OpenSSL/1.0.1e zlib/1.2.3
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp 
Features: AsynchDNS IPv6 Largefile NTLM SSL libz 

首先安装异步DNS解析库c-ares:

yum install c-ares-devel openssl-devel

编译libcurl库:

wget http://curl.haxx.se/download/curl-7.40.0.tar.gz
tar zxvf curl-7.40.0.tar.gz 
cd curl-7.40.0/
./configure --enable-ares --prefix=/usr/local/curl --with-ssl
make && make install

查看编译安装的curl信息,已经支持了异步DNS解析库c-ares:

/usr/local/curl/bin/curl --version
curl 7.40.0 (x86_64-unknown-linux-gnu) libcurl/7.40.0 OpenSSL/1.0.1e zlib/1.2.7 c-ares/1.10.0
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp 
Features: AsynchDNS IPv6 Largefile NTLM NTLM_WB SSL libz UnixSockets 

将libcurl动态库添加到动态链接库:

echo '/usr/local/curl/lib' > /etc/ld.so.conf.d/libcurl.conf
ldconfig

使用libcurl库看是否支持c-ares:

#include <curl/curl.h>
int main()
{
 curl_version_info_data*info=curl_version_info(CURLVERSION_NOW);
 if (info->features&CURL_VERSION_ASYNCHDNS) {
   printf( "ares enabled\n");
 } else {
   printf( "ares NOT enabled\n");
 }
 return 0;
}

又一PHP libcurl封装异步并发HTTP客户端

发布时间:January 27, 2015 // 分类:PHP // No Comments

PHP标准库内置curl扩展,不过实现不完整,如multi_socket_action接口,无意中发现pecl http库同样基于libcurl封装,支持更多的libcurl特性,更新也比较快,底层通过libevent(epoll)实现multi_socket_action接口,不过pecl http版本1和版本2 api完全不兼容,使用过程中稳定性及性能并不如PHP内置的curl,好像还有内存泄露,以下为示例代码,基于pecl_http 2.20:

<?php
   function push($client, $url) {
      $req = new http\Client\Request("GET", $url, ["User-Agent"=>"My Client/0.1"]);
      $req->setOptions(array('connecttimeout'=>1, 'timeout'=>1));
      $client->enqueue($req, function($response) use ($client, $req, $url) {
         printf("%s returned '%s' (%d)\n", $response->getTransferInfo("effective_url"), $response->getInfo(), $response->getResponseCode());
         echo $client->count().PHP_EOL;
         global $urls;
         if ($urls) {
            while ($client->count() < 20) {
               $url = array_shift($urls);
               push($client, $url);
            }
            return true; // dequeue
         }
      });
   }

   $client = new http\Client;
   $client->enablePipelining(true);
   $client->enableEvents(true);

   for ($i = 0; $i < 10000; ++$i) {
      $urls[] = "http://192.168.1.3/";
   }
   for ($i = 0; $i < 20; ++$i) {
      $url = array_shift($urls);
      push($client, $url);
   }
   /*
   try{
      var_dump($client->send());
   }
   catch(http\Exception\RuntimeException  $e)
   {
      echo 'Message: ' .$e->getMessage().PHP_EOL;
   }
   */

   while ($client->once()) {
      $client->wait();
   }

Python CURL异步并发HTTP客户端

发布时间:January 7, 2015 // 分类:Python // No Comments

Select模式,类似于php multi curl异步并发,连接数不能太多:

#!/usr/bin/python
# -*- coding: utf-8 -*-

import sys
import pycurl
import cStringIO

#最大连接数
num_conn = 20

queue = []
urls = ['https://www.haiyun.me/'] * 10000
for url in urls:
  queue.append(url)

num_urls = len(queue)
num_conn = min(num_conn, num_urls)
print ('----- Getting', num_urls, 'Max conn', num_conn,
       'connections -----')

m = pycurl.CurlMulti()
#初始化handle,可复用
m.handles = []
for i in range(num_conn):
  c = pycurl.Curl()
  c.body = cStringIO.StringIO()
  c.setopt(pycurl.FOLLOWLOCATION, 1)
  c.setopt(pycurl.MAXREDIRS, 5)
  c.setopt(pycurl.CONNECTTIMEOUT, 30)
  c.setopt(pycurl.TIMEOUT, 300)
  c.setopt(pycurl.NOSIGNAL, 1)
  m.handles.append(c)


freelist = m.handles[:]
num_processed = 0
#主循环开始
while num_processed < num_urls:

    #添加请求URL
    while queue and freelist:
      url = queue.pop()
      c = freelist.pop()
      c.setopt(pycurl.URL, url)
      c.setopt(pycurl.WRITEFUNCTION, c.body.write)
      m.add_handle(c)
      c.url = url
      #print url

    #执行请求
    while 1:
      (ret, num_handles) = m.perform()
      if ret != pycurl.E_CALL_MULTI_PERFORM:
        break

    #阻塞一会直到有连接完成
    m.select(1.0)

    #读取完成的连接
    while 1:
      (num_q, ok_list, err_list) = m.info_read()
      for c in ok_list:
        m.remove_handle(c)
        #print c.body.getvalue()
        freelist.append(c)

      for (c, errno, errmsg) in err_list:
        m.remove_handle(c)
        print ('Failed: ', c.url, errno, errmsg)
        freelist.append(c)
      num_processed = num_processed + len(ok_list) + len(err_list)
      if num_q == 0:
        break

for c in m.handles:
  c.fp = None
  c.close()
m.close()

epoll模式,php mult curl不支持此模式,tornado基于pycurl multi_socket_action封装的异步http client,每个client实例维护一个ioloop:

from tornado.httpclient import AsyncHTTPClient
from tornado.ioloop import IOLoop
count = 10000
done = 0
def handle_request(response):
  global done
  done += 1
  if (done == count):
    #结束循环
    IOLoop.instance().stop()

  if response.error:
    print "Error:", response.error
  #else:
    #print response.body
#默认client是基于ioloop实现的,配置使用Pycurl
AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient",max_clients=20)
http_client = AsyncHTTPClient()
for i in range(count):
  http_client.fetch("https://www.haiyun.me/", handle_request)
#死循环
IOLoop.instance().start()      

基于epoll的multi curl在lan环境下效果不如select,因为所有Socket都在活跃状态,所有的callback都被唤醒,会导致资源的竞争。既然都是要处理所有的Socket,直接遍历是最简单最有效的方式.
为更好的性能建议libcurl/pycurl开启异步DNS解析

PHP CURL Keepalive连接重用

发布时间:January 4, 2015 // 分类:PHP // No Comments

PHP CURL默认支持keepalive连接复用,单个CURL注意要复用handle才可以:

<?php
   $ch1 = curl_init();
   curl_setopt($ch1, CURLOPT_URL, "https://www.haiyun.me/");
   curl_setopt($ch1, CURLOPT_HTTPHEADER, array(
    'Connection: Keep-Alive',
    'Keep-Alive: 300'
   ));
   curl_setopt($ch1, CURLOPT_FORBID_REUSE, 0);
   curl_setopt($ch1, CURLOPT_RETURNTRANSFER,1);
   curl_exec($ch1);

   curl_setopt($ch1, CURLOPT_URL, "https://www.haiyun.me/");
   curl_setopt($ch1, CURLOPT_RETURNTRANSFER,1);
   curl_exec($ch1);
   curl_close($ch1);

Multi_CURL无需复用handle即默认支持keepalive连接复用,当然也可复用handle,详情见:自用完美php异步并行 multi curl类

$master = curl_multi_init();
$done = curl_multi_info_read($master)
#删除handle
curl_multi_remove_handle($master, $done['handle']);
#复用删除的handle
curl_multi_add_handle($master, $ch);
分类
最新文章
最近回复
  • 海运: 恩山有很多。
  • swsend: 大佬可以分享一下固件吗,谢谢。
  • Jimmy: 方法一 nghtp3步骤需要改成如下才能编译成功: git clone https://git...
  • 海运: 地址格式和udpxy一样,udpxy和msd_lite能用这个就能用。
  • 1: 怎么用 编译后的程序在家里路由器内任意一台设备上运行就可以吗?比如笔记本电脑 m参数是笔记本的...
  • 孤狼: ups_status_set: seems that UPS [BK650M2-CH] is ...
  • 孤狼: 擦。。。。apcupsd会失联 nut在冲到到100的时候会ONBATT进入关机状态,我想想办...
  • 海运: 网络,找到相应的url编辑重发请求,firefox有此功能,其它未知。
  • knetxp: 用浏览器F12网络拦截或监听后编辑重发请求,修改url中的set为set_super,将POS...
  • Albert: 啊啊啊啊啊啊啊啊啊 我太激动了,终于好了英文区搜索了半天,翻遍了 pve 论坛没找到好方法,博...