http://s3.amazonaws.com/alexa-static/top-1m.csv.zip
http://www.quantcast.com/quantcast-top-million.zip
Alexa's top 1 million sites list
发布时间:July 1, 2014 // 分类: // No Comments
PHP通过Yahoo Content Analysis API生成tag
发布时间:June 27, 2014 // 分类:PHP // No Comments
<?php
$text = 'This domain name expired on 12/6/2014 and is pending renewal or deletion.';
$query = "select * from contentanalysis.analyze where text = '".$text."'";
$url = 'http://query.yahooapis.com/v1/public/yql';
$yql_query_url = $url . "?q=" . urlencode($query);
$yql_query_url .= "&format=json";
$yql_query_url .= "&enable_categorizer=true";
$yql_query_url .= "&diagnostics=false";
$yql_query_url .= "&related_entities=true";
$yql_query_url .= "&show_metadata=true";
$ch = curl_init($yql_query_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
$result = curl_exec($ch);
curl_close($ch);
$result = json_decode($result);
var_dump($result);
if(!is_null($result->query->results)) {
foreach($result->query->results->entities as $word) {
if (is_array($word)) {
foreach($word as $subword) {
echo $subword->text->content."\n";
}
} else{
echo $word->text->content."\n";
}
}
}
?>
文档:https://developer.yahoo.com/search/content/V2/contentAnalysis.html
Python使用Selenium/PhantomJS/chrome/firefox
发布时间:June 26, 2014 // 分类:Python // No Comments
Windows下安装setuptools和pip:
https://bootstrap.pypa.io/ez_setup.py
https://bootstrap.pypa.io/get-pip.py
python ez_setup.py
python get-pip.py
安装selenium:
pip install selenium
安装PhantomJS:
https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-1.9.7-linux-x86_64.tar.bz2
tar jxvf phantomjs-1.9.7-linux-x86_64.tar.bz2
cp phantomjs-1.9.7-linux-x86_64/bin/phantomjs /bin/
chmod 755 /bin/phantomjs
使用示例:
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get("http://www.baidu.com")
data = driver.title
print data
通过Remote Selenium Server:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
driver = webdriver.Remote(
command_executor='http://192.168.1.3:4444/wd/hub',
desired_capabilities={'browserName': 'PhantomJS',
'version': '2',
'javascriptEnabled': True})
driver = webdriver.Remote(
command_executor='http://192.168.1.3:4444/wd/hub',
desired_capabilities=DesiredCapabilities.PHANTOMJS)
driver.get("http://www.baidu.com")
data = driver.title
print data
PhantomJS和Firefox速度对比:
import unittest
from selenium import webdriver
import time
class TestThree(unittest.TestCase):
def setUp(self):
self.startTime = time.time()
def test_url_fire(self):
self.driver = webdriver.Firefox()
self.driver.get("http://www.qq.com")
self.driver.quit()
def test_url_phantom(self):
self.driver = webdriver.PhantomJS()
self.driver.get("http://www.qq.com")
self.driver.quit()
def tearDown(self):
t = time.time() - self.startTime
print "%s: %.3f" % (self.id(), t)
self.driver.quit
if __name__ == '__main__':
suite = unittest.TestLoader().loadTestsFromTestCase(TestThree)
unittest.TextTestRunner(verbosity=0).run(suite)
远程连接chrome:
google-chrome --remote-debugging-port=9222 --no-sandbox
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_experimental_option("debuggerAddress", "127.0.0.1:9222")
driver = webdriver.Chrome(options=chrome_options)
driver.get("https://91porn.com")
html = driver.page_source
print(html)
time.sleep(2000)
driver.quit()
远程连接firefox:
firefox -marionette -start-debugger-server 2828
import time
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
firefox_services = Service(executable_path='/usr/bin/geckodriver', port=3000, service_args=['--marionette-port', '2828', '--connect-existing'])
driver = webdriver.Firefox(service=firefox_services)
driver.get("https://91porn.com")
pageSource = driver.page_source
print(pageSource)
driver.quit()
#import time
#from selenium.webdriver import Firefox
#from selenium import webdriver
#driver = webdriver.Firefox()
#driver.get("https://91porn.com")
#html = driver.page_source
#print(html)
#time.sleep(2000)
PHP使用Selenium自动化运行chrome/firefox
发布时间:June 26, 2014 // 分类:PHP // 1 Comment
通过composer安装php-webdriver:
apt install php7.4-cli php-curl php-zip
curl -sS https://getcomposer.org/installer | php --install-dir=/usr/bin/
php composer.phar require php-webdriver/webdriver
安装java环境和selenium server:
apt install openjdk-14-jre
wget https://selenium-release.storage.googleapis.com/3.141/selenium-server-standalone-3.141.59.jar
java -jar selenium-server-standalone-3.141.59.jar
安装firefox/chrome浏览器和相应的webdirver:
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
apt install ./google-chrome-stable_current_amd64.deb
wget https://chromedriver.storage.googleapis.com/88.0.4324.96/chromedriver_linux64.zip
unzip chromedriver_linux64.zip
mv chromedriver /usr/bin/
apt install firefox
wget https://github.com/mozilla/geckodriver/releases/download/v0.29.0/geckodriver-v0.29.0-linux64.tar.gz
tar zxf geckodriver-v0.29.0-linux64.tar.gz
mv geckodriver /usr/bin/
启动浏览器需X环境支持,可使用XVNC或X Window
可以使用Firefox扩展Selenium IDE: PHP Formatters录制脚本。
selenium chrome使用:
<?php
require_once('vendor/autoload.php');
use Facebook\WebDriver\Remote\DesiredCapabilities;
use Facebook\WebDriver\Remote\RemoteWebDriver;
use Facebook\WebDriver\Chrome\ChromeOptions;
$host = 'http://localhost:4444/wd/hub';
$options = new ChromeOptions();
$options->addArguments(array(
'--no-sandbox',
'--headless',
'--start-maximized',
'--user-data-dir=/tmp/chrome-user-data-dir',
'--profile-directory=/tmp/chrome-profile-dir',
'--user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36'
));
$caps = DesiredCapabilities::chrome();
$caps->setCapability(ChromeOptions::CAPABILITY, $options);
$driver = RemoteWebDriver::create($host, $caps);
//default
//$driver = RemoteWebDriver::create($host, DesiredCapabilities::chrome());
//$driver->manage()->window()->maximize();
$driver->get('https://www.haiyun.me/');
var_dump($driver->getTitle());
$driver->quit();
selenium firefox使用:
<?php
namespace Facebook\WebDriver;
require 'vendor/autoload.php';
use Facebook\WebDriver\Remote\RemoteWebDriver;
use Facebook\WebDriver\Remote\DesiredCapabilities;
use Facebook\WebDriver\Firefox\FirefoxProfile;
use Facebook\WebDriver\Firefox\FirefoxDriver;
$host = 'http://localhost:4444/wd/hub';
$profile = new FirefoxProfile();
$profile->setPreference('browser.startup.homepage', 'https://github.com/facebook/php-webdriver/');
$profile->setPreference("general.useragent.override", "Mozilla/5.0");
//$profile->addExtension('./vimperator-3.8.2-fx.xpi');
$caps = DesiredCapabilities::firefox();
$caps->setCapability(FirefoxDriver::PROFILE, $profile);
$caps->setCapability('moz:firefoxOptions', ['args' => ['-headless']]);
$caps->setCapability('moz:firefoxOptions', ['args' => ["-profile", "/tmp/firefox_profile"]]);
$driver = RemoteWebDriver::create($host, $caps);
//default
//$driver = RemoteWebDriver::create($host, DesiredCapabilities::firefox());
$driver->manage()->window()->maximize();
$driver->get('https://www.haiyun.me/');
var_dump($driver->getTitle());
$driver->quit();
文档:
https://github.com/php-webdriver/php-webdriver/wiki
https://php-webdriver.github.io/php-webdriver/
Centos/ubuntu下Xvfb配合x11vnc搭建VNC Server
发布时间:June 26, 2014 // 分类:CentOS // No Comments
远程运行Linux窗口程序使用X Windows太重量级了,可以使用Xvfb新建虚拟X窗口,通过x11vnc启动VNC Server并转发Xvfb启动的虚拟窗口。
apt install x11vnc xvfb
yum install xorg-x11-server-Xvfb
yum install x11vnc
#新建X虚拟窗口
Xvfb :1 -screen 0 1024x768x24 -nolisten tcp &
#设置默认窗口为新建的虚拟窗口,打开窗口程序时调用
export DISPLAY=:1
#或
DISPLAY=:1 firefox
INIT:
#!/bin/bash
#chkconfig: 345 95 50
#description: Starts xvfb on display 1
if [ -z "$1" ]; then
echo "`basename $0` {start|stop}"
exit
fi
case "$1" in
start)
Xvfb :1 -screen 0 1024x768x24 -nolisten tcp &
export DISPLAY=:1
echo 'export DISPLAY=:1' >> ~/.bashrc
;;
stop)
killall Xvfb
;;
esac
新建VNC服务器并转发指定X窗口
x11vnc -listen 0.0.0.0 -rfbport 5900 -noipv6 -passwd password -display :1 -forever
然后通过VNC客户端连接,默认端口5900,Windows下可使用TightVNC或UltraVNC。
启动的软件窗口太小,设置:
xdotool search --name ".*Mozilla Firefox" windowsize 1440 900
ubuntu下firefox中文显示乱码需安装中文字体:
apt install fonts-wqy-microhei
分类
- Apache (13)
- Nginx (45)
- PHP (86)
- IIS (8)
- Mail (17)
- DNS (16)
- Cacti (14)
- Squid (5)
- Nagios (4)
- Puppet (7)
- CentOS (13)
- Iptables (23)
- RADIUS (3)
- OpenWrt (41)
- DD-WRT (1)
- VMware (9)
- 网站程序 (2)
- 备份存储 (11)
- 常用软件 (20)
- 日记分析 (10)
- Linux基础 (18)
- 欧诺代理 (0)
- Linux服务 (18)
- 系统监控 (4)
- 流量监控 (7)
- 虚拟化 (28)
- 伪静态 (2)
- LVM (3)
- Shell (18)
- 高可用 (2)
- 数据库 (16)
- FreeBSD (3)
- 网络安全 (25)
- Windows (35)
- 网络工具 (22)
- 控制面板 (3)
- 系统调优 (10)
- Cisco (3)
- VPN (6)
- ROS (20)
- Vim (14)
- KMS (4)
- PXE (2)
- Mac (1)
- Git (1)
- PE (1)
- LNS (2)
- Xshell (7)
- Firefox (13)
- Cygwin (4)
- OpenSSL (9)
- Sandboxie (3)
- StrokesPlus (1)
- AutoHotKey (4)
- Total Commander (3)
- WordPress (3)
- iMacros (6)
- Typecho (2)
- Ollydbg (1)
- Photoshop (1)
- 正则 (3)
- Debian (3)
- Python (8)
- NoSQL (6)
- 消息队列 (4)
- JS (7)
- Tmux (3)
- GO (7)
- HHVM (2)
- 算法 (1)
- Docker (2)
- PT (15)
- N1 (16)
- K2P (6)
- LUKS (4)
最新文章
- sandboxie plus运行firefox 140播放视频全屏不能覆盖任务栏
- TEWA-1100G光猫使用
- 烽火光猫HG5382A3使用
- 记联通更换移动XG-040G-MD光猫
- smokeping slave同步错误illegal attempt to update using time解决
- 使用valgrind定位解决smartdns内存泄露
- 此内容被密码保护
- debian12下initramfs-tools配置ip子网掩码255.255.255.255/32失败解决
- iPhone查看屏幕供应商
- 光猫拨号ImmortalWrt/OpenWRT路由获取ipv6遇到的问题
最近回复
- 海运: 可能版本问题
- 海运: 如果运营商限制型号
- 海运: 没有
- Mruru: 烽火猫切换rootfs的方法有么大佬?
- nono: 修改光猫型号是做啥子用的
- 960: root账号默认密码hg2x0 不对哇
- rer: 感谢分享!~
- opnfense: 谢谢博主!!!解决问题了!!!我之前一直以为内置的odhcp6就是唯一管理ipv6的方式
- liyk: 这个方法获取的IPv6大概20分钟之后就会失效,默认路由先消失,然后Global IPV6再消失
- 海运: 不好意思,没有。
归档
- August 2025
- March 2025
- February 2025
- August 2024
- May 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- May 2023
- April 2023
- February 2023
- January 2023
- December 2022
- September 2022
- July 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- July 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- July 2019
- April 2019
- March 2019
- February 2019
- January 2019
- December 2018
- November 2018
- October 2018
- September 2018
- August 2018
- July 2018
- June 2018
- April 2018
- March 2018
- February 2018
- January 2018
- December 2017
- October 2017
- September 2017
- August 2017
- July 2017
- April 2017
- March 2017
- February 2017
- January 2017
- December 2016
- November 2016
- July 2016
- June 2016
- November 2015
- October 2015
- September 2015
- August 2015
- July 2015
- June 2015
- May 2015
- April 2015
- March 2015
- February 2015
- January 2015
- December 2014
- November 2014
- October 2014
- September 2014
- August 2014
- July 2014
- June 2014
- May 2014
- April 2014
- March 2014
- February 2014
- January 2014
- December 2013
- November 2013
- October 2013
- August 2013
- July 2013
- June 2013
- May 2013
- April 2013
- March 2013
- February 2013
- January 2013
- December 2012
- November 2012
- October 2012
- September 2012
- August 2012
- July 2012
- June 2012
- May 2012
- April 2012
- March 2012
- February 2012
- October 2011
- September 2011
- August 2011
- July 2011