PHPor 的Blog – 第61页

非主流抓包工具

httpry：

An open-source HTTP packet sniffing tool which captures live HTTP packets with libpcap library, and displays HTTP requests and responses in a human-readable format. It comes with a collection of parsing Perl scripts for mining various information from its standard output.

效果：

安装：

linux下yum可以安装
其它系统编译安装，源码： https://github.com/jbittel/httpry （从源码来看，虽然是c写的，似乎可以写perl插件）

mysql sniffer: https://phpor.net/blog/post/9562

golang 开源项目

http://www.geekhub.cn/a/77.html

haproxy 实现http隧道代理

什么是http隧道代理？（自己搜吧）

haproxy的经典逻辑是：每个请求都分配给所配置的后端（backend）来处理；对于connect请求，原则上不需要添加配置backend的，但是这不符合haproxy的规则;测试+跟踪源代码发现: haproxy根本不能实现http隧道代理，之于option http-tunnel 配置也不过是生命不要解析第一个请求之后的数据而已，connect请求还是要原样转发给backend的

grep -o 用法

实例1：提取百度首页内容中的url

curl  http://www.baidu.com/ 2&gt;/dev/null |grep -oE "https?://[^[:space:]&gt;]+"

1	curl http://www.baidu.com/ 2>/dev/null \|grep -oE "https?://[^[:space:]>]+"

grep -o 选项只输出匹配到的内容，对于“提取”来讲很好用

haproxy 健康检查与域名解析

配置：

resolvers mydns
 nameserver svr1 172.16.162.194:53
backend http_backend
  mode http
  acl acl_baidu req.hdr(host) -i www.baidu.com
  acl acl_beebank req.hdr(host) -i www.beebank.com
  use-server svr_baidu if acl_baidu
  use-server svr_beebank if acl_beebank
  server svr_baidu www.baidu.com:80 check fall 100 rise 1 resolvers mydns resolve-prefer ipv4
  server svr_beebank www.beebank.com:80 check fall 100 rise 1 resolvers mydns resolve-prefer ipv4

resolvers mydns

nameserver svr1 172.16.162.194:53

backend http_backend

mode http

acl acl_baidu req.hdr(host) -i www.baidu.com

acl acl_beebank req.hdr(host) -i www.beebank.com

use-server svr_baidu if acl_baidu

use-server svr_beebank if acl_beebank

server svr_baidu www.baidu.com:80 check fall 100 rise 1 resolvers mydns resolve-prefer ipv4

server svr_beebank www.beebank.com:80 check fall 100 rise 1 resolvers mydns resolve-prefer ipv4

其中：

check 说明要开启健康检查
fall 100 rise 1: 失败100次才会被自动摘掉，对于被摘掉的机器，成功1次就能挂回来
resolvers dns：使用指定的dns 进行解析；如果为开启健康检查（即： check）则该配置将不生效（这叫什么逻辑？），参考： http://cbonte.github.io/haproxy-dconv/configuration-1.6.html#5.2-resolvers
resolve-prefer ipv4：参考ipv4地址，这个配置避免解析没有必要的ipv6，参考： http://cbonte.github.io/haproxy-dconv/configuration-1.6.html#5.3.2

字面意思来看：DNS总是会解析ipv4和ipv6的，只是Haproxy优先参考哪一个。
实测的结果是：指定了ipv4后，就不会再去解析ipv6地址了（这样效率更好）
或许dns类库也可以一下子解析出来ipv4和ipv6的

为什么profile中设置的环境变量在crond中取不到？

环境： centos6

分析：

一般来讲，都是因为在某个环节把环境变量给清了：

要么是应用程序为了安全，自己清理了环境变量；这种情况多半在应用程序的配置中允许设置保留哪些环境变量（如：sudo程序）
要么是外部脚本启动应用程序时（使用env命令）把环境变量给清了；如： service命令启动服务（如： crond）时就使用env清理了环境变量：

env -i PATH=/sbin:/usr/sbin:/bin:/usr/bin TERM=xterm-256color /etc/init.d/crond start

1

env -i PATH=/sbin:/usr/sbin:/bin:/usr/bin TERM=xterm-256color /etc/init.d/crond start

解决办法：要么不清环境变量，要么添加自己期望保留的环境变量

根据上面的理论，修改了/sbin/service 添加了自己想保留的环境变量，如下：

env -i PATH=/sbin:/usr/sbin:/bin:/usr/bin TERM=xterm-256color MY_VAR_NAME=$MY_VAR_NAME /etc/init.d/crond start

1	env -i PATH=/sbin:/usr/sbin:/bin:/usr/bin TERM=xterm-256color MY_VAR_NAME=$MY_VAR_NAME /etc/init.d/crond start

（其实，如果直接使用/etc/init.d/crond start 来启动crond是和 /sbin/service 没有一毛钱关系的）

测试发现，在crond中启动的程序，依然获取不到想要的环境变量；根据上面理论，基本是crond把环境变量给清理了，那么这个该怎么办呢？

解决办法1：给每个cron 任务写一个wrapper，在wrapper中设置自己需要的环境变量，wrapper.sh如下：

#!/bin/bash
source /path/to/my/env/file
$*

#!/bin/bash

source /path/to/my/env/file

这样，只需要在每条cron的前面添加 wrapper.sh 就行了，如下：

* * * * * /tmp/wrapper.sh /bin/echo 1 >/tmp/debug.txt

1	* * * * * /tmp/wrapper.sh /bin/echo 1 >/tmp/debug.txt

cron执行log如下：

Jul 20 19:21:01 bs2 CROND[42036]: (root) CMD (/tmp/wrapper.sh /bin/echo 1 >/tmp/debug.txt)

1	Jul 20 19:21:01 bs2 CROND[42036]: (root) CMD (/tmp/wrapper.sh /bin/echo 1 >/tmp/debug.txt)

看起来还不错:)

但是，我有上百条cron，修改起来好是麻烦；其实通过写脚本来添加，还好；不过每次添加新的cron都要记着先写/tmp/wrapper.sh 有些繁琐，忘记了怎么办？

解决办法2：

据说cron会参考环境变量SHELL，而且可以在crontab中设置这个SHELL，你们crond应该就是用这个东西启动子进程的，否则要这个干啥？如果我在crontab中如下设置：

SHELL=/tmp/wrapper.sh

1	SHELL=/tmp/wrapper.sh

其不很方便？

测试发现并不如此简单，原因如下：

如果SHELL设置为/tmp/wrapper.sh 则crond是这么使用的：
/tmp/wrapper.sh -c “/bin/echo 1 >/tmp/debug.txt”
显然，我的wrapper.sh中没有处理-c选项；简单，我也不关心这个，直接shift掉不就完了？wrapper.sh 修改如下：

#!/bin/bash source /path/to/my/env/file shift $*

1
2
3
4

#!/bin/bash
source /path/to/my/env/file
shift
$*

测试发现找不到/tmp/debug.txt；但是，在cron的邮件中找到了 “ 1 >/tmp/debug.txt”（这是cron任务的输出）；分析发现,我们的wrapper.sh 似乎并不认识 “>” ，难怪，我的wrapper.sh确实不是一个shell，不能解析 “>”, 但是如果为此写一个认识 “>” 的shell也不值当的呀！还是让bash自己来干吧
我们上面已经说了，crond使用了bash的-c选项，-c选项就是把字符串当做shell来执行，我何不学下呢？如下：

#!/bin/bash shift source /path/to/my/env/file /bin/bash -c "$*"

1
2
3
4

#!/bin/bash
shift
source /path/to/my/env/file
/bin/bash -c "$*"

嗯，就是这样子的

附录： vixiecron 的源码（src/do_command.c）中(函数: static int child_process(entry * e, char **jobenv); )是这么运行任务的：

execle(shell, shell, "-c", e->cmd, (char *) 0, jobenv);

1	execle(shell, shell, "-c", e->cmd, (char *) 0, jobenv);

man execle可以知道：

int execle(const char *path, const char *arg, ..., char * const envp[]);

1	int execle(const char path, const char arg, ..., char * const envp[]);

The execle() function also specifies the environment of the executed process by following the
 NULL pointer that terminates the list of arguments in the argument list or the pointer to the
 argv array with an additional argument. This additional argument is an array of pointers to
 null-terminated strings and must be terminated by a NULL pointer. The other functions take the
 environment for the new process image from the external variable environ in the current process.

The execle() function also specifies the environment of the executed process by following the

NULL pointer that terminates the list of arguments in the argument list or the pointer to the

argv array with an additional argument. This additional argument is an array of pointers to

null-terminated strings and must be terminated by a NULL pointer. The other functions take the

environment for the new process image from the external variable environ in the current process.

其中，最后一个参数是要执行的shell的环境变量，没有继承父进程的环境变量（虽然没有专门clear所有的环境变量）

其实，完全可以在crontab file的前面定义想要的环境变量的，这里定义的环境变量不会被清空

agetty 占用cpu 100%的问题

现象：

解决办法：

然后：

unlink /dev/tty1 && kill 141733

WEB优化的一些资料

HTTP 协议中的 Content-Encoding： https://imququ.com/post/content-encoding-header-in-http.html

HTTP/2 头部压缩技术介绍： https://imququ.com/post/header-compression-in-http2.html

关注： https://imququ.com/

HTTP Client Hints

简言之：就是通过一种机制，让client告诉server自己有哪些特性，便于server能更好地为该client提供定制的服务

参考： https://imququ.com/post/http-client-hints.html

https://docs.imgix.com/tutorials/responsive-images-client-hints?_ga=1.267838798.110287679.1468903970

HTTP代理相关资料

HTTP 代理原理及实现（一）https://imququ.com/post/web-proxy.html
HTTP 代理原理及实现（二）https://imququ.com/post/web-proxy-2.html