再谈代理与隧道

HTTP协议, TCP/IP No Responses »

6,460

4月 252014

缘起

我的IE上访问 www.google.com.hk 是正常的；但是我的chrome上访问 www.google.com.hk 却报证书错误；

查看证书：在IE上确实是正确的，在chrome上确实是错误的。

另：由于www.google.com.hk 被解析到一个内网IP 192.168.xx.xx ;

问题：同是被解析到192.168.xx.xx ，为什么IE可以，而chrome不可以？

分析

抓包看看

chrome抓包结果：

IE访问时的抓包结果：

从抓包信息不难发现：

内网IP 192.168.xx.xx 提供了两种代理模式：

1. 7层代理模式： chrome走的是这种模式

2. HTTP隧道模式： IE走的是这种模式

通过nslookup查看www.google.com.hk解析到的IP地址为： 192.168.xx.xx ，如下

说明

该dns： 10.xx.xx.xx 提供了wpad.dat 文件（wpad参考资料：http://yuelei.blog.51cto.com/202879/83841/ ）；

似乎是：

IE参考该文件，通过HTTP隧道模式完成请求

chrome没有参考该文件，直接发起了请求

为什么会这样？

原本chrome的代理等设置是和IE相同的，如此看来，有可能不同了，去chrome的设置中看看，发现如下可以地方：

有没有发现，这里无法“更改代理服务器设置”，原因是： 您的网络代理设置有扩展程序管理 正常情况为：

去扩展程序中看看，我安装并启用了如下两个插件：

Proxy SwitchySharp

Unblock Youku

这两个插件都有此功能，全部禁用后，恢复正常：

最后

那么哪些域名会走隧道呢？访问如下地址：

http://10.xx.xx.xx/wpad.dat

其中 10.xx.xx.xx 为你的dns服务器的IP地址；当然，也可能是dns告诉你的某个其他的IP地址，详情参考（ http://yuelei.blog.51cto.com/202879/83841/ ），如下：

到此为止，浪费了1个小时

原文链接：https://phpor.net/blog/post/2332

latency是什么意思

HTTP协议 No Responses »

3,483

8月 042013

1. latency不是响应时间

2. latency一般指数据在路上的时间

3. chrome的开发者工具中显示的latency指的是接收到第一个响应字节之前消耗掉的时间（latency应该是这个意思吗？）

参考资料：

http://javidjamae.com/2005/04/07/response-time-vs-latency/

http://www.linfo.org/latency.html

原文链接：https://phpor.net/blog/post/2043

Response与Transfer-Encoding:chunked、Content-Length、Content-Encoding:gzip

HTTP协议 No Responses »

5,549

8月 042013

缘起

了解HTTP 1.1和HTTP1.0的区别的同学都知道，Transfer-Encoding:chunked ， Connection:keep-alive 都是HTTP 1.1的新特性；

Connection:keep-alive 使得一次连接可以干多次HTTP请求的活儿，而HTTP1.0协议每次tcp连接只能处理一个HTTP请求；

另外，对于HTTP 1.0来讲，如果一次HTTP的响应内容很多，而且又无法提前预知内容的多少，那么就不使用content-length ,输出完成后，直接关闭连接即可，一定程度上来讲，content-length对于HTTP 1.0来讲，是可有可无的；

而对于HTTP1.1 来讲，如果 connection： keep-alive ，而且又不能提前预知内容多少的话，该怎么办呢？这就是为什么要有Transfer-Encoding:chunked 的原因了。

有了这些知识之后，我们再来看一种现象，为什么同样是HTTP1.1 ，有的请求就使用的是 content-length ，而又的请求就使用的Transfer-Encoding:chunked 呢？需要什么特殊的设置吗？

可能请求输出的内容并不多，比如就10行，但是每行内容需要5s钟的时间来生成，如果能分块儿输出，是不是用户体验会好一些？那么又如何分块儿输出呢？

从协议上来讲，如果输出中有 content-length 则显然不是分块儿输出的，如果是Transfer-Encoding:chunked 则可能是分块儿输出的。

查资料：

一般服务器采用 Transfer-Encoding:chunked 有两种情况：

1. 应用程序已经输出给webserver很多内容（就是webserver的buffer满了），但是还是没有要结束的意思（就是还没输出完），则webserver放弃输出content-length，如果是HTTP1.0 ，则直接输出内容（输出完关闭连接就ok了）；如果是HTTP1.1，则采用Transfer-Encoding:chunked 的方式输出。

2. 应用程序主动flush内容到客户端，如果是PHP，为： ob_flush(); flush(); (参看： http://php.net/flush )

所以：对于页面服务，如果要输出的内容很多，可以生成一部分就flush一部分，让浏览器尽快渲染给用户，给用户一个更好的使用体验。

Content-Encoding:gzip

gzip是内容编码，是否压缩；压缩式在传输之前进行的，所以传输的分块儿是按照压缩后的数据分块儿的（似乎有点儿废话了）

Nginx中如果启用了gzip压缩，则必然采用Transfer-Encoding:chunked 的方式输出，原因如下：

“

Nginx中Gzip模块和r->headers_out.content_length_n

r->headers_out.content_length_n ：这个在Nginx内部用于表述请求返回内容的长度。但注意这不是完全相等的，只有在 r->headers_out.content_length_n >=0的时候，才有意义。比如说，通常后端的upstream（比如说PHP），如果没有在脚本中强制header输出content-length，则默认在nginx中 r->headers_out.content_length_n = -1。

Gzip模块也是一个典型的Filter模块。这里简单介绍下，后续可以详细描述。在header filter中会直接清空 r->headers_out.content_length_n和header中输出的content_length。为什么要清空呢？主要是因为gzip要对内容模块进行压缩处理，而在header filter的时候，gzip模块不可能计算出压缩后的内容长度（原因是在nginx中，header 输出和body的输出是完全两个不同的阶段），所以最好的办法就是在清空header中的content-length。这样结合之前的介绍的chunked模块，可以看出：在nginx中，如果采用gzip，如果是keep alive，则必然是chunked模式。

”

参考资料：

http://blog.xiuwz.com/tag/content-length/

http://lokki.iteye.com/blog/1072327

http://www.cnblogs.com/foxhengxing/archive/2011/12/02/2272387.html

原文链接：https://phpor.net/blog/post/2037

关于cookie

HTTP协议 No Responses »

1,905

2月 162013

缘起

如何在不发起一个Http请求的情况下删除一个Httponly的cookie？

探索：

1. 虽然JS无法读取Httponly的cookie，那么JS是否可以设置或删除Httponly的cookie呢？

答案是否定的

2. 使用http-equiv类型的meta标签是可以设置cookie的，那么是否可以cache一个带有http-equiv类型的meta标签的页面来操作cookie呢？

http-equiv类型的meta标签只能设置、删除、修改非Httponly类型的cookie

分析：

分析1：

JS无法访问Httponly的cookie，其中包括三层含义：

无法读取
无法设置和修改
无法删除

如：

无法通过如下方式设置、修改或删除一个已存在的Httponly的cookie：

document.cookie = "test=set_by_js; expires=Thu, 01-Jan-2070 00:00:01 GMT; path=/"

1	document.cookie = "test=set_by_js; expires=Thu, 01-Jan-2070 00:00:01 GMT; path=/"

无法通过如下方式设置、修改或删除一个已存在的Httponly的cookie：

document.cookie = "test=set_by_js; expires=Thu, 01-Jan-2070 00:00:01 GMT; path=/; Httponly"

1	document.cookie = "test=set_by_js; expires=Thu, 01-Jan-2070 00:00:01 GMT; path=/; Httponly"

注意：或许你通过第一种方式在一个Httponly的cookie已存在的情况下设置了一个新的test cookie，那么基本上是因为两个test的domain或path不同；如：一个Httponly的cookie设置在a.test.com域下，而这时通过document.domain=”test.com”，将domain修改为了 test.com；这时候可以通过第一种方式设置一个testcookie，和原有的Httponly的cookie并不冲突

分析2：

HTML中有一个meta标签，其中的一个用法是http-equiv，如，可以通过如下方式设置cookie：

<meta http-equiv="Set-Cookie" content="cookiename=cookievalue; expires=Friday, 12-Jan-2021 18:18:18 GMT； path=/">

1	<meta http-equiv="Set-Cookie" content="cookiename=cookievalue; expires=Friday, 12-Jan-2021 18:18:18 GMT； path=/">

但是，测试发现，似乎无法通过JS创建meta标签的方式来设置cookie

分析3：

测试发现，无法通过meta标签来设置、修改、删除一个Httponly的cookie

原文链接：https://phpor.net/blog/post/1785

关于WebSocket

HTTP协议 No Responses »

1,804

1月 182013

1. WebSocket rfc 中文翻译： http://blog.csdn.net/stoneson/article/details/8063802

2. http://o0211oo.iteye.com/blog/1671973

原文链接：https://phpor.net/blog/post/1755

Why https？

HTTP协议 No Responses »

2,085

1月 172013

Why https？

保证用户信息的传输安全。包括用户名、密码、cookie、用户保存的服务器端信息等等
很大程度上免遭DNS劫持

Why not https？

更多的cpu资源的消耗
证书校验的时间消耗
证书校验可能因一些问题导致验证失败

原文链接：https://phpor.net/blog/post/1745

关于PHP中http请求的一个细节

HTTP协议, Linux & Unix, TCP/IP No Responses »

2,233

1月 112013

缘起：

使用php的file_get_contents(…) 函数从广州发送一个http请求到北京，该请求的请求头和响应头如下：

GET / HTTP/1.1
Host: phpor.net

HTTP/1.1 200 OK
Date: Fri, 11 Jan 2013 11:11:19 GMT
Server: Apache
Content-Length: 5
Connection: close

hello

GET / HTTP/1.1

Host: phpor.net

HTTP/1.1 200 OK

Date: Fri, 11 Jan 2013 11:11:19 GMT

Server: Apache

Content-Length: 5

Connection: close

hello

现在，从广州到北京的一次数据的往返时间（就是ping的时间）为34ms，请求的执行时间为6ms；因为请求头和相应头都非常小，分别一次tcp数据包就够，这样的话，一次http请求时间为： 34 + 34 + 6 = 74ms ，而实际却稳定在108ms左右，还有大约34ms不知去向

分析

根据这个丢失的时间长度来看，大概是多了一次数据的发送和接收；通过tcpdump抓包发现如下现象：

file_get_contents(…)中实现的http请求，首先把请求行发送出去，收到请求行数据包的ack包后继续发送后续的请求头；这样的话就凭空多出一次网络时间，其各种原因，有时间再看看PHP源码中为何如此实现吧。

测试了一下curl，没有这个问题。

原文链接：https://phpor.net/blog/post/1728

tcpdump匹配http头

HTTP协议 No Responses »

3,491

12月 042011

tcpdump -XvvennSs 0 -i eth0 tcp[20:2]=0x4745 or tcp[20:2]=0x4854

0x4745 为"GET"前两个字母"GE"

0x4854 为"HTTP"前两个字母"HT"

原文链接：https://phpor.net/blog/post/936

5 Ways to Speed Up Your Site

HTTP协议 No Responses »

3,976

3月 142011

22 Jun 2006

Throughout the blogosphere I’m always seeing these blogs, that while they look great, are horribly slow and overburdened. Over the past few months I have become somewhat of a website optimization specialist, bringing my own site from an over 250kB homepage to its current 34kB. I will help you achieve some of the same success with a few, powerful tips. Most of these are common sense, but I can’t stress their importance enough. I will concentrate on the website and not the server in this article, as there are too many things to discuss when it comes to server optimization.

1) Reduce Overall Latency by Reducing HTTP Requests

Every HTTP request, or loading each item on your website, has an average round-trip latency of 0.2 seconds. So if your site is loading 20 items, regardless of whether they are stylesheets, images or scripts, that equates to 4 seconds in latency alone (on your average broadband connection). If you have a portion on your site with many images within it, such as your footer, you can reduce the number of HTTP requests with image maps. I discussed that in more depth at the end of this article. If you are using K2, you can get rid on one extra HTTP request by using one stylesheet, style.css, and no schemes (integrate what was in your scheme in the main stylesheet).

Don’t Rely on Other Sites!

If you have several components on your site loading from other websites, they are slowing you down. A bunch of HTTP requests from the same server is bad enough, but having HTTP requests from different servers has increased latency and can be critical to your site’s loading time if their server is down. For example, when the Yahoo! ads server was acting weird one day my site seemingly hesitated to load as it waited on the Yahoo! server before loading the rest of my content. Hence, I don’t run Yahoo! ads anymore. I don’t trust anyone else’s server and neither should you. The only thing on this site served on another is the FeedBurner counter.

2) Properly Save Your Images

One huge mistake people do is save their image in Photoshop the regular way. Photoshop has a "save for web" feature for a reason, use it. But that’s not enough. You must experiment with different settings and file formats. I’ve found that my header/footers fare well as either PNGs or GIFs. One major contributor to image size is the palette or number of colors used in the image. Gradients are pure evil when it comes to image size. Just changing the way my header text was formatted and replacing the gradient with a color overlay (or just reducing the opacity of the gradient) saved a few kilobytes. However, if you must keep your gradient you can experiment with the websnap feature which removes similar colors from the palette. But if you get carried away, it can make your image look horrible. Spend sometime in Photoshop, saving image for web with different settings. Once you have honed this skill, you can shave off many kilobytes throughout your site. Also, if you use the FeedBurner counter chicklet you can save roughly 2.1kB by opting to use the non-animated, static version.

3) Compression

Along with reducing HTTP requests comes decreasing the size of each request. We covered this case when it comes to images, but what about other aspects of the site? You can save a good deal of space by compressing the CSS, JS and PHP used on your site. Ordinarily compressing PHP wouldn’t do anything since it’s a server-side scripting language, but when it’s used to structure your site or blog, as it commonly is, compressing the PHP in the form of removing all whitespace can help out. If you run WordPress, you can save 20kB or more by enabling WP Admin » Options » Reading » WordPress should compress articles (gzip) if browsers ask for them. Keep in mind, however, that if you receive mass traffic one day you might want to disable that setting if your webhost gets easily ruffled with high CPU usage.

The problem with compressing any of your files is that it makes editing them a pain. That’s why I try to keep two versions of the same file, a compressed version and an uncompressed version. As for PHP compression, I generally go through the files by hand and remove any whitespace. When it comes to CSS, I usually do the same thing but have found CSS Tweak to be helpful when dealing with larger files. But do keep in mind that if you compress your main style.css for WordPress with default CSS Tweak settings, it will remove the comments at the top that setup the theme. Be sure to add that piece back after you’ve compressed it or WordPress won’t recognize your theme. When it comes to compressing JavaScript, this site has you covered. However, use the "crunch" feature as I’ve received weird results using "compress."

Alternatively, you can check out my method of CSS compression utilizing PHP.

4) Avoid JavaScript Where Possible

In addition to adding HTTP requests and size to the site, the execution of the JavaScript (depends on what it does) can slow your site. Things like Live Search, Live Comments, Live Archives are tied to large JS files that like to keep your readers’ browsers busy. The less the better.

5) Strip Extraneous PHP/MySQL Calls

This step is probably only worth pursuing once you have completely exhausted the other tips. The K2 theme my site is vaguely based upon originally comes with support for many plugins and features, many of which I don’t use. By going through each file and removing the PHP calls for plugins I’m not using or features I don’t need, I can take some of the load off of the server. When it comes time to hit the frontpage of Digg or Slashdot, your server will more than thank you. Some aspects of this can be exemplified by hardcoding items where feasible. Things in code that don’t change in your installation such as the name of your blog or your feed or stylesheet location, can be hardcoded. In K2 these items rely on a WordPress PHP tag such as bloginfo. It’s hard to explain what sorts of things you can strip from your website’s PHP framework, but be on the lookout for things you don’t use on your site. For example, in the K2 comments file there is a PHP if else that looks to see if live comments are enabled and utilize them if so. Since I don’t use live comments, I can completely remove the if part and write it so that regular comments are always used.

Also, using too many WordPress plugins can be a bad thing, especially if those plugins are dependent on many MySQL commands which generally take much, much longer to execute than PHP and can slow a whole page down.

Miscellaneous Thoughts

Even if you don’t call on a piece of CSS that has an image, it is still loaded – so you might want to rethink using that one CSS selector that hardly gets called. When it comes to using a pre-made theme for your CMS, it’s a good idea to go through the CSS and look for things that aren’t used. For example, with K2 there was a bit of CSS defined for styling sub-pages. I don’t have any sub-pages so I removed that piece of CSS.

If your site is maintained using a CMS of some sort, you likely have several plugins, if not dozens, running behind the scenes. Going along with the theme of things, you will want to deactivate any plugins that aren’t mission critical. They use server resources and add to the PHP processing load.

from: http://paulstamatiou.com/5-ways-to-speed-up-your-site

原文链接：https://phpor.net/blog/post/800

http: revalidate

HTTP协议 No Responses »

3,812

3月 092011

在http头中可能会出现must-revalidate ; 以前没太注意，其大致意思为：
如果服务器端明确指出了资源的过期时间或者是保鲜时间，而且声明了资源的修改时间或者etag之类的标识，那么就有一个问题：在保鲜时间内，如果用到了该资源，是不是要（根据修改时间或etag）到服务器确认一下资源是否最新的，如果没有明确说明，则agent有自己的默认机制，如果服务器声明了:must-revalidate, 则每次使用该资源就都需要确认资源新鲜性了。

相关参考： http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.4

must-revalidate: Because a cache MAY be configured to ignore a server’s specified expiration time, and because a client request MAY include a max- stale directive (which has a similar effect), the protocol also includes a mechanism for the origin server to require revalidation of a cache entry on any subsequent use. When the must-revalidate directive is present in a response received by a cache, that cache MUST NOT use the entry after it becomes stale to respond to a
: subsequent request without first revalidating it with the origin server. (I.e., the cache MUST do an end-to-end revalidation every time, if, based solely on the origin server’s Expires or max-age value, the cached response is stale.)
: The must-revalidate directive is necessary to support reliable operation for certain protocol features. In all circumstances an HTTP/1.1 cache MUST obey the must-revalidate directive; in particular, if the cache cannot reach the origin server for any reason, it MUST generate a 504 (Gateway Timeout) response.
: Servers SHOULD send the must-revalidate directive if and only if failure to revalidate a request on the entity could result in incorrect operation, such as a silently unexecuted financial transaction. Recipients MUST NOT take any automated action that violates this directive, and MUST NOT automatically provide an unvalidated copy of the entity if revalidation fails.
: Although this is not recommended, user agents operating under severe connectivity constraints MAY violate this directive but, if so, MUST explicitly warn the user that an unvalidated response has been provided. The warning MUST be provided on each unvalidated access, and SHOULD require explicit user confirmation.
proxy-revalidate: The proxy-revalidate directive has the same meaning as the must- revalidate directive, except that it does not apply to non-shared user agent caches. It can be used on a response to an authenticated request to permit the user’s cache to store and later return the response without needing to revalidate it (since it has already been authenticated once by that user), while still requiring proxies that service many users to revalidate each time (in order to make sure that each user has been authenticated). Note that such authenticated responses also need the public cache control directive in order to allow them to be cached at all.

原文链接：https://phpor.net/blog/post/798

Older Entries