再谈代理与隧道

 http协议, TCP/IP  再谈代理与隧道已关闭评论
4月 252014
 

缘起

我的IE上访问 www.google.com.hk 是正常的;但是我的chrome上访问 www.google.com.hk 却报证书错误;

查看证书: 在IE上确实是正确的,在chrome上确实是错误的。

另: 由于www.google.com.hk 被解析到一个内网IP 192.168.xx.xx ;

问题:同是被解析到192.168.xx.xx ,为什么IE可以,而chrome不可以?

分析

抓包看看

chrome抓包结果:

IE访问时的抓包结果:

 

从抓包信息不难发现:

内网IP 192.168.xx.xx 提供了两种代理模式:

1.  7层代理模式:  chrome走的是这种模式

2.  HTTP隧道模式: IE走的是这种模式

通过nslookup查看www.google.com.hk解析到的IP地址为: 192.168.xx.xx ,如下

 

说明

该dns: 10.xx.xx.xx 提供了wpad.dat 文件(wpad参考资料:http://yuelei.blog.51cto.com/202879/83841/ );

似乎是:

IE参考该文件,通过HTTP隧道模式完成请求

chrome没有参考该文件,直接发起了请求

为什么会这样?

原本chrome的代理等设置是和IE相同的,如此看来,有可能不同了,去chrome的设置中看看,发现如下可以地方:

有没有发现,这里无法“更改代理服务器设置”,原因是: 您的网络代理设置有扩展程序管理  正常情况为:

 

去扩展程序中看看,我安装并启用了如下两个插件:

Proxy SwitchySharp

Unblock Youku

这两个插件都有此功能,全部禁用后,恢复正常:

最后

那么哪些域名会走隧道呢? 访问如下地址:

http://10.xx.xx.xx/wpad.dat

其中 10.xx.xx.xx 为你的dns服务器的IP地址;当然,也可能是dns告诉你的某个其他的IP地址,详情参考( http://yuelei.blog.51cto.com/202879/83841/ ),如下:

 

到此为止,浪费了1个小时

 Posted by at 上午 11:32

Response与Transfer-Encoding:chunked、Content-Length、Content-Encoding:gzip

 http协议  Response与Transfer-Encoding:chunked、Content-Length、Content-Encoding:gzip已关闭评论
8月 042013
 

缘起

了解HTTP 1.1和HTTP1.0的区别的同学都知道,Transfer-Encoding:chunked , Connection:keep-alive 都是HTTP 1.1的新特性;

Connection:keep-alive 使得一次连接可以干多次HTTP请求的活儿,而HTTP1.0协议每次tcp连接只能处理一个HTTP请求;

另外,对于HTTP 1.0来讲,如果一次HTTP的响应内容很多,而且又无法提前预知内容的多少,那么就不使用content-length ,输出完成后,直接关闭连接即可,一定程度上来讲,content-length对于HTTP 1.0来讲,是可有可无的;

而对于HTTP1.1 来讲,如果 connection: keep-alive ,而且又不能提前预知内容多少的话,该怎么办呢? 这就是为什么要有Transfer-Encoding:chunked 的原因了。

有了这些知识之后,我们再来看一种现象,为什么同样是HTTP1.1 ,有的请求就使用的是 content-length ,而又的请求就使用的Transfer-Encoding:chunked 呢? 需要什么特殊的设置吗?

可能请求输出的内容并不多,比如就10行,但是每行内容需要5s钟的时间来生成,如果能分块儿输出,是不是用户体验会好一些?那么又如何分块儿输出呢?

从协议上来讲,如果输出中有 content-length 则显然不是分块儿输出的,如果是Transfer-Encoding:chunked 则可能是分块儿输出的。

 

查资料:

一般服务器采用 Transfer-Encoding:chunked 有两种情况:

1. 应用程序已经输出给webserver很多内容(就是webserver的buffer满了),但是还是没有要结束的意思(就是还没输出完),则webserver放弃输出content-length,如果是HTTP1.0 ,则直接输出内容(输出完关闭连接就ok了);如果是HTTP1.1,则采用Transfer-Encoding:chunked 的方式输出。

2. 应用程序主动flush内容到客户端,如果是PHP,为: ob_flush(); flush(); (参看: http://php.net/flush )

 

所以: 对于页面服务,如果要输出的内容很多,可以生成一部分就flush一部分,让浏览器尽快渲染给用户,给用户一个更好的使用体验。

Content-Encoding:gzip

gzip是内容编码,是否压缩; 压缩式在传输之前进行的,所以传输的分块儿是按照压缩后的数据分块儿的(似乎有点儿废话了)

Nginx中如果启用了gzip压缩,则必然采用Transfer-Encoding:chunked 的方式输出,原因如下:

Nginx中Gzip模块和r->headers_out.content_length_n

r->headers_out.content_length_n :这个在Nginx内部用于表述请求返回内容的长度。但注意这不是完全相等的,只有在 r->headers_out.content_length_n >=0的时候,才有意义。比如说,通常后端的upstream(比如说PHP),如果没有在脚本中强制header输出content-length,则默认在nginx中 r->headers_out.content_length_n = -1。

Gzip模块也是一个典型的Filter模块。这里简单介绍下,后续可以详细描述。在header filter中会直接清空 r->headers_out.content_length_n和header中输出的content_length。为什么要清空呢?主要是因为gzip要对内容模块进行压缩处理,而在header filter的时候,gzip模块不可能计算出压缩后的内容长度(原因是在nginx中,header 输出和body的输出是完全两个不同的阶段),所以最好的办法就是在清空header中的content-length。这样结合之前的介绍的chunked模块,可以看出:在nginx中,如果采用gzip,如果是keep alive,则必然是chunked模式。

 

参考资料:

http://blog.xiuwz.com/tag/content-length/

http://lokki.iteye.com/blog/1072327

http://www.cnblogs.com/foxhengxing/archive/2011/12/02/2272387.html

 

 

 Posted by at 上午 12:39

关于cookie

 http协议  关于cookie已关闭评论
2月 162013
 

缘起

如何在不发起一个Http请求的情况下删除一个Httponly的cookie?

探索:

1. 虽然JS无法读取Httponly的cookie,那么JS是否可以设置或删除Httponly的cookie呢?

答案是否定的

2. 使用http-equiv类型的meta标签是可以设置cookie的,那么是否可以cache一个带有http-equiv类型的meta标签的页面来操作cookie呢?

http-equiv类型的meta标签只能设置、删除、修改非Httponly类型的cookie

分析:

分析1:

JS无法访问Httponly的cookie,其中包括三层含义:

  1. 无法读取
  2. 无法设置和修改
  3. 无法删除

如:

无法通过如下方式设置、修改或删除一个已存在的Httponly的cookie:

无法通过如下方式设置、修改或删除一个已存在的Httponly的cookie:

注意:或许你通过第一种方式在一个Httponly的cookie已存在的情况下设置了一个新的test cookie,那么基本上是因为两个test的domain或path不同;如:一个Httponly的cookie设置在a.test.com域下,而这时通过document.domain=”test.com”,将domain修改为了 test.com;这时候可以通过第一种方式设置一个testcookie,和原有的Httponly的cookie并不冲突

 

分析2:

HTML中有一个meta标签,其中的一个用法是http-equiv,如,可以通过如下方式设置cookie:

但是,测试发现,似乎无法通过JS创建meta标签的方式来设置cookie

分析3:

测试发现,无法通过meta标签来设置、修改、删除一个Httponly的cookie

 Posted by at 上午 1:36

关于WebSocket

 http协议  关于WebSocket已关闭评论
1月 182013
 

1. WebSocket rfc 中文翻译: http://blog.csdn.net/stoneson/article/details/8063802

2. http://o0211oo.iteye.com/blog/1671973

 Posted by at 上午 12:37

Why https?

 http协议  Why https?已关闭评论
1月 172013
 

Why https?

  1. 保证用户信息的传输安全。包括用户名、密码、cookie、用户保存的服务器端信息等等
  2. 很大程度上免遭DNS劫持

Why not https?

  1. 更多的cpu资源的消耗
  2. 证书校验的时间消耗
  3. 证书校验可能因一些问题导致验证失败
 Posted by at 下午 2:06

关于PHP中http请求的一个细节

 http协议, Linux & Unix, TCP/IP  关于PHP中http请求的一个细节已关闭评论
1月 112013
 

缘起:

使用php的file_get_contents(…) 函数从广州发送一个http请求到北京,该请求的请求头和响应头如下:

现在,从广州到北京的一次数据的往返时间(就是ping的时间)为34ms,请求的执行时间为6ms; 因为请求头和相应头都非常小,分别一次tcp数据包就够,这样的话,一次http请求时间为: 34 + 34 + 6 = 74ms , 而实际却稳定在108ms左右,还有大约34ms不知去向

分析

根据这个丢失的时间长度来看,大概是多了一次数据的发送和接收; 通过tcpdump抓包发现如下现象:

file_get_contents(…)中实现的http请求,首先把请求行发送出去,收到请求行数据包的ack包后继续发送后续的请求头; 这样的话就凭空多出一次网络时间,其各种原因,有时间再看看PHP源码中为何如此实现吧。

测试了一下curl,没有这个问题。

 Posted by at 下午 7:23

tcpdump匹配http头

 http协议  tcpdump匹配http头已关闭评论
12月 042011
 

tcpdump -XvvennSs 0 -i eth0 tcp[20:2]=0x4745 or tcp[20:2]=0x4854

0x4745 为"GET"前两个字母"GE"

0x4854 为"HTTP"前两个字母"HT"

 Posted by at 上午 10:42

5 Ways to Speed Up Your Site

 http协议  5 Ways to Speed Up Your Site已关闭评论
3月 142011
 

22 Jun 2006

Throughout the blogosphere I’m always seeing these blogs, that while  they look great, are horribly slow and overburdened. Over the past few  months I have become somewhat of a website optimization specialist,  bringing my own site from an over 250kB homepage to its current 34kB. I  will help you achieve some of the same success with a few, powerful  tips. Most of these are common sense, but I can’t stress their  importance enough. I will concentrate on the website and not  the server in this article, as there are too many things to discuss when  it comes to server optimization.

1) Reduce Overall Latency by Reducing HTTP Requests

Every HTTP request, or loading each item on your website, has an  average round-trip latency of 0.2 seconds. So if your site is loading 20  items, regardless of whether they are stylesheets, images or scripts,  that equates to 4 seconds in latency alone (on your average broadband  connection). If you have a portion on your site with many images within  it, such as your footer, you can reduce the number of HTTP requests with  image maps. I discussed that in more depth at the end of this article.  If you are using K2, you can get rid on one extra HTTP request by using  one stylesheet, style.css, and no schemes (integrate what was in your  scheme in the main stylesheet).

Don’t Rely on Other Sites!

If you have several components on your site loading from other  websites, they are slowing you down. A bunch of HTTP requests from the  same server is bad enough, but having HTTP requests from different  servers has increased latency and can be critical to your site’s loading  time if their server is down. For example, when the Yahoo! ads server  was acting weird one day my site seemingly hesitated to load as it  waited on the Yahoo! server before loading the rest of my content.  Hence, I don’t run Yahoo! ads anymore. I don’t trust anyone else’s  server and neither should you. The only thing on this site served on  another is the FeedBurner counter.

2) Properly Save Your Images

One huge mistake people do is save their image in Photoshop the  regular way. Photoshop has a "save for web" feature for a reason, use  it. But that’s not enough. You must experiment with different settings  and file formats. I’ve found that my header/footers fare well as either  PNGs or GIFs. One major contributor to image size is the palette or  number of colors used in the image. Gradients are pure evil when it  comes to image size. Just changing the way my header text was formatted  and replacing the gradient with a color overlay (or just reducing the  opacity of the gradient) saved a few kilobytes. However, if you must  keep your gradient you can experiment with the websnap feature which  removes similar colors from the palette. But if you get carried away, it  can make your image look horrible. Spend sometime in Photoshop, saving  image for web with different settings. Once you have honed this skill,  you can shave off many kilobytes throughout your site. Also, if you use  the FeedBurner counter chicklet you can save roughly 2.1kB by opting to  use the non-animated, static version.

3) Compression

Along with reducing HTTP requests comes decreasing the size of each  request. We covered this case when it comes to images, but what about  other aspects of the site? You can save a good deal of space by  compressing the CSS, JS and PHP used on your site. Ordinarily  compressing PHP wouldn’t do anything since it’s a server-side scripting  language, but when it’s used to structure your site or blog, as it  commonly is, compressing the PHP in the form of removing all whitespace  can help out. If you run WordPress, you can save 20kB or more by  enabling WP Admin » Options » Reading » WordPress should compress articles (gzip) if browsers ask for them.  Keep in mind, however, that if you receive mass traffic one day you  might want to disable that setting if your webhost gets easily ruffled  with high CPU usage.

The problem with compressing any of your files is that it makes  editing them a pain. That’s why I try to keep two versions of the same  file, a compressed version and an uncompressed version. As for PHP  compression, I generally go through the files by hand and remove any  whitespace. When it comes to CSS, I usually do the same thing but have  found CSS Tweak to be  helpful when dealing with larger files. But do keep in mind that if you  compress your main style.css for WordPress with default CSS Tweak  settings, it will remove the comments at the top that setup the theme.  Be sure to add that piece back after you’ve compressed it or WordPress  won’t recognize your theme. When it comes to compressing JavaScript, this site has you covered. However, use the "crunch" feature as I’ve received weird results using "compress."

Alternatively, you can check out my method of CSS compression utilizing PHP.

4) Avoid JavaScript Where Possible

In addition to adding HTTP requests and size to the site, the  execution of the JavaScript (depends on what it does) can slow your  site. Things like Live Search, Live Comments, Live Archives are tied to  large JS files that like to keep your readers’ browsers busy. The less  the better.

5) Strip Extraneous PHP/MySQL Calls

This step is probably only worth pursuing once you have completely  exhausted the other tips. The K2 theme my site is vaguely based upon  originally comes with support for many plugins and features, many of  which I don’t use. By going through each file and removing the PHP calls  for plugins I’m not using or features I don’t need, I can take some of  the load off of the server. When it comes time to hit the frontpage of  Digg or Slashdot, your server will more than thank you. Some aspects of  this can be exemplified by hardcoding items where feasible. Things in  code that don’t change in your installation such as the name of your  blog or your feed or stylesheet location, can be hardcoded. In K2 these  items rely on a WordPress PHP tag such as bloginfo. It’s hard to explain  what sorts of things you can strip from your website’s PHP framework,  but be on the lookout for things you don’t use on your site. For  example, in the K2 comments file there is a PHP if else that looks to  see if live comments are enabled and utilize them if so. Since I don’t  use live comments, I can completely remove the if part and write it so  that regular comments are always used.

Also, using too many WordPress plugins can be a bad thing, especially  if those plugins are dependent on many MySQL commands which generally  take much, much longer to execute than PHP and can slow a whole page  down.

Miscellaneous Thoughts

Even if you don’t call on a piece of CSS that has an image, it is  still loaded – so you might want to rethink using that one CSS selector  that hardly gets called. When it comes to using a pre-made theme for  your CMS, it’s a good idea to go through the CSS and look for things  that aren’t used. For example, with K2 there was a bit of CSS defined  for styling sub-pages. I don’t have any sub-pages so I removed that  piece of CSS.

If your site is maintained using a CMS of some sort, you likely have  several plugins, if not dozens, running behind the scenes. Going along  with the theme of things, you will want to deactivate any plugins that  aren’t mission critical. They use server resources and add to the PHP  processing load.

 

from: http://paulstamatiou.com/5-ways-to-speed-up-your-site

 Posted by at 上午 9:54

http:   revalidate

 http协议  http:   revalidate已关闭评论
3月 092011
 

在http头中可能会出现must-revalidate ; 以前没太注意,其大致意思为:
如果服务器端明确指出了资源的过期时间或者是保鲜时间,而且声明了资源的修改时间或者etag之类的标识,那么就有一个问题: 在保鲜时间内,如果用到了该资源,是不是要(根据修改时间或etag)到服务器确认一下资源是否最新的,如果没有明确说明,则agent有自己的默认机制,如果服务器声明了:must-revalidate, 则每次使用该资源就都需要确认资源新鲜性了。

相关参考: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.4

must-revalidate
      Because a cache MAY be configured to ignore a server’s specified       expiration time, and because a client request MAY include a max-       stale directive (which has a similar effect), the protocol also       includes a mechanism for the origin server to require revalidation       of a cache entry on any subsequent use. When the must-revalidate       directive is present in a response received by a cache, that cache       MUST NOT use the entry after it becomes stale to respond to a
      subsequent request without first revalidating it with the origin       server. (I.e., the cache MUST do an end-to-end revalidation every       time, if, based solely on the origin server’s Expires or max-age       value, the cached response is stale.)
      The must-revalidate directive is necessary to support reliable       operation for certain protocol features. In all circumstances an       HTTP/1.1 cache MUST obey the must-revalidate directive; in       particular, if the cache cannot reach the origin server for any       reason, it MUST generate a 504 (Gateway Timeout) response.
      Servers SHOULD send the must-revalidate directive if and only if       failure to revalidate a request on the entity could result in       incorrect operation, such as a silently unexecuted financial       transaction. Recipients MUST NOT take any automated action that       violates this directive, and MUST NOT automatically provide an       unvalidated copy of the entity if revalidation fails.
      Although this is not recommended, user agents operating under       severe connectivity constraints MAY violate this directive but, if       so, MUST explicitly warn the user that an unvalidated response has       been provided. The warning MUST be provided on each unvalidated       access, and SHOULD require explicit user confirmation.
   proxy-revalidate
      The proxy-revalidate directive has the same meaning as the must-       revalidate directive, except that it does not apply to non-shared       user agent caches. It can be used on a response to an       authenticated request to permit the user’s cache to store and       later return the response without needing to revalidate it (since       it has already been authenticated once by that user), while still       requiring proxies that service many users to revalidate each time       (in order to make sure that each user has been authenticated).       Note that such authenticated responses also need the public cache       control directive in order to allow them to be cached at all.
 Posted by at 下午 8:38