fgets的第二个参数

缘起

string fgets ( resource $handle [, int $length ] );

第二个参数指定可以读取的最大长度,但是比较有意思的是,如果最终没有读取到换行,则返回的不是$length个字节,而是 $length -1 个字节,文档是这么写的,事实也是这样子的,那么为什么制造这么一个小插曲呢?说多少就是多少不是很好嘛,为什么还要少一个字节呢?

分析

因为PHP是用c写的,这可能也不是PHP故意如此的,或许C就是这样的,于是:

man fgets

fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by
s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A ‘\0’ is
stored after the last character in the buffer.

看来这和字符串buffer的长度是有关系的,字符串总是要以”\0″(是零不是欧)结尾的,所以真正得到的长度比指定的长度是小1的。

如果一行是3个字符(带上换行),这时候,指定fgets的最大长度为3,则读不出来换行,只能读到2个字符,写一次才能读到换行

PHP版的tail -f

缘起

每天生成一个文件,用一个程序实时读文件,类似tail -f ,但是程序需要能自动切换文件

 

问题

 

脚本

具体可参考fseek的实现:main/streams/streams.c

 

 

How to load debug symbols with GDB

转: http://marcioandreyoliveira.blogspot.jp/2008/03/how-to-debug-striped-programs-with-gdb.html

My friend Wanderley asked me if it is possible to tell GDB to load debuging symbols from some file and use it to help debuging a program that doesn’t have them.

Yes. It is.

There are two solutions to this question.

I going to explain the first solution in this post. The other solution I will explain in the next post.

You can load debuging information from an debug enabled version of executable file.

In order to better explain the first solution, I will setup my sample environment as follows:

  • released.c: source code of the program we wish to debug (listing 1).
  • ~/estudo/: Source code of our program will be put here.
  • ~/local/bin: The stripped off version of binary program will stay here.
  • ~/local/symbols: In this place are all files that contain debuging information.

Listing 1 – sample program source code

I have two versions of the program: with and without debuging information.

1 – You compile your program with debug information. In our sample:

gcc -Wall -g -release release.c <ENTER>

2 – You make a copy of your program. In our sample:

cp release release.full <ENTER>

3 – You strip off debuging information:

strip -s release <ENTER>

As you can see on Figure 1, we have two programs. released.full has debuging symbols but release doesn’t have them.

[nosym_metodo1_fig1.png]
Figure 1

4 – Move file release to ~/local/bin/:

mv release ~/local/bin <ENTER>

5 – Move file released.full to ~/local/symbols/

mv release.full ~/local/symbols <ENTER>

6 – Go to directory ~/local/bin

cd ~/local/bin <ENTER>

7 – Run GDB:

gdb ./release <ENTER>

8 – Try list command to see that release executable file doesn’t have symbols in it.

Note: if the program was already running you could get its PID then attach GDB to it.

Figure 2 shows us two windows. The first one shows that our executable file has no debug information. In the other window we can see thatrelease is not yet loaded.

Figure 2 – executable file named release is loaded by GDB but it is not yet running.

9 – Let GDB to load symbols from executable file named release.full. This binary version of our program has all symbols that we need to debug.

Please notice that GDB will not replace the release executable byrelease.full version of our program. It will just import symbols fromrelease.full into release debugging session.

But GDB needs to know in advance where it must put the symbols it will load. How can you determine the correct memory address?

It is quite simple. You issue command maint inside GDB:

maint info sections

Then you look for .text section. The address that is in the first column is what you want. In our sample, it is 0x08048320. See figure 3.

Figure 3 – looking for .text section address

10 – The next step is to instruct GDB to load debug symbols into .text section. To achieve it you do this:

add-symbol-file ~/local/symbols/release.full <.text section address>

In our sample it means to type:

add-symbol-file ~/local/symbols/release.full 0x08048320

From now on you can debug your program as usual.

Figure 4 shows us that debugging symbols where imported successfully and that now the list command (abbreviated as l) shows us the program source code.

Figure 4 – now our GDB session has debuging symbols

As you can see in figure 5, I set a break point at line 17 and I ran the program that stopped there. Then I printed i variable.

In the other terminal I issued ps command. It was done just to show you that the only program running was release executable. There is no instance of release.full program.

Figure 5 – debugging session.

I hope this post will make your life easier. In the next time I will teach you another way to import debugging symbols.

pcntl_signal 的第三个参数

缘起

当给进程安装一个信号处理程序,如果一个进程在执行一个系统调用的时候,突然收到一个信号,然后转到信号处理程序,当执行完信号处理程序后,系统调用会继续进行吗?

解答

不管在C还是其他语言,都有一个可选的方式,比如在PHP中,pcntl_signal有第三个参数(默认为true),可以继续未完成的系统调用(默认),也可以让系统调用返回失败(第三个参数设置为false);

这在有些时候是需要关心的,比如,fpm在退出的时候,先发送一个信号到每个子进程,子进程收到信号后不是立即死掉,而是先关闭输入,停止accept,然后设置 in_shutdown = 1; 然后信号处理例程执行完毕,返回继续原来的工作(如果原来在执行系统调用,则返回原来的系统调用),执行完一次请求后发现in_shutdown==1 则执行退出

putenv/getenv/$_ENV/phpinfo(INFO_ENVIRONMENT)

putenv/getenv, $_ENV, and phpinfo(INFO_ENVIRONMENT) are three completely distinct environment stores. doing putenv(“x=y”) does not affect $_ENV; but also doing $_ENV[“x”]=”y” likewise does not affect getenv(“x”). And neither affect what is returned in phpinfo().

putenv(); Adds setting to the server environment. The environment variable will only exist for the duration of the current request. At the end of the request the environment is restored to its original state.

Assuming the USER environment variable is defined as “dave” before running the following:

 

prints:

env is: dave
(doing: putenv fred)
env is: dave
getenv is: fred
(doing: set _env barney)
getenv is: fred
env is: barney
phpinfo()

Environment

Variable => Value

USER => dave

Response与Transfer-Encoding:chunked、Content-Length、Content-Encoding:gzip

缘起

了解HTTP 1.1和HTTP1.0的区别的同学都知道,Transfer-Encoding:chunked , Connection:keep-alive 都是HTTP 1.1的新特性;

Connection:keep-alive 使得一次连接可以干多次HTTP请求的活儿,而HTTP1.0协议每次tcp连接只能处理一个HTTP请求;

另外,对于HTTP 1.0来讲,如果一次HTTP的响应内容很多,而且又无法提前预知内容的多少,那么就不使用content-length ,输出完成后,直接关闭连接即可,一定程度上来讲,content-length对于HTTP 1.0来讲,是可有可无的;

而对于HTTP1.1 来讲,如果 connection: keep-alive ,而且又不能提前预知内容多少的话,该怎么办呢? 这就是为什么要有Transfer-Encoding:chunked 的原因了。

有了这些知识之后,我们再来看一种现象,为什么同样是HTTP1.1 ,有的请求就使用的是 content-length ,而又的请求就使用的Transfer-Encoding:chunked 呢? 需要什么特殊的设置吗?

可能请求输出的内容并不多,比如就10行,但是每行内容需要5s钟的时间来生成,如果能分块儿输出,是不是用户体验会好一些?那么又如何分块儿输出呢?

从协议上来讲,如果输出中有 content-length 则显然不是分块儿输出的,如果是Transfer-Encoding:chunked 则可能是分块儿输出的。

 

查资料:

一般服务器采用 Transfer-Encoding:chunked 有两种情况:

1. 应用程序已经输出给webserver很多内容(就是webserver的buffer满了),但是还是没有要结束的意思(就是还没输出完),则webserver放弃输出content-length,如果是HTTP1.0 ,则直接输出内容(输出完关闭连接就ok了);如果是HTTP1.1,则采用Transfer-Encoding:chunked 的方式输出。

2. 应用程序主动flush内容到客户端,如果是PHP,为: ob_flush(); flush(); (参看: http://php.net/flush )

 

所以: 对于页面服务,如果要输出的内容很多,可以生成一部分就flush一部分,让浏览器尽快渲染给用户,给用户一个更好的使用体验。

Content-Encoding:gzip

gzip是内容编码,是否压缩; 压缩式在传输之前进行的,所以传输的分块儿是按照压缩后的数据分块儿的(似乎有点儿废话了)

Nginx中如果启用了gzip压缩,则必然采用Transfer-Encoding:chunked 的方式输出,原因如下:

Nginx中Gzip模块和r->headers_out.content_length_n

r->headers_out.content_length_n :这个在Nginx内部用于表述请求返回内容的长度。但注意这不是完全相等的,只有在 r->headers_out.content_length_n >=0的时候,才有意义。比如说,通常后端的upstream(比如说PHP),如果没有在脚本中强制header输出content-length,则默认在nginx中 r->headers_out.content_length_n = -1。

Gzip模块也是一个典型的Filter模块。这里简单介绍下,后续可以详细描述。在header filter中会直接清空 r->headers_out.content_length_n和header中输出的content_length。为什么要清空呢?主要是因为gzip要对内容模块进行压缩处理,而在header filter的时候,gzip模块不可能计算出压缩后的内容长度(原因是在nginx中,header 输出和body的输出是完全两个不同的阶段),所以最好的办法就是在清空header中的content-length。这样结合之前的介绍的chunked模块,可以看出:在nginx中,如果采用gzip,如果是keep alive,则必然是chunked模式。

 

参考资料:

http://blog.xiuwz.com/tag/content-length/

http://lokki.iteye.com/blog/1072327

http://www.cnblogs.com/foxhengxing/archive/2011/12/02/2272387.html

 

 

测一下tcp的读写buffer有多大

测试办法:

1. 使用nc做一个server

nc -l localhost 8181

CTRL-Z

2. 使用nc做client

nc localhost 8181 </dev/zero

3. netstat -an |grep 8181

171008 就是读buffer的大小(约160KB), 1279872就是写buffer的大小(约1.2MB)

现在nc client被write给阻塞住了,如果nc client之发送小于 171008 + 1279872 的数据,则现在已经兴高采烈地quit了,然而,server端却连1个字节都没处理呢,如果在把server给杀掉,则这些数据就都算丢弃了。