PHPor 的Blog

缘起

话说strace是运维的利器，可以跟踪进程的所有的系统调用；有一天，运维小明发现tmp目录下意外产生了大量图片文件，并且在不断增多，但是不知道是哪个进程写的，所以更不知道是哪个应用写的，这时候strace无能为力了（其实可以strace所有可以的进程的，有些麻烦）

曾记否，inotify是可以监控文件（目录）的访问（不仅仅是变化）的，其输出类似：

嗯，没有进程信息，咋办？

还有一个神奇： systemtap

参考文档： https://sourceware.org/systemtap/SystemTap_Beginners_Guide.pdf

这里的iotime.stp 可以用来监控文件的变化，而且包含进程信息：

#! /usr/bin/env stap
/*
 * Copyright (C) 2006-2007 Red Hat Inc.
 *
 * This copyrighted material is made available to anyone wishing to use,
 * modify, copy, or redistribute it subject to the terms and conditions
 * of the GNU General Public License v.2.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program. If not, see <http://www.gnu.org/licenses/>.
 *
 * Print out the amount of time spent in the read and write systemcall
 * when each file opened by the process is closed. Note that the systemtap
 * script needs to be running before the open operations occur for
 * the script to record data.
 *
 * This script could be used to to find out which files are slow to load
 * on a machine. e.g.
 *
 * stap iotime.stp -c 'firefox'
 *
 * Output format is:
 * timestamp pid (executabable) info_type path ...
 *
 * 200283135 2573 (cupsd) access /etc/printcap read: 0 write: 7063
 * 200283143 2573 (cupsd) iotime /etc/printcap time: 69
 *
 */
global start
global time_io
function timestamp:long() { return gettimeofday_us() - start }
function proc:string() { return sprintf("%d (%s)", pid(), execname()) }
probe begin { start = gettimeofday_us() }
global filehandles, fileread, filewrite
probe syscall.open.return {
	filename = user_string($filename)
		if ($return != -1) {
			filehandles[pid(), $return] = filename
		} else {
			printf("%d %s access %s fail\n", timestamp(), proc(), filename)
		}
}
probe syscall.read.return {
	p = pid()
		fd = $fd
		bytes = $return
		time = gettimeofday_us() - @entry(gettimeofday_us())
		if (bytes > 0)
			fileread[p, fd] += bytes
				time_io[p, fd] <<< time
}
probe syscall.write.return {
	p = pid()
		fd = $fd
		bytes = $return
		time = gettimeofday_us() - @entry(gettimeofday_us())
		if (bytes > 0)
			filewrite[p, fd] += bytes
				time_io[p, fd] <<< time
}
probe syscall.close {
	if ([pid(), $fd] in filehandles) {
		printf("%d %s access %s read: %d write: %d\n",
				timestamp(), proc(), filehandles[pid(), $fd],
				fileread[pid(), $fd], filewrite[pid(), $fd])
			if (@count(time_io[pid(), $fd]))
				printf("%d %s iotime %s time: %d\n", timestamp(), proc(),
						filehandles[pid(), $fd], @sum(time_io[pid(), $fd]))
	}
	delete fileread[pid(), $fd]
		delete filewrite[pid(), $fd]
		delete filehandles[pid(), $fd]
		delete time_io[pid(),$fd]
}

#! /usr/bin/env stap

* This copyrighted material is made available to anyone wishing to use,

* modify, copy, or redistribute it subject to the terms and conditions

* of the GNU General Public License v.2.

* You should have received a copy of the GNU General Public License

* along with this program. If not, see <http://www.gnu.org/licenses/>.

* Print out the amount of time spent in the read and write systemcall

* when each file opened by the process is closed. Note that the systemtap

* script needs to be running before the open operations occur for

* the script to record data.

* This script could be used to to find out which files are slow to load

* on a machine. e.g.

* stap iotime.stp -c 'firefox'

* Output format is:

* timestamp pid (executabable) info_type path ...

* 200283135 2573 (cupsd) access /etc/printcap read: 0 write: 7063

* 200283143 2573 (cupsd) iotime /etc/printcap time: 69

global start

global time_io

function timestamp:long() { return gettimeofday_us() - start }

function proc:string() { return sprintf("%d (%s)", pid(), execname()) }

probe begin { start = gettimeofday_us() }

global filehandles, fileread, filewrite

probe syscall.open.return {

filename = user_string($filename)

if ($return != -1) {

filehandles[pid(), $return] = filename

} else {

printf("%d %s access %s fail\n", timestamp(), proc(), filename)

}

probe syscall.read.return {

p = pid()

fd = $fd

bytes = $return

time = gettimeofday_us() - @entry(gettimeofday_us())

if (bytes > 0)

fileread[p, fd] += bytes

time_io[p, fd] <<< time

}

probe syscall.write.return {

p = pid()

fd = $fd

bytes = $return

time = gettimeofday_us() - @entry(gettimeofday_us())

if (bytes > 0)

filewrite[p, fd] += bytes

time_io[p, fd] <<< time

}

probe syscall.close {

if ([pid(), $fd] in filehandles) {

printf("%d %s access %s read: %d write: %d\n",

timestamp(), proc(), filehandles[pid(), $fd],

fileread[pid(), $fd], filewrite[pid(), $fd])

if (@count(time_io[pid(), $fd]))

printf("%d %s iotime %s time: %d\n", timestamp(), proc(),

filehandles[pid(), $fd], @sum(time_io[pid(), $fd]))

}

delete fileread[pid(), $fd]

delete filewrite[pid(), $fd]

delete filehandles[pid(), $fd]

delete time_io[pid(),$fd]

}

这个脚本用来干这事儿，逻辑有些多，杀鸡用了牛刀，不过至少可以解决问题

排查问题工具

JVM 新生代老年代

java 虚拟机–新生代与老年代GC

Java 堆内存 – 老生代新生代

10 个最好的 PHP 库用于轻松发送HTTP请求 – OPEN资讯

独家许可与独占许可

文件操作监控

缘起

[转载] Linux的capability深入分析 – 舒方小院 – 博客园

Linux 进程状态之 D

主机内存查看方式