It's also worth looking at SystemTap or DTrace, depending on what OS you're running. While strace will allow you to look at an individual process and its children, SystemTap/DTrace will allow you to gather data on system call (and then some) usage system wide. Some examples:
- monitor execve() calls system-wide
- monitor I/O to a specific file, from any process
- measure per-process network usage
(note that newer Linux kernels may have other ways of accomplishing some of these tasks that I'm not aware of).
I've had a lot of success using SystemTap to look at low-level filesystem performance issues in the kernel. We've run SystemTap scripts on our production filesystem servers for over a year with no problems whatsoever.
strace tip #31415927: If your program is I/O bound, sometimes you can improve performance by increasing the size of the read buffer. A bigger buffer means fewer system calls and potentially increased performance. How do you know how big the read buffer is? Sometimes it's hard to tell even if you have the source (i.e. you're using stdio). With strace you can see the number of bytes you're trying to slurp in with each read system call. If it looks like a small number, you can then go figure out how to make it a bigger number, perhaps using setvbuf or rolling your own buffered I/O.
Yes, buffer size can have a significant effect on performance. You can quickly see buffer sizes used by "read" across your system with "dtrace -n 'syscall::read:entry{ @ = quantize(arg2); }'", which summarizes the output (in case you're doing more of these than you can reasonably see in the console) and has significantly less impact on the program you're tracing. Output for my system:
For those of you on the Mac that doesn't have strace but does have dtrace, here are some preinstalled dtrace scripts you have at your fingertips:
dtruss - similar to strace
opensnoop - all files
nettop - all network access
iosnoop/iotop - all io
execsnoop - all new processes
errinfo - all system calls resulting in errors
Not exactly the same info but I think much more powerful as it is system wide and you call always filter out what you don't need to know, or write your own scripts!
Helped my trouble shoot why sudo was taking 25+ seconds yesterday. Apparently it was timing out attempting to perform some NIS operations on a misconfigured setup.
Does anyone know of anything that will parse the output of strace or dtrace. What I'd like to do is generate a graph of my pipeline showing which program calls which other program, how long does it take to get back, which files it uses, etc.
I think it would be a great visualization to go along with my documentation.
The comments in the code claim it will show elapsed time for each process, but that's not working for me.
I discovered the Python ptrace module while I was searching for this. I have a project for which modifying the Python module might be a nice alternative to parsing strace output.
I'm referring to dtruss which is a launcher like strace and is built on dtrace. The dtruss "interface" predates dtrace, but historically didn't need to be run a root.
This has bugged me for quite a while, a few days ago enough so that I actually grabbed the strace source code with the intent of compiling it on OSX, at which point I discovered the real reason: ptrace on OSX doesn't support the PTRACE_SYSCALL flag. (I gave up at that point.)
I cannot think of a single time when I'm debugging an errant program when I would like to have it run as root. I can however think of many things, like debugging permission issues, resource limit issues, reading and writing files, etc where it makes a big difference to run as root. I know you can su and then su back to yourself, but that's a pain and things aren't exactly the same anymore, i.e. there is a usability problem.
Your can send your complaints to Apple and hope that they fix it, or use free software that doesn't do stupid things like this. IMO, you kind of forfeited your right to complain when you started using non-free software.
usualy yoh debug code with gdb Nd the like....dtrace is morevfor profiling the whole setup and figuring out where to optimize......plus you can, i believe, insrument your code with dtrace probes, even better. if a company wants their product optimized theyll have to provide the tools for it.
- monitor execve() calls system-wide
- monitor I/O to a specific file, from any process
- measure per-process network usage
(note that newer Linux kernels may have other ways of accomplishing some of these tasks that I'm not aware of).
I've had a lot of success using SystemTap to look at low-level filesystem performance issues in the kernel. We've run SystemTap scripts on our production filesystem servers for over a year with no problems whatsoever.
Edit: formatting