A core dump is a file containing a process's address space (memory) when the process terminates unexpectedly. Core dumps may be produced on-demand (such as by a debugger), or automatically upon termination. Core dumps are triggered by the kernel in response to program crashes, and may be passed to a helper program (such as systemd-coredump) for further processing. A core dump is not typically used by an average user, but may be passed on to developers upon request where it can be invaluable as a post-mortem snapshot of the program's state at the time of the crash, especially if the fault is hard to reliably reproduce.

systemd's default behavior is to generate core dumps for all processes in /var/lib/systemd/coredump.

The default action of certain signals is to cause a process to terminate and produce a core dump file, a disk file containing an image of the process's memory at the time of termination. This image can be used in a debugger (e.g., gdb) to inspect the state of the program at the time that it terminated.

Aligned vs Unaligned Memory Access

The problem starts in the memory chip. Memory is capable to read certain number of bytes at a time. If you try to read from unaligned memory address, your request will cause RAM chip to do two read operations. For instance, assuming RAM works with units of 8 bytes, trying to read 8 bytes from relative offset of 5 will cause RAM chips to do two read operations. First will read bytes 0-7. Second will read bytes 8-15. As a result, the relatively slow memory access will become even slower.
Luckily hardware developers learned to overcome this problem long time ago. Actually the solution is not absolute and the ultimate problem remains. In modern computers CPU caches memory it reads. Memory cache build of 32 and 64 byte long cache lines. Even when you read just one byte from the memory, CPU reads a complete cache line and places it into the cache. This way reading 8 bytes from offset 5 or from offset 0 makes no difference – CPU reads 64 bytes in any case.
However things get not so pretty when you read from a memory address that is not on a cache line boundary. In that case CPU will read two complete cache lines. That is 128 bytes instead of only 8. This is a huge overhead and this is the overhead I would like to demonstrate you.
Now I guess you already noticed that I only talk about reads, but not about write memory accesses. This is because I think that memory reads should give us good enough results. The thing is that when you write some data into the memory, CPU does two things. First it loads cache line into cache. Then it modifies part of the line in the cache. Note that it does not write the cache line back to memory immediately. Instead it waits for more appropriate time (for instance when its less busy or, in SMP systems, when other processor needs this cache line).
We’ve already seen that, at least theoretically speaking, problem that effects performance of unaligned memory access is the time it takes for the CPU to transfer memory to the cache. Compared to this, time it takes for the CPU to modify few bytes in the cache line is negligible. As a result, its enough to test only the reads.

http://www.alexonlinux.com/aligned-vs-unaligned-memory-access

Linux Internals

Search This Blog

Core dump & Memory Alignment

Core dump

Aligned vs Unaligned Memory Access

Comments

Post a Comment

Popular posts from this blog

Spinlock implementation in ARM architecture

Explanation of "struct task_struct"

Macro "container_of"