Why are log entries dropping under load when writing to a network file system?
#1
I just ran into a weird issue where my application's logging system seems to be dropping entries under heavy load, but only when writing to the network file system. The logs show a gap, and I can't find any error messages about the write failures in the system or application logs themselves. It's like the data just silently disappears.
Reply
#2
I've seen this when the app buffers logs and writes to an NFS mount. the write path would hand off to a queue and assume it landed, but under load the buffer could fill and data never made it to disk. no obvious errors in the app logs, just gaps in the log file.
Reply
#3
we added a tiny sanity check: a separate heartbeat file updated every few seconds and a simple counter for writes. under heavy bursts the write latency spiked and the main log lagged behind, so gaps showed up even though the app kept running. it helped reveal the timing, not the cause.
Reply
#4
we tried turning off the async path and forced synchronous writes for the log file. reliability improved a bit but throughput collapsed, so we dropped that test. still not sure if the root cause is client NFS cache eviction or server side writeback.
Reply
#5
do you think the gap lines up with rotate events or NFS cache flush windows?
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: