Hi,
Sometimes we get:
Caused by: java.lang.OutOfMemoryError: Map failed
at sun.nio.ch.FileChannelImpl.map0(Native Method)
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:885)
... 25 more
(excuse me, we lost full stack trace, it starts from java.io.IOException which is caused by OOM)
As far as I understand it happens because of failed ‘mmap’ call during index mapping:
OffsetIndex.scala:
val idx = raf.getChannel.map(FileChannel.MapMode.READ_WRITE, 0, len)
Probably it happens because of memory limit.
But after restart (and installing more memory, adjusting ulimits, etc.) we end with long startup time. This IOException is not handled, active log files are never flushed, some .index files maybe corrupted. It leads to startup times (starting from 40min for us).
Is it possible to handle this exception, flush active log files, indexes and exit properly? In fact it can take ‘infinite’ time to recover all things in case of big number of topics/partitions.
Thanks,
Pavel.
Sometimes we get:
Caused by: java.lang.OutOfMemoryError: Map failed
at sun.nio.ch.FileChannelImpl.map0(Native Method)
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:885)
... 25 more
(excuse me, we lost full stack trace, it starts from java.io.IOException which is caused by OOM)
As far as I understand it happens because of failed ‘mmap’ call during index mapping:
OffsetIndex.scala:
val idx = raf.getChannel.map(FileChannel.MapMode.READ_WRITE, 0, len)
Probably it happens because of memory limit.
But after restart (and installing more memory, adjusting ulimits, etc.) we end with long startup time. This IOException is not handled, active log files are never flushed, some .index files maybe corrupted. It leads to startup times (starting from 40min for us).
Is it possible to handle this exception, flush active log files, indexes and exit properly? In fact it can take ‘infinite’ time to recover all things in case of big number of topics/partitions.
Thanks,
Pavel.