Forum Stats

  • 3,784,352 Users
  • 2,254,928 Discussions


survival flume to kill -9


customer is testing BDA and using Flume to ingest data into HDFS. He wants to test service resilience and part of the test simulates unexpected service break down.

When he issues a kill -9 to the Flume process, cloudera watchdog inmediately restarts it, but the events stored in the file channel never reach the HDFS sink. The HDFS sink

never completes the final .tmp renaming and in summary, the system loses events.

Two questions:

1) Is this the expected behaviour?

2) Any hint on how to recover from such situation ( in essence, recover/reload lost events)?

source type is spooling directory , with immediate removal policy (there will be thousands/millions of daily files coming )



This discussion has been closed.