File descriptors leak with Wapiti models
It looks like the primary/secondary models are being loaded everytime a request is made to /sourceExtractor/extractNewsML
, except they are never closed and are kept in memory, which leads to the error "Too many files open" after a while, once the OS runs out of file descriptors.
Truncated output of lsof -a -p <pid>
:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
...
java 29717 semantic 200r REG 8,2 10691781 12598248 /source-extractor/lib/wapiti_models/model_primary_bio
java 29717 semantic 201r REG 8,2 799578 12598252 /source-extractor/lib/wapiti_models/model_secondary_bio
java 29717 semantic 202r REG 8,2 799578 12598252 /source-extractor/lib/wapiti_models/model_secondary_bio
java 29717 semantic 203r REG 8,2 10691781 12598248 /source-extractor/lib/wapiti_models/model_primary_bio
java 29717 semantic 204r REG 8,2 10691781 12598248 /source-extractor/lib/wapiti_models/model_primary_bio
java 29717 semantic 205r REG 8,2 799578 12598252 /source-extractor/lib/wapiti_models/model_secondary_bio
...
How to reproduce
- Run the source extractor with default configuration.
- Do a POST request to
/sourceExtractor/extractNewsML
. - Inspect open files using
lsof
. - Repeat step 2 and 3.
- Observe that the list of open files is growing.
Please let me know if you need more details.