NLTK resources re-downloaded on every clean_pokemon_text() call #10
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Description
In
text-cleaner/text_cleaning_pipeline.pyline 121:ensure_nltk_resources()callsnltk.download()for 6 resources on every invocation of the cleaning function.Problem
nltk.download()checks if already present, the check itself has I/O overhead (stat calls per resource)Fix
Use a module-level flag:
Or call it once at import/app startup time.