Skip to main content

cate batch

Overview

Batch-process multiple files in a directory for anonymization. Supports glob patterns, parallel workers, resume capability for interrupted jobs, and configurable error handling.

Usage

python -m src.anonymization.cli batch DIRECTORY [OPTIONS]

Options

OptionDescriptionDefault
DIRECTORYDirectory containing files to process (positional, required)--
-o, --output-dirOutput directory<dir>/_anonymized
-p, --patternGlob pattern(s) to match files; repeatable*
-s, --strategyTransformation strategyplaceholder
-j, --parallelNumber of parallel workers1
-r, --resumeResume a previously interrupted batchfalse
--overwriteOverwrite existing output filesfalse
--no-recursiveDon't search subdirectoriesfalse
--continue-on-errorContinue if a file fails (default)true
--stop-on-errorStop on first errorfalse
-c, --configPath to CATE configuration file--

Prerequisites

Examples

Process entire directory

python -m src.anonymization.cli batch ./documents -o ./anonymized

Only text files with parallel processing

python -m src.anonymization.cli batch ./docs --pattern "*.txt" --parallel 4

Resume interrupted batch

python -m src.anonymization.cli batch ./documents --resume