- class batchcreate.BatchCreator(records, max_record_size=1000000, max_batch_size=5000000, max_batch_num_records=500)[source]
Batchcreator takes an array of records as input and splits it into suitably sized batches of records which can be further processed or passed to any other system/s. Import BatchCreator iterator class and instantiate it. You can use below parameters to define output batch limits. These parameters are optional. If neither of these parameters is specified then the default value will be used.
Example:
from batchcreate import BatchCreator batches = BatchCreator(records, max_record_size=60, max_batch_size=200, max_batch_num_records=4)
The iterable BatchCreator object can give suitable batches as needed on iteration. The BatchCreator object can be used in a regular ‘for’ loop.
Example:
for batch in batches: print(batch) #batch processing here
OR
Example:
batchItr = iter(batches) print(next(batchItr)) #batch processing here
- Attributes:
- records[]
Input list of records to split into batches.
- max_record_sizeint, default 1MB
The maximum size limit for a record in the output batch. Any record with larger size than this will be skipped from batching.
- max_batch_sizeint, default 5MB
The maximum size limit for a batch.
- max_batch_num_recordsint, default 500
The maximum number of records limit for a batch. BatchCreator will put maximum these many records per batch provided batch size satisfies the limit.
- Methods:
- batches :
Returns the list of all the batches.