Item and Batch Transforms

DataPipes for calling functions over iterations

ItemTransformer

 ItemTransformer (*args, **kwds)

Converts item_tfms into a Pipeline that is run over for every iter in source_datapipe

ItemTransformer can be used to do quick augmentations to an existing pipeline by passing simple functions into item_tfms. For example below given an input of 0->10 we add one to each element and then multiply that element by 2…

add_one = lambda o:o+1
multiple_by_two = lambda o:o*2
pipe = ItemTransformer(range(10),[add_one,multiple_by_two])
test_eq(list(pipe),[2, 4, 6, 8, 10, 12, 14, 16, 18, 20])
list(pipe)
[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

BatchTransformer

 BatchTransformer (*args, **kwds)

Converts batch_tfms into a Pipeline that is run over for every iter in source_datapipe

BatchTransformer is identical to ItemTransformer but semantically the functions it runs should operate on an entire batch of elements…

add_one = lambda o:[element+1 for element in o]
multiple_by_two = lambda o:[element*2 for element in o]

pipe = dp.iter.IterableWrapper(range(10))
pipe = pipe.batch(2)
pipe = BatchTransformer(pipe,[add_one,multiple_by_two])
test_eq(list(pipe),[[2, 4], [6, 8], [10, 12], [14, 16], [18, 20]])
list(pipe)
[[2, 4], [6, 8], [10, 12], [14, 16], [18, 20]]