Data Flow
tflearn.data_flow.DataFlow (coord, num_threads=8, max_queue=32, shuffle=False, continuous=False, ensure_data_order=False, dprep_dict=None, daug_dict=None)
Base class for using real time pre-processing and controlling data flow. Supports pipelining for faster computation.
Arguments
- coord:
Coordinator
. A Tensorflow coordinator. - num_threads:
int
. Total number of simultaneous threads to process data. - max_queue:
int
. Maximum number of data stored in a queue. - shuffle:
bool
. If True, data will be shuffle. - continuous:
bool
. If True, when an epoch is over, same data will be feeded again. - ensure_data_order:
bool
. Ensure that data order is keeped when using 'next' to retrieve data (Processing will be slower). - dprep_dict: dict. Optional data pre-processing parameter for performing
real time data pre-processing. Keys must be placeholders and values
DataPreprocessing
subclass object. - daug_dict: dict. Optional data augmentation parameter for performing
real time data augmentation. Keys must be placeholders and values
DataAugmentation
subclass object.
FeedDictFlow
tflearn.data_flow.FeedDictFlow (feed_dict, coord, batch_size=128, num_threads=8, max_queue=32, shuffle=False, continuous=False, ensure_data_order=False, dprep_dict=None, daug_dict=None, index_array=None)
Generate a stream of batches from a dataset. It uses two queues, one for
generating batch of data ids, and the other one to load data and apply pre
processing. If continuous is True
, data flow will never ends until stop
is invoked, or coord
interrupt threads.
Arguments
- feed_dict:
dict
. A TensorFlow formatted feed dict (with placeholders as keys and data as values). - coord:
Coordinator
. A Tensorflow coordinator. - num_threads:
int
. Total number of simultaneous threads to process data. - max_queue:
int
. Maximum number of data stored in a queue. - shuffle:
bool
. If True, data will be shuffle. - continuous:
bool
. If True, when an epoch is over, same data will be feeded again. - ensure_data_order:
bool
. Ensure that data order is keeped when using 'next' to retrieve data (Processing will be slower). - dprep_dict: dict. Optional data pre-processing parameter for performing
real time data pre-processing. Keys must be placeholders and values
DataPreprocessing
subclass object. - daug_dict: dict. Optional data augmentation parameter for performing
real time data augmentation. Keys must be placeholders and values
DataAugmentation
subclass object. - index_array:
list
. An optional list of index to be used instead of using the whole dataset indexes (Useful for validation split).
Methods
next (timeout=None)
Get the next feed dict.
Returns
A TensorFlow feed dict, or 'False' if it has no more data.
start (reset_status=True)
Arguments
- reset_status:
bool
. If True,DataStatus
will be reset.
Returns
ArrayFlow
tflearn.data_flow.ArrayFlow (X, Y, multi_inputs=False, batch_size=32, shuffle=True)
Convert array samples to tensors and store them in a queue.
Arguments
- X:
array
. The features data array. - Y:
array
. The targets data array. - multi_inputs:
bool
. Set to True if X has multiple input sources (i.e. X is a list of arrays). - batch_size:
int
. The batch size. - shuffle:
bool
. If True, data will be shuffled.
Returns
The X
and Y
data tensors or a list(X
) and Y
data tensors if
multi_inputs is True.