Data Flow
tflearn.data_flow.DataFlow (coord, num_threads=8, max_queue=32, shuffle=False, continuous=False, ensure_data_order=False, dprep_dict=None, daug_dict=None)
Base class for using real time pre-processing and controlling data flow. Supports pipelining for faster computation.
Arguments
- coord:
Coordinator. A Tensorflow coordinator. - num_threads:
int. Total number of simultaneous threads to process data. - max_queue:
int. Maximum number of data stored in a queue. - shuffle:
bool. If True, data will be shuffle. - continuous:
bool. If True, when an epoch is over, same data will be feeded again. - ensure_data_order:
bool. Ensure that data order is keeped when using 'next' to retrieve data (Processing will be slower). - dprep_dict: dict. Optional data pre-processing parameter for performing
real time data pre-processing. Keys must be placeholders and values
DataPreprocessingsubclass object. - daug_dict: dict. Optional data augmentation parameter for performing
real time data augmentation. Keys must be placeholders and values
DataAugmentationsubclass object.
FeedDictFlow
tflearn.data_flow.FeedDictFlow (feed_dict, coord, batch_size=128, num_threads=8, max_queue=32, shuffle=False, continuous=False, ensure_data_order=False, dprep_dict=None, daug_dict=None, index_array=None)
Generate a stream of batches from a dataset. It uses two queues, one for
generating batch of data ids, and the other one to load data and apply pre
processing. If continuous is True, data flow will never ends until stop
is invoked, or coord interrupt threads.
Arguments
- feed_dict:
dict. A TensorFlow formatted feed dict (with placeholders as keys and data as values). - coord:
Coordinator. A Tensorflow coordinator. - num_threads:
int. Total number of simultaneous threads to process data. - max_queue:
int. Maximum number of data stored in a queue. - shuffle:
bool. If True, data will be shuffle. - continuous:
bool. If True, when an epoch is over, same data will be feeded again. - ensure_data_order:
bool. Ensure that data order is keeped when using 'next' to retrieve data (Processing will be slower). - dprep_dict: dict. Optional data pre-processing parameter for performing
real time data pre-processing. Keys must be placeholders and values
DataPreprocessingsubclass object. - daug_dict: dict. Optional data augmentation parameter for performing
real time data augmentation. Keys must be placeholders and values
DataAugmentationsubclass object. - index_array:
list. An optional list of index to be used instead of using the whole dataset indexes (Useful for validation split).
Methods
next (timeout=None)
Get the next feed dict.
Returns
A TensorFlow feed dict, or 'False' if it has no more data.
start (reset_status=True)
Arguments
- reset_status:
bool. If True,DataStatuswill be reset.
Returns
ArrayFlow
tflearn.data_flow.ArrayFlow (X, Y, multi_inputs=False, batch_size=32, shuffle=True)
Convert array samples to tensors and store them in a queue.
Arguments
- X:
array. The features data array. - Y:
array. The targets data array. - multi_inputs:
bool. Set to True if X has multiple input sources (i.e. X is a list of arrays). - batch_size:
int. The batch size. - shuffle:
bool. If True, data will be shuffled.
Returns
The X and Y data tensors or a list(X) and Y data tensors if
multi_inputs is True.