Data Flow

tflearn.data_flow.DataFlow (coord, num_threads=8, max_queue=32, shuffle=False, continuous=False, ensure_data_order=False, dprep_dict=None, daug_dict=None)

Base class for using real time pre-processing and controlling data flow. Supports pipelining for faster computation.

Arguments

  • coord: Coordinator. A Tensorflow coordinator.
  • num_threads: int. Total number of simultaneous threads to process data.
  • max_queue: int. Maximum number of data stored in a queue.
  • shuffle: bool. If True, data will be shuffle.
  • continuous: bool. If True, when an epoch is over, same data will be feeded again.
  • ensure_data_order: bool. Ensure that data order is keeped when using 'next' to retrieve data (Processing will be slower).
  • dprep_dict: dict. Optional data pre-processing parameter for performing real time data pre-processing. Keys must be placeholders and values DataPreprocessing subclass object.
  • daug_dict: dict. Optional data augmentation parameter for performing real time data augmentation. Keys must be placeholders and values DataAugmentation subclass object.

FeedDictFlow

tflearn.data_flow.FeedDictFlow (feed_dict, coord, batch_size=128, num_threads=8, max_queue=32, shuffle=False, continuous=False, ensure_data_order=False, dprep_dict=None, daug_dict=None, index_array=None)

Generate a stream of batches from a dataset. It uses two queues, one for generating batch of data ids, and the other one to load data and apply pre processing. If continuous is True, data flow will never ends until stop is invoked, or coord interrupt threads.

Arguments

  • feed_dict: dict. A TensorFlow formatted feed dict (with placeholders as keys and data as values).
  • coord: Coordinator. A Tensorflow coordinator.
  • num_threads: int. Total number of simultaneous threads to process data.
  • max_queue: int. Maximum number of data stored in a queue.
  • shuffle: bool. If True, data will be shuffle.
  • continuous: bool. If True, when an epoch is over, same data will be feeded again.
  • ensure_data_order: bool. Ensure that data order is keeped when using 'next' to retrieve data (Processing will be slower).
  • dprep_dict: dict. Optional data pre-processing parameter for performing real time data pre-processing. Keys must be placeholders and values DataPreprocessing subclass object.
  • daug_dict: dict. Optional data augmentation parameter for performing real time data augmentation. Keys must be placeholders and values DataAugmentation subclass object.
  • index_array: list. An optional list of index to be used instead of using the whole dataset indexes (Useful for validation split).

Methods

next (timeout=None)

Get the next feed dict.

Returns

A TensorFlow feed dict, or 'False' if it has no more data.

start (reset_status=True)

Arguments
  • reset_status: bool. If True, DataStatus will be reset.
Returns

ArrayFlow

tflearn.data_flow.ArrayFlow (X, Y, multi_inputs=False, batch_size=32, shuffle=True)

Convert array samples to tensors and store them in a queue.

Arguments

  • X: array. The features data array.
  • Y: array. The targets data array.
  • multi_inputs: bool. Set to True if X has multiple input sources (i.e. X is a list of arrays).
  • batch_size: int. The batch size.
  • shuffle: bool. If True, data will be shuffled.

Returns

The X and Y data tensors or a list(X) and Y data tensors if multi_inputs is True.