I am on tf_nightly-2.7.0 and used tensorflow’s “make_csv_dataset” to make dataset from a TSV file, but it seems the Tensorflow PrefetchDataset doesn’t have shape information. I could have used Pandas dataframe but would like to try Tensorflow’s dataset. Here are codes without the import:
!wget https://cdn.freecodecamp.org/project-data/sms/train-data.tsv train_file_path = "train-data.tsv" train_data = tf.data.experimental.make_csv_dataset(train_file_path, header=False, field_delim='t', column_names=['label', 'text'], batch_size=5, label_name='label', num_epochs=1, ignore_errors=True) examples, labels = next(iter(train_data)) # Just the first batch. print("FEATURES: n", examples, "n") print("LABELS: n", labels) encoder = keras.layers.TextVectorization(max_tokens=None, output_mode='int', output_sequence_length=160) encoder.adapt(train_data)
Here is how the dataset looks in the print output:
FEATURES: OrderedDict([('text', <tf.Tensor: shape=(5,), dtype=string, numpy= array([b'rt-king pro video club>> need help? info@ringtoneking.co.uk or call 08701237397 you must be 16+ club credits redeemable at www.ringtoneking.co.uk! enjoy!', b'good afternoon sunshine! how dawns that day ? are we refreshed and happy to be alive? do we breathe in the air and smile ? i think of you, my love ... as always', b'they have a thread on the wishlist section of the forums where ppl post nitro requests. start from the last page and collect from the bottom up.', b'no current and food here. i am alone also', b'die... i accidentally deleted e msg i suppose 2 put in e sim archive. haiz... i so sad...'], dtype=object)>)]) LABELS: tf.Tensor([b'spam' b'ham' b'ham' b'ham' b'ham'], shape=(5,), dtype=string)
Here is the error on line encoder.adapt(train_data) :
AttributeError: 'NoneType' object has no attribute 'ndims
The desired outcome would be no error message after manipulating the Tensorflow dataset.
Thank you for the help in advance!
submitted by /u/na_haran
[visit reddit] [comments]