Categories
Misc

Blender 3.0 Release Accelerated by NVIDIA RTX GPUs, Adds USD Support for Omniverse

‘Tis the season for all content creators, especially 3D artists, this month on NVIDIA Studio. Blender, the world’s most popular open-source 3D creative application, launched a highly anticipated 3.0 release, delivering extraordinary performance gains powered by NVIDIA RTX GPUs, with added Universal Scene Description (USD) support for NVIDIA Omniverse. Faster 3D creative workflows are made Read article >

The post Blender 3.0 Release Accelerated by NVIDIA RTX GPUs, Adds USD Support for Omniverse appeared first on The Official NVIDIA Blog.

Categories
Misc

Useful data summary statistics with image classification

Hello!

I am doing image classification with TensorFlow for learning purposes. I am splitting the data into 5 folds. I would like to get useful summary statistics on these validation sets. What could be useful other than the shape of the validation sets?

submitted by /u/The_Poor_Jew
[visit reddit] [comments]

Categories
Misc

Advent of Code 2021 in pure TensorFlow – day 2. The limitations of Python enums and type annotations in TensorFlow programs

Advent of Code 2021 in pure TensorFlow - day 2. The limitations of Python enums and type annotations in TensorFlow programs submitted by /u/pgaleone
[visit reddit] [comments]
Categories
Misc

exit code 409 when trying to run through tensorflow example in pycharm

I am trying to go through the DCGAN example on the tensorflow website https://www.tensorflow.org/tutorials/generative/dcgan. it seems to run fine up until the step where it uses the generator generated_image = generator(noise, training=False). At that point it exits with error code Process finished with exit code -1073740791 (0xC0000409).

I am running on Windows 10 using pycharm. I have tried messing with the batch size in case this is a memory issue, but even setting it to 1 gives the same results. I have also tried running pycharm as administrator.

submitted by /u/skywo1f
[visit reddit] [comments]

Categories
Misc

Using AMD Radeon with TF in Anaconda Spyder

Hello,

I understand that Tensorflow is geared towards proprietary NVIDIA Cuda, but is there a workaround for AMD Radeon GPU? I’m on a Macbook Pro with an AMD Radeon 580 external GPU card.

submitted by /u/ZThrock
[visit reddit] [comments]

Categories
Misc

Advent of Code 2021 in pure TensorFlow – day 1

Advent of Code 2021 in pure TensorFlow - day 1 submitted by /u/pgaleone
[visit reddit] [comments]
Categories
Misc

Tensorflow Lite Segmentation Fault

I am running Tensorflow Lite on my Raspberry Pi 3b+ with a custom object detection mode. I have tested it on a Google COCO dataset and it works wonderfully but when I test it on my custom trained model it does not work despite the model passing TfLite Model Maker evaluation. When I run it the only error I get in my message is “Segmentation fault”. How can I fix this?

I am not able to upload my model to Stackoverflow but just some info about it. It is only detecting one object, It is Not quantized, it is trained based off the efficientdet_lite1 mode, and I trained it using the official Tensorflow Lite Model Maker Google Colab.

Here is the code used to interpret the model on my Pi.

https://pastebin.com/1at3ZAJd

I added a few print statements aswell to troubleshoot and it stops executing at around line 115.

Does anyone know how to fix this?

submitted by /u/MattDlr4
[visit reddit] [comments]

Categories
Misc

Why am I not able to train 1,859 75×75 RGB images on my M1 Pro 10-core CPU, 16-core GPU 16-core Neural Engine with 16 GB RAM?

I’ve followed the same steps provided by Apple (https://developer.apple.com/metal/tensorflow-plugin/) to install the deep learning development environment on TensorFlow (based on tensorflow-metal plugin) on my MacBook Pro. My model employs the VGG19 through transfer-learning as its summary can be seen below. While I train this model on 1,859 75×75 RGB images, getting the error tensorflow.python.framework.errors_impl.InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run _EagerConst: Dst tensor is not initialized. Isn’t this task an easy one for such a powerful SoC like M1 Pro 10-core CPU, 16-core GPU 16-core Neural Engine with 16 GB RAM? What is the issue here? Is this a bug or do I need to do some configuration to overcome this situation?

Here is the stack trace:

Metal device set to: Apple M1 Pro

systemMemory: 16.00 GB

maxCacheSize: 5.33 GB

Model: “sequential”

_________________________________________________________________

Layer (type) Output Shape Param #

=================================================================

vgg19 (Functional) (None, 512) 20024384

_________________________________________________________________

flatten (Flatten) (None, 512) 0

_________________________________________________________________

dense (Dense) (None, 1859) 953667

=================================================================

Total params: 20,978,051

Trainable params: 953,667

Non-trainable params: 20,024,384

_________________________________________________________________

Traceback (most recent call last):

File “/Users/talhakabakus/PycharmProjects/keras-matlab-comp-metal/run.py”, line 528, in <module>

run_all_stanford_dogs()

File “/Users/talhakabakus/PycharmProjects/keras-matlab-comp-metal/run.py”, line 481, in run_all_stanford_dogs

H = model.fit(X_train, y_train_cat, validation_split=n_val_split, epochs=n_epochs,

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/keras/engine/training.py”, line 1134, in fit

data_handler = data_adapter.get_data_handler(

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/keras/engine/data_adapter.py”, line 1383, in get_data_handler

return DataHandler(*args, **kwargs)

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/keras/engine/data_adapter.py”, line 1138, in __init__

self._adapter = adapter_cls(

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/keras/engine/data_adapter.py”, line 230, in __init__

x, y, sample_weights = _process_tensorlike((x, y, sample_weights))

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/keras/engine/data_adapter.py”, line 1031, in _process_tensorlike

inputs = tf.nest.map_structure(_convert_numpy_and_scipy, inputs)

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/tensorflow/python/util/nest.py”, line 869, in map_structure

structure[0], [func(*x) for x in entries],

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/tensorflow/python/util/nest.py”, line 869, in <listcomp>

structure[0], [func(*x) for x in entries],

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/keras/engine/data_adapter.py”, line 1026, in _convert_numpy_and_scipy

return tf.convert_to_tensor(x, dtype=dtype)

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py”, line 206, in wrapper

return target(*args, **kwargs)

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/tensorflow/python/framework/ops.py”, line 1430, in convert_to_tensor_v2_with_dispatch

return convert_to_tensor_v2(

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/tensorflow/python/framework/ops.py”, line 1436, in convert_to_tensor_v2

return convert_to_tensor(

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/tensorflow/python/profiler/trace.py”, line 163, in wrapped

return func(*args, **kwargs)

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/tensorflow/python/framework/ops.py”, line 1566, in convert_to_tensor

ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/tensorflow/python/framework/tensor_conversion_registry.py”, line 52, in _default_conversion_function

return constant_op.constant(value, dtype, name=name)

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/tensorflow/python/framework/constant_op.py”, line 271, in constant

return _constant_impl(value, dtype, shape, name, verify_shape=False,

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/tensorflow/python/framework/constant_op.py”, line 283, in _constant_impl

return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/tensorflow/python/framework/constant_op.py”, line 308, in _constant_eager_impl

t = convert_to_eager_tensor(value, ctx, dtype)

File “/Users/talhakabakus/miniforge3/envs/keras-matlab-comp-metal/lib/python3.9/site-packages/tensorflow/python/framework/constant_op.py”, line 106, in convert_to_eager_tensor

return ops.EagerTensor(value, ctx.device_name, dtype)

tensorflow.python.framework.errors_impl.InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run _EagerConst: Dst tensor is not initialized.

Process finished with exit code 1

submitted by /u/talhak
[visit reddit] [comments]

Categories
Misc

The half_pixel_centers keyword seems to not exist in tf.image.resize in tf2.0 or tf.27

With the robotics set up we use opencv for the images. However in the cnn I use tf.io.decode_jpg to open the images. These two methods slighty alter the image and makes it such that the same image, but opened in the different methods can’t be classified on the cnn.

I found the differences in this blog: https://towardsdatascience.com/image-read-and-resize-with-opencv-tensorflow-and-pil-3e0f29b992be

which states that two things needs to be changed to ensure that the two files are the same

  1. dct_method=’INTEGER_ACCURATE’ needs to be added to the decode
  2. half_pixel_centers=True to the resize method and also force it to be bilinear.

However the half_pixel_centers keyword is not found

https://stackoverflow.com/questions/50591669/tf-image-resize-bilinear-vs-cv2-resize

This stackoverflow states that it is added in tf2.0 with a link to their github showing it has indeed been added: https://github.com/tensorflow/tensorflow/commit/3ae2c6691b7c6e0986d97b150c9283e5cc52c15f

About my code I map the dataset to a function that reads the file path

img = tf.io.read_file(file_path)

img = tf.io.decode_jpeg(img, channels=3, dct_method=’INTEGER_ACCURATE’)

resized img = tf.image.resize(img, (28,28), method=tf.image.ResizeMethod.BILINEAR, preserve_aspect_ratio=False, antialias=False, name=None, half_pixel_center=True)

I also tried it on an other machine with tf2.7 and it gives the same error.Could someone point out what I am doing wrong or perhaps there is a better way in general?

submitted by /u/Calond
[visit reddit] [comments]

Categories
Offsites

A Fast WordPiece Tokenization System

Tokenization is a fundamental pre-processing step for most natural language processing (NLP) applications. It involves splitting text into smaller units called tokens (e.g., words or word segments) in order to turn an unstructured input string into a sequence of discrete elements that is suitable for a machine learning (ML) model. ln deep learning–based models (e.g., BERT), each token is mapped to an embedding vector to be fed into the model.

Tokenization in a typical deep learning model, like BERT.

A fundamental tokenization approach is to break text into words. However, using this approach, words that are not included in the vocabulary are treated as “unknown”. Modern NLP models address this issue by tokenizing text into subword units, which often retain linguistic meaning (e.g., morphemes). So, even though a word may be unknown to the model, individual subword tokens may retain enough information for the model to infer the meaning to some extent. One such subword tokenization technique that is commonly used and can be applied to many other NLP models is called WordPiece. Given text, WordPiece first pre-tokenizes the text into words (by splitting on punctuation and whitespaces) and then tokenizes each word into subword units, called wordpieces.

The WordPiece tokenization process with an example sentence.

In “Fast WordPiece Tokenization”, presented at EMNLP 2021, we developed an improved end-to-end WordPiece tokenization system that speeds up the tokenization process, reducing the overall model latency and saving computing resources. In comparison to traditional algorithms that have been used for decades, this approach reduces the complexity of the computation by an order of magnitude, resulting in significantly improved performance, up to 8x faster than standard approaches. The system has been applied successfully in a number of systems at Google and has been publicly released in TensorFlow Text.

Single-Word WordPiece Tokenization
WordPiece uses a greedy longest-match-first strategy to tokenize a single word — i.e., it iteratively picks the longest prefix of the remaining text that matches a word in the model’s vocabulary. This approach is known as maximum matching or MaxMatch, and has also been used for Chinese word segmentation since the 1980s. Yet despite its wide use in NLP for decades, it is still relatively computation intensive, with the commonly adopted MaxMatch approaches’ computation being quadratic with respect to the input word length (n). This is because two pointers are needed to scan over the input: one to mark a start position, and the other to search for the longest substring matching a vocabulary token at that position.

We propose an alternative to the MaxMatch algorithm for WordPiece tokenization, called LinMaxMatch, which has a tokenization time that is strictly linear with respect to n. First, we organize the vocabulary tokens in a trie (also called a prefix tree), where each trie edge is labeled by a character, and a tree path from the root to some node represents a prefix of some token in the vocabulary. In the figure below, nodes are depicted as circles and tree edges are black solid arrows. Given a trie, a vocabulary token can be located to match an input text by traversing from the root and following the trie edges to match the input character by character; this process is referred to as trie matching.

The figure below shows the trie created from the vocabulary consisting of “a”, “abcd”, “##b”, “##bc”, and “##z”. An input text “abcd” can be matched to a vocabulary token by walking from the root (upper left) and following the trie edges with labels “a”, “b”, “c”, “d” one by one. (The leading “##” symbols are special characters used in WordPiece tokenization that are described in more detail below.)

Trie diagram of the vocabulary [“a”, “abcd”, “##b”, “##bc”, “##z”]. Circles and arrows represent nodes and edges along the trie, respectively.

Second, inspired by the Aho-Corasick algorithm, a classical string-searching algorithm invented in 1975, we introduce a method that breaks out of a trie branch that fails to match the given input and skips directly to an alternative branch to continue matching. As in standard trie matching, during tokenization, we follow the trie edges to match the input characters one by one. When trie matching cannot match an input character for a given node, a standard algorithm would backtrack to the last character where a token was matched and then restart the trie matching procedure from there, which results in repetitive and wasteful iterations. Instead of backtracking, our method triggers a failure transition, which is done in two steps: (1) it collects the precomputed tokens stored at that node, which we call failure pops; and (2) it then follows the precomputed failure link to a new node from which the trie matching process continues.

For example, given a model with the vocabulary described above (“a”, “abcd”, “##b”, “##bc”, and “##z”), WordPiece tokenization distinguishes subword tokens matching at the start of the input word from the subword tokens starting in the middle (the latter being marked with two leading hashes “##”). Hence, for input text “abcz”, the expected tokenization output is [“a”, “##bc”, “##z”], where “a” matches at the beginning of the input while “##bc” and “##z” match in the middle. For this example, the figure below shows that, after successfully matching three characters ‘a’, ‘b’, ‘c’, trie matching cannot match the next character ‘z’ because “abcz” is not in the vocabulary. In this situation, LinMaxMatch conducts a failure transition by outputting the first recognized token (using the failure pop token “a”) and following the failure link to a new node to continue the matching process (in this case, node with “##bc” as the failure pop tokens).The process then repeats from the new node.

Trie structure for the same vocabulary as shown in the example above, now illustrating the approach taken by our new Fast WordPiece Tokenizer algorithm. Failure pops are bracketed and shown in purple. Failure links between nodes are indicated with dashed red line arrows.

Since at least n operations are required to read the entire input, the LinMaxMatch algorithm is asymptotically optimal for the MaxMatch problem.

End-to-End WordPiece Tokenization
Whereas the existing systems pre-tokenize the input text (splitting it into words by punctuation and whitespace characters) and then call WordPiece tokenization on each resulting word, we propose an end-to-end WordPiece tokenizer that combines pre-tokenization and WordPiece into a single, linear-time pass. It uses the LinMaxMatch trie matching and failure transitions as much as possible and only checks for punctuation and whitespace characters among the relatively few input characters that are not handled by the loop. It is more efficient as it traverses the input only once, performs fewer punctuation / whitespace checks, and skips the creation of intermediate words.

End-to-End WordPiece Tokenization.

Benchmark Results
We benchmark our method against two widely-adopted WordPiece tokenization implementations, HuggingFace Tokenizers, from the HuggingFace Transformer library, one of the most popular open-source NLP tools, and TensorFlow Text, the official library of text utilities for TensorFlow. We use the WordPiece vocabulary released with the BERT-Base, Multilingual Cased model.

We compared our algorithms with HuggingFace and TensorFlow Text on a large corpus (several million words) and found that the way the strings are split into tokens is identical to other implementations for both single-word and end-to-end tokenization.

To generate the test data, we sample 1,000 sentences from the multilingual Wikipedia dataset, covering 82 languages. On average, each word has four characters, and each sentence has 82 characters or 17 words. We found this dataset large enough because a much larger dataset (consisting of hundreds of thousands of sentences) generated similar results.

We compare the average runtime when tokenizing a single word or general text (end-to-end) for each system. Fast WordPiece tokenizer is 8.2x faster than HuggingFace and 5.1x faster than TensorFlow Text, on average, for general text end-to-end tokenization.

Average runtime of each system. Note that for better visualization, single-word tokenization and end-to-end tokenization are shown in different scales.

We also examine how the runtime grows with respect to the input length for single-word tokenization. Because of its linear-time complexity, the runtime of LinMaxMatch increases at most linearly with the input length, which is much slower than other quadratic-time approaches.

The average runtime of each system with respect to the input length for single-word tokenization.

Conclusion
We proposed LinMaxMatch for single-word WordPiece tokenization, which solves the decades-old MaxMatch problem in the asymptotically-optimal time with respect to the input length. LinMaxMatch extends the Aho-Corasick Algorithm, and the idea can be applied to more string search and transducer challenges. We also proposed an End-to-End WordPiece algorithm that combines pre-tokenization and WordPiece tokenization into a single, linear-time pass for even higher efficiency.

Acknowledgements
We gratefully acknowledge the key contributions and useful advices from other team members and colleagues, including Abbas Bazzi, Alexander Frömmgen, Alex Salcianu, Andrew Hilton, Bradley Green, Ed Chi, Chen Chen, Dave Dopson, Eric Lehman, Fangtao Li, Gabriel Schubiner, Gang Li, Greg Billock, Hong Wang, Jacob Devlin, Jayant Madhavan, JD Chen, Jifan Zhu, Jing Li, John Blitzer, Kirill Borozdin, Kristina Toutanova, Majid Hadian-Jazi, Mark Omernick, Max Gubin, Michael Fields, Michael Kwong, Namrata Godbole, Nathan Lintz, Pandu Nayak, Pew Putthividhya, Pranav Khaitan, Robby Neale, Ryan Doherty, Sameer Panwar, Sundeep Tirumalareddy, Terry Huang, Thomas Strohmann, Tim Herrmann, Tom Small, Tomer Shani, Wenwei Yu, Xiaoxue Zang, Xin Li, Yang Guo, Yang Song, Yiming Xiao, Yuan Shen, and many more.