Categories
Misc

error: self._traceback = tf_stack.extract_stack() or self._traceback = tf_stack.extract_stack_for_node(self._c_op)

Hi,

tried to run https://github.com/JiahuiYu/generative_inpainting

is the lib version causing this error?

tried this solution https://stackoverflow.com/questions/62466877/self-traceback-tf-stack-extract-stack but not working..

so code

import neuralgym as ng trainer = ng.train.MultiGPUTrainer( num_gpus=FLAGS.num_gpus_per_job, optimizer=g_optimizer, var_list=g_vars, max_iters=FLAGS.max_iters, graph_def=multigpu_graph_def, grads_summary=False, gradient_processor=None, graph_def_kwargs={ "model": model, "FLAGS": FLAGS, "image_ref_mask_data": image_ref_mask_data, "identity_model": identity_model, "loss_type": "g", }, spe=FLAGS.train_spe, #log_dir=FLAGS.log_dir, ) 

when running in tf 1.1.5, with package:

tensorboard 1.15.0 pyhb230dea_0 tensorboard-data-server 0.6.0 py37h03978a9_2 conda-forge tensorboard-plugin-wit 1.8.1 pyhd8ed1ab_0 conda-forge tensorflow 1.15.0 gpu_py37hc3743a6_0 tensorflow-base 1.15.0 gpu_py37h1afeea4_0 tensorflow-estimator 1.15.1 pyh2649769_0 tensorflow-gpu 1.15.0 h0d30ee6_0 

got error:

 spe=FLAGS.train_spe, File " neuralgymneuralgymtrainmultigpu_trainer.py", line 25, in __init__ super().__init__(**self.context) File " neuralgymneuralgymtraintrainer.py", line 38, in __init__ self._train_op, self._loss = self.train_ops_and_losses() File " neuralgymneuralgymtrainmultigpu_trainer.py", line 75, in train_ops_and_losses gpu_id=gpu, **graph_def_kwargs) ... File " tensorflow115libsite-packageskerasenginebase_layer.py", line 489, in __call__ output = self.call(inputs, **kwargs) File " tensorflow115libsite-packageskerasenginenetwork.py", line 583, in call output_tensors, _, _ = self.run_internal_graph(inputs, masks) File " tensorflow115libsite-packageskerasenginenetwork.py", line 740, in run_internal_graph layer.call(computed_tensor, **kwargs)) File " tensorflow115libsite-packageskeraslayersnormalization.py", line 185, in call epsilon=self.epsilon) File " tensorflow115libsite-packageskerasbackendtensorflow_backend.py", line 2315, in normalize_batch_in_training epsilon=epsilon) File " tensorflow115libsite-packageskerasbackendtensorflow_backend.py", line 2288, in _fused_normalize_batch_in_training data_format=tf_data_format) File " tensorflow115libsite-packagestensorflow_corepythonopsnn_impl.py", line 1501, in fused_batch_norm name=name) File " tensorflow115libsite-packagestensorflow_corepythonopsgen_nn_ops.py", line 4620, in fused_batch_norm_v3 name=name) File " tensorflow115libsite-packagestensorflow_corepythonframeworkop_def_library.py", line 794, in _apply_op_helper op_def=op_def) File " tensorflow115libsite-packagestensorflow_corepythonutildeprecation.py", line 507, in new_func return func(*args, **kwargs) File " tensorflow115libsite-packagestensorflow_corepythonframeworkops.py", line 3357, in create_op attrs, op_def, compute_device) File " tensorflow115libsite-packagestensorflow_corepythonframeworkops.py", line 3426, in _create_op_internal op_def=op_def) File " tensorflow115libsite-packagestensorflow_corepythonframeworkops.py", line 1748, in __init__ self._traceback = tf_stack.extract_stack() 

when running in tf 2.x:

 log_dir=FLAGS.log_dir, File " ngneuralgymtrainmultigpu_trainer.py", line 25, i __init__ super().__init__(**self.context) File " ngneuralgymtraintrainer.py", line 38, in __init_ self._train_op, self._loss = self.train_ops_and_losses() File " ngneuralgymtrainmultigpu_trainer.py", line 75, i train_ops_and_losses gpu_id=gpu, **graph_def_kwargs) File " inpaint_model.py", line 203, in build_graph_with_loses surface_attention=FLAGS.surface_attention, File " inpaint_model.py", line 127, in build_inpaint_net x = gen_deconv(x, cnum, name="allconv15_upsample") File " inpaint_ops.py", line 85, in gen_deconv x = resize(x, func=tf.compat.v1.image.resize_nearest_neighbor) File " ngneuralgymopslayers.py", line 152, in resize x = func(x, new_xs, align_corners=align_corners) File "C:ProgramDataAnaconda3libsite-packagestensorflowpythonopsimage_ps_impl.py", line 4659, in resize_nearest_neighbor name=name) File "C:ProgramDataAnaconda3libsite-packagestensorflowpythonopsgen_imge_ops.py", line 3873, in resize_nearest_neighbor name=name) File "C:ProgramDataAnaconda3libsite-packagestensorflowpythonframeworkp_def_library.py", line 742, in _apply_op_helper attrs=attr_protos, op_def=op_def) File "C:ProgramDataAnaconda3libsite-packagestensorflowpythonframeworkps.py", line 3784, in _create_op_internal op_def=op_def) File "C:ProgramDataAnaconda3libsite-packagestensorflowpythonframeworkps.py", line 2175, in __init__ self._traceback = tf_stack.extract_stack_for_node(self._c_op) 

with setting:

tensorboard 2.8.0 pypi_0 pypi tensorboard-data-server 0.6.1 pypi_0 pypi tensorboard-plugin-wit 1.8.1 pypi_0 pypi tensorflow-gpu 2.8.0 pypi_0 pypi tensorflow-io-gcs-filesystem 0.25.0 pypi_0 pypi 

thanks a lot

submitted by /u/boydbuilding
[visit reddit] [comments]

Categories
Misc

BCE vs Sparse Categorical CE loss, metrics for binary segmentation

Hello,I am training a segmentation model following this link. My problem at hand is a binary segmentation 0 for background class and 1 for object of interest. I noticed the predictions are good when Sparse Categorical CE loss is used compared to BCE. Can anyone give a valid reason to why this is the case? I am under an impression that BCE suits better for binary segmentation tasks. Can you please also recommend the best metric instead of accuracy? I am currently using mean IoU as a metric for evaluation.

submitted by /u/stashpot420
[visit reddit] [comments]

Categories
Misc

Creating a Bidirectional LSTM

So I’m trying to create a bidirectional LSTM and I have this code

model.add(Bidirectional(LSTM(5, return_sequences=True), input_shape=X.shape[1]))

I keep getting the following error,

TypeError: ‘int’ object is not iterable

it is over the input_shape parameter, but I’m confused on how to fix it.

The shape of the input is (494021, 118)

The shape of the output is (494021, 5)

submitted by /u/echaney456
[visit reddit] [comments]

Categories
Misc

Tensorflow not using GPU

Hi, I’ve been having problems with running tensorflow on gpu for about a week now, so here I am, looking for ideas on how to fix this.I’m running tensorflow 2.8.0 with cuda-11.6 on a Ubuntu 20.04 virtual machine using passthrough for the 2 GPUs I have.

Now when I run this object detection project: https://github.com/nicknochnack/TFODCourse everything works but the training part doesn’t seem to use any of the GPU power, looking with nvidia-smi:

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Quadro K2200 On | 00000000:07:00.0 Off | N/A | | 42% 26C P8 1W / 39W | 3543MiB / 4096MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 Quadro K2200 On | 00000000:08:00.0 Off | N/A | | 42% 21C P8 1W / 39W | 3543MiB / 4096MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1313 G /usr/lib/xorg/Xorg 2MiB | | 0 N/A N/A 2149 C python 3535MiB | | 1 N/A N/A 1313 G /usr/lib/xorg/Xorg 2MiB | | 1 N/A N/A 2149 C python 3535MiB | +-----------------------------------------------------------------------------+ 

Also, when I run tf.test.is_gpu_available(), I get True.

Finally, here’s the result from device_lib.list_local_devices():

2022-05-01 07:45:35.910286: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-01 07:45:35.910656: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /device:GPU:0 with 124 MB memory: -> device: 0, name: Quadro K2200, pci bus id: 0000:07:00.0, compute capability: 5.0 2022-05-01 07:45:35.910778: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-01 07:45:35.911123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /device:GPU:1 with 124 MB memory: -> device: 1, name: Quadro K2200, pci bus id: 0000:08:00.0, compute capability: 5.0 [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 11170602163960232220 xla_global_id: -1 , name: "/device:GPU:0" device_type: "GPU" memory_limit: 130678784 locality { bus_id: 1 links { } } incarnation: 3083526702988944120 physical_device_desc: "device: 0, name: Quadro K2200, pci bus id: 0000:07:00.0, compute capability: 5.0" xla_global_id: 416903419 , name: "/device:GPU:1" device_type: "GPU" memory_limit: 130678784 locality { bus_id: 1 links { } } incarnation: 4460464810151506589 physical_device_desc: "device: 1, name: Quadro K2200, pci bus id: 0000:08:00.0, compute capability: 5.0" xla_global_id: 2144165316 ] 

It seems like tensorflow can see my GPUs, when I was missing libraries I couldn’t get to this point, but now I’m stuck.

Oh, and here’s what I get when trying to train:

2022-04-30 21:53:01.961464: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:01.962055: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:02.071860: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:02.072626: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:02.073121: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:02.073627: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:02.407004: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:02.407485: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:02.407840: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:02.408204: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:02.408536: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:02.408865: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:04.298247: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:04.298711: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:04.299062: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:04.299421: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:04.299801: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:04.300924: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3385 MB memory: -> device: 0, name: Quadro K2200, pci bus id: 0000:07:00.0, compute capability: 5.0 2022-04-30 21:53:04.303496: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-04-30 21:53:04.303793: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 3385 MB memory: -> device: 1, name: Quadro K2200, pci bus id: 0000:08:00.0, compute capability: 5.0 INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0', '/job:localhost/replica:0/task:0/device:GPU:1') I0430 21:53:04.485582 140329512097600 mirrored_strategy.py:374] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0', '/job:localhost/replica:0/task:0/device:GPU:1') INFO:tensorflow:Maybe overwriting train_steps: 2000 I0430 21:53:04.494590 140329512097600 config_util.py:552] Maybe overwriting train_steps: 2000 INFO:tensorflow:Maybe overwriting use_bfloat16: False I0430 21:53:04.494734 140329512097600 config_util.py:552] Maybe overwriting use_bfloat16: False WARNING:tensorflow:From /home/oscar/Tensorflow/tfod/lib/python3.8/site-packages/object_detection/model_lib_v2.py:563: StrategyBase.experimental_distribute_datasets_from_function (from tensorflow.python.distribute.distribute_lib) is deprecated and will be removed in a future version. Instructions for updating: rename to distribute_datasets_from_function W0430 21:53:04.534977 140329512097600 deprecation.py:337] From /home/oscar/Tensorflow/tfod/lib/python3.8/site-packages/object_detection/model_lib_v2.py:563: StrategyBase.experimental_distribute_datasets_from_function (from tensorflow.python.distribute.distribute_lib) is deprecated and will be removed in a future version. Instructions for updating: rename to distribute_datasets_from_function INFO:tensorflow:Reading unweighted datasets: ['Tensorflow/workspace/annotations/train.record'] I0430 21:53:04.547105 140329512097600 dataset_builder.py:162] Reading unweighted datasets: ['Tensorflow/workspace/annotations/train.record'] INFO:tensorflow:Reading record datasets for input file: ['Tensorflow/workspace/annotations/train.record'] I0430 21:53:04.547340 140329512097600 dataset_builder.py:79] Reading record datasets for input file: ['Tensorflow/workspace/annotations/train.record'] INFO:tensorflow:Number of filenames to read: 1 I0430 21:53:04.547466 140329512097600 dataset_builder.py:80] Number of filenames to read: 1 WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards. W0430 21:53:04.547594 140329512097600 dataset_builder.py:86] num_readers has been reduced to 1 to match input file shards. WARNING:tensorflow:From /home/oscar/Tensorflow/tfod/lib/python3.8/site-packages/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.deterministic`. W0430 21:53:04.553132 140329512097600 deprecation.py:337] From /home/oscar/Tensorflow/tfod/lib/python3.8/site-packages/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.deterministic`. WARNING:tensorflow:From /home/oscar/Tensorflow/tfod/lib/python3.8/site-packages/object_detection/builders/dataset_builder.py:235: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.map() W0430 21:53:04.594555 140329512097600 deprecation.py:337] From /home/oscar/Tensorflow/tfod/lib/python3.8/site-packages/object_detection/builders/dataset_builder.py:235: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.map() WARNING:tensorflow:From /home/oscar/Tensorflow/tfod/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:1082: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version. Instructions for updating: Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead. W0430 21:53:11.867036 140329512097600 deprecation.py:337] From /home/oscar/Tensorflow/tfod/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:1082: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version. Instructions for updating: Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead. WARNING:tensorflow:From /home/oscar/Tensorflow/tfod/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:1082: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version. Instructions for updating: `seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead. W0430 21:53:15.011741 140329512097600 deprecation.py:337] From /home/oscar/Tensorflow/tfod/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:1082: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version. Instructions for updating: `seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead. WARNING:tensorflow:From /home/oscar/Tensorflow/tfod/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:1082: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.cast` instead. W0430 21:53:16.702110 140329512097600 deprecation.py:337] From /home/oscar/Tensorflow/tfod/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:1082: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.cast` instead. 

submitted by /u/OscarleBG
[visit reddit] [comments]

Categories
Misc

Object Detection: How do I display results as text?

Hi, I just followed a python tensorflow object detection tutorial and was able to successfully get back an image with a label box. I was wondering how I could get the output in text. For example: The output image had successfully labeled 2 cars a toyota and 1 car as a honda. I want another output of a string that says “2 toyotas and 1 honda found”. I am not sure how to do this.

Code: https://pastebin.com/cUq4Ezm4

submitted by /u/irissnare
[visit reddit] [comments]

Categories
Misc

Seemingly Impossible Install Conditions For Tensorflow and Stable-Baselines OpenAI

I’m trying to use OpenAI stable-baselines. This requires python 3.5 or higher. I set up my virtual environment etc., installed all dependencies. No problem. Then I try running a test script and it turns out that stable-baselines does not support Tensorflow 2. I need Tensorflow 1.15. No problem. I try to install tensorflow 1.15, but can’t because it is not available in pip anymore, for python3. Ok, so I need python 2.7 to use tensorflow 1.15… but then I can’t use stable-baselines, because it needs python 3… Help!

submitted by /u/HaikuHaiku
[visit reddit] [comments]

Categories
Misc

No module named tensorflow.keras.optimizer error

Hi,

I was recently working on a project which required the use of the tensorflow.keras.optimizer module. I am getting no module name tensorflow.keras.optimizer error though tensorflow and keras are both installed and up to date and I am using tensorflow in the same file and it gave me no error. Also tensorflow.keras.preprocessing.image shows no module named that found. Thanks

submitted by /u/StarLan7
[visit reddit] [comments]

Categories
Misc

Pytorchs model.cuda() equivalent

Dear Community I have worked with pytorch before and now I started a project with Tensorflow.

In pytorch I can define a model and then send the parameters to the GPU with the command in the header. I googled how to achieve the same with Tensorflow (and Keras) but I have not found a satisfying answer.

Thank for the help!

submitted by /u/whiteabc11
[visit reddit] [comments]

Categories
Misc

I’m a beginner having really bad issues installing tensorflow on M1 MacBook, would anyone be able to hop on call and help out?

submitted by /u/Alpay_
[visit reddit] [comments]

Categories
Offsites

Extracting Skill-Centric State Abstractions from Value Functions

Advances in reinforcement learning (RL) for robotics have enabled robotic agents to perform increasingly complex tasks in challenging environments. Recent results show that robots can learn to fold clothes, dexterously manipulate a rubik’s cube, sort objects by color, navigate complex environments and walk on difficult, uneven terrain. But “short-horizon” tasks such as these, which require very little long-term planning and provide immediate failure feedback, are relatively easy to train compared to many tasks that may confront a robot in a real-world setting. Unfortunately, scaling such short-horizon skills to the abstract, long horizons of real-world tasks is difficult. For example, how would one train a robot capable of picking up objects to rearrange a room?

Hierarchical reinforcement learning (HRL), a popular way of solving this problem, has achieved some success in a variety of long-horizon RL tasks. HRL aims to solve such problems by reasoning over a bank of low-level skills, thus providing an abstraction for actions. However, the high-level planning problem can be further simplified by abstracting both states and actions. For example, consider a tabletop rearrangement task, where a robot is tasked with interacting with objects on a desk. Using recent advances in RL, imitation learning, and unsupervised skill discovery, it is possible to obtain a set of primitive manipulation skills such as opening or closing drawers, picking or placing objects, etc. However, even for the simple task of putting a block into the drawer, chaining these skills together is not straightforward. This may be attributed to a combination of (i) challenges with planning and reasoning over long horizons, and (ii) dealing with high dimensional observations while parsing the semantics and affordances of the scene, i.e., where and when the skill can be used.

In “Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning”, presented at ICLR 2022, we address the task of learning suitable state and action abstractions for long-range problems. We posit that a minimal, but complete, representation for a higher-level policy in HRL must depend on the capabilities of the skills available to it. We present a simple mechanism to obtain such a representation using skill value functions and show that such an approach improves long-horizon performance in both model-based and model-free RL and enables better zero-shot generalization.

Our method, VFS, can compose low-level primitives (left) to learn complex long-horizon behaviors (right).

Building a Value Function Space
The key insight motivating this work is that the abstract representation of actions and states is readily available from trained policies via their value functions. The notion of “value” in RL is intrinsically linked to affordances, in that the value of a state for skill reflects the probability of receiving a reward for successfully executing the skill. For any skill, its value function captures two key properties: 1) the preconditions and affordances of the scene, i.e., where and when the skill can be used, and 2) the outcome, which indicates whether the skill executed successfully when it was used.

Given a decision process with a finite set of k skills trained with sparse outcome rewards and their corresponding value functions, we construct an embedding space by stacking these skill value functions. This gives us an abstract representation that maps a state to a k-dimensional representation that we call the Value Function Space, or VFS for short. This representation captures functional information about the exhaustive set of interactions that the agent can have with the environment, and is thus a suitable state abstraction for downstream tasks.

Consider a toy example of the tabletop rearrangement setup discussed earlier, with the task of placing the blue object in the drawer. There are eight elementary actions in this environment. The bar plot on the right shows the values of each skill at any given time, and the graph at the bottom shows the evolution of these values over the course of the task.

Value functions corresponding to each skill (top-right; aggregated in bottom) capture functional information about the scene (top-left) and aid decision-making.

At the beginning, the values corresponding to the “Place on Counter” skill are high since the objects are already on the counter; likewise, the values corresponding to “Close Drawer” are high. Through the trajectory, when the robot picks up the blue cube, the corresponding skill value peaks. Similarly, the values corresponding to placing the objects in the drawer increase when the drawer is open and peak when the blue cube is placed inside it. All the functional information required to affect each transition and predict its outcome (success or failure) is captured by the VFS representation, and in principle, allows a high-level agent to reason over all the skills and chain them together — resulting in an effective representation of the observations.

Additionally, since VFS learns a skill-centric representation of the scene, it is robust to exogenous factors of variation, such as background distractors and appearances of task-irrelevant components of the scene. All configurations shown below are functionally equivalent — an open drawer with the blue cube in it, a red cube on the countertop, and an empty gripper — and can be interacted with identically, despite apparent differences.

The learned VFS representation can ignore task-irrelevant factors such as arm pose, distractor objects (green cube) and background appearance (brown desk).

Robotic Manipulation with VFS
This approach enables VFS to plan out complex robotic manipulation tasks. Take, for example, a simple model-based reinforcement learning (MBRL) algorithm that uses a simple one-step predictive model of the transition dynamics in value function space and randomly samples candidate skill sequences to select and execute the best one in a manner similar to the model-predictive control. Given a set of primitive pushing skills of the form “move Object A near Object B” and a high-level rearrangement task, we find that VFS can use MBRL to reliably find skill sequences that solve the high-level task.

A rollout of VFS performing a tabletop rearrangement task using a robotic arm. VFS can reason over a sequence of low-level primitives to achieve the desired goal configuration.

To better understand the attributes of the environment captured by VFS, we sample the VFS-encoded observations from a large number of independent trajectories in the robotic manipulation task and project them onto a two-dimensional axis using the t-SNE technique, which is useful for visualizing clusters in high-dimensional data. These t-SNE embeddings reveal interesting patterns identified and modeled by VFS. Looking at some of these clusters closely, we find that VFS can successfully capture information about the contents (objects) in the scene and affordances (e.g., a sponge can be manipulated when held by the robot’s gripper), while ignoring distractors like the relative positions of the objects on the table and the pose of the robotic arm. While these factors are certainly important to solve the task, the low-level primitives available to the robot abstract them away and hence, make them functionally irrelevant to the high-level controller.

Visualizing the 2D t-SNE projections of VFS embeddings show emergent clustering of equivalent configurations of the environment while ignoring task-irrelevant factors like arm pose.

Conclusions and Connections to Future Work
Value function spaces are representations built on value functions of underlying skills, enabling long-horizon reasoning and planning over skills. VFS is a compact representation that captures the affordances of the scene and task-relevant information while robustly ignoring distractors. Empirical experiments reveal that such a representation improves planning for model-based and model-free methods and enables zero-shot generalization. Going forward, this representation has the promise to continue improving along with the field of multitask reinforcement learning. The interpretability of VFS further enables integration into fields such as safe planning and grounding language models.

Acknowledgements
We thank our co-authors Sergey Levine, Ted Xiao, Alex Toshev, Peng Xu and Yao Lu for their contributions to the paper and feedback on this blog post. We also thank Tom Small for creating the informative visualizations used in this blog post.