Categories
Misc

Looking for advice/resources

Greetings all,

I want to create a tool to detect rooftops in satellite images and create a mask of them to display over the image and/or generate a geoJSON of the mask to use in further processing. I actually want to go a step further an try and estimate the 3d geometry of the roof, but that could be outside the scope of this question.

Does something like this already exist open source that I haven’t found yet? Do I need to do something like Mask R-CNN? Also if anyone has suggestions on general resources in image processing or geospatial image processing I would be greatly appreciate it!

submitted by /u/ashmortar
[visit reddit] [comments]

Categories
Misc

Tensorflow Incompatible shape error help

import tensorflow as tf import pandas as pd import numpy as np from tensorflow import keras from tensorflow.keras.models import Sequential from tensorflow.keras import layers from tensorflow.keras.layers.experimental.preprocessing import Normalization from tensorflow.keras.layers.experimental.preprocessing import IntegerLookup from sklearn.preprocessing import LabelEncoder # Load Data data = pd.read_csv('../datasets/labeled_data/labeled_features.csv') data = data[["acousticness","danceability","energy", "instrumentalness","liveness","loudness", "valence","tempo","genre"]] # # Prepare target variable def prepare_target(dataframe, target): le = LabelEncoder() le.fit(dataframe[target]) dataframe[target] = le.transform(dataframe[target]) return dataframe data = prepare_target(data, "genre") validation_frame = data.sample(frac=0.3, random_state=1234) train_frame = data.drop(validation_frame.index) def encode_numerical_feature(feature, name, dataset): # Create Normalization layer for our feature normalizer = Normalization() # Prepare the dataset that only yields our feature feature_ds = dataset.map(lambda x, y: x[name]) feature_ds = feature_ds.map(lambda x: tf.expand_dims(x, -1)) # Learn the statistics of the data normalizer.adapt(feature_ds) # Normalize the input feature encoded_feature = normalizer(feature) return encoded_feature def encode_string_categorical_feature(feature, name, dataset): # Create a StringLookup layer which will turn strings into integer indices index = StringLookup() # Prepare a Dataset that only yields our feature feature_ds = dataset.map(lambda x, y: x[name]) feature_ds = feature_ds.map(lambda x: tf.expand_dims(x, -1)) # Learn the set of possible string values and assign them a fixed integer index index.adapt(feature_ds) # Turn the string input into integer indices encoded_feature = index(feature) # Create a CategoryEncoding for our integer indices encoder = IntegerLookup(output_mode="binary") # Prepare a dataset of indices feature_ds = feature_ds.map(index) # Learn the space of possible indices encoder.adapt(feature_ds) # Apply one-hot encoding to our indices encoded_feature = encoder(encoded_feature) return encoded_feature def encode_integer_categorical_feature(feature, name, dataset): # Create a CategoryEncoding for our integer indices encoder = IntegerLookup(output_mode="binary") # Prepare a Dataset that only yields our feature feature_ds = dataset.map(lambda x, y: x[name]) feature_ds = feature_ds.map(lambda x: tf.expand_dims(x, -1)) # Learn the space of possible indices encoder.adapt(feature_ds) # Apply one-hot encoding to our indices encoded_feature = encoder(feature) return encoded_feature def dataframe_to_dataset(dataframe): dataframe = dataframe.copy() labels = dataframe.pop("genre") ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels)) ds = ds.shuffle(buffer_size=len(dataframe)) return ds train_ds = dataframe_to_dataset(train_frame) val_ds = dataframe_to_dataset(validation_frame) train_ds = train_ds.batch(32) train_ds = val_ds.batch(32) #Adaptation Steps acousticness = keras.Input(shape=(1,), name="acousticness", dtype="float64") danceability = keras.Input(shape=(1,), name="danceability", dtype="float64") energy = keras.Input(shape=(1,), name="energy", dtype="float64") instrumentalness = keras.Input(shape=(1,), name="instrumentalness", dtype="float64") liveness = keras.Input(shape=(1,), name="liveness", dtype="float64") loudness = keras.Input(shape=(1,), name="loudness", dtype="float64") valence = keras.Input(shape=(1,), name="valence", dtype="float64") tempo = keras.Input(shape=(1,), name="tempo", dtype="float64") all_inputs = [ acousticness, danceability, energy, instrumentalness, liveness, loudness, valence, tempo, ] acousticness_encoded = encode_numerical_feature(acousticness, "acousticness", train_ds) danceability_encoded = encode_numerical_feature(acousticness, "danceability", train_ds) energy_encoded = encode_numerical_feature(acousticness, "energy", train_ds) instrumentalness_encoded = encode_numerical_feature(acousticness, "instrumentalness", train_ds) liveness_encoded = encode_numerical_feature(acousticness, "liveness", train_ds) loudness_encoded = encode_numerical_feature(acousticness, "loudness", train_ds) valence_encoded = encode_numerical_feature(acousticness, "valence", train_ds) tempo_encoded = encode_numerical_feature(acousticness, "tempo", train_ds) all_features = layers.concatenate( [ acousticness_encoded, danceability_encoded, energy_encoded, instrumentalness_encoded, liveness_encoded, loudness_encoded, valence_encoded, tempo_encoded, ] ) nn = layers.Dense(32, activation="relu")(all_features) nn = layers.Dropout(0.5)(nn) #nn = layers.Dense(32, activation="relu")(nn) #nn = layers.Dropout(0.5)(nn) output = layers.Dense(8,activation='softmax')(nn) model = keras.Model(all_inputs, output) model.compile("adam","categorical_crossentropy", metrics=["accuracy"]) model.fit(train_ds, epochs = 50, validation_data=val_ds) 

I have been trying to run this code, but simply cannot run it without receiving the following error:

 ValueError: Shapes (None, 1) and (None, 8) are incompatible 

Can someone help me through this?

submitted by /u/Puzzleheaded_Juice12
[visit reddit] [comments]

Categories
Misc

NVIDIA Sets Conference Call for First-Quarter Financial Results

SANTA CLARA, Calif., May 05, 2021 — NVIDIA will host a conference call on Wednesday, May 26, at 2 p.m. PT (5 p.m. ET) to discuss its financial results for the first quarter of fiscal year 2022,…

Categories
Misc

New on NGC: New and Updated HPC Containers on the NGC Catalog

There are more than a hundred containers spanning HPC, deep learning and machine applications available in the NGC catalog, NVIDIA’s hub of GPU-optimized HPC and AI applications.

A container is a portable unit of software that combines the application and all its dependencies into a single package that is agnostic to the underlying host OS. In a high-performance computing (HPC) environment, containers remove the need for building complex environments or maintaining environment modules, making it easy for researchers and systems administrators to deploy their HPC applications. 

There are more than a hundred containers spanning HPC, deep learning and machine applications available in the NGC catalog, NVIDIA’s hub of GPU-optimized HPC and AI applications. The containers available in the catalog are tested for performance, reliability and scalability. They are also screened for Common Exposure and Vulnerabilities (CVEs) and malware to ensure that they are ready for deployment in a production environment. 

Below are some highlights of some of the new as well as updated containers that can run on both x86 and ARM platforms and fully support Singularity runtimes. 

New containers added to the catalog: 

  • TinkerHP is an MPI based, massively parallel package dedicated to long polarizable molecular dynamics simulations and to polarizable QM/MM.
  • TorchANI is a PyTorch-Based Deep Learning Implementation of the ANI Neural Network Potentials.
  • LBPM or Lattice Boltzmann Porous Media is a software package for simulating flow through porous media. 
  • NGC Pre-Flight Check light-weight tool verifies that the container runtime is set up correctly for GPUs and InfiniBand.

You can also find updated versions of the some of the key HPC applications: 

Get started today by pulling the container for your HPC needs from the list shown or visiting the NGC catalog

Categories
Offsites

Introducing FELIX: Flexible Text Editing Through Tagging and Insertion

Sequence-to-sequence (seq2seq) models have become a favoured approach for tackling natural language generation tasks, with applications ranging from machine translation to monolingual generation tasks, such as summarization, sentence fusion, text simplification, and machine translation post-editing. However these models appear to be a suboptimal choice for many monolingual tasks, as the desired output text often represents a minor rewrite of the input text. When accomplishing such tasks, seq2seq models are both slower because they generate the output one word at a time (i.e., autoregressively), and wasteful because most of the input tokens are simply copied into the output.

Instead, text-editing models have recently received a surge of interest as they propose to predict edit operations – such as word deletion, insertion, or replacement – that are applied to the input to reconstruct the output. However, previous text-editing approaches have limitations. They are either fast (being non-autoregressive), but not flexible, because they use a limited number of edit operations, or they are flexible, supporting all possible edit operations, but slow (autoregressive). In either case, they have not focused on modeling large structural (syntactic) transformations, for example switching from active voice, “They ate steak for dinner,” to passive, “Steak was eaten for dinner.” Instead, they’ve focused on local transformations, deleting or replacing short phrases. When a large structural transformation needs to occur, they either can’t produce it or insert a large amount of new text, which is slow.

In “FELIX: Flexible Text Editing Through Tagging and Insertion”, we introduce FELIX, a fast and flexible text-editing system that models large structural changes and achieves a 90x speed-up compared to seq2seq approaches whilst achieving impressive results on four monolingual generation tasks. Compared to traditional seq2seq methods, FELIX has the following three key advantages:

  • Sample efficiency: Training a high precision text generation model typically requires large amounts of high-quality supervised data. FELIX uses three techniques to minimize the amount of required data: (1) fine-tuning pre-trained checkpoints, (2) a tagging model that learns a small number of edit operations, and (3) a text insertion task that is very similar to the pre-training task.
  • Fast inference time: FELIX is fully non-autoregressive, avoiding slow inference times caused by an autoregressive decoder.
  • Flexible text editing: FELIX strikes a balance between the complexity of learned edit operations and flexibility in the transformations it models.

In short, FELIX is designed to derive the maximum benefit from self-supervised pre-training, being efficient in low-resource settings, with little training data.

Overview
To achieve the above, FELIX decomposes the text-editing task into two sub-tasks: tagging to decide on the subset of input words and their order in the output text, and insertion, where words that are not present in the input are inserted. The tagging model employs a novel pointer mechanism, which supports structural transformations, while the insertion model is based on a Masked Language Model. Both of these models are non-autoregressive, ensuring the model is fast. A diagram of FELIX can be seen below.

An example of FELIX trained on data for a text simplification task. Input words are first tagged as KEEP (K), DELETE (D) or KEEP and INSERT (I). After tagging, the input is reordered. This reordered input is then fed to a masked language model.

The Tagging Model
The first step in FELIX is the tagging model, which consists of two components. First the tagger determines which words should be kept or deleted and where new words should be inserted. When the tagger predicts an insertion, a special MASK token is added to the output. After tagging, there is a reordering step where the pointer reorders the input to form the output, by which it is able to reuse parts of the input instead of inserting new text. The reordering step supports arbitrary rewrites, which enables modeling large changes. The pointer network is trained such that each word in the input points to the next word as it will appear in the output, as shown below.

Realization of the pointing mechanism to transform “There are 3 layers in the walls of the heart” into “the heart MASK 3 layers”.

The Insertion Model
The output of the tagging model is the reordered input text with deleted words and MASK tokens predicted by the insertion tag. The insertion model must predict the content of MASK tokens. Because FELIX’s insertion model is very similar to the pretraining objective of BERT, it can take direct advantage of the pre-training, which is particularly advantageous when data is limited.

Example of the insertion model, where the tagger predicts two words will be inserted and the insertion model predicts the content of the MASK tokens.

Results
We evaluated FELIX on sentence fusion, text simplification, abstractive summarization, and machine translation post-editing. These tasks vary significantly in the types of edits required and dataset sizes under which they operate. Below are the results on the sentence fusion task (i.e., merging two sentences into one), comparing FELIX against a large pre-trained seq2seq model (BERT2BERT) and a text-editing model (LaserTager), under a range of dataset sizes. We see that FELIX outperforms LaserTagger and can be trained on as little as a few hundred training examples. For the full dataset, the autoregressive BERT2BERT outperforms FELIX. However, during inference, this model takes significantly longer.

A comparison of different training dataset sizes on the DiscoFuse dataset. We compare FELIX (using the best performing model) against BERT2BERT and LaserTagger.
Latency in milliseconds for a batch of 32 on a Nvidia Tesla P100.

Conclusion
We have presented FELIX, which is fully non-autoregressive, providing even faster inference times, while achieving state-of-the-art results. FELIX also minimizes the amount of required training data with three techniques — fine-tuning pre-trained checkpoints, learning a small number of edit operations, and an insertion task that mimics masked language model task from the pre-training. Lastly, FELIX strikes a balance between the complexity of learned edit operations and the percentage of input-output transformations it can handle. We have open-sourced the code for FELIX and hope it will provide researchers with a faster, more efficient, and more flexible text-editing model.

Acknowledgements
This research was conducted by Jonathan Mallinson, Aliaksei Severyn (equal contribution), Eric Malmi, Guillermo Garrido. We would like to thank Aleksandr Chuklin, Daniil Mirylenka, Ryan McDonald, and Sebastian Krause for useful discussions, running early experiments and paper suggestions.

Categories
Misc

Putting the AI in Retail: Walmart’s Grant Gelvin on Prediction Analytics at Supercenter Scale

With only one U.S. state without a Walmart supercenter — and over 4,600 stores across the country — the retail giant’s prediction analytics work with data on an enormous scale. Grant Gelven, a machine learning engineer at Walmart Global Tech, joined NVIDIA AI Podcast host Noah Kravitz for the latest episode of the AI Podcast. Read article >

The post Putting the AI in Retail: Walmart’s Grant Gelvin on Prediction Analytics at Supercenter Scale appeared first on The Official NVIDIA Blog.

Categories
Misc

BMW Brings Together Art, Artificial Intelligence for Virtual Installation Using NVIDIA StyleGAN

BMW today unveiled a virtual art installation that projects AI-generated artwork onto a virtual rendition of the automaker’s 8 Series Gran Coupe.  Dubbed “The Ultimate AI Masterpiece,” the installation harnessed NVIDIA StyleGAN — a generative model for high-resolution images — to create original artwork projection-mapped onto the virtual vehicle. The project debuts in conjunction with … Continued

BMW today unveiled a virtual art installation that projects AI-generated artwork onto a virtual rendition of the automaker’s 8 Series Gran Coupe. 

Dubbed “The Ultimate AI Masterpiece,” the installation harnessed NVIDIA StyleGAN — a generative model for high-resolution images — to create original artwork projection-mapped onto the virtual vehicle. The project debuts in conjunction with the contemporary art festival Frieze New York, and marks the 50th year of cultural engagement by the BMW Group.

“For 50 years, BMW has supported arts and culture through numerous initiatives as a way to engage and interact with consumers around the world in an authentic way,” said Uwe Dreher, vice president of marketing, BMW of North America. “As we continue these efforts into 2021, and look for new and creative ways to engage audiences, we shift to a virtual setting where we are combining centuries-old art and the latest AI technology to create something completely new and exciting.”

Collaborators Gary Yeh, founder of the art media company ArtDrunk, and Nathan Shipley, director of creative technology at Goodby, Silverstein & Partners, trained NVIDIA StyleGAN on 50,000 images of art across nine centuries as well as 50 contemporary works from artists BMW has worked with in past years. The trained model merges the learnings from classical art along with styles from the contemporary artists. 

“AI is an emerging medium of creative expression. It’s a fascinating space where art meets algorithm,” said Shipley. “Combining the historical works with the curated modern works and projecting the evolving images onto the 8 Series Gran Coupe serves a direct nod to BMW’s history of uniting automobiles, art, and technology.” 

The project uses the BMW car as a canvas to showcase each creator’s style — like that of South Korean charcoal artist Lee Bae. 

“In this case the AI learned from Lee Bae’s work. In a way, it sees those textures,” Shipley said. “And then on its own the AI generates this evolving stream of new textures. They’re informed by his work, but they’re also unique.”

Developed by NVIDIA Research, StyleGAN has been adopted for digital storytelling, art exhibits, manga illustrations and reimagined historical portraits.

For more AI-inspired artwork, visit the AI Art Gallery featured at the recent NVIDIA GPU Technology Conference.

Categories
Misc

Streaming Everything with NVIDIA Rivermax

NVIDIA Rivermax 1.5, the newest release of the IP-based video and data streaming library, includes key features and capabilities enabling performance boosts and quicker integrations.

In 2020, many of us adopted a work-from-home routine, and this new norm has been stressing IT networks. It shouldn’t be a surprise that the sudden boost in remote working drives the need for a more dynamic IT environment, one that can pull in resources on demand.

Over the past few years, we’ve focused on the Media & Entertainment (M&E) market, supporting the global industry as it evolves from proprietary SDI to cost-effective Ethernet/IP infrastructure solutions. NVIDIA technologies are enabling M&E to take the next transformational step toward cloud computing, while meeting compliance with the most stringent SMPTE ST-2110-21 specification requirements.

On the journey to modernize M&E network interconnect, we introduced NVIDIA Rivermax, an optimized, standard-compliant software library API for streaming data. Rivermax software runs on NVIDIA ConnectX-5 or later network adapters, enabling the use of common off-the-shelf (COTS) servers for streaming SD, HD, and up to Ultra-HD video flows. The Rivermax-ConnectX-5 adapter card combination also enables compliance with M&E specifications, such as the SMPTE 2110-21; reduces CPU utilization for video data streaming; and removes bottlenecks for the highest throughput. It can reach 82 Gbps of streamed video with a single CPU core.

As our partners have rolled out new Rivermax-based, full-IP solutions rigorously tested in their labs, we’re excited to share the fruits of these collaborative investments in Rivermax 1.5, the latest release of the streaming library. Rivermax 1.5 includes key features and capabilities enabling performance boosts and quicker integrations. One of these new features allows Rivermax-accelerated applications to stream not only video, audio, and ancillary data but other data stream formats as well, enabling Rivermax accelerations and CPU savings in many new markets and applications:

  • Compressed video
  • Healthcare imaging (DICOM-RTV)
  • Cloud gaming
  • Autonomous car sensor streaming (video/LiDAR/RADAR)
  • And more

Another good piece of news is that Rivermax 1.5 recently passed the JT-NM Tested program (March 16 – 20, 2020), allowing for integration and interoperability with multiple other market vendors.

Rivermax 1.5 release contents

The Rivermax 1.5 release contains the following updates and features:

  • Virtualized Rivermax over vmware ESXi and Linux OpenStack (currently in beta-level support)
  • Rivermax API updates:
    • Replaced TX pause API with a flag to commit API
    • Changed structure of in-buffer attributes
    • Changed function signature of in-query buffer API
  • New 802.1Q VLAN tagging support
  • New SDK code examples:
    • Media sender:
      • Real video content, interlace, 59.94, 29.97
    • Media receiver:
      • GPU-CUDA support for color space conversion (from YCBCR to RGB): Display or playback a video stream on screen or through X11 SSH
      • Interlace video formats
      • 2022-7 Rx SW sample code to get you started quickly on software implementation of 2022-7, which will be offloaded to ConnectX-6 Dx hardware with future releases
  • Generic API (beta version): For streaming any type of data. Get all the goodies of Rivermax, like traffic shaping (accurate packet pacing), high bandwidth for any type of UDP-based data stream with low CPU utilization and supporting both Linux and Windows.
  • Introduce Rivermax for Mellanox ConnectX-6 Dx in beta-level support over Linux OS (with feature parity to ConnectX-5)
  • NVIDIA-Jetson platform software image (as presented at IBC2019)
    • Based on Rivermax 1.5 release
    • Demos running Rivermax on NVIDIA-Jetson platform
    • Includes sender and receiver examples
    • GPU is integrated with the Media_receiver for both CSC and on-screen rendering
    • AnalyzeX (SMPTE ST2110-20 verification software) while running video viewers

Want to discuss Rivermax? Comment below or reach out to your local account/support team.

Here’s to seeing you at the next M&E show!

Categories
Misc

Easiest way to get/set flattened array of trainable weights (and biases)

For example i want to be able to do something like this…

weights = model.get_trainable_weights() weights *= 2 model.set_trainable_weights(weights) 

I’ve googled it and seems like getting trainable weights might be pretty straightforward, but i’m not finding anything on being able to supply a flat array of weights for the model to set.

Right now I’m manually tracking the shapes, calculating which part of the flat array is for this tensor then taking that subset and reshaping it. It’s seems more difficult than it needs to be plus its also pretty expensive computationally taking as long as a tenth of a second just to set weights.

submitted by /u/Yogi_DMT
[visit reddit] [comments]

Categories
Misc

Handy data augmentation toolkit for image classification put in a single efficient TensorFlow op

Handy data augmentation toolkit for image classification put in a single efficient TensorFlow op submitted by /u/lnstadrum
[visit reddit] [comments]