I built a custom generator that outputs X data with shape (100,2,2048) belonging to Y 16 (16) classes to be passed to a GRU model for video classification.
100 is the sequence length, 2 is for 2 simultaneous camera views, each with 2048 features, extracted earlier with a feature extractor.
I need to pass this to GRU model, but it throws an error (Input 0 of layer “gru” incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None,100,2,2048)
) when I set the input shape in the input layer to (100,2,2048).
Using just one camera view and setting the it to (100,2048) works.
What input shape do I need to set to accommodate the two cameras?