Memory impact 1DConv vs. Dense layer


I have two types of networks.

Type 1: 2-layer 1DConv + Dense
Type 2: 3-layer 1DConv + Dense

I perform hyperparameter tuning (wandb) and found an optimal network (best parameters) for both types. I obtain the same MSE for the unseen test data. During my experiments, I also keep track of the memory usage of the model. I “translate” the model to a quantized tflite model (post-training quantization).

What surprises me is that a (“larger”) 3-layer 1DConv + Dense has a lower need for memory compared to a 2-layer 1DConv + Dense (given the same MSE, i.e. regression model). In case you tune the parameters only of the Dense layer, it has a large impact on memory. Probably because the Dense Layer is fully connected and has more weights to store. Of course, a comparison between the two models is still difficult because you can tune a lot of parameters…

What I am currently looking for is a better theoretical and fundamental understanding of the memory impact for each type of layer (1DConv, 2DConv, … and Dense layer) inside a network.


Well as you were saying, each layer’s memory usage is mainly driven by the number of weights, and so a 1D vs a 2D Conv vs a Dense layer will be entirely driven by the number of neurons / size of filters in those layers. Then when you try to cross that with accuracy it will be very dependent on the application at hand and likely won’t generalize. That said, here is an article that explains a bit more about how to compute the number of parameters in CNNs Number of Parameters and Tensor Sizes in a Convolutional Neural Network (CNN) | LearnOpenCV #