In the visual wakewords example why is the number of samples 256 and not 512?

In the visual wakewords, why was number of samples only 256 when we are reading 512 (bytes?) from the PDM? Is it because we need two bytes to make up one sample, but then i dont see where we put two bytes together to make a single sample anywhere?
i think this may be something related to the PDM?

I’m trying to recreate the visual wakewords model with an electret microphone since i wanst able to get a BLE sense module since theyre out of stock.

If anyone has any ideas, for now im just going to use a buffer size of 256 since my ADC reads 16bits of data?