Optimize PHY_ofdm_mod CYCLIC_PREFIX in case of incidentally aligned pointers
It seems that the cyclic prefix is in most cases a multiple of 512 samples. This means that in most cases the idft output pointer is already aligned and there is no need to perform an extra memcpy. This saves the memcpy time in most cyclic prefix insertion cases.