Authors - Abraham Gezehei, Thomas Hanne, Rolf Dornberger Abstract - This study benchmarks twelve recurrent neural network (RNN) architectures for univariate macroeconomic time-series forecasting, covering LSTM and GRU baselines, width/depth scaling, bidirectional encoders, an attention-like pooling variant, convolutional–recurrent hybrids, and strong regularization. Following the Libra benchmarking philosophy and the multi-metric evaluation advocated by Prater et al., we compare all configurations under identical protocols on 100 series from the Libra Economics collection. A bidirectional GRU yields the best RNN accuracy (sMAPE 41.0, MASE 0.0447), improving over a comparable 2-layer GRU baseline (sMAPE 41.9) at higher wall-clock runtime. Most architectural additions and capacity increases do not improve performance over the simple GRU baseline (e.g., deeper/wider models, pooling-based attention, CNN–RNN hybrids, and heavy dropout). The results suggest that short input windows (dynamically sized at 10% of series length, minimum 10 steps) limit the benefits of architectural complexity in this setting. Classical statistical methods (sNaive, ETS, Theta) outperform all neural models by a wide margin while requiring substantially less computation. For these low-frequency macroeconomic series, shallow GRU variants—especially bidirectional encoders—are the strongest RNN option, but classical baselines remain the practical choice.