You are confusing that recurrent networks are trained unrolled, i.e. with finite history, with them beeing evaluated with finite history, which they generally arent.
The key advantage a kalman filter has is in its long term properties being predictable, things like observability, controllability, and bibo stability.
The key downside of kalman filters is that their nice properties only apply to linear systems and linear observations.
The EKF works by linearizing the process, and suffers on even moderately nonlinear models as a result. It also suffers from the Gaussian assumption. So people switched to unscented (tries to model the underlying probability better) and particle filters (same idea), each giving better accuracy for most all problems. Now plenty of problems are switching to full Bayesian models as the ultimate in accuracy.
So if you like KF, EKF, UKF, be sure to look into this entire chain of accuracy vs computation tradeoff algorithms.
The key advantage a kalman filter has is in its long term properties being predictable, things like observability, controllability, and bibo stability.
The key downside of kalman filters is that their nice properties only apply to linear systems and linear observations.