Appendix A — BTVC Mechanics (Advanced / Extensions)
Advanced appendix to Chapter 17 (Dynamic Linear Models). Skippable; library-free.
This appendix collects the mechanics flagged but deferred in the Chapter 17 BTVC section, following Ng et al. (2021).
Kernel-smoothed knots. Instead of an independent state \theta_t at every one of the T periods, BTVC represents the coefficient trajectory with a small number J \ll T of latent knots b \in \mathbb R^J placed across time, and reconstructs the full path by smoothing: \beta = K b, where K \in \mathbb R^{T\times J} is a fixed kernel matrix whose entry K_{tj} is a smooth (e.g. Gaussian or cubic) weight of period t on knot j. The number of free parameters is J, decoupled from the series length T — the structural contrast with the DLM, whose state count grows with T. Smoothness is imposed by the kernel rather than by a random-walk variance.
Folded-normal positivity hierarchy. Marketing coefficients should be non-negative. BTVC places a two-layer hierarchy on the knots: a global scale draws from a half-normal, and each knot is a folded normal (the absolute value of a normal) centered on a shrinkage mean. The folding enforces \beta \ge 0 exactly, while the hierarchy shares strength across knots and shrinks small effects toward zero — positivity and regularization in one prior.
Why variational inference. With many channels and long histories, FFBS/MCMC over the full posterior is expensive. BTVC fits with stochastic variational inference: it posits a parametric family for the posterior and optimizes its parameters to minimize the KL divergence to the true posterior, using minibatch gradients. This trades the DLM’s exact recursions for speed and scalability, at the cost of a variational approximation gap — the posterior is approximate, and uncertainty can be understated. The trade is deliberate: BTVC targets production deployment where scalability and built-in positivity outweigh the loss of exactness.