||Hierarchical mixture of experts (HME) is a widely adopted probabilistic divide-and-conquer regression model. We extend the variational inference algorithm for HME by using automatic relevance determination (ARD) priors. Unlike Gaussian priors, ARD allows for a few model parameters to take on large values, while forcing others to zero. Thus, using ARD priors encourages sparse models. Sparsity is known to be advantageous to the generalization capability as well as interpretability of the models. We present the variational inference algorithm for sparse HME in detail. Subsequently, we evaluate the sparse HME approach in building objective speech quality assessment algorithms, that are required to determine the quality of service in telecommunication networks.