We show that Mallows model averaging estimator proposed by Hansen (2007) can be written as a least squares estimation with a weighted $L_1$ penalty and additional constraints. By exploiting this representation, we demonstrate that the weight vector obtained by this model averaging procedure has a sparsity property in the sense that a subset of models receives exactly zero weights. Moreover, this representation allows us to adapt algorithms developed to efficiently solve minimization problems with many parameters and weighted $L_1$ penalty. In particular, we develop a new coordinate-wise descent algorithm for model averaging. Simulation studies show that the new algorithm computes the model averaging estimator much faster and requires less memory than conventional methods when there are many models.