The goal of this paper is to introduce a direct visual tracking method based on an image similarity measure called the sum of conditional variance (SCV). The SCV was originally proposed in the medical imaging domain for registering multi-modal images. In the context of visual tracking, the SCV is invariant to non-linear illumination variations, multi-modal and computationally inexpensive. Compared to information theoretic tracking methods, it requires less iterations to converge and has a significantly larger convergence radius. The novelty in this paper is a generalization of the efficient second-order minimization formulation for tracking using the SCV, allowing us to combine the efficient second-order approximation of the Hessian with a similarity metric invariant to non-linear illumination variations. The result is a visual tracking method that copes with non-linear illumination variations without requiring the estimation of photometric correction parameters at every iteration. We demonstrate the superior performance of the proposed method through comparative studies and tracking experiments under challenging illumination conditions and rapid motions.