Motivation/Purpose: This work reports the development and validation of an algorithm to automatically detect and localize vertebrae in CT images of patients undergoing spine surgery. Slice-by-slice detections using the state-of-the art 2D convolutional neural network (CNN) architectures were combined to estimate vertebra centroid location in 3D including a method that combined detections in sagittal and coronal slices. The solution facilitates applications in image guided surgery and automatic computation of image analytics for surgical data science. Methods: CNN-based object detection models in 3D (volume) and 2D (slice) images were implemented and evaluated for the task of vertebrae detection. Slice-by-slice detections in 2D architectures were combined to estimate the 3D centroid location including a model that simultaneously evaluated 2D detections in orthogonal directions (i.e., sagittal and coronal slices) to improve the robustness against spurious false detections-called Ortho-2D. Performance was evaluated in a data set consisting of 85 patients undergoing spine surgery at our institution, including images presenting spinal instrumentation/implants, spinal deformity, and anatomical abnormalities that are realistic exemplars of pathology in the patient population. Accuracy was quantified in terms of precision, recall, F1 score, and the 3D geometric error in vertebral centroid annotation compared to ground truth (expert manual) annotation. Results: Three CNN object detection models were able to successfully localize vertebrae, with Ortho-2D model that combined 2D detections in orthogonal directions achieving best performance: precision = 0.95, recall = 0.99, and F1 score = 0.97. Overall centroid localization accuracy was 3.4 mm (median) [interquartile range (IQR) = 2.7 mm], and ∼97% of detections (154/159 lumbar cases) yielded acceptable centroid localization error <15 mm (considering average vertebrae size ∼25 mm). Conclusions: State-of-the-art CNN architectures were adapted for vertebral centroid annotation, yielding accurate and robust localization even in the presence of anatomical abnormalities, image artifacts, and dense instrumentation. The methods are employed as a basis for streamlined image guidance (automatic initialization of 3D-2D and 3D-3D registration methods in image-guided surgery) and as an automatic spine labeling tool to generate image analytics.