Early prediction of diseased brain conditions is critical for curing illness and preventing irreversible neuronal dysfunction and loss. Generically regarding the different neuroimaging modalities as filtered, complementary insights of brain's anatomical and functional organization, multimodal data fusion could be hypothesized to enhance the predictive power as compared to a unimodal prediction of disease progression. More recently, deep learning (DL) based methods on structural MRI (sMRI) data have outperformed classical machine learning approaches in several neuroimaging applications including diagnostic classification and prediction. Similarly, functional MRI (fMRI) features estimated using a dynamic (i.e. time-varying) functional connectivity (FC) approach have been found to be more discriminative and predictive of the clinical diagnosis than those based on the static FC approach. Motivated by this, we introduce a novel multimodal data fusion framework featuring deep residual learning of non-linear sMRI features and dynamic FC (dFC) based extraction of fMRI features to predict the subset of individuals with mild cognitive impairments who would progress to Alzheimer's disease within a time-period of three years from the baseline scanning sessions. Our cross-validated results from the developed multimodal (sMRI-fMRI) data fusion framework demonstrate a significant improvement in performance over the unimodal prediction analyses with the fMRI (p = 7.03 x 10-7) and sMRI (p = 6.72 x 10-4) modalities. As such, the findings in this work highlight the benefits of combining multiple neuroimaging data modalities via data fusion, corroborate the predictive value of the tested DL and dFC features and argue in favor of exploration of similar approaches to learn neuroanatomical and functional alterations in the neuroimaging data.