TY - JOUR
T1 - Log-sum enhanced sparse deep neural network
AU - Qiao, Chen
AU - Shi, Yan
AU - Diao, Yu Xian
AU - Calhoun, Vince D.
AU - Wang, Yu Ping
N1 - Funding Information:
Vince D. Calhoun is founding director of the tri-institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS) and a Georgia Research Alliance eminent scholar in brain health and image analysis where he holds an appointments at Georgia State University, Georgia Institution of Technology and Emory University. He was previously the President of the Mind Research Network and Distinguished Professor of Electrical and Computer Engineering at the University of New Mexico. He is the author of more than 780 full journal articles and over 800 technical reports, abstracts and conference proceedings. His work includes the development of flexible methods to analyze functional magnetic resonance imaging data such as independent component analysis (ICA), deep learning for neuroimaging, data fusion of multimodal imaging and genetics data, neuroinformatics tools, and the identification of biomarkers for disease. His research is funded by the NIH and NSF among other funding agencies. Dr. Calhoun is a fellow of the Institute of Electrical and Electronic Engineers, The American Association for the Advancement of Science, The American Institute of Biomedical and Medical Engineers, The American College of Neuropsychopharmacology, and the International Society of Magnetic Resonance in Medicine. He served at the chair for the Organization for Human Brain Mapping from 2018–2019 is a past chair of the IEEE Machine Learning for Signal Processing Technical Committee. He currently serves on the IEEE BISP Technical Committee and is also a member of IEEE Data Science Initiative Steering Committee.
Funding Information:
This work was supported by NSFC (No. 11471006 and No. 81601456), Science and Technology Innovation Plan of Xi’an (No. 2019421315KYPT004JC006), the Fundamental Research Funds for the Central Universities (No.xjj2017126) and was partly supported by NIH R01GM109068, R01MH104680 and the HPC Platform, Xi’an Jiaotong University.
Funding Information:
This work was supported by NSFC (No. 11471006 and No. 81601456), Science and Technology Innovation Plan of Xi'an (No. 2019421315KYPT004JC006), the Fundamental Research Funds for the Central Universities (No.xjj2017126) and was partly supported by NIH R01GM109068, R01MH104680 and the HPC Platform, Xi'an Jiaotong University.
Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2020/9/24
Y1 - 2020/9/24
N2 - How to design deep neural networks (DNNs) for the representation and analysis of high dimensional but small sample size data is still a big challenge. One solution is to construct a sparse network. At present, there exist many approaches to achieve sparsity for DNNs by regularization, but most of them are carried out only in the pre-training process due to the difficulty in the derivation of explicit formulae in the fine-tuning process. In this paper, a log-sum function is used as the regularization terms for both the responses of hidden neurons and the network connections in the loss function of the fine-tuning process. It provides a better approximation to the L0-norm than several often used norms. Based on the gradient formula of the loss function, the fine-tuning process can be executed more efficiently. Specifically, the commonly used gradient calculation in many deep learning research platforms, such as PyTorch or TensorFlow, can be accelerated. Given the analytic formula for calculating gradients used in any layer of DNN, the error accumulated from successive numerical approximations in the differentiation process can be avoided. With the proposed log-sum enhanced sparse deep neural network (LSES-DNN), the sparsity of the responses and the connections can be well controlled to improve the adaptivity of DNNs. The proposed model is applied to MRI data for both the diagnosis of schizophrenia and the study of brain developments. Numerical experiments demonstrate its superior performance among several classical classifiers tested.
AB - How to design deep neural networks (DNNs) for the representation and analysis of high dimensional but small sample size data is still a big challenge. One solution is to construct a sparse network. At present, there exist many approaches to achieve sparsity for DNNs by regularization, but most of them are carried out only in the pre-training process due to the difficulty in the derivation of explicit formulae in the fine-tuning process. In this paper, a log-sum function is used as the regularization terms for both the responses of hidden neurons and the network connections in the loss function of the fine-tuning process. It provides a better approximation to the L0-norm than several often used norms. Based on the gradient formula of the loss function, the fine-tuning process can be executed more efficiently. Specifically, the commonly used gradient calculation in many deep learning research platforms, such as PyTorch or TensorFlow, can be accelerated. Given the analytic formula for calculating gradients used in any layer of DNN, the error accumulated from successive numerical approximations in the differentiation process can be avoided. With the proposed log-sum enhanced sparse deep neural network (LSES-DNN), the sparsity of the responses and the connections can be well controlled to improve the adaptivity of DNNs. The proposed model is applied to MRI data for both the diagnosis of schizophrenia and the study of brain developments. Numerical experiments demonstrate its superior performance among several classical classifiers tested.
KW - Back propagation algorithm
KW - Concise gradient formula
KW - Deep neural network
KW - Log-sum enhanced sparsity
KW - Magnetic resonance imaging
UR - http://www.scopus.com/inward/record.url?scp=85085583461&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085583461&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2020.04.118
DO - 10.1016/j.neucom.2020.04.118
M3 - Article
AN - SCOPUS:85085583461
VL - 407
SP - 206
EP - 220
JO - Neurocomputing
JF - Neurocomputing
SN - 0925-2312
ER -