TY - JOUR
T1 - Bio-Swarm-Pipeline
T2 - A light-weight, extensible batch processing system for efficient biomedical data processing
AU - Cheng, Xi
AU - Pizarro, Ricardo
AU - Tong, Yunxia
AU - Zoltick, Brad
AU - Luo, Qian
AU - Weinberger, Daniel R.
AU - Mattay, Venkata S.
PY - 2009/10/9
Y1 - 2009/10/9
N2 - A streamlined scientific workflow system that can track the details of the data processing history is critical for the efficient handling of fundamental routines used in scientific research. In the scientific workflow research community, the information that describes the details of data processing history is referred to as "provenance" which plays an important role in most of the existing workflow management systems. Despite its importance, however, provenance modeling and management is still a relatively new area in the scientific workflow research community. The proper scope, representation, granularity and implementation of a provenance model can vary from domain to domain and pose a number of challenges for an efficient pipeline design. This paper provides a case study on structured provenance modeling and management problems in the neuroimaging domain by introducing the Bio-Swarm-Pipeline. This new model, which is evaluated in the paper through real world scenarios, systematically addresses the provenance scope, representation, granularity, and implementation issues related to the neuroimaging domain. Although this model stems from applications in neuroimaging, the system can potentially be adapted to a wide range of bio-medical application scenarios.
AB - A streamlined scientific workflow system that can track the details of the data processing history is critical for the efficient handling of fundamental routines used in scientific research. In the scientific workflow research community, the information that describes the details of data processing history is referred to as "provenance" which plays an important role in most of the existing workflow management systems. Despite its importance, however, provenance modeling and management is still a relatively new area in the scientific workflow research community. The proper scope, representation, granularity and implementation of a provenance model can vary from domain to domain and pose a number of challenges for an efficient pipeline design. This paper provides a case study on structured provenance modeling and management problems in the neuroimaging domain by introducing the Bio-Swarm-Pipeline. This new model, which is evaluated in the paper through real world scenarios, systematically addresses the provenance scope, representation, granularity, and implementation issues related to the neuroimaging domain. Although this model stems from applications in neuroimaging, the system can potentially be adapted to a wide range of bio-medical application scenarios.
KW - Neuroimaging
KW - Neuroinformatics
KW - Provenance
KW - Scientific workflow
KW - Swarm
UR - http://www.scopus.com/inward/record.url?scp=78149240327&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78149240327&partnerID=8YFLogxK
U2 - 10.3389/neuro.11.035.2009
DO - 10.3389/neuro.11.035.2009
M3 - Article
C2 - 19847314
AN - SCOPUS:78149240327
SN - 1662-5196
VL - 3
JO - Frontiers in Neuroinformatics
JF - Frontiers in Neuroinformatics
IS - OCT
M1 - 35
ER -