(RP20) Development of Training Environment for Deep Learning With Medical Images on Supercomputer System Based on Asynchronous Parallel Bayesian Optimization
AI/Machine Learning/Deep Learning
Big Data Analytics
TimeTuesday, June 18th8:30am - 10am
DescriptionRecently, deep learning has been exploited in the field of medical image analysis. However, deep learning requires large amounts of computational power, and optimization of numerous hyper-parameters largely affects the performance of deep learning. If a framework for training deep learning with hyper-parameter optimization on a supercomputer system can be realized, it is expected to accelerate training of deep learning with medical images. In this study, we described our novel environment for training deep learning with medical images on the Reedbush-H supercomputer system based on asynchronous parallel Bayesian optimization (BO). Our training environment was composed of an automated hyper-parameter tuning module based on BO and a job submission module based on Xcrypt, which is a job level parallel script language based on Perl. The training jobs using the hyper-parameters generated by the hyper-parameter tuning module were alternately executed at the compute nodes. In training of deep learning using our framework, the hyper-parameters were chosen by BO so as to maximize the value of evaluation criteria in validation. We targeted an automated detection of lung nodule in chest computed tomography images based on a 3D U-Net. In this case, we selected 11 types of hyper-parameters. The tuning performance with sequential BO was superior to that with asynchronous parallel BO. When the number of workers was eight or less, the tuning performance with asynchronous parallel BO was superior to that with random search. The constructed environment enabled to efficiently train deep learning with hyper-parameter tuning on the Reedbush-H supercomputer system.