講演抄録/キーワード |
講演名 |
2017-07-27 17:00
A study on Network Structure and Parameter Exchange Method in large-scale Cluster for Machine Learning ○Duo Zhang・Mingxi Li(Univ. of Tsukuba)・Yusuke Tanimura・Hidemoto Nakada(AIST) CPSY2017-29 |
抄録 |
(和) |
(まだ登録されていません) |
(英) |
For modern machine learning systems, including deep learning systems, parallelization is inevitable since they are required to process massive amount of training data. One of the hot area of this area is the data parallel learning where multiple nodes cooperate each other exchanging parameter / gradient periodically. In this paper, we focus on the network resource requirement for this kind of application. We investigate 3-layered Clos network and omega-network adding to the 2-layered fat tree network which we have already reported. As parameter exchange method, we tested direct parameter exchange method and centralized server method. We evaluated these three types of network with SimGrid, a simulator for distributed environment, and confirmed that with suitable parameter exchange methods, we can maintain performance with higher over subscription factor. |
キーワード |
(和) |
/ / / / / / / |
(英) |
Machine learning / Parameter Server / Simulation / Clos network / / / / |
文献情報 |
信学技報, vol. 117, no. 153, CPSY2017-29, pp. 145-150, 2017年7月. |
資料番号 |
CPSY2017-29 |
発行日 |
2017-07-19 (CPSY) |
ISSN |
Print edition: ISSN 0913-5685 Online edition: ISSN 2432-6380 |
著作権に ついて |
技術研究報告に掲載された論文の著作権は電子情報通信学会に帰属します.(許諾番号:10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
PDFダウンロード |
CPSY2017-29 |
|