Booth Talk Schedule | [GSIC] Tokyo Institute of Technology | Global Scientific Information and Computing Center

Urgent Info

There is no Urgent Info now.

GSIC

Addr.	2-12-1 O-okayama, Meguroku, Tokyo 152-8550 JAPAN
Email	Contact this mail address.

You are here

Booth Talk Schedule

Booth Talk Schedule

Tuesday, November 13th

12:30 - 13:00

Training ImageNet on Thousands of GPUs
Rio Yokota / Tokyo Institute of Technology

14:00 - 14:30

Accelerated Deep Learning Advances in HPC
William Tang / Princeton University

15:00 - 15:30

A fast low-order implicit unstructured finite-element solver for earthquake problems using GPU-based computers
Takuma Yamaguchi (Gordon Bell Finalist) / The University of Tokyo

16:00 - 16:30

VDI as Visualization Server for TSUBAME Supercomputer
Atsushi Okawa / Tokyo Institute of Technology

Wednesday, November 14th

15:00 - 15:30

Scientific Application Development and Early Results on Summit
Tjerk Straatsma / Oak Ridge National Laboratory

Abstracts of Booth Talk

Title	Training ImageNet on Thousands of GPUs
Speaker	Rio Yokota / Tokyo Institute of Technology
Abstract	ImageNet has become a common benchmark for large scale distributed deep learning, where teams at Facebook, UC Berkeley, Preferred Networks have independently performed runs on thousands of GPUs. The current state-of-the-art can train ImageNet using ResNet-50 for 90 epochs in about 15 minutes. However, data-parallel implementation of such large scale deep learning requires very large batch sizes, which has a detrimental effect on both the optimization and generalizability. We are currently investigating alternative optimization methods that are less sensitive to the increase in batch size. Large scale runs have been conducted on TSUBAME3.0 using 2048GPUs.

Title	Accelerated Deep Learning Advances in HPC
Speaker	William Tang / Princeton University
Abstract	Recent HPC-relevant advances in the deployment of deep learning recurrent nets have been demonstrated in exciting scaling studies of Princeton’s new Deep Learning Code -- "FRNN (Fusion Recurrent Neural Net) Code on modern GPU systems. This is clearly a “big-data” project in that it has direct access to the huge EUROFUSION/JET disruption data base of over a half-petabyte to drive these studies 1. FRNN implements a distributed data parallel synchronous stochastic gradient approach with “Tensorflow”2 and “Theano” 3 libraries at the backend and MPI for communication. This deep learning software has recently demonstrated excellent scaling up to 6000 GPU's on Titan ? enabled by a 2017 Oak Ridge Leadership Computing Facility (OLCF) Director’s Discretionary Award. This has enabled stimulating progress toward the goal of establishing the practical feasibility of using leadership class supercomputers to greatly enhance training of neural nets to enable transformational impact on key discovery science application domains such as Fusion Energy Science. Powerful systems targeted for near-future deployment of our deep learning software include: (1) Japan’s new “Tsubame 3” system with 3000 P-100 GPU’s; (2) Switzerland’s “Piz Daint” CRAY XC50 system with its 4500 P100 GPU’s; (2) Switzerland’s “Piz Daint” CRAY XC50 system with its 4500 P100 GPU’s; and (3) OLCF’S “Summit-Dev” system. Summarily, statistical Deep Learning software trained on very large data sets hold exciting promise for delivering much-needed predictive tools capable of accelerating scientific knowledge discovery in HPC. The associated creative methods being developed also has significant potential for cross-cutting benefit to a number of important application areas in science and industry.

Title	A fast low-order implicit unstructured finite-element solver for earthquake problems using GPU-based computers
Speaker	Takuma Yamaguchi (Gordon Bell Finalist) / The University of Tokyo
Abstract	In understanding earthquake generation/propagation processes and reducing the damage, it's important to conduct numerical simulation considering the complex geometry and material heterogeneity. To handle massive computational cost due to large domain and high resolution, we propose a fast low-order finite element solver accelerated by GPUs. In finite-element solver, sparse matrix vector multiplication becomes computationally expensive. Dense computation introduced by time-parallel algorithm reduces the random data access in Element-by-Element method and attains 2.2 times speedup per vector from the original kernel when combined with the utilization of shared memory. Our proposed solver on Piz-Daint is 2.79 times faster than SC14 Gordon Bell Finalist solver. We demonstrate crustal deformation computation with a 2,403,562,056 degree-of-freedom finite-element model targeting Eastern Mediterranean crust and mantle. The same techniques are applicable for the earthquake simulation in urban environments and this framework is used as the basis of our solver we propose as a 2018 finalist of the Gordon Bell Prize.

Title	VDI as Visualization Server for TSUBAME Supercomputer
Speaker	Atsushi Okawa / Tokyo Institute of Technology
Abstract	TSUBAME VDI (Virtual Desktop Infrastructure) is an experimental VDI system to evaluate high performance VDI for supercomputers. Supercomputers generate large data files of simulated results, etc. on their storage systems. In general, users have to download those files to their local environments before check or visualize the results. The use of VDI system allows low-latency access to them. This talk includes demonstration of the remote access to actual VDI system in Tokyo.

Title	Scientific Application Development and Early Results on Summit
Speaker	Tjerk Straatsma / Oak Ridge National Laboratory
Abstract	Summit, the world fastest supercomputer, located in the Oak Ridge Leadership Computing Facility at the DOE Oak Ridge National Laboratory will provide unprecedented computational resources for open science supported by the DOE user programs. The unique aspects of its GPU-accelerated architecture are reviewed in this presentation. The collaborative efforts to prepare scientific modeling and simulation as well as data-intensive computing applications to take advantage of the architectural features of Summit are highlighted, and early scientific results enabled by the porting and development work presented.