Photonic Switched Optically Connected Memory: An Approach to Address Memory Challenges in Deep Learning

Deep learning has been revolutionizing many aspects of our society, powering various fields including computer vision, natural language processing, and activity recognition. However, the scaling trends for both datasets and model size are constraining system performance. Variability of memory requirements can lead to poor resource utilization. Reconfigurable photonic interconnects provide scalable solutions and enable efficient use of disaggregated memory resources. We propose a photonic switched optically connected memory system architecture that tackles the memory challenges while showing the functionality of optical switching for deep learning models. Our proposed system architecture utilizes a “lite” (de)serialization scheme for memory transfers via optical links to avoid network overheads and supports the dynamic allocation of remote memories to local processing systems. In order to test the feasibility of our proposal, we built an experimental testbed with a processing system and two remote memory nodes using silicon photonic switch fabrics and evaluated the system performance. The optical switching time is measured to be 119 μs and an overall 2.78 ms latency is achieved for the end-to-end reconfiguration. The collective results and existing high-bandwidth optical I/Os show the potential of integrating the photonic switched optically connected memory to state-of-the-art processing systems.

[1]  John Shalf,et al.  PINE: An Energy Efficient Flexibly Interconnected Photonic Data Center Architecture for Extreme Scalability , 2018, 2018 IEEE Optical Interconnects Conference (OI).

[2]  Newsha Ardalani,et al.  Beyond human-level accuracy: computational challenges in deep learning , 2019, PPoPP.

[3]  Qixiang Cheng,et al.  Scalable Space-And-Wavelength Selective Switch Architecture Using Microring Resonators , 2019, 2019 Conference on Lasers and Electro-Optics (CLEO).

[4]  Yang Yang,et al.  Deep Learning Scaling is Predictable, Empirically , 2017, ArXiv.

[5]  Takashi Inoue,et al.  SOA-Integrated Silicon Photonics Switch and Its Lossless Multistage Transmission of High-Capacity WDM Signals , 2019, Journal of Lightwave Technology.

[6]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Sachin Katti,et al.  Bandana: Using Non-volatile Memory for Storing Deep Learning Models , 2018, MLSys.

[8]  Nathan C. Abrams,et al.  Ultralow-crosstalk, strictly non-blocking microring-based optical switch , 2019, Photonics Research.

[9]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[10]  Yan Yan,et al.  All-Optical Programmable Disaggregated Data Centre Network Realized by FPGA-Based Switch and Interface Card , 2016, Journal of Lightwave Technology.

[11]  Eugenio Culurciello,et al.  An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.

[12]  Srihari Cadambi,et al.  A dynamically configurable coprocessor for convolutional neural networks , 2010, ISCA.

[13]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[14]  H. Takahashi,et al.  Silica-based PLC Type 32 x 32 Optical Matrix Switch , 2006, 2006 European Conference on Optical Communications.

[15]  S. Namiki,et al.  Low-Insertion-Loss and Power-Efficient 32 × 32 Silicon Photonics Switch With Extremely High-Δ Silica PLC Connector , 2019, Journal of Lightwave Technology.

[16]  Benjamin G. Lee,et al.  Silicon Photonic Switch Fabrics: Technology and Architecture , 2019, Journal of Lightwave Technology.

[17]  Alex Krizhevsky,et al.  One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.

[18]  Carole-Jean Wu,et al.  The Architectural Implications of Facebook's DNN-Based Personalized Recommendation , 2019, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[19]  A Wonfor,et al.  Demonstration of the feasibility of large-port-count optical switching using a hybrid Mach-Zehnder interferometer-semiconductor optical amplifier switch module in a recirculating loop. , 2014, Optics letters.

[20]  Stephen Marshall,et al.  Activation Functions: Comparison of trends in Practice and Research for Deep Learning , 2018, ArXiv.

[21]  Martin D. Schatz,et al.  Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications , 2018, ArXiv.

[22]  Jorge Nocedal,et al.  On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[23]  Apostol Natsev,et al.  YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.

[24]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[26]  Rajat Raina,et al.  Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.

[27]  D. Brunina,et al.  Building Data Centers With Optically Connected Memory , 2011, IEEE/OSA Journal of Optical Communications and Networking.

[28]  G. Zervas,et al.  Optically Disaggregated Data Centres with Minimal Remote Memory Latency: Technologies, Architectures, and Resource Allocation , 2017 .

[29]  Lei Qiao,et al.  32 × 32 silicon electro-optic switch with built-in monitors and balanced-status units , 2017, Scientific Reports.

[30]  P. J. Duthie,et al.  16*16 single chip optical switch array in lithium niobate , 1991 .

[31]  Md. Zakir Hossain,et al.  A Comprehensive Survey of Deep Learning for Image Captioning , 2018, ACM Comput. Surv..

[32]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[33]  A. Biberman,et al.  An ultralow power athermal silicon modulator , 2014, Nature Communications.

[34]  Raluca Dinu,et al.  100 GHz silicon–organic hybrid modulator , 2014, Light: Science & Applications.

[35]  Polina Bayvel,et al.  Sub-Nanosecond Clock and Data Recovery in an Optically-Switched Data Centre Network , 2018, 2018 European Conference on Optical Communication (ECOC).

[36]  Ming C. Wu,et al.  240×240 Wafer-Scale Silicon Photonic Switches , 2019, 2019 Optical Fiber Communications Conference and Exhibition (OFC).

[37]  Daniel Roggen,et al.  Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition , 2016, Sensors.

[38]  Minsoo Rhu,et al.  Beyond the Memory Wall: A Case for Memory-Centric HPC System for Deep Learning , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[39]  Natalia Gimelshein,et al.  vDNN: Virtualized deep neural networks for scalable, memory-efficient neural network design , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[40]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Qixiang Cheng,et al.  Scalable Microring-Based Silicon Clos Switch Fabric With Switch-and-Select Stages , 2019, IEEE Journal of Selected Topics in Quantum Electronics.

[42]  Juerg Leuthold,et al.  100 GHz Plasmonic Photodetector , 2018, ACS Photonics.

[43]  Qixiang Cheng,et al.  Photonic switching in high performance datacenters [Invited]. , 2018, Optics express.

[44]  Qixiang Cheng,et al.  Recent advances in optical technologies for data centers: a review , 2018, Optica.

[45]  Qixiang Cheng,et al.  Scalable Space-And-Wavelength Selective Switch Architecture Using Microring Resonators , 2019 .

[46]  DarrellTrevor,et al.  Long-Term Recurrent Convolutional Networks for Visual Recognition and Description , 2017 .

[47]  Qixiang Cheng,et al.  Multi-Stage 8 × 8 Silicon Photonic Switch Based on Dual-Microring Switching Elements , 2020, Journal of Lightwave Technology.

[48]  Georgios Zervas,et al.  Optically disaggregated data centers with minimal remote memory latency: Technologies, architectures, and resource allocation [Invited] , 2018, IEEE/OSA Journal of Optical Communications and Networking.

[49]  Yang Liu,et al.  A reconfigurable and redundant optically-connected memory system using a silicon photonic switch , 2014, OFC 2014.