Publications

Journal Publications

2023

[J35]Liancheng Jia, Zizhang Luo, Liqiang Lu, Yun Liang. “Automatic Generation of Spatial Accelerator for Tensor Algebra, “ in the IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol: 42: Issue: 8, June 2023, pp.1898-1911.

2022

[J34]Tao Wei, Yonghong Tian, Yaowei Wang, Yun Liang, Changwen Chen, “Optimized separable convolution: Yet another efficient convolution operator, “ in AI Open, October, 2022.
[J33]Size Zheng, Renze Chen, Yicheng Jin, Aijiang Wei, Bingyang Wu, Xiuhong Li, Shengen Yan, Yun Liang. “NeoFlow: A Flexible Framework for Enabling Efficient Compilation for High Performance DNN Training, “ in the IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol:33,Issue: 11, November 2022.
[J32]Liqiang Lu, Yun Liang. “Morphling: A Reconfigurable Architecture for Tensor Computation, “ in the IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol:41, Issue:11, November 2022. (TCAD).
[J31]Yun Liang, Qingcheng Xiao, Liqiang Lu, Jiaming Xie. “FCNNLib: A Flexible Convolution Algorithm Library for Deep Learning on FPGAs, “in the IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol: 41, Issue: 8, August 2022.
[J30]Yun Liang, Liqiang Lu, Yicheng Jin, Jiaming Xie, Ruirui Huang, Jiansong Zhang, Wei Lin. “An Efficient Hardware Design for Accelerating Sparse CNNs with NAS-based Models, “ in the IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol: 41, Issue: 3, March 2022.

2021

[J29]Tao Yang, Zhezhi He, Tengchuan Kou, Qi Han, Haibao Yu, Fangxin Liu, Yun Liang, Li Jiang. “BISWSRBS: A Winograd-based CNN Accelerator with a Fine-grained Regular Sparsity Pattern and Mixed Precision Quantization, “ in the ACM Transactions on Reconfigurable Technology and Systems (TRETS), 14, 4, Article 18,September, 2021.
[J28]Yun Liang, Liqiang Lu, Jiaming Xie. “OMNI: A Framework for Integrating Hardware and Software Optimizations for Sparse CNNs,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD),Vol:40, Issue:8,Aug 2021,pp. 1648-1661.

2020

[J27]Jingchen Zhu, Guangyu Sun, Xian Zhang, Chao Zhang, Weiqi Zhang, Yun Liang, Tao Wang, Yiran Chen, Jia Di. “Fork Path: Batching ORAM Requests to Remove Redundant Memory Accesses,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol: 39, Issue: 10, Oct 2020, pp.2279-2292.
[J26]Linacheng Jia, Liqiang Lu, Xuechao Wei, Yun Liang. “Generating Systolic Array Accelerators with Reusable Blocks, “ IEEE MICRO (Special Issue on Agile and Open-Source Hardware), Vol: 40, Issue: 4, July, 2020, pp. 85-92.
[J25]Liancheng Jia, Yun Liang, Xiuhong Li, Liqiang Lu, Shengen Yan. “Enabling Efficient Fast Convolution Algorithms on GPUs via MegaKernels, “ IEEE Transactions on Computers (TC), Vol:69, Issue: 7, July 2020, pp. 986-997.
[J24]Qingcheng Xiao, Yun Liang. Performance Modeling and Directives Optimization for High Level Synthesis on FPGA. “Fune: An FPGA Tuning Framework for CNN Acceleration, “ IEEE Design & Test (D&T), Vol 37, Issue 1, Feb.2020.
[J23]Yun Liang, Liqiang Lu, Qingcheng Xiao, Shengen Yan. “Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs, “ IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Volume: 39 , Issue: 4 , April 2020, pp. 857-870.

2019

[J22]Jieru Zhao, Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang, Bingsheng He. “Performance Modeling and Directives Optimization for High Level Synthesis on FPGA, “ IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2019.

2018

[J21]Yun Liang, Shuo Wang, Wei Zhang. “FlexCL: A Model of Performance and Power for OpenCL Workloads on FPGAs, “ IEEE Transactions on Computer (TC), Vol 67, No.12, Dec. 2018
[J20]Zhenxin Fua, Lei Yang, Wenbin Houa, Zhuohan Lia, Yifan Wua, Yihua Cheng, Xiaolin Wang, Yun Liang, “ Reproducing Vectorization of the Terso_Multi-Body Potential on the Intel Broadwell Architecture, “ Parallel Computing 78 (2018) 28–32.
[J19]Yun Liang, Xiaolong Xie, Yu Wang, Guangyu Sun, Tao Wang. “Optimizing Cache Bypassing and Warp Scheduling for GPUs, “ IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol 37, No. 8, August. 2018.
[J18]Xiaolong Xie, Yun Liang, Xiuhong Li, Yudong Wu, Guangyu Sun, Tao Wang, and Dongrui Fan. “CRAT: Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs, “ IEEE Transactions on Computer (TC), Vol 67, No.6, June 2018.
[J17]Xinfeng Xie, Dayou Du, Qian Li, Yun Liang, Wai Teng Tang, Zhong Liang Ong,Mian Lu, Huynh Phung Huynh , Rick Siow Mong Goh. “Exploiting Sparsity to Accelerate Fully Connected Layers of CNN-based Applications on Mobile SoCs, “ ACM Transactions on Embedded Computing Systems (TECS), Vol 17, Issue 2, February, 2018.

2017

[J16]Lei Yang, Yilong Li, Zhenxin Fu, Zhuohan Li, Wenbin Hou, Haoze Wu, Xiaolin Wang, Yun Liang, “ParConnect Reproducibiliy Report, “ Parallel Computing, 70 (2017) 22-26.
[J15]Yun Liang, Waiteng Tang, Ruizhe Zhao, Mian Lu, Huynh Phung Huynh, Rick Siow Mong Goh. “Scale-free Sparse Matrix-Vector Multiplication on Many-Core Architectures, “ IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol 36, Issue 12, Dec 2017.
[J14]Yun Liang, Xiuhong Li. “Efficient Kernel Management on GPUs”. ACM Transactions on Embedded Computing Systems (TECS), Vol 16, Issue 4, May 2017.

2016

[J13]Yun Liang, Muhammad T. Satria, Kyle Rupnow, Deming Chen. “An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization, “ IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 35, No. 7, July 2016.
[J12]Yao Chen, Swathi T. Gurumani, Yun Liang, Guofeng Li, Donghui Guo, Kyle Rupnow, Deming Chen. “FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow, “ IEEE Transactions on Very Large Scale Integration Systems (TVLSI), Vol. 24, No. 6, pp. 2220–2233, June 2016.
[J11]Ying Chen, Tan Nguyen, Yao Chen, Swathi Gurumani, Yun Liang, Kyle Rupnow, Jason Cong, Wen-mei Hwu, Deming Chen. “FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow”, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 35, No. 12, April 2016.
[J10]Yun Liang, Shuo Wang. “Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs, “ Journal of Computer Science and Technology (JCST), Vol 31, No.1, Janurary. 2016.

2015

[J9]Mian Lu, Yun Liang, Huynh Phung Huynh, Zhongliang Ong, Bingsheng He, Rick Siow Mong Goh. “MrPhi : An Optimized MapReduce Framework on Intel Xeon Phi Coprocessors, “ IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 26, No. 11, pp. 3066-3078, November 2015.
[J8]Yun Liang, Xiaolong Xie, Guangyu Sun, Deming Chen. “An Efficient Compiler Framework for Cache Bypassing on GPUs,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 34, No. 10, pp. 1677-1690, October 2015.
[J7]Yun Liang, Tulika Mitra, Lei Ju. “Instruction Cache Locking using Temporal Reuse Profile,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 34, No. 9, pp. 1387-1400, August 2015.
[J6]Yun Liang, Huynh Phung Huynh, Kyle Rupnow, Rick Siow Mong Goh, Deming Chen. “Efficient GPU Spatial-Temporal Multitasking,” IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 26, No. 3, pp. 748-760, March 2015.

2013

[J5]Yun Liang, Tulika Mitra. “An Analytical Approach for Fast and Accurate Design Space Exploration of Instruction Caches,” ACM Transactions on Embedded Computing Systems (TECS), 13(3), Article 43, December, 2013.

2012

[J4]Yun Liang, Huping Ding, Tulika Mitra, Abhik Roychoudhury, Yan Li, Vivy Suhendra. “Timing Analysis of Concurrent Programs Running on Shared Cache Multi-cores,” Real-Time Systems Journal (RTS) 48(6), November, 2012.
[J3]Yun Liang, Kyle Rupnow, Yinan Li, Dongbo Min, Minh Do, and Deming Chen. “High Level Synthesis: Productivity, Performance and Software Constraints, “ Journal of Electrical and Computer Engineering, Special Issue on ESL Design Methodology, Volume 2012 (2012), 649057, 2012.

2009

[J2]Lei Ju, Yun Liang, Samarjit Chakraborty, Tulika Mitra, Abhik Roychoudhury. “Cache-aware optimization of BAN applications,” Journal of Design Automation for Embedded System, Volume 13 (3), September, 2009.

2007

[J1]Xianfeng Li, Yun Liang, Tulika Mitra, Abhik Roychoudhury. “Chronos: A Timing Analyzer for Embedded Software,” Science of Computer Programming, Special issue on Experimental Software and Toolkit, 69(1-3), December 2007.

Conference Papers

2024

[C97]Youwei Xiao, Zizhang Luo, Kexing Zhou, Yun Liang. “Cement: Streamlining FPGA Hardware Design with Cycle-Deterministic eHDL and Synthesis”, to appear in the proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), March, 2024.
[C96]Xiaochen Hao, Hongbo Rong, Mingzhe Zhang, Ce Sun, Hong Jiang, Yun Liang. “POPA: Expressing High and Portable Performance across Spatial and Vector Architectures for Tensor Computations”, to appear in the proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), March, 2024.
[C95]Yijie Chen, Jiarui Zhang, Tao Wang, Yun Liang. “Trend-Aware Supervision: On Learning Invariance for Semi-Supervised Facial Action Unit Intensity Estimation”, to appear in the proceedings of the 38th Annual AAAI Conference on Artificial Intelligence (AAAI), February, 2024.

2023

[C94]Size Zheng, Siyuan Chen, Siyuan Gao, Liancheng Jia, Guangyu Sun, Runsheng Wang, and Yun Liang. “TileFlow: A Framework for Modeling Fusion Dataflow via Tree-based Analysis”, in the proceedings of the 56th International Symposium on Microarchitecture (MICRO), 2023.
[C93]Kexing Zhou, Yun Liang, Yibo Lin, Runsheng Wang, and Ru Huang. “Khronos: Fusing Memory Access for Improved Hardware RTL Simulation”, in the proceedings of the 56th International Symposium on Microarchitecture (MICRO), 2023.
[C92]Yifan Chen, Zaiwen Wen, Yun Liang, and Yibo Lin.” Stronger Mixed-Size Placement Backbone Considering Second-Order Information”, in the proceeding of the International Conference on Computer Aided Design (ICCAD), Oct. 2023.
[C91]Shuzhang Zhong, Meng Li, Yun Liang, Runsheng Wang and Ru Huang. “Memory-aware Scheduling for Complex Wired Networks with Iterative Graph Optimization”, in the proceeding of the International Conference on Computer Aided Design (ICCAD), Oct. 2023.
[C90]Xiuping Cui, Size Zheng, Tianyu Jia, Le Ye and Yun Liang. “ARES: A Mapping Framework of DNNs towards Diverse PIMs with General Abstractions”, in the proceeding of the International Conference on Computer Aided Design (ICCAD), Oct. 2023.
[C89]Xiaochen Hao, Zijian Ding, Jieming Yin, Yuan Wang and Yun Liang. “Title: Monad: Towards Cost-effective Specialization for Chiplet-based Spatial Accelerators”, in the proceeding of the International Conference on Computer Aided Design (ICCAD), Oct. 2023.
[C88]Jing Mai, Jiarui Wang, Zhixiong Di, Guojie Luo, Yun Liang and Yibo Lin, “OpenPARF: An Open-Source Placement and Routing Framework for Large-Scale Heterogeneous FPGAs with Deep Learning Toolkit”, in the proceedings of the IEEE15th International Conference on ASIC (ASICON), Oct, 2023.
[C87]Zizhang Luo, Liqiang Lu, Yichen Jin, Liancheng Jia, Yun Liang. “TITLE: Calabash: Accelerating Attention using a Systolic Array Chain on FPGAs”, in the proceedings of the International Conference on Field-Programmable Logic and Applications (FPL), Sep 2023.
[C86]Yanchi Dong, Tianyu Jia, Kaixuan Du, Yiqi Jing, Qijun Wang, Pixian Zhan, Yadong Zhang, Fengyun Yan, Yufei Ma, Yun Liang, Le Ye, Ru Huang. “A Model-Specific End-to-End Design Methodology for Resource-Constrained TinyML Hardware”, in the proceedings of the Design Automation Conference (DAC), July 2023.
[C85]Zizhang Luo, Liqiang Lu, Size Zheng, Jieming Yin, Jason Cong, Jianwei Yin, Yun Liang. “Rubick: A Synthesis Framework for Spatial Architectures via Dataflow Decomposition”, in the proceedings of the Design Automation Conference (DAC), July 2023.
[C84]Size Zheng, Siyuan Chen, Yun Liang. “Memory and Computation Coordinated Mapping of DNNs onto Complex Heterogeneous SoC “, in the proceedings of the Design Automation Conference (DAC), July 2023.
[C83]Xiaochen Hao, Mingzhe Zhang, Ce Sun, Zhuofu Tao, Hongbo Rong, Yu Zhang, Lei He, Eric Petit, Wenguang Chen, Yun Liang. “Lasa: Abstraction and Specialization for Productive and Performant Linear Algebra on FPGAs” in the proceedings of the 31st IEEE International Symposium On Field-Programmable Custom Computing Machines (FCCM), May 2023.
[C82]Size Zheng, Siyuan Chen, Peidi Song, Renze Chen, Xiuhong Li, Shengen Yan, Dahua Lin, Jingwen Leng, Yun Liang. “Chimera: An Analytical Optimizing Framework for Effective Compute-intensive Operators Fusion”, in the proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), March, 2023.

2022

[C81]Ruifan Xu, Youwei Xiao, Jin Luo, Yun Liang. “Hector: A Multi-level Intermediate Representation for Hardware Synthesis Methodolgies”, in the proceeding of the International Conference on Computer Aided Design (ICCAD), Nov. 2022.
[C80]Yingjie Chen, Huasong Zhong, Chong Chen, Chen Shen, Jianqiang Huang, Tao Wang, Yun Liang, Qianru Sun, “On Mitigating Hard Clusters for Face Clustering”, in the proceeding of the 17th European Conference on Computer Vision (ECCV), Oct. 2022.
[C79]Yingjie Chen, Chong Chen, Xiao Luo, Jianqiang Huang, Xian-Sheng Hua, Tao Wang, Yun Liang, “Pursuing Knowledge Consistency: Supervised Hierarchical Contrastive Learning for Facial Action Unit Recognition”, in the proceedings of the 30th ACM International Conference on Multimedia (ACM MM), October, 2022.
[C78]Xiuping Cui, Xiaochen Hao, Yun Liang, Guangyu Sun, Xiaoxin Cui, Yuan Wang, Ru Huang, “A Mapping Model of SNNs to Neuromorphic Hardware”, in the proceeding of the International Conference on Aritifical Intelligence Circuits and Systems (AICAS), June, 2022. (Invited paper)
[C77]Size Zheng, Renze Chen, Anjiang Wei, Yicheng Jin, Qin Han, Liqiang Lu, Bingyang Wu, Xiuhong Li, Shengen Yan, Yun Liang. “AMOS: Enabling Automatic Mapping for Tensor Computations on Spatial Accelerators with Hardware Abstraction,” in the proceedings of the International Symposium on Computer Architecture (ISCA), June 2022.
[C76]Liancheng Jia, Yuyue Wang, Jingwen Leng, Yun Liang. “EMS:Efficient Memory Subsystem Synthesis for Spatial Accelerators,” in the proceedings of the Design Automation Conference (DAC), 2022.
[C75]Yingjie Chen, Diqi Chen, Tao Wang, Yizhou Wang, Yun Liang. “Causal Intervention for Subject-deconfounded Facial Action Unit Recognition,” in the proceedings of the Association for the Advancement of Artificial Intelligence (AAAI) , 2022.
[C74]Qingcheng Xiao, Yun Liang. “Towards Agile DNN Accelerator Design Using Incremental Synthesis on FPGAs,” in the proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), 2022.

2021

[C73]Liancheng Jia, Zizhang Luo, Liqiang Lu, Yun Liang. “TensorLib: A Spatial Accelerator Generation Framework for Tensor Algebra,” in the proceedings of the Design Automation Conference (DAC), 2021.
[C72]Yangjie Zhou, Mengtian Yang, Cong Guo, Jingwen Leng, Yun Liang, Quan Chen, Minyi Guo , Yuhao Zhu. “Characterizing and Demystifying the Implicit Convolution Algorithm on Commercial Matrix-Multiplication Accelerators ,” in the proceedings of the IEEE International Symposium on Workload Characterization (IISWC), 2021.
[C71]Liqiang Lu, Yicheng Jin, Hangrui Bi, Zizhang Luo, Peng Li, Tao Wang, Yun Liang. “Sanger: A Co-Design Framework for Enabling Sparse Attention using Reconfigurable Architecture,” in the proceedings of the 54th international Symposium on Microarchitecture (MICRO), 2021.
[C70]Yingjie Chen, Diqi Chen, Yizhou Wang, Tao Wang, Yun Liang. “ CaFGraph: Context-aware Facial Multi-graph Representation for Facial Action Unit Recognition,” in the proceedings of ACM International Conference on Multimedia (ACM MM), 2021.
[C69]Yingjie Chen, Han Wu, Tao Wang, Yizhou Wang, Yun Liang. “Cross-modal Representation Learning For Lightweight and Accurate Facial Action Unit Detection,” in the proceedings of IEEE\RSJ International Conference on Intelligent Robots and Systems (IROS), September, 2021.
[C68]Qingcheng Xiao, Size Zheng, Bingzhe Wu, Pengcheng Xu, Xuehai Qian, Yun Liang. “HASCO: Towards Agile Hardware and Software Co-design for Tensor Computation,” in the proceedings of the International Symposium on Computer Architecture (ISCA), 2021.
[C67]Liqiang Lu, Naiqing Guan, Yuyue Wang, Liancheng Jia, Zizhang Luo, Jieming Yin, Jason Cong, Yun Liang. “TENET: A Framework for Modeling Tensor Dataflow based on Relation-centric Notation ,” in the proceedings of the International Symposium on Computer Architecture (ISCA), 2021.

2020

[C66]Yi-Hsiang Lai, Hongbo Rong, Size Zheng, Weihao Zhang, Xiuping Cui, Yunshan Jia, Jie Wang, Brendan Sullivan, Zhiru Zhang, Yun Liang, Youhui Zhang, Jason Cong, Nithin George, Jose Alvareze, Christopher Hughes, Pradeep Dubey. “SuSy: A Programming Model for Productive Construction of High-Performance Systolic Arrays on FPGAs,” in the proceedings of the International Conference on Computer Aided Design (ICCAD), November 2020.
[C65]Tao Yang, Yunkun Liao, Jianping Shi, Yun Liang, Naifeng Jing, Li Jiang. “A Winograd-based CNN Accelerator with a Fine-grained Regular Sparsity Pattern,” in the proceedings of the International Conference on Field-Programmable Logic and Applications (FPL) , August 2020.
[C64]Qingcheng Xiao, Liqiang Lu, Jiaming Xie, Yun Liang. “FCNNLib: An Efficient and Flexible Convolution Algorithm Library on FPGAs,” in the proceedings of the Design Automation Conference (DAC), July 2020.
[C63]Size Zheng, Yun Liang, Shuo Wang, Renze Chen, Kaiwen Sheng. “FlexTensor: An Automatic Schedule Exploration and Optimization Framework for Tensor Computation on Heterogeneous System,” in the proceedings of the 25th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2020.

2019

[C62]Qingcheng Xiao, Yun Liang. “Zac: Towards Automatic Optimization and Deployment of Quantized Deep Neural Networks on Embedded Devices,” in the proceedings of the International Conference on Computer Aided Design (ICCAD), November 2019. (invited paper)
[C61]Xiaolong Xie, Yun Liang, Xiuhong Li, Wei Tan. “ CuLDA: Solving Large-scale LDA Problems on GPUs,” in the proceedings of the 26th International Symposium on High Performance Parallel and Distributed Computing (HPDC), June 2019.
[C60]Xuechao Wei, Yun Liang, Jason Cong. “Overcoming Data Transfer Bottlenecks in DNN Accelerators via Layer-Conscious Memory Management,” in the proceedings of the Design Automation Conference (DAC), June 2019.
[C59]Runbin Shi, Junjie Liu, Shuo Wang, Yun Liang, Hayden So. “E-LSTM: Efficient Inference of Sparse LSTM on Embedded Heterogeneous System,” in the proceedings of the Design Automation Conference (DAC), June 2019.
[C58]Jiaxi Zhang, Wentai Zhang, Guojie Luo, Xuechao Wei, Yun Liang, Jason Cong, “Frequency Improvement of Systolic Array-Based CNNs on FPGAs,” in the proceedings of the 2019 IEEE International Symposium on Circuits and Systems (ISCAS 2019), Sapporo, Japan, May 26-29, 2019.
[C57]Liqiang Lu, Jiaming Xie, Ruirui Huang, Jiansong Zhang, Wei Linn Yun Liang. “An Efficient Hardware Accelerator for Sparse Convolutional Neural Networks on FPGAs” in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2019.
[C56]Caiwen Ding, Shuo Wang, Ning Liu, Kaidi Xu, Yanzhi Wang, Yun Liang. “REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs,” in the proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Februrary, 2019.
[C55]Xiuhong Li, Yun Liang, Shengen Yan, Liancheng Jia, Yinghan li. “ A Coordinated Tiling and Batching Framework for Efficient GEMM on GPUs,” in the proceedings of Principles and Practice of Parallel Programming (PPoPP), February 2019. Best Paper Award Nomination.
[C54]Shuo Wang, Yun Liang, Wei Zhang. “Poly: Efficient Heterogeneous System and Application Management for Interactive Applications,” in the proceedings of 21st IEEE International Symposium on High Performance Computer Architecture (HPCA), February 2019.

2018

[C53]Xuechao Wei, Yun Liang, Xiuhong Li, Cody Hao Yu, Peng Zhang and Jason Cong. “TGPA: Tile-Grained Pipeline Architecture for Low Latency CNN Inference,” in the proceedings of International Conference on Computer Aided Design (ICCAD) , Nov, 2018.
[C52]Liqiang Lu, Yun Liang. “SpWA: An Efficient Sparse Winograd Convolutional Neural Networks Accelerator on FPGAs,” in the proceedings of the Design Automation Conference (DAC), June 2018.
[C51]Xiuhong Li, Yun Liang, Wentai Zhang, Taide Liu, Haochen Li, Guojie Luo, Ming Jiang. “cuMBIR: An Efficient Framework for Low-dose X-ray CT Image Reconstruction on GPUs,” in the proceedings of the ACM International Conference on Supercomputing (ICS), June, 2018
[C50]Liang Feng, Sharad Sinha, Wei Zhang and Yun Liang. “CAMAS: Static and Dynamic Hybrid Cache Management for CPU-FPGA Platforms,”in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2018.
[C49]Shuo Wang, Zhe Li, Caiwen Ding, Bo Yuan, Qinru Qiu, Yanzhi Wang ,Yun Liang. “C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs” in the proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Februrary, 2018.

2017

[C48]Yun Liang, Xiuhong Li, Xiaolong Xie. “Exploring Cache Bypassing and Partitioning for MultiTasking on GPUs,” in the proceedings of International Conference on Computer Aided Design (ICCAD) , Nov, 2017.
[C47]Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang. “A Hybrid Approach to Cache Management in Heterogeneous CPU-FPGA Platforms,” in the proceedings of International Conference on Computer Aided Design (ICCAD) , Nov, 2017. (invited paper)
[C46]Jieru Zhao, Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang, Bingsheng He. “COMBA: A Comprehensive Model-Based Analysis Framework for High Level Synthesis of Real Applications,” in the proceedings of International Conference on Computer Aided Design (ICCAD) , Nov, 2017. Best Paper Award.
[C45]Xiaolong Xie, Wei Tan, Liana L. Fong and Yun Liang. “CuMF_SGD: Parallelized Stochastic Gradient Descent for Matrix Factorization on GPUs, “ in the proceedings of the 26th International Symposium on High Performance Parallel and Distributed Computing (HPDC), June 2017.
[C44]Shuo Wang, Yun Liang. “A Comprehensive Framework for Synthesizing Stencil Algorithms on FPGAs using OpenCL Model, “ in the proceedings of the Design Automation Conference (DAC), June 2017.
[C43]Shuo Wang, Yun Liang, Wei Zhang. “FlexCL: An Analytical Performance Model for OpenCL Workloads on Flexible FPGAs, “ in the proceedings of the Design Automation Conference (DAC), June 2017.
[C42]Qingcheng Xiao, Yun Liang, Liqiang Lu, Shengen Yan, Yu-Wing Tai. “Exploring Heterogeneous Algorithms for Accelerating Deep Convolutional Neural Networks on FPGAs, “ in the proceedings of the Design Automation Conference (DAC), June 2017.
[C41]Xuechao Wei, Cody Hao Yu, Peng Zhang, Youxiang Chen, Yuxin Wang, Han Hu,Yun Liang, Jason Cong. “Automated Systolic Array Architecture Synthesis for High Throughput CNN Inference on FPGAs,” in the proceedings of the Design Automation Conference (DAC), June 2017. Best Paper Award Nomination.
[C40]Liqiang Lu, Yun Liang, Qingcheng Xiao and Shengen Yan.” Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs,” in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2017.
[C39]Guanwen Zhong, Alok Prakash, Siqi Wang, Yun Liang, Tulika Mitra, Smail Niar. “Design Space Exploration of FPGA-based Accelerators with Multi-level Parallelism,” in the proceedings of the Design Automation and Test in Europe (DATE), March, 2017.
[C38]Xuechao Wei, Yun Liang, Tao Wang, Songwu Lu, Jason Cong. “Throughput Optimization for Streaming Applications on CPU-FPGA Heterogeneous Systems,” in the proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2017.

2016

[C37]Guanwen Zhong, Alok Prakash, Yun Liang, Tulika Mitra, Smail Niar. “Lin-Analyzer: A High-level Performance Analysis Tool for FPGA-based Accelerators,” in the proceedings of the Design Automation Conference (DAC), June, 2016.
[C36]Xiuhong Li, Yun Liang. “Efficient Kernel Management on GPUs,” in the proceedings of the Design Automation and Test in Europe (DATE), March, 2016.
[C35]Shuo Wang, Yun Liang, Chao Zhang, Xiaolong Xie, Guangyu Sun, Yongpan Liu, Yu Wang, Xiuhong Li. “Performance-centric Register File Design for GPUs using Racetrack Memory,” in the proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2016. Best Paper Award Nomination.

2015

[C34]Xiaolong Xie, Yun Liang, Xiuhong Li, Yudong Wu, Guangyu Sun, Tao Wang, and Dongrui Fan. “Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs,” in the proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December, 2015.
[C33]Xian Zhang, Guangyu Sun, Chao Zhang, Weiqi Zhang, Yun Liang, Tao Wang, Yiran Chen, and Jia Di. “Fork Path: Improving Efficiency of ORAM by Removing Redundant Memory Accesses,” in the proceedings of 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December, 2015.
[C32]Yun Liang, Shuo Wang. “Quantitative Performance and Power Analysis of LTE using High Level Synthesis,” in the proceedings of International Conference on ASIC, November 2015. (invited paper)
[C31]Chao Zhang, Guangyu Sun, Xian Zhang, Weiqi Zhang, Weisheng Zhao, Tao Wang, Yun Liang, Yongpan Liu, Yu Wang, and Jiwu Shu, “Hi-fi Playback: Tolerating Position Errors in Shift Operations of Racetrack Memory, “ in the proceedings of the 42nd International Symposium on Computer Architecture (ISCA), June 2015.
[C30]Xiaolong Xie, Yun Liang, Yu Wang, Guangyu Sun, Tao Wang. “Coordinated Static and Dynamic Cache Bypassing on GPUs,” in the proceedings of 21st IEEE International Symposium on High Performance Computer Architecture (HPCA), February 2015.
[C29]Waiteng Tang, Ruizhe Zhao, Mian Lu, Yun Liang, Huynh Phung Huynh, Xibai Li, Rick Siow Mong Goh. “Optimizing and Auto-Tuning Scale-Free Sparse Matrix-Vector Multiplication on Intel Xeon Phi, “ in the proceedings of the International Symposium on Code Generation and Optimization (CGO), February 2015.

2014

[C28]Guanwen Zhong, Vanchinathan Venkataramani, Yun Liang, Tulika Mitra, Smail Niar. “Design Space Exploration of Multiple Loops on FPGAs using High Level Synthesis,” in the proceedings of IEEE International Conference on Computer Design (ICCD), October 2014.
[C27]Xiaoming Chen, Yu Wang, Yun Liang, Yuan Xie, Huazhong Yang, “Run-time Techniques for Simultaneous Aging and Power Optimization in GPGPUs,” in the proceedings of the 51th Design Automation Conference (DAC), June, 2014.
[C26]Jingyu Deng, Yun Liang, Guojie Luo, Guangyu Sun. “Rapid Design Space Exploration of Two-level Unified Caches,” in the proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), June 2014.
[C25]Swathi Gurumani, Jacob Tolar, Yao Chen, Yun Liang, Kyle Rupnow, Deming Chen. “Integrated CUDA-to-FPGA Synthesis with Network-on-Chip,” in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2014.
[C24]Huping Ding, Yun Liang, Tulika Mitra. “WCET-Centric Dynamic Instruction Cache Locking,” in the proceedings of the Design Automation and Test in Europe (DATE), March, 2014.
[C23]Zhimin Wu, Yang Liu, Yun Liang, Jun Sun. “GPU Accelerated Counterexample Generation in LTL Model Checking,” in the proceedings of the International Conference on Formal Engineering Methods (ICFEM), November, 2014.

2013

[C22]Xiaolong Xie, Yun Liang, Guangyu Sun, Deming Chen. “An Efficient Compiler Framework for Cache Bypassing on GPUs,” in the proceedings of International Conference on Computer Aided Design (ICCAD) , Nov, 2013.
[C21]Mian Lu, Lei Zhang, Huynh Phung Huynh, Zhongliang Ong, Yun Liang, Bingsheng He, Rick Siow Mong Goh, Richard Huynh. “Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor,” in the proceedings of the IEEE Bag Data (BigData), Oct, 2013.
[C20]Alexandros Papakonstantinou, Deming Chen, Wen Mei Hwu, Yun Liang, Jason Cong. “Throughput-Oriented Kernel Porting onto FPGAs,” in the proceedings of the 50th Design Automation Conference (DAC), June, 2013.
[C19]Huping Ding, Yun Liang, Tulika Mitra. “Integrated Instruction Cache Analysis and Locking in Multitasking Real-time Systems,” in the proceedings of the 50th Design Automation Conference (DAC), June, 2013.
[C18]Wei Zuo, Yun Liang, Peng Li, Kyle Rupnow, Deming Chen, Jason Cong. “Improving High Level Synthesis Optimization Opportunity Through Polyhedral Transformations,” in the proceedings of the 21st ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Februrary, 2013.
[C17]Swathi T. Gurumani, Hisham Cholakkal, Yun Liang, Kyle Rupnow, Deming Chen. “High-Level Synthesis of Multiple Dependent CUDA Kernels on FPGA,” in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2013. (invited paper)
[C16]Yun Liang, Zheng Cui, Kyle Rupnow, Deming Chen. “Register and Thread Structure Optimization for GPUs,” in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC), January, 2013.
[C15]Huping Ding, Yun Liang, Tulika Mitra. “Shared Cache Aware Task Mapping for WCRT Minimization,” in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC), January, 2013.

2012

[C14]Huping Ding, Yun Liang, Tulika Mitra. “WCET-Centric Partial Instruction Cache Locking,” in the proceedings of ACM 49th Design Automation Conference (DAC), June 2012. Best Paper Award Nomination (7 out of 741 submissions).
[C13]Zheng Cui, Yun Liang, Kyle Rupnow, Deming Chen. “An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization,” in the proceedings of IEEE International Parallel & Distributed Processing Symposium (IPDPS), May, 2012.
[C12]Yun Liang, Zheng Cui, Shengkui Zhao, Kyle Rupnow, Yihao Zhang, Douglas L. Jones, Deming Chen. “Real-time Implementation and Performance Optimization of 3D Sound Localization on GPUs,” in the proceedings of Design Automation and Test in Europe (DATE), March, 2012.
[C11]Shengkui Zhao, Saima Ahmed, Yun Liang, Kyle Rupnow, Deming Chen, Douglas L Jones.”A real-time 3D sound localization system with miniature microphone array for virtual reality,” in the proceedings of 7th IEEE Conference on Industrial Electronics and Applications (ICIEA), July, 2012.
[C10]Kyle Rupnow, Yun Liang, Yinan Li, Dongbo Min, Minh Do, Deming Chen. “High Level Synthesis of Stereo Matching: Productivity, Performance, and Software Constraints, “ in the proceedings of International Conference on Field Programmable Technology (FPT), December 2011. Best Paper Award Nomination (4 out of 94 submissions).
[C9]Kyle Rupnow, Yun Liang, Yinan Li, Deming Chen. “A study of high-level synthesis: Promises and challenges”, in the proceedings of IEEE 9th International Conference on ASIC (ASICON), October, 2012.

2011

[C8]Alexandros Papakonstantinou, Yun Liang, John A. Stratton, Karthik Gururaj, Deming Chen, Wen-Mei W. Hwu, Jason Cong. “Multilevel Granularity Parallelism Synthesis on FPGAs,” in the proceedings of the 19th Annual IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2011. ** Best Paper Award (1 out of 119 submissions).**

2010

[C7]Yun Liang, Tulika Mitra. “Improved Procedure Placement for Set Associative Caches,” in the proceedings of the international Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), October, 2010.
[C6]Yun Liang, Tulika Mitra. “Instruction Cache Locking using Temporal Reuse Profile,” in the proceedings of the ACM 47th Design Automation Conference (DAC), June 2010.
[C5]Huynh Phung Huynh, Yun Liang, Tulika Mitra. “Efficient custom instructions generation for system-level design,” in the proceedings of the International Conference on Field-Programmable Technology (FPT), December, 2010.

2009

[C4]Yan Li, Vivy Suhendra, Yun Liang, Tulika Mitra, Abhik Roychoudhury. “Timing Analysis of Concurrent Programs Running on Shared Cache Multi-Cores,” in the proceedings of the 30th IEEE Real-Time Systems Symposium (RTSS), December, 2009.

2008

[C3]Yun Liang, Tulika Mitra. “Static Analysis for Fast and Accurate Design Space Exploration of Caches,” in the proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October, 2008.
[C2]Yun Liang, Lei Ju, Samarjit Chakraborty, Tulika Mitra, Abhik Roychoudhury. “Cache-aware Optimization of BAN Applications,” in the proceedings of the International Conference on Hardware/Software Codesign and System Synthesis(CODES+ISSS), October, 2008. Best Paper Award Nomination.
[C1]Yun Liang, Tulika Mitra. “Cache Modeling in Probabilistic Execution Time Analysis,” in the proceedings of the 45th Design Automation Conference (DAC), June, 2008.

Other Workshop Papers and Poster Papers

[W9]Ruifan Xu, Jin Luo, Yun Liang. “Hermes: Enhancing Extensibility in High-Level Synthesis through Multi-Level IRs”, in the proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), March, 2024. (Poster)
[W8]Ruifan Xu, Youwei Xiao, Jin Luo, Yun Liang. “Hector: Multi-level Paradigm in Hardware Synthesis”, Workshop on Languages, Tools, and Techniques for Accelerator Design (LATTE 2023) in conjunction with ASPLOS 2023.
[W7]Ruifan Xu, Youwei Xiao, Jin Luo, Yun Liang. “A MLIR-Based Hardware Synthesis Framework”, Workshop on Open-Source EDA Technology (WOSET 2022) in conjunction with ICCAD2022.
[W6]Liancheng Jia, Zizhang Luo, Liqiang Lu, Yun Liang. TensorLib: A Spatial Accelerator Generation Framework for Tensor Algebra. Workshop on Open-Source EDA Technology (WOSET 2022), in conjunction with ICCAD2022, Poster.
[W5]Haokun Li, Jing Liu, Liancheng Jia, Yun Liang, Yaowei Wang, Tingming, Tan. “Downscaling and Overflow-aware Model Compression For Efficient Vision Processors”, the third International Workshop on Efficient Artificial Intelligence For Edge Computing(EAI) in conjunction with ICDCS 2022. Best Paper Award.
[W4]Pengcheng Xu, Yun Liang. “Automatic Code Generation for Rocket Chip RoCC Accelerators, “ Fourth Workshop on Computer Architecture Research with RISC-V (CARRV) 2020.
[W3]Xiaolong Xie, Yun Liang, Xiuhong Li, Wei Tan. “CuLDA_CGS: solving large-scale LDA problems on GPUs, “ in the proceedings of the Symposium on Principles and Practice of Parallel Programming (PPoPP), 2019. (Poster).
[W2]XuechaoWei, Yun Liang, XibaiLi, Tao Wang, Songwu Lu, Jason Cong.“Evaluation of Software Defined Radio on Heterogeneous Systems, “ in the proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT), 2015. (Poster).
[W1]Yun Liang, Abhik Roychoudhury, Tulika Mitra. “Timing analysis of body area network application,” in the proceedings of 7th International Workshop on Worst Case Execution Time Analysis (WCET), 2007.

Paper in Chinese

[P4]包云岗,常轶松,韩银和,黄立波,李华伟,梁云,罗国杰,尚笠,唐丹,王颖.解壁伟,喻文健,张科,孙凝晖, “处理器芯片敏捷设计方法:问题与挑战,” 计算机研究与发展,58(6),1131-1145,2021.
[P3]徐瑞帆,肖有为,罗进,梁云, 高层次综合综述。微纳电子与智能制造, 2021.
[P2]卢丽强, 郑思泽, 肖倾城, 陈德铭, 梁云, 面向卷积神经网络的FPGA 设计. 中国科学: 信息科学 49, 277 (2019).
[P1]王硕, 章嘉玺, 罗国杰, 梁云, 开源硬件与开源EDA工具:芯片未来设计的加速器. 前沿科学 4(2018).