Niels Bohr Institute
  University of Copenhagen
weifeng.liu [at]

Google Scholar Citations    DBLP   

Short Bio

Weifeng Liu is currently a Postdoctoral Researcher at the Niels Bohr Institute (NBI) of the University of Copenhagen. He received his Ph.D. in 2016 from NBI under advisor Prof. Brian Vinter. Before he moved to Copenhagen, he has been working as a Senior Researcher in high performance computing technology at SINOPEC Exploration & Production Research Institute for about six years (2006-2012). He also has been shortly working as a Research Associate with Prof. Iain S. Duff at STFC Rutherford Appleton Laboratory in 2016, and as a visiting PhD student with Prof. Anders Logg at Chalmers University of Technology in 2015. He received his B.E. degree and M.E. degree in computer science, both from China University of Petroleum, Beijing, in 2002 and 2006, respectively. He is a member of the ACM, the IEEE, the CCF and the SIAM.

His research interests include numerical linear algebra and parallel computing, particularly in designing scalable algorithms and data structures for sparse matrix computations on throughput-oriented processors. His algorithms run on a variety of many-core devices (e.g., NVIDIA, AMD and Intel GPUs, and Intel Xeon Phi) and single-chip CPU-GPU heterogeneous processors (e.g., NVIDIA Tegra, AMD Kaveri and Intel Broadwell).

He will join the Department of Computer Science (IDI) of the Norwegian University of Science and Technology (NTNU) as a Marie Curie Fellow in the summer of 2017. Prof. Anne C. Elster at the Heterogenous and Parallel Computing Lab (HPC-Lab) will host this action.


Recent Publications
  • Ang Li, Weifeng Liu, Mads R. B. Kristensen, Brian Vinter, Hao Wang, Kaixi Hou, Andres Marquez, Shuaiwen Leon Song. "Exploring and Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels". 2017 International Conference for High Performance Computing, Networking, Storage and Analysis (SC '17). Nominated for best paper.
    [PDF] [Slides] [DOI] [BibTex]
  • Ang Li, Shuaiwen Leon Song, Weifeng Liu, Xu Liu, Akash Kumar, Henk Corporaal. "Locality-Aware CTA Clustering for Modern GPUs". 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '17). Received a HiPEAC Paper Award.
    [PDF] [Slides] [DOI] [BibTeX]
  • Kaixi Hou, Weifeng Liu, Hao Wang, Wu-chun Feng. "Fast Segmented Sort on GPUs". 31st ACM International Conference on Supercomputing (ICS '17).
    [PDF] [Slides] [DOI] [BibTeX]
  • Weifeng Liu, Ang Li, Jonathan D. Hogg, Iain S. Duff, Brian Vinter. "Fast Synchronization-Free Algorithms for Parallel Sparse Triangular Solves with Multiple Right-Hand Sides". Concurrency and Computation: Practice and Experience (CCPE). (This is the extended paper of the Euro-Par '16 work).
    [PDF] [DOI] [BibTeX] [Source code (cuda, opencl-amd)]
  • ~2016~
  • Weifeng Liu, Ang Li, Jonathan D. Hogg, Iain S. Duff, Brian Vinter. "A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves". 22nd International European Conference on Parallel and Distributed Computing (Euro-Par '16).
    [PDF] [Slides] [DOI] [BibTeX] [Source code (cuda, opencl-amd)]
  • Hao Wang, Weifeng Liu, Kaixi Hou, Wu-chun Feng. "Parallel Transposition of Sparse Data Structures". 30th ACM International Conference on Supercomputing (ICS '16).
    [PDF] [DOI] [BibTeX] [Source code (avx2, knc)]
  • ~2015~
  • Weifeng Liu, Brian Vinter. "CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication". 29th ACM International Conference on Supercomputing (ICS '15).
    [PDF] [Slides] [DOI] [BibTeX] [Source code (avx2, avx512, knc, cuda, opencl-amd, opencl-nvidia)]
    [The CSR5 format is incorporated in the MAGMA main branch from version 2.2.0.]
  • Weifeng Liu, Brian Vinter. "Speculative Segmented Sum for Sparse Matrix-Vector Multiplication on Heterogeneous Processors". Parallel Computing (PARCO). Volume 49, November 2015.
    [PDF] [DOI] [BibTeX] [Source code (cuda, opencl-amd, opencl-intel)]
  • Weifeng Liu, Brian Vinter. "A Framework for General Sparse Matrix-Matrix Multiplication on GPUs and Heterogeneous Processors". Journal of Parallel and Distributed Computing (JPDC). Volume 85, November 2015. (This is the extended paper of the IPDPS '14 work).
    [PDF] [Slides (LA '15)] [DOI] [BibTeX] [Source code (cuda, opencl-amd)]
    [This SpGEMM framework is incorporated in the clSPARSE main branch from version Beta 2.]
  • ~2014~
  • Weifeng Liu, Brian Vinter. "An Efficient GPU General Sparse Matrix-Matrix Multiplication for Irregular Data". 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS '14).
    [PDF] [Slides] [DOI] [BibTeX] [Source code (cuda, opencl-amd)]
  • Weifeng Liu, Brian Vinter. "Ad-heap: An Efficient Heap Data Structure for Asymmetric Multicore Processors". 7th Workshop on General Purpose Processing Using GPUs (held with ASPLOS '14) (GPGPU-7).
    [PDF] [Slides] [DOI] [BibTeX]
  • Weifeng Liu. "Parallel and Scalable Sparse Basic Linear Algebra Subprograms". PhD Thesis. Niels Bohr Institute, University of Copenhagen. 2015.
    [PDF] [Slides] [BibTeX]
Recent Talks
  • "Scalability Analysis of Sparse Matrix Computations on Many-core Processors". Sparse Days Meeting 2017 (Sparse Days '17). Toulouse, France. September 7, 2017. [Abstract] [Slides]
  • "An Empirical Study of Sparse BLAS on Emerging Heterogeneous Processors". 9th International Workshop on Parallel Matrix Algorithms and Applications (PMAA '16). Bordeaux, France. July 8, 2016. [Abstract]
  • "Assessing Recent Sparse Matrix Storage Formats for Parallel Sparse Matrix-Vector Multiplication". SIAM Conference on Parallel Processing for Scientific Computing (PP '16). Paris, France. April 14, 2016. [Abstract]
  • "A Framework for SpGEMM on GPUs and Heterogeneous Processors". SIAM Conference on Applied Linear Algebra (LA '15). Atlanta, Georgia, USA. October 26, 2015. [Abstract]
  • "Accelerating Sparse Basic Linear Algebra Subprograms on Many-Core Processors". Computational and Applied Mathematics Seminar (CAM Seminar). Mathematical Sciences - Chalmers University of Technology and University of Gothenburg, Gothenburg, Sweden. April 1, 2015. [Abstract]
  • "An Efficient and General Method for CSR-Format Based SpMV on GPUs". 8th International Workshop on Parallel Matrix Algorithms and Applications (PMAA '14). Lugano, Switzerland. July 3, 2014. [Abstract]