17/05/2017 11:30Meyer 861

 

Moore with Less: Specializing Cores for the Cloud

Boris Grot

University of Edinburgh

Big data is revolutionizing the way we live, work, and socialize. This revolution is powered by datacenters built with commodity hardware technologies that are now running out of steam. In this talk, I will outline the disruptive challenges facing future computing systems and will argue for specializing server hardware as the only way forward. The principal challenge for specialization lies in providing common-case efficiency benefits without losing the generality necessary to accommodate both legacy and future software needs in cloud infrastructures. As a step toward meeting this challenge, I will present our latest work on specializing server cores to meet both single-thread performance demands and maximize throughput under QoS constraints for collocated workloads.

Bio: Boris Grot is an Assistant Professor in the School of Informatics at the University of Edinburgh. His research seeks to address efficiency bottlenecks and capability shortcomings of processing platforms for big data. He is a MICRO Hall of Fame inductee and a recipient of the Google Faculty Research Award. Grot received his PhD in Computer Science from The University of Texas at Austin and spent two years as a post-doctoral fellow at the Parallel Systems Architecture Lab at EPFL.

24/05/2017 11:30Taub 301

 

cuSTINGER - A Sparse Dynamic Graph and Matrix Data Structure

Oded Green

Georgia Institute of Technology

Sparse data computations are ubiquitous in science and engineering. Two widely used applications requiring sparse data computations are graph algorithms and linear algebra operations such as Sparse Matrix-Vector Multiplication (SpMV). In contrast to their dense data counterparts, sparse-data computations have less locality and more irregularity in their execution - making them significantly more challenging to optimize. While there are several existing formats for sparse data representations, most of these formats are restricted to static data sets. This means that these representations and their implementations do not support changes to the original data. For graphs, this means that topology changes are not possible. For matrices, this means adding a perturbation to the original input is non-trivial. In today's talk, I will present cuSTINGER - a dynamic graph data structure for the GPU. This talk will partially focus on the internals of the data structure and some of its design considerations. cuSTINGER will be compared with Compressed Sparse Row (CSR) - a widely used static graph and matrix data representation for sparse inputs. This comparison will include a performance analyis of the data structure itself as well as higher level data analytics implemented using the data structure. I will show that the new data structure has a relatively low-overhead in comparison to CSR. Yet unlike CSR, our new data structure is flexible and can grow for indefinite amount of time. I will then show that the data structure can process tens of millions of updates per second - a rate that exceeds many real world applications making it extremely useful. In the second part of my talk, I will elaborate on the programming model for cuSTINGER. My research team's big goal is to enable simple implementation of high-performance, scalable, dynamic graph analytics on the GPU for all users - including none-GPU experts. To limit the amount of GPU code that users need to write by themselves, cuSTINGER offers a small set of optimized operators for traversing graph vertices and edges. An additional design goal is to ensure that static graph algorithms, such as breath-first search and connected components, will also perform well in this library. I will show how these algorithms can be implemented in less than 15 lines of code through the use of these operators. I will then compare the performance of our cuSTINGER implementations with the static graph implementations in Gunrock - a highly tuned GPU graph library. Our current implementations are usually upto 30% slower or 20% faster than Gunrock, depending on the input and algorithm. Finally, I will show some performance results for a novel triangle counting algorithm for dynamic graphs which sustains millions of updates per second.

Bio: Oded Green is a research scientist at the Georgia Institute of Technology (Georgia Tech) in Computational Sciences and Engineering, where he also received his PhD. Oded received both his MSc in electrical engineering and his BSc in computer engineering from Technion - Israel Institute of Technology. Prior to coming to Georgia Tech, Oded spent some time working as the Chief Operating Officer and a research scientist for ArrayFire (Atlanta, GA) where he was responsible for the company's product management and research directions. Oded's research primarily focuses on improving the performance and scalability of large-scale graph analytics on dynamic graph using a wide range of high-performance computational platforms and by designing efficient algorithms and data structures.

14/06/2017 11:30TBA

 

TBA

Lluis Vilanova

EE, Technion

TBA

Bio: TBA

20/06/2017 00:00TBA

 

TBA

Adi Fuchs

EE, Princeton

TBA

Bio: TBA

28/06/2017 11:30TBA

 

TBA

Michael Schapira

CS, Hebrew University of Jerusalem

TBA

Bio: TBA