Differences

This shows you the differences between two versions of the page.

Link to this comparison view

seminars:combining_duplication_and_reordering_to_accelerate_parallel_graph_processing [2019/05/06 16:23] (current)
bschwedo created
Line 1: Line 1:
 +====== Combining Duplication and Reordering to Accelerate Parallel Graph Processing ======
  
 +Tuesday May 14, 2019\\
 +Location: CIC 4th floor Panther Hollow Conference Room\\
 +Time:  4:​30PM-5:​30PM\\
 +
 +=====Abstract=====
 +
 +Performance of single-machine,​ shared memory graph processing is affected by expensive atomic updates and poor cache locality. Data duplication,​ a popular approach to eliminate atomic updates by creating thread-local copies of shared data, incurs extreme memory overheads due to the large sizes of typical input graphs. Even memory-efficient duplication strategies that exploit the power-law structure common to many graphs (by duplicating only the highly-connected "​hub"​ vertices) suffer from overheads for having to dynamically identify the hub vertices. Degree Sorting, a popular graph reordering technique that re-assigns hub vertices consecutive IDs in a bid to improve spatial locality, is effective for single-threaded graph applications but suffers from increased false sharing in parallel executions. ​
 +
 +In this talk, I will describe the main insight of our work -- the combination of data duplication and Degree Sorting eliminates the overheads of each optimization. Degree Sorting improves the efficiency of data duplication by assigning hub vertices consecutive IDs which enables easy identification of the hub vertices. Additionally,​ duplicating the hub vertex data eliminates false sharing in Degree Sorting since each thread updates its local copy of the hub vertex data. We evaluate this mutually-enabling combination of power-law-specific data duplication and Degree Sorting in a system called RADAR. RADAR improves performance by eliminating atomic updates for hub vertices and improving the cache locality of graph applications,​ providing speedups of up to 165x (1.88x on average) across different graph applications and input graphs.
 +=====Bio=====
 +
 +Vignesh Balaji is a fourth year PhD student advised by Professor Brandon Lucia. His primary research focus is on providing architecture support for accelerating graph processing on shared-memory platforms. Website: https://​users.ece.cmu.edu/​~vigneshb/​