Rajagopalan, Arun Krishnakumar (2016-05). Fast and Precise On-The-Fly Data Race Detection. Master's Thesis. | Thesis individual record

While concurrent programming is quickly gaining popularity lately, developing bug-free programs is still challenging. Although developers have a wide choice of race detection tools available, we have found that the majority of these techniques do not scale well and developers are often forced to balance precision with speed. Additionally, various practical issues force even precise race detectors to produce spurious warnings, defeating their purpose and burdening their users. We design and implement a novel race detection technique that is both fast and precise, even in the face of missing program source information. Towards this goal, we have developed two separate tools, TREE and RDIT, that respectively improve performance and precision over existing techniques.

TREE, implemented in the RoadRunner framework, acts as a filter and sends through only those events that might add value to race detection while eliminating those events which are deemed redundant for this purpose. All the while, removing these redundant events does not affect its race detection capability. We have evaluated TREE against a whole set of standard benchmarks, including two large real-world applications. We have found that there exists a significant number of redundant events in all these applications and on an average, TREE saves somewhere between 15-25% of analysis time as compared to the state-of-the-art techniques.

Meanwhile, our next tool, RDIT, is able to precisely detect races in programs with incomplete source information, generating no false positives. RDIT is also maximal in the sense that it detects a maximal set of true races from the observed incomplete trace. It is underpinned by a sound BarrierPair model that abstracts away the missing events by capturing the invocation data of their enclosing methods. By making the least conservative assumption that a missing method introduces synchronization only when its invocation data overlaps with other missing methods, and by formulating maximal thread causality as a set of logical constraints, RDIT guarantees to precisely detect races with maximal capability. We tested RDIT against seven real-world large concurrent systems and have detected dozens of true races with zero false alarm. Comparatively, existing algorithms such as Happens-Before, Causal-Precede, and Maximal-Causality, which are all known to be precise, were observed reporting hundreds of false alarms due to trace incompleteness.

etd chair
publication date