Please use this identifier to cite or link to this item: http://hdl.handle.net/11023/1428
Title: Test-Driven Reuse: Improving the Selection of Semantically Relevant Code
Author: Nurolahzade, Mehrdad
Advisor: Maurer, Frank
Walker, Robert J.
Keywords: Computer Science
Issue Date: 23-Apr-2014
Abstract: Test-driven reuse (TDR) proposes to find reusable source code through the provision of test cases describing the functionality of interest to a developer. The vocabulary and design of the interface of the function under test is used as the basis of selecting potentially relevant candidate functions to be tested. This approach requires that the searcher know—or correctly guess—the solution’s interface vocabulary and design. However, semantic similarity neither implies nor is implied by syntactic or structural similarity. According to empirical studies, behaviourally similar code of independent origin can be syntactically very dissimilar. We believe test cases that exercise a function provide additional facts for describing its semantics. Therefore, the thesis of this dissertation is that by modelling tests—in addition to function interfaces—the odds of finding semantically relevant source code is improved. Additionally, to facilitate similarity comparisons and improve performance, we propose a multi-representation approach to building a reuse library. To validate these claims we created a similarity model using lexical, structural, and data flow attributes of test cases. Four similarity heuristics utilize this model to independently find relevant test cases that exercise similar functionality. We developed a proof of concept TDR tool, called Reviver, that finds existing test cases exercising similar functions once given a new set of test cases. Using Reviver a developer writing tests for a new functionality can reuse or learn from similar functions developed in the past. We evaluated the effectiveness of Reviver in a controlled study using tasks and their manually generated approximations. We experimented with different configurations of Reviver and found that overall the combination of lexical and data flow similarity heuristics is more likely to find an existing implementation of the function under test. Our results confirm that lexical, structural, and data flow facts in the test cases exercising a function—in addition to the interface of function under test—improve selection of semantically relevant functions.
URI: http://hdl.handle.net/11023/1428
Appears in Collections:Electronic Theses

Files in This Item:
File Description SizeFormat 
ucalgary_2014_nurolahzade_mehrdad.pdf2.15 MBAdobe PDFView/Open


Items in The Vault are protected by copyright, with all rights reserved, unless otherwise indicated.