首页 > > 详细

辅导 An Automated Approach to Extracting Local Variables讲解 Java编程

An Automated Approach to Extracting Local Variables

KEYWORDS

Software Refactoring, Extract Local Variable, Reliable, Bugs

ACM Reference Format:

Xiaye Chi, Hui Liu, Guangjie Li, Weixiao Wang, Yunni Xia, Yanjie Jiang, Yuxia Zhang, and Weixing Ji. 2023. An Automated Approach to Extracting Local Variables. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engi- neering (ESEC/FSE ’23), December 3—9, 2023, San Francisco, CA, USA. ACM, New York, NY, USA, 13 pages. https://doi.org/10.1145/3611643.3616261

1 INTRODUCTION

The term "software refactoring" was coined by Opdyke [29], refer- ring to the object-oriented variant of restructuring [1]. In general, software refactoring could be defined as "the process of changing a [object-oriented] software system in such a way that it does not alter the external behavior. of the code, yet improves its internal struc- ture" [11]. Recently, software refactoring has been well studied as an e般cient way to improve software quality [6,20] as well as an efec- tive way to facilitate software maintenance and evolution [21, 50]. Refactoring tools like JDeodorant [43], ReSharper [32], and built-in refactoring engines in IDEs (including Eclipse[7], IntelliJ IDEA [16], NetBeans [27], and Visual Studio [49]) have been widely used to facilitate software refactoring.

Extract local variable (or extract variable for short) is one of the most popular refactorings. Notably, dozens of software refactor- ings have been proposed, ranging from low-level refactorings like renaming variable to high-level refactorings like teasing apart inher- itance [11]. By tracking the refactoring histories of programmers, Murphy-Hill et al. [25] found that extract variable was the second most popular software refactoring. Extract local variable is to create a local variable, initialize it with a selected expression, and replace one or more occurrences ofthe expression with direct access to the new variable. The benefits ofthe refactoring are twofold [10]. On one side, replacing complex expressions with a named variable may increase the readability of the program because variable names are often more readable than complex expressions. On the other side, replacing multiple occurrences ofthe same expression with simple accesses to a variable may avoid repetitive computation and thus reduce code complexity.

Manually extracting local variables could be tedious, time-consu- ming and error-prone. It has been well known that changing source code of complex software systems could be risky [37]. The same is true for software refactoring [3]. To this end, mainstream IDEs have provided automated tool support for this refactoring. According to the survey by Golubev et al. [15], 54.7% of the surveyed developers use the IDE support to conductExtract refactorings, e.g., extract variables. The empirical study conducted by Negara et al. [26] sug- gests that extract variable is frequently performed with automated tool support. Developers pferm over 80% of extract variable refac- torings with automated tool support. However, such refactoring tools often simply replace all expressions that are lexically identical to the selected one without in-deep analysis on the safety of the refactoring. As a result, even with such tool support, extracting local variables could be error-prone, resulting in exceptions and semantic errors. In Section 2, we explain with motivating exam- ples why the state-of-the-practice refactoring tools may introduce semantic errors while extracting local variables.

To this end, in this paper, we propose a novel and more reliable approach, called ValExtractor, to conduct extract variable refactor- ings automatically. The major challenge of automated extract local variable refactorings is how to e般ciently and accurately identify the side efect of the extracted expressions and the potential interaction between the extracted expressions and their contexts (i.e., state- ments around them) without time-consuming dynamic execution of the involved programs. To resolve this challenge, ValExtractor leverages a lightweight static source code analysis to validate the side efect of the selected expression, and to identify which oc- currences of the selected expression could be extracted together without changing the semantics of the program or introducing potential new exceptions. We evaluated the proposed approach on open-source applications by applying it and the baseline ap- proaches (Eclipse and IntelliJ IDEA) to extract expressions in such applications. Our evaluation results suggested that the state-of-the- practice baselines did result in hundreds of semantic errors while conducting extract variable refactorings. Our approach, however, successfully avoided all such errors. Besides that, we also evaluated the proposed approach and Eclipse with 253 real-world extract vari- able refactorings discovered from 10 open-source applications. Our evaluation results suggested that Eclipse resulted in semantic errors in 19 out of the 253 cases, and another mainstream IDE IntilliJ IDEA resulted in semantic errors on all such 19 cases as well. In contrast, our approach succeeded in conducting all such refactorings without introducing any semantic errors.

The paper makes the following contributions:

• An automated and more reliable approach to extracting local variables in Java applications.

• A benchmark consisting of 253 real-world extract local variable refactorings.

• An evaluation of the proposed approach on the benchmark, whose replication package, including detailed instruction for replication, is publicly available [31].

Listing 1: Motivating Example

2    MOTIVATING EXAMPLE

In this section, we explain with a motivating example the po- tential risks in extract variable refactorings, and how we min- imize such risks. The motivating example is presented in List- ing 1. Suppose that a developer realizes that there are many in- stances of expression "pattern.length()" in the motivating ex- ample (as shown in colors), and would like to replace such in- stances with a local variable. To this end, the developer selects the expression "pattern.length()" at Line 6 within Eclipse, right- clicks it, and selects menu item "refactoring - extract local variable" as well as the checkbox "replace all occurrences of the selected expression with references to the local variable". As a response to the command, Eclipse invokesJDT [8] to conduct the extract variable refactoring. The resulting source code is presented in Listing 2. Notably, conducting the same refac- toring with IntilliJ IDEA would result in the same code.

The refactorings conducted automatically by Eclipse and IDEA are questionable. By comparing the code before and after the refac- toring, we notice that Eclipse and IDEA declare a new variable (length) at Line 3 in Listing2and initializes it with the extracted expression "pattern.length()". It also replaces all of the four in- stances of the expression with the newly added variable length at Lines 4, 7, 11, and 14, respectively. However, the replacement is incorrect and it results in serious bugs that change the seman- tics of the enclosing software application. First, the newly added declaration at Line 3 is questionable. In case the input parame- ter pattern equals null, the declaration would result in a null pointer exception. In contrast, the source code before refactor- ing can avoid the exception because it carefully checks whether the pattern equals null (Line 3 of Listing1) before the variable is used to access any of its properties. Second, replacing the expression "pattern.length()" with variable length at Lines 14 of Listing 2 is incorrect. The variable pattern has been updated at Line 12. Consequently, at Line 14 the variable length (initialized at Line 3) is not equivalent to the original expression "pattern.length()". As a result, replacing the expression with variable length at Line 14 is incorrect, which may result in fewer iterations at Line 14.

To avoid such errors, in this paper, we propose an automated approach ValExtractor to conduct extract variable refactorings. It successfully conducts the refactoring as shown in Listing 3 and avoids all bugs introduced by the state-of-the-practice IDEs (i.e.,

Listing 2: After Refactoring(by Eclipse or IDEA)

Listing 3: After Refactoring (by ValExtractor)

Eclipse and IDEA). ValExtractor works as follows. First, given the selected expression "pattern.length()" at Line 6 in Listing1, ValExtractor infers that the new variable declaration should be added between Line 5 and Line 6 in Listing 1 if the selected ex- pression alone should be extracted as a new variable. After that, ValExtractor validates that the variable and the expression (to be replaced with access to the variable) are equivalent at Line 6, and the expression itself does not have any side efect. Consequently, the selected expression (at Line 6) is extractable, and it is added as an extractable expression.

ValExtractor keeps finding more extractable expressions that could be extracted together with the selected expression. To this end, it turns to the expression "pattern.length()" at Line 10 of Listing 1because it is the closest expression to the selected expres- sion at Line 6 and it is literally identical to the selected expression. It repeats the inference in suggesting where the new variable should be declared as well as the validation of potential side efect as in- troduced in the preceding paragraph. This time, ValExtractor suggests that the new variable could be declared between Line 5 and Line 6 in Listing 1and that replacing both of the expressions (at Lines 6 and 10) is safe. Consequently, the expression at Line 10 is also added as an extractable expression.

ValExtractor comes to the next expression "pattern.lengt- h()" at Line 13 of Listing1. While validating the side efect of the expressions between Line 10 and Line 13, ValExtractor finds that the statement at Line 11 has side efect on the selected expression (i.e., it may change the value of the expression "pattern.length()"). As a result, executing the same expression appearing before and

Figure 1: Overview of ValExtractor

after Line 11 may result in diferent values, and thus we cannot extract the expressions at Line 6 (before Line 11) and Line 13 (after Line 11) together. To this end, ValExtractor discards the expression at Line 13 as well as other expressions beyond it.

Finally, it reverses the searching direction, and turns to the ex- pression "pattern.length()" at Line 3. It infers that the new vari- able should be declared and initialized before Line 3 in Listing 1. However, the initialization of the new variable with the expression "pattern.length()" before Line 3 may result in a null pointer exception (when pattern equals null) that may not happen be- fore the refactoring. Consequently, ValExtractor discards this ex- pression as well as other expressions before it (if any).

As a result of the preceding static analysis, ValExtractor ex- tracts two extractable expressions at Lines 6 and 10 of Listing 1, avoiding all bugs introduced by Eclipse JDT.

3 APPROACH

3.1 Overview

An overview of the proposed approach (ValExtractor) is pre- sented in Fig.1. It takes as input a selected expression and its enclos- ing project. With such input, ValExtractor validates whether the selected expression has side efect and validates iteratively whether other literally identical expressions within the same method could be extracted together. Overall, ValExtractor works as follows:

• Expression validation: It adds the selected expression as a can- didate expression, and validates whether the selected expression has side efect. If yes, it skips the next step.

• Retrieval of candidate expressions: It retrieves all expressions within the enclosing method that are literally identical to the selected expression, taking them as candidate expressions.

• Search for extractable expressions: It takes a greedy strategy to search for candidate expressions that could be extracted to- gether with the selected one (called extractable expressions), and suggests where the new variable should be declared. If none of the candidate expressions could be extracted, ValExtractor terminates and no refactoring would be conducted. Otherwise, ValExtractor turns to the next step.

• Refactoring: Finally, ValExtractor conducts extract variable refactoring by declaring and initializing a new variable and replacing all of the extractable expressions with accesses to the variable.

Details of the key steps are presented in the following sections, and the full list of preconditions when a set of literally identical expressions could be extracted by an extract variable refactoring is presented as an online appendix [12].

3.2 Expression Validation

The validation of the selected expression is composed of two parts. The first part validates whether the selected expression is suitable for extraction. Not all expressions could be extracted as local vari- ables. For example,"this.length"in assignment"this.length=5", "ArrayList()" in statement "list= new ArrayList()", and "id.isEmpty()" in statement "st.id.isEmpty()" cannot be extracted as variables. ValExtractor terminates (i.e., re- fuses to conduct the refactoring) if the selected expression is one of the following expressions: parameters, left values, declarations, single null literal, expressions in annotations, incomplete expres- sions, void expressions, enumeration expressions in switch cases, expressions used in initializer or updater of for statements, name properties, and expressions outside methods.

In the second part, ValExtractor validates whether the selected expression has side efect. An expression has a side effect if exe- cuting the same expression (one or more times) is not semantically equivalent to a single execution of the expression. For example, the expression stack.pop()has side efect because repeating it n times may remove additional n elements from the stack. Consequently, the following code

Print(stack.pop());

Print(stack.pop());

is not equivalent to the following code:

value=stack.pop()

Print(value);

Print(value);

If the selected expression has side efect, we cannot extract it to- gether with other expressions that are literally identical to it. In this case, the selected expression is taken as the only candidate ex- pression, i.e., ValExtractor will extract no more than one expression.

ValExtractor validates the side efect of the selected expression by checking whether the expression has updated states ofthe sys- tem, generated outputs, or consumed system inputs. An expression has updated the states of the system if and only if the expression (including methods called directly or indirectly by it) has updated

Listing 4: Expressions Updating Empty Fields Only

any software entities whose lifetime is beyond the execution of the selected expression. ValExtractor identifies generation of outputs and consumption of system inputs by comparing the executed state- ments against a list of manually marked Java input/output APIs. If any of the marked APIs is executed directly or indirectly by the selected expression, it has side efect.

An exception to the preceding rules is that we allow the se- lected expression to initialize fields that are initially null. Listing4 presents atypical example ofsuch initialization. The selected expres- sion "ConverterManager.getInstance()" is to retrieve the static field manager of class ConverterManager. However, if the variable equals null (i.e., it has not yet been initialized), the expression would initialize it with a brand new object (Line 8). Consequently, although the selected expression has the possibility to update the field manager, repeating the expression multiple times is semanti- cally equivalent to a single execution of the same expression.

3.3 Retrieval of Candidate Expressions

First, ValExtractor automatically infers the scope of the selected expression, noted as ScopeExp. The scope of the expression specifies where the expression is syntactically accessible. Consequently, the scope is the intersection of the scopes ofall elements involved in the expression. For example, the scope of the expression "list.add(it- em)" is the intersection of the scope of the variable "list" and the scope of the parameter "item".

To retrieve candidate expressions that may be extracted together with the selected expression, ValExtractor searches for all expres- sions within ScopeExp that are lexically identical to the selected expression. For each of the retrieved expressions, ValExtractor also validates whether it is suitable for extraction in the same way as it validates the selected expression in Section 3.2. All expressions passing the validation are added as candidate expressions.





联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!