From: Wim Vanhoof <wva@info.fundp.ac.be>
(apologies for multiple postings)
The University of Namur, Belgium, is seeking a young researcher
(no PhD) for a 1 year research project entitled
"Automatic program analysis for refactoring logic programs".
A short description follows. Interested candidates should send their CV
before September 15, 2008 to Wim Vanhoof (wva@info.fundp.ac.be).
Program refactoring [3] is the process of systematically changing the
structure of a
program without changing its semantics. The goal of refactoring is to
improve the design
of the code after it has been written, in order to facilitate
maintenance (including further
development) of the software. Emerged from the OO and XP communities,
program
refactoring has recently gained attention in the fields of functional
[8] and logic
programming [11]. Within the software engineering community, the process of
refactoring is considered important and has been identified as central
to software
development and maintenance [3].
At the basis of the refactoring process is a catalogue of available
source-to-source
transformations - the so-called refactorings. For each refactoring, a
set of conditions is
specified under which the transformation is correct in the sense that it
preserves the
semantics of the program. The activity of refactoring consists then in
repeatedly searching
through the source code, looking for a code fragment of which the design
could be
improved by a particular refactoring from the catalogue. The refactoring
is subsequently
applied, and the whole process is repeated. Although each transformation
can have an
impact (positive or negative) on the performance of the program, the
primary aim of each
transformation is to improve the readability and maintainability of the
code.
Even if refactoring is basically a manual process that is performed by
the programmer, the
need for automation is recognized due to the time-consuming and
error-prone nature of
the refactoring activity [3, 8, 10]. Although tools do exist that aid
the developer with
performing a particular refactoring on a selected fragment of source
code - an example
being the Refactoring Browser that was developed for Smalltalk [10] -
identifying where
to perform (a particular) refactoring in the code remains essentially a
non-trivial, creative
and therefore manual process.
Although a substantial number of different refactorings has been
identified in the
literature, the primary indication for where to perform refactoring
seems to be the
presence of duplicated code [3]. The notion of duplicated code is not
limited to literally
copied code, but refers to fragments of code that have the same
input/output behaviour.
The goal of refactoring is then to transform the source code in such a
way that the
duplication is removed. This generally requires to extract the
duplicated input/output
behaviour into a new subroutine (be it a method, function or predicate),
which may
require a generalisation of the concerned code fragments and a possible
reorganisation of
the code as a whole [3, 11].
Code duplication can be caused by a number of reasons. First of all,
unfamiliarity of the
developer with the existing code body may result in reimplementation of
routines that
already exist in the application under development. Second, the “copy
and paste”
technique is commonly used when the existing functionality has to be
slightly adapted.
Even if in this case one usually does not end up literally duplicating
input/output
behaviour, the changes introduced by adaptation are usually relatively
minor and a
generalization of the original and the adapted fragments could be often
proposed. Finally,
code duplication might result from a polyvariant program analysis.
The problem of deciding whether two code fragments implement the same
functionality is
well-known to be undecidable. Nevertheless, automatic program analysis
techniques have
been developed, notably in the context of imperative and object-oriented
languages, that
are capable to detect such equivalence under particular circumstances
and within a certain
error margin, e.g. [1, 5, 6, 7, 9]. Also related is the work on
plagiarism detection for
programs written in such languages, e.g. [4, 14, 13].
In this project, we will investigate the automatic detection of
duplicated code in
declarative programming languages, in particular logic programming
languages such as
Prolog and Mercury. Declarative languages allow the developer to program
at a much
higher level of abstraction than it is the case with most imperative
languages. In a
declarative language, one describes properties of the desired solution
rather than the
actual algorithm that should be used to find the solution.
We aim to develop an analysis that allows to find, without user
intervention, duplicated
code fragments into a logic program and we will study if and how the
results of this
analysis can be used to drive a number of refactorings to remove the
unwanted
duplication from the program. These refactorings include predicate
extraction (replacing
duplicated code fragments by a call to a newly defined predicate), the
removal of
predicates implementing the same relation as another predicate and the
generalization of
duplicated code fragments into calls to newly generated higher-order
predicate [11, 12].
The development of such a duplicated code analysis for logic programs,
and its
integration with refactoring, presents some interesting research
opportunities. Firstly, the
declarative nature of logic programs makes it not straightforward to
adapt the methods
developed in an imperative (or object-oriented) setting. Secondly, the
fact that logic
programs have a small and formally well-defined semantics and use an
explicit symbolic
data representation makes the use of advanced analyses possible.
Therefore it might well
be possible to obtain more fine-grained results than is the case for
imperative languages.
Finally, it might be worthwhile to investigate how the developed
analysis could be used in
the context of plagiarism detection for logic programs.
References
[1] B. S. Baker. On finding duplication and near-duplication in large
software systems. In Proc. Second
IEEE Working Conference on Reverse Engineering, pages 86-95, July 1995.
[2] F. Degrave and W. Vanhoof. Towards a normal form for Mercury
programs. In A. King, ed. LOPSTR
2007, volume 4915 of Lecture notes in computer science, pages 43-58.
Springer-Verlag, 2007.
[3] M. Fowler, K. Beck, J. Brant, W. Opdyke, and D. Roberts. Refactoring
: Improving the Design of
Existing Code. Object Technology Series. Addison-Wesley, 1999.
[4] S. Horwitz. Identifying the semantic and textual differences between
two versions of a program. ACM
SIGPLAN Notices, 25(6) :234-245, 1990.
[5] T. Kamiya, S. Kusumoto, and K. Inoue. Ccfinder : A multilinguistic
token-based code clone detection
system for large scale source code. IEEE Trans. Software Eng., 28(7):
654-670, 2002.
[6] R. Komondoor and S. Horwitz. Using slicing to identify duplication
in source code. In Static Analysis
Symposium, pages 40-56, 2001.
[7] K. Kontogiannis, R. de Mori, E. Merlo, M. Galler, and M. Bernstein.
Pattern matching for clone and
concept detection. Autom. Softw. Eng., 3(1/2) :77-108, 1996.
[8] H. Li, C. Reinke, and S. Thompson. Tool support for refactoring
functional programs. In J. Jeuring,
editor, ACM SIGPLAN 2003 Haskell Workshop. ACM 2003.
[9] J. Mayrand, C. Leblanc, and E. Merlo. Experiment on the automatic
detection of function clones in a
software system using metrics. In Intl. Conf. on Software Maintenance,
pages 244-253, 1996.
[10] D. Roberts, J. Brant, and R. E. Johnson. A refactoring tool for
Smalltalk. Theory and Practice of Object
Systems (TAPOS), 3(4) :253-263, 1997.
[11] A. Serebrenik, T. Schrijvers and B. Demoen. Improving Prolog
programs: Refactoring for Prolog.
Theory and practice of logic programming (Accepted) 2008.
[12] W. Vanhoof. Searching semantically equivalent code fragments in
logic programs. In S. Etalle, editor,
LOPSTR 2004, volume 3573 of Lecture notes in computer science, pages
1-18. Springer-Verlag, 2005.
[13] J. Winstead and D. Evans. Towards differential program analysis. In
Proceedings of the 2003
Workshop on Dynamic Analysis, 2003.
[14] W. Yang. Identifying syntactic differences between two programs.
Software Practice and Experience,
21(7): 739-755, 1991.
Last updated: Jan 04 2025 at 20:18 UTC