Eötvös Loránd University



Abstracts and materials




Nikolay Handzhiyski (University of Plovdiv, PhD student): Tunnel Parsing

Tunnel parsing is a new and an efficient algorithm for parsing and construction of concrete syntax trees, by context-free grammars without left recursion and rules that recognize empty words. By following the grammar topology, the different states in which the parsing machine can be are precomputed in a static memory. Precomputed are also all groups of steps (tunnels) that can be used from the parsing machine to transition from one state to another. All precomputed data is read only and can be used from more then one parsing machine at a time. The dynamic memory amount that is used in the process of parsing is not dependent from the grammar determinism. The tunnel parsing is fully iterative parsing without in depth usage of the thread dedicated stack and uses the supplied grammar without changing it.

Zsolt Parragi (PhD student), Zoltán Porkoláb (Eötvös Loránd University): Scale-free Properties are Preserved in Paradigm-agnostic Coupling Metrics

Every computer program uses at least some memory for storing its own code and variables, and often programs also allocate this memory dynamically, in multiple smaller parts during their run. We introduce Typegrind, an open source instrumentation tool for C++, focusing primarily on memory analysis. It provides a modular architecture, allowing for the use of different compilers, data formats, data processing tools, or specific integration into applications that require it. Typegrind works by automatically analysing and transforming -- refactoring -- the source code of the program during compilation with the Clang API, adding logging calls to everything that could be related to memory allocation. The entire process is customizable, allowing the developer to adapt it to the specific problem.

Ia Mgvdliashvili (BSc student), Zoltán Porkoláb (Eötvös Loránd University): A Fuzzing-Based Method for Test Case Generation

Dynamic analysis, or fuzzing, is a popular and effective method of finding vulnerabilities in software. In the presentation we discuss the difference between generation based and mutation based fuzzing and advantages of each one of them. Several fuzzing tools exist, and we will concentrate on libfuzzer by LLVM. It is in-process, coverage-guided, evolutionary fuzzing engine. We will demonstrate its capabilities, for example, the famous heart bleed bug that took years to discover can be found in less than a single minute. Test Driven Development (TDD) is a widely accepted software development process originally suggested by Kent Beck, that turns requirements to specific test cases and implement them in repetition. Development methods, like extreme programming among others apply TDD heavily. Key component in TDD is maximizing the code coverage that we reach with creating various test cases implemented by different sequences of interface function calls on the unit under test. However, to specify all the necessary scenarios is hard. Given that space for possible sequences is effectively infinite, it is not trivial for the developer to find meaningful cases and at the same time keep the number of unit tests minimal. The ideas from fuzzing can be applied to tackle this problem and possible solutions can be implemented using the SanitizerCoverage library. However, there are key differences in the approach for testing a single endpoint of the interface and exploring the combinations of member function calls. In this paper we will discuss these differences and how to deal with them.

Simon Baars (University of Amsterdam, MSc student): Towards Automated Merging of Code Clones in Object-Oriented Programming Languages

Duplication in source code can have a major negative impact on the maintainability of source code, as it creates implicit dependencies between fragments of code. Such implicit dependencies often cause bugs. In this study, we look into the opportunities to automatically refactor these duplication problems for object-oriented programming languages. We propose a method to detect clones that are suitable for refactoring. This method focuses on the context and scope of clones, ensuring our refactoring improves the design and does not create side effects. Our intermediate results indicate that more than half of the duplication in code is related to each other through inheritance, making it easier to refactor these clones in a clean way. Approximately ten percent of the duplication can be refactored through method extraction without extra considerations required, while other clones require other refactoring techniques or further transformations. Similar future measurements will provide further insight into the contexts where clones occur and how this affects the automated refactoring process. Finally, we strive to construct a tool that automatically applies refactorings for a large part of the detected duplication problems.

Bart Zuilhof (University of Amsterdam, MSc student): Code Quality Metrics for the Functional Side of the Object-Oriented Language C#

With the evolution of object-oriented languages such as C#, new code constructs which originate from the functional programming paradigm are introduced. Our hypothesis is that a relationship exists between the usage of these constructs and the error-proneness. Measures defined for this study will focus on functional programming constructs where object-oriented features are used; this often affects the purity of the code. Build on these measures we define a metric that relates the usage of the measured constructs to error-proneness. To validate the metric that would confirm our hypothesis, we implement the methodology presented by Briand et al. for empirical validation of code metrics. The results of this research are expected to grant new insights into the evolution of software systems and the evolution of programming languages regarding the usage of constructs from the functional programming paradigm in object-oriented languages.

Gonçalo Lopes (University of Coimbra, MSc student): Analysis of Energy Efficiency of Matrix Transposition Algorithms

Energy consumption is becoming a serious concern in the context of software development. A recent theoretical model has shown that the energy consumption of an algorithm not only depends on its running time but also on the number of memory accesses. However, algorithms that can take advantage of this model have not been experimentally validated in a thorough manner. In this work, we empirically analyse several algorithms for matrix transposition operations with different patterns of low-level cache access, and compare them in terms of energy consumption, CPU-time, number of cache misses at different levels, number of total accesses, CPU-cycles and number of instructions for different matrix sizes. Our results suggest that the running time has a significant impact on the energy consumed and that some memory access patterns affect the running time quite strongly.

Norbert Luksa (Eötvös Loránd University, MSc student): Parallelisation of Haskell Programs by Refactoring

We propose a refactoring tool for the Haskell programming language, capable of introducing parallelism to the code with reduced effort from the programmer. Haskell has many ways to express concurrency and parallelism. Moreover, Eden, a dialect of Haskell supports a wide range of features for parallel and distributed computations. After comparing a number of possibilities we have found that the Eval Monad, the Par Monad and the Eden language provide similar parallel performance. Our tool is able to introduce parallelism by turning certain syntactic forms into the application of algorithmic skeletons, which are implemented with the Eval Monad, the Par Monad and the Eden language.

Pedro Henrique Villar de Figueirêdo (Eötvös Loránd University, BSc student): Octree-based Approach for Real-time Visualization of Surfaces Defined by Signed Distance Fields

The representation and visualization of various objects are very important to Computer Graphics and its subfields. Traditionally, in real-time computer graphics, triangle list based representations are used. However, implicitly defined surfaces provide a greater flexibility in modelling, but their performance falls behind the direct approach of the industry standard. We present an efficient framework for implicit visualization of originally triangle based representation using signed distance functions which utilizes the GPU and an optimized octree data structure. For efficient use of the rendering engine, the octree is accessed in a massively parallel fashion, using the GPU, allowing real-time visualization. Our framework also provides for incremental and modular iterations, contributing towards a versatile rendering engine in the future.