Transcription

Comparing the Performance of String OperationsAcross Programming LanguagesUniversity of OuluFaculty of Information Technology andElectrical Engineering / InformationProcessing ScienceMaster’s ThesisNiko Pelkonen20.12.2019

2AbstractIn this thesis, the performance of string operations are compared across programminglanguages. Handling strings effectively is important especially when performance is acrucial factor and large string sizes may emerge. Common examples where large stringsizes emerge are during digitalization of a product, reading string data from a database,reading and handling large CSV-files and Excel-files, converting file format to anotherfile format (e.g. CSV to Excel and vice versa), and reading and handling a DOM-tree ofa website.There has been a lot of corresponding research where programming languages arebenchmarked, but none of them focus directly on string operations. The main goal ofthis thesis is to fill this gap in literature and try to find out which programminglanguages have the best results on string operations in terms of execution time andmemory (maximum RSS) usage.The test environment was formed by creating randomly generated string files with sizesvarying from ten thousand characters to 100 million characters. The generatedcharacters were ‘a’, ‘b’, and ‘ ‘ (whitespace character). The programming languagesselected for this thesis were Python, C, C , Java, Perl, Ruby, Go, and Swift.Go seemed to be the most effective language in execution times, although it was not thefastest in many operations. C used very little memory, but only five operations wereimplemented in it. Every operation was implemented in Python, and it used additionalmemory to loading the string file in only one operation, which was sorting a string.Swift had quite bad results, and this could be caused by the Linux version of Swift thatwas used. In regular expressions, Perl and C were overwhelmingly effective. Javaused the most memory in every operation.Keywordsprogramming languages, comparison, string operationsSupervisorsPh.D., University lecturer, Ari VesanenPh.D., University lecturer, Antti Siirtola

3ForewordI have a bad habit of taking too much things to handle at the same time, and thathappened also during this thesis, which resulted in yet another busy and stressful periodof time while writing this thesis. Nevertheless, writing this thesis has been trulyfascinating and inspiring; picking a subject of interest, creating the test environments,seeing interesting and surprising results, reporting the results thrillingly, and discussingabout the results with other people has been a privilege that I’m thankful for.I remain grateful to many people who have helped me during this journey. I’d like tothank my supervisors, University lecturers Ari Vesanen and Antti Siirtola, who gavegreat guidance throughout the process. Special thanks for my friends Marko Pulkkinen,who helped me a lot in choosing my topic, and Juho Junnila for not only helping outwith this thesis, but also for all these amusing years at the University of Oulu we’vegone through. Thank you also for my family for having had patience and understandingwith me being a lot away due to my duties.Niko PelkonenOulu, October 30, 2019

4ContentsAbstract.2Foreword.3Contents.41. Introduction.61.1 Motivation.61.2 Strings.61.3 Research questions and methods.71.4 Structure.72. Background.92.1 Execution time and memory usage comparisons across programminglanguages.92.2 Comparisons on SLOC and code quality across languages.123. Methodology.143.1 Research method.143.2 Sample strings.153.3 Programming languages.153.4 String operations.184. Results.204.1 Load string file.214.2 Concatenation.234.3 Replace.254.4 Reverse.274.5 Sort.284.6 String duplication.304.7 Find first index of substring.314.8 Uppercase.334.9 String equality.354.10 Regular expressions.374.10.1 Regex 1.384.10.2 Regex 2.394.10.3 Regex 3.404.10.4 Regex 4.414.10.5 Regex 5.425. Discussion.446. Limitations and Future Research.477. Conclusion.48References.49Appendix A. Results for all tests.53A.1 Loading string file.53A.2 Python.54A.3 C.55A.4 C .55A.5 Java.57A.6 Perl.58

5A.7 Ruby.59A.8 Go.60A.9 Swift.62Appendix B. Results for string concatenation in Ruby.64Appendix C. Execution times and memory usages as positions across languages.65Appendix D. Libraries and methods used for calculating the execution times.67Appendix E. Source codes.68

61.IntroductionThis introductory chapter is divided into four subchapters. The first subchapterdiscusses the motivation of this study. The second subchapter introduces strings andwhy they are relevant in this study. The third subchapter presents the research questionsfor this thesis and the research methods used. The last subchapter presents the structureof this thesis.1.1 MotivationThe need to handle strings of great size and the performance issues that are broughtwithin them are common matters in programming. A program should be able to handlestrings of great size, especially if performance is an important requirement. If the size ofthe strings is not considered beforehand, large strings may become a burden fordevelopers. Therefore, sometimes developers need to consider which language would bethe best option for handling large strings effectively.Programming languages have built-in string operations, but which of those methods andlanguages itself are efficient enough in terms of performance? It is also possible thatsome of the built-in methods or functions perform efficiently on small-sized strings, butas the size of the string grows, the performance drops. There hasn’t been any researchdone on the very subject; naturally, a lot of research has been done on performancecomparisons across programming languages, but string operations specifically has beenleft out so far. This research aims to fill that gap in the literature and find out whichprogramming languages tend to perform the best on varying string operations.1.2 StringsStrings in programming are typically sequences of characters. How they are declared,handled, stored in memory etc., is up to a programming language, and there arevariations across programming languages. (Busbee, 2009) There are many situations inwhich large strings may emerge, including: Reading and handling large Excel-files (there may be hundreds of thousands ofrows and tens of columns), Reading and handling large CSV-files, Converting a file format to another file format (e.g. converting Excel-file toCSV-file and vice versa), Reading string data from database, Reading and handling data from large files in order to digitize a product, and Reading and handling DOM from a website (e.g. if a website doesn’t provide anAPI (application programming interface) for getting the data it presents, and the siteincludes some data that must be used, the data must be read from the DOM of thesite).DOM (Document Object Model) is an application programming interface (API) thatrepresents the logical structure of the HTML-elements (e.g. head, body, div, tr) on a

7website as XML and specifies the website’s interface. DOM provides a standardprogramming interface which developer can use to manipulate elements and navigatethe structure. (W3C, 2000) It is presented as a tree-structure, making it easy to interpret.DOM is returned as a string (which might be massive), and string operations areperformed in it in order to get the data that is needed. For example, first thing could beto remove all other than div-elements, and from these div-elements one could search forelements with a specific id (an identifier for elements). This id could be, e.g., forelements that include a person’s name. These names of persons could be then saved into another variable, which could be, e.g., sorted alphabetically.A key point why strings need to be handled effectively is because data is often receivedas a string, and therefore it needs to be handled as a string. For example, data receivedfrom a CSV-file is a string (e.g. columns separated with a comma and rows separatedwith a semicolon), and the same goes for DOM of a website. Excel-files usually giveuser freedom to write whatever user wants, so there are usually strings in Excel-files,too. Data stored in database may also include a large string. In addition, when Excel-fileneeds to be converted into CSV-file and vice versa, the data needs to be converted and ithas to be handled as string.An example of working with a string returned from a CSV-file could be to first get datafrom each row and column. As mentioned earlier, rows and columns are often separatedwith characters like colons and semicolons. Thus, the first string operation would be tosplit the string with colons and semicolons. Then, regular expressions could be used tocheck that substring doesn’t include any illegal characters; if it does, those could beremoved or modified. Checking if substring equals a fixed value, as well asconcatenating some value to a substring, are also common string operations.1.3 Research questions and methodsThe current literature doesn’t include the exact subject of this thesis; whichprogramming languages are the most efficient on string operations. This thesis aims toanswer this question. Thus, the research questions are:1. Which programming languages have the shortest execution times on stringoperations?2. Which programming languages use the least memory on string operations?The research method used in this thesis is experimental research. There are variousdifferent types of experimental research, and benchmarking is the type of experimentalresearch used in this thesis. Benchmarking is a common approach in this kind ofresearch where different technologies are being compared, and it fits this thesis well.1.4 StructureThis thesis has seven chapters, introduction included. Chapter 2 presents the currentliterature on the differences of programming languages; there has been lots of studies inwhich programming languages are being compared, but none of them focus on stringoperations only. Chapter 3 introduces the research methodologies, the sample stringsused to compare the languages, the programming languages selected for the thesis (howwere they selected, how they handle strings, etc.), what string operations were selectedfor the thesis, and what problems emerged with measuring memory usages. Chapter 4

8presents the results of the comparisons as simple graphs implemented in R. Each stringoperation has its own subchapter in which the graphs are shown and the results arediscussed. Chapter 5 contains discussion of the results; what operations were the mostdemanding (in terms of execution time and memory usage), what programminglanguages performed the best and what performed the worst, are there any differencesbetween compiled and scripting languages, and so on. Chapter 6 is the last chapter, andit presents the conclusion of the thesis as well as limitations and suggestions for futureresearch.

92.BackgroundThe comparisons of string operations performance across programming languageshasn’t been that researched; some studies have included some string operations whenprogramming languages were being compared, but there hasn’t been any explicit studyon string operations only. However, there are plenty of papers in which the performanceof programming languages are compared in general.This chapter presents the current literature on the subject. The references mostly studythe differences on memory consumptions and execution times across programminglanguages, which is also studied in this thesis. There are some languages and benchmarkmeasurements (e.g. energy consumption) that are discussed in this chapter that are notdirectly part of the scope of this thesis. Nevertheless, they were added as viewpoints onhow programming languages may be compared.Chapter 2.1 consists of literature on execution time and memory usage comparisons,and chapter 2.2 consists of literature on lines of code and reliability of languages.2.1 Execution time and memory usage comparisons acrossprogramming languagesIn a study by Prechelt (2000), programmers implemented programs on the sameprogramming problem on seven different languages, which were C, C , Java, Perl,Python, Rexx and Tcl. The program had to load a dictionary of over 70 000 words intomemory, and then read “telephone numbers” from another file. Those numbers had tobe converted into word sequences, and the result was printed. Not surprisingly,implementing the program on scripting languages (Perl, Python, Rexx and Tcl) took theleast time (especially on Perl and Python); the research shows that implementations inPerl, Python, Rexx and Tcl took under half the time it took on C, C and Java. Javaand C programs took the most time to implement, but as for Java, it should be notedthat the developers on the study were inexperienced Java programmers. Java programsused three to four, and scripting languages about two times more memory thanprograms written in C and C . C and C were also about three to four times fasterthan Java, and five to ten times faster than scripting languages. (Prechelt, 2000) Thispaper is a great background source, because it has similarities to my study. The similarlanguages were C, C , Java, Perl, and Python, and the paper studied execution timesand memory usages, which is also studied in my thesis. Also, the paper used a certaindata set (z0, which is empty) to measure the runtime for dictionary load time only, andin my study I had to take file loading into account in memory usage tests (in executiontime tests this wasn’t an issue, because they were done using the system clock inlanguages). There are some differences between my study and Prechelt’s. In my study, Idevelop the programs myself, but in Prechelt’s study, the programmers were other thanthe author itself; for Java, C, and C programs the programmers were computerscience master students, and for Perl, Python, Rexx, and Tcl programs the programmerswere volunteers for the study. Another difference is that in Prechelt’s study only onekind of program was measured, whereas in my study I measure different programs(varying operations).

10Fourment and Gillings (2008) found out that C and C programs used the leastmemory and they also had the shortest execution times. Perl and Python were pointedout to be flexible languages that required little amount of work, and Java and C# werelike a compromise between these two groups. In this study, different kinds of programswere implemented to compare the languages. One type of program was to parse anoutput of a BLAST (Basic Local Alignment Search Tool) result, and from theseimplementations, the slowest languages were Python and C#, and the fastest were C andC . Other programs were global alignment program, and Neighbor-Joining program,and in these two, Perl and Python were clearly the slowest. The results for C and C were very much alike; overall, they were the fastest programs and required the leastmemory. Python had clearly worse results than Perl for I/O operations. In addition, C#consumed more memory than Java for holding strings in memory. (Fourment &Gillings, 2008) Compared to my thesis, this paper also studied execution times andmemory usages, there were similar languages (namely, every language except for C#,which had to be excluded from my thesis), and the languages were measured againstvarying programming tasks. One difference was that the paper compared operatingsystems (no clear evidence was found whether Windows or Linux would be fasteroperating system) whereas my study doesn’t. Also, the programs in Fourment &Gillings’ paper were considerably bigger; in my study, where the programminglanguages’ string operations are compared, an operation takes usually only one line ofcode.Aruoba and Fernandez-Villaverde (2014) studied execution times on programminglanguages commonly used in economics. These languages were C , Fortran, Java,Julia, Python, Matlab, Mathematica, and R. In these languages, the stochasticneoclassical growth model was implemented. They found out that the fastest languagesin their study were C and Fortran, C being slightly faster. Julia was quite effectivelanguage, being only 2.64 to 2.70 times slower than C . Python was found to beremarkably slow; with Pypy implementation, which is a virtual machine replacementusing just-in-time compiler, it was over 40 times slower than C , and with thetraditional CPython implementation, it was 155 to 269 times slower than C . Java wasover 2 times slower than C . (Aruoba & Fernandez-Villaverde, 2014) This studypoints out great background information for my thesis, because it studied executiontimes of varying programming languages, which is also part of this thesis. The samelanguages between this study and my thesis were C , Java, and Python. The differenceis that the authors implemented only one program on every language, whereas my thesiscompares various operations.Couto, Pereira, Ribeiro, Rua, and Saraiva (2017) carried out another study using 13benchmark problems from the Computer Language Benchmarks Game (CLBG) in orderto study the execution times and energy consumptions of programming languages. Therise of non-wired computer devices has led to energy consumption being a bottleneckon building computers and their softwares. The languages chosen for the study are C,C#, Fortran, Go, Java, JRuby, Lua, OCaml, Perl and Racket. Intel’s Running AveragePower Limit (RAPL) tool was used for measuring energy consumption. According tothe authors of the study, one of the hot questions that tends to rise on the subject hasbeen whether a program with high execution times will also mean it is energy efficient.The overall results for the 13 benchmark problems show that for both energyconsumption and execution time C was the best with Java following as the second. C#was 5th in energy consumption, consuming 2.21 times more energy than C, and 3rd inexecution time, being 2.44 times slower than C. Perl used the most energy out of the 10languages, consuming 84.89 times more energy than C. Perl was also the second mostslow as it was 68.45 times slower than C. In addition, it was noted that energy

11consumption isn’t always directly proportional to execution time, as almost every taskhad languages where energy consumption and execution time didn’t behave in the sameway. An interesting finding was also that Lua was 12% slower than Perl, but at the sametime, it was 53% greener. Go was a pretty solid performer in each test. (Couto et al.,2017) This study had also similarities with my thesis. It included benchmarking ofprogramming languages, and some of the languages studied were the same as in mythesis: C, Go, Java and Perl. In addition, execution time was studied. What wasdifferent, though, was that this paper studied energy consumption, whereas my studystudies memory consumption in addition to execution times. Also, this study used 13different CLBG programs to compare the languages, but my thesis uses strings ofvarying sizes.Energy, time and memory consumption across programming languages were studied byPereira et al. (2017), who also used CLBG in their study. The results for energy andtime consumption suggested that the best performer was C, followed by (from best toworst) Rust, C , Ada, and Java. It was noticed in the study that execution timedoesn’t necessarily affect energy consumption directly; for example, Fortran consumedsecond least memory for one CLBG problem, but at the same time, it was only 8th inexecution time. On average, the least memory was used by Pascal (66Mb), Go (69Mb),C (77Mb), Fortran (82Mb), and C (88Mb), whereas the most memory was used byJRuby (1309Mb), Dart (570Mb), Erlang (475Mb), Lua (444Mb), and Perl (437Mb).(Pereira et al., 2017)Rosetta Code is a repository which collects programming solutions to varying tasks indifferent languages. Anyone can log in to the repository and add a new task or asolution, either to seek or provide help with programming subjects. (“Rosetta Code”,n.d.) Nanz and Furia (2015) used Rosetta Code in order to find out about differences inprogramming languages. They were looking for differences in lines of code, sizes ofexecutables, runtime performances, memory usage efficiencies, and runtime failures.The study analyzed over 7,000 solutions to over 700 different tasks in 8 languages. Thelanguages selected for the study were C, Go, C#, Java, F#, Haskell, Python, and Ruby.The results of the study suggest many kinds of matters. F#, Haskell, Python and Rubyenabled writing more concise code than C, Go, C# and Java. Executables compiled intobytecode were smaller than the ones compiled into native machine code. For bothruntime and memory usage, C had the best results, and Go was the second. However,the differences on runtime reduced on moderate sized inputs. In addition, compiledstrongly-typed languages were able to catch more errors at compile time thaninterpreted or weakly-typed languages. (Nanz & Furia, 2015) I think this study isreliable because of the large sampling size (there were over 7,000 solutions analyzed).This study has also valuable background information for my thesis, because there weresimilar languages (C, Go, Java, Python, and Ruby) studied, and execution times andmemory usages were studied, too. The key difference between Nanz & Furia’s paperand my study is that they analyzed over 7,000 programs found at Rosetta Code, whereasI develop the programs myself. Another difference is that the programs analysed inNanz & Furia’s paper were much bigger programs than my programs, which are mostlyprograms consisting of one string operation (excluding, e.g., reading the string file).Cesarini, Pappalardo and Santoro (2008) studied the differences of programminglanguages by implementing an IMAP (Internet Message Access Protocol) client on eachprogramming language. The programming languages they selected for their study wereErlang, C#, Java, Python and Ruby. The metrics for comparison they used were SLOC(number of source lines of code), memory consumption (the space cost of the library),execution time, and functionality of primitives (analyses the functionalities of primitives

12provided by the IMAP libraries). SLOC was highest for Ruby (1,612) and, notsurprisingly, smallest for Python (472). For C# SLOC was 1,089, and for Java itcouldn’t be measured. The amount of memory required for Java was highest, as itrequired 14MB for code, 190MB for class library, and a total of 213MB. Pythonrequired only 4,4MB for code, 1,6MB for library, and 6,7MB in total. The total ofmemory required for Ruby was 15MB and for C# it was 19MB. As for execution times,Ruby’s performance was the worst, and the performance for C# was the second worst.Python seemed to have the best performance, but it should be noted that Python didn’tparse IMAP server responses. Overall, for the client-side IMAP protocolimplementations, Ruby had the worst results, whereas the best results were calculatedfor Python. (Cesarini et al., 2008) The study consisted of developing only one programon all of the selected languages, which is a bit lacking. Programming languages havedifferent kinds of results on different types of programs. For example, some Pythonprogram might have higher execution time than a similar Perl program, and vice versa.Therefore, this study would be more reliable if it had consisted of developing moreprograms, because now the results might not be generalizable. However, it is a niceaddition to the background information, especially because it included execution timesfor Java, Python and Ruby, which are also part of this thesis.JIT-compiled (just-in-time) dynamic languages have started to challenge compiledlanguages as they have began improving their performance. Fortran, C , Java(compiled languages) were compared to Python (with Numba-library), Matlab, andJulia (interpreted languages) by Eichhorn, Cano, McLean and Anderl (2018) in theirresearch. The languages were evaluated with four different problems relevant inastrodynamics. The research found out various things. Julia was faster than Java inevery test, and even faster than Fortran and C in one test. Matlab was the slowest inevery test. Python Numba was also a worthy challenger for Fortran, C , and Java inruntime. However, C and Fortran still seemed to be the best alternatives forastrodynamics, where high performance is needed. (Eichhorn et al., 2018)In a study by Alomari, El Halimi, Sivaprasad and Pandit (2015), C#, C , Python, Java,VB and PHP were compared by using BFS (breadth-first search), DFS (depth-firstsearch), Kruskal’s algorithm, as well as reading and writing operations on a MySQLdatabase. Tests for DFS, BFS and Kruskal’s algorithm were used to evaluate programsexecution times and memory consumptions, and database tests were used to evaluateexecution times only. For DFS, BFS and Kruskal’s algorithm, C had the best results.A rather i

programming languages tend to perform the best on varying string operations. 1.2 Strings Strings in programming are typically sequences of characters. How they are declared, handled, stored in memory etc., is up to a programming language, and there are variations across programming languages. (Busbee, 2009) There are many situations in