Lead Image © Andrey Suslov, 123RF.com

Lead Image © Andrey Suslov, 123RF.com

Julia: Fast as Fortran, easy as Python


Article from ADMIN 50/2019
The first stable release of the Julia language offers scientific and technical programmers a convenient and fast tool.

The Python programming language is wildly popular because it's easy to write, easy to read, and can be made to do almost anything with its huge stack of libraries. Yet, one complaint is heard over and over: It's too slow! This drawback might not matter much for web programming, where your program can spend more time waiting for responses from the network and database than actually computing anything, but data scientists, physicists, and engineers who try to use Python for real number crunching quickly hit the performance wall. A new language called Julia promises to be as easy to program as Python and other dynamic, interpreted languages, while offering the execution speed of statically typed, compiled languages such as C and Fortran.

Dynamic Languages

The popularity of interpreted, dynamically typed languages is easy to understand. You can get right to the real work in your program without spending multiple lines of code on ceremony and bookkeeping. You don't need to declare data types, manage memory, or think about how your high-level code will be translated into machine instructions. Perhaps best of all, you can type expressions into an interactive prompt, often called a REPL (read-eval-print loop), and get immediate results. The REPL allows you to experiment freely, try out ideas, and use your favorite programming language as a sophisticated calculator.

These types of dynamic languages used to be called "scripting languages," because they were used in shell scripting; however, as their applications have grown far beyond their original niche in automating system maintenance tasks, this term is falling out of use. These days Python, Ruby, R, Perl, and other languages in this class are grouped together under various and sometimes inaccurate terms. In this article, I adopt the description "dynamic languages" to reflect the dynamic nature of their data typing. (See also the "Common Confusion" box.)

Common Confusion

People often make the error of confounding dynamic and loose typing, mistakenly describing, for example, Python as loosely typed. In fact, Python is strongly typed and will give you an error if you try something like 1 + "1". Perl and JavaScript are loosely typed, which makes them more flexible, but less predictable. Try 1 +"1" and 2 * "3" in your browser's JavaScript console – but first try to predict the results.

Static Languages

The other major class of languages are statically typed: C, Fortran, Java, C++, and others. In these programming languages, the type of every variable is fixed before compilation, either explicitly or by convention or inference, and a particular variable cannot be used for more than one major type. These languages have no REPL: Your program needs to be compiled as a whole and linked with any external resources it references each time you change it. Therefore, they cannot be used interactively. Exploration or program development thus involves many cycles of edit-compile-run.

Some people extol static typing as a way to avoid many classes of programming errors or to expose them during compilation before running the program. Others consider it annoying ceremony, getting in the way of the real business of expressing an algorithm in code. It's partly a matter of taste and partly experience. One objective advantage of static typing is that it allows the compiler to produce more optimized, and therefore faster code, which is one of the reasons languages such as Fortran and C are still the workhorses [1] of large-scale computing and simulation in demanding disciplines such as weather forecasting, physics research, and economic modeling.

Dynamic Problems

Nevertheless, the convenience of dynamic, interactive languages is tempting enough that scientists and engineers have tried to bend them to their purposes. Python, for example, has become popular with data scientists in recent years. How do these computational scientists overcome the intrinsically poor performance of these languages? Several techniques can speed up dynamic languages, but most of them come down to rewriting the time-consuming parts of the code in one of the fast static languages or using pre-built libraries (e.g., NumPy [2] for Python) that call on Fortran or C routines to do common tasks in array manipulation, linear algebra, and the like.

This setup leads to two-language problems [3]: Your program is now written in a mix of languages with incompatible paradigms; you need to manage the interface between languages; you may wind up on large projects with different teams specializing in the different languages; and the overhead of calling foreign functions may eat in to the gains in execution speed, depending on the efficiency of your language's foreign function interface. Moreover, your code becomes more difficult to read, develop, and maintain.

If you are fortunate enough that big parts of your problem can be passed off to NumPy or a similar library, parts of your calculation will still need to be expressed in the host language and will remain slow. Also, a library written in a foreign language will typically bind you to a particular implementation of your dynamic language, limiting your choices in the future.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

comments powered by Disqus