python

REVISITED: Julia vs Python Speed Comparison: Bootstrapping the OLS MLE

I originally switched to Julia because Julia was estimating a complicated MLE about 100-times faster than Python. Yesterday, I demonstrated how to bootstrap the OLS MLE in parallel using Julia. I presented the amount of time required on my laptop to bootstrap 1,000 times: about 21.3 seconds on a single processor, 8.7 seconds using four processors.

For comparison, I translated this code into Python, using only NumPy and SciPy for the calculations, and Multiprocessing for the parallelization. The Python script is available here. For this relatively simple script, I find that Python requires 110.9 seconds on a single processor, 66.0 seconds on four processors.

Thus, Julia performed more than 5-times faster than Python on a single processor, and about 7.5-times faster on four processors.

I also considered increasing the number of bootstrap samples from 1,000 to 10,000. Julia requires 211 seconds on a single processor and 90 seconds on four processors. Python requires 1135 seconds on a single processor and 598 seconds on four processors. Thus, even as the size of the task became greater, Julia remained more than 5-times faster on one processor and around 7-times faster on four processors.

In this simple case, Julia is between 5- and 7.5-times faster than Python, depending on configuration.


Bradley J. Setzler

Advertisements

UPDATE: A Comparison of Programming Languages in Economics

Here, you can find the latest version of the paper and all of the codes used in A Comparison of Programming Languages in Economics, by S. Boragan Aruoba and Jesús Fernández-Villaverde.

This is the paper that convinced me to give Julia a try.


Bradley J. Setzler

Getting Started: Installing Julia, Julia Studio, and Packages used in Economics

UPDATE: Julia Studio is no longer supported. Please see my more recent installation guide here.


In this post, I explain how to install Julia, Julia Studio, and 3 packages commonly used in economics on your personal computer in about 5 minutes.


Installing Julia

Unlike installing Python, it is very easy to install Julia and its packages. Simply download Julia Studio, which is the most popular IDE for Julia, and click install. This will also install the current version of the Julia language. Now, open Julia Studio. In the console, type:

 
julia> 2+2 

and press Enter. If it returns the number 4, you have successfully installed Julia.


Installing Packages in Julia

Next, you need to install a few packages used frequently in economics. The following command will install the Distributions package, which allows you to simulate random variables and evaluate probability distribution functions. In the console, type:

julia> Pkg.add("Distributions")

Like R but unlike Python, Julia installs packages from within Julia. Also, install the packages called “DataFrames”, which is used for working with data, and “Optim”, which contains numerical optimizers.

That’s it, you should be ready to work in Julia after about 5 minutes of installations!


Bradley J. Setzler

Why I Switched to Julia

The following story explains why I began programming in Julia. Since then, I have found that Julia improves the performance of my other econometric estimators. However, Julia has a major disadvantage in that it lacks informative documentation and tutorials, much less accumulated discussion on sites like stackoverflow. This blog is meant to record the skills I am learning in Julia over time, to serve as a tutorial for economists and others learning the Julia programming language.


Is Julia the Future of Computational Economics?

I am currently estimating a structural econometric model of game-theoretic parent-child interaction. Using the standard implementation of Python (the code is written entirely in NumPy and SciPy with data prepared by Pandas), the optimizer ran for 24 hours, then terminated due to the 5,000 iteration limit. It was converging smoothly, but never quite arrived. While waiting for the estimates last night (and growing increasingly impatient), I installed Julia and its packages, learned how to program in Julia, rewrote the estimation in Julia, and this morning successfully optimized the likelihood in Julia.

The contrast is staggering: the optimization that didn’t converge after 24 hours in Python converged after only 15 minutes in Julia while Python was still running on the same processor. Julia was already achieving a greater likelihood than Python after only 5 minutes even though Python had a 20-hour head start. They are both using the same optimization algorithm (including numerical tolerance), and the structure of the code is identical. Julia evaluates the likelihood in 0.5 seconds, while Python requires 21 seconds per evaluation, so Julia is about 40 times faster in the function evaluation, and about 100 times faster in the optimizer (I’m giving Python the benefit of the doubt even though it never converged).

The final iteration of Python was approaching the Julia optimal likelihood and getting closer; the only difference was that Julia arrived much, much more quickly. Since my next step is to bootstrap the estimator, speed is extremely important. Some practical arithmetic: on my four-core laptop, it would take two-thirds of a year to bootstrap this estimator 1,000 times, whereas Julia could do it in fewer than three days (though I’m planning to run the bootstrap in batch on the server).

I am agnostic on programming languages; I use whatever gets the answer fastest and can be reproduced most clearly, and I often use multiple languages on the same project to get the best features of each. My only claim is that Julia has taken the Python code, with minimal syntax changes, and executed the code 100 times faster for someone who had no prior experience with Julia. This was not a contrived, time-testing code; this is the estimator motivated by economic theory. The 100-fold speed increase of Julia relative to Python has been found elsewhere in computational economics.

So, is Julia the programming language of the future in structural econometrics? I’m not sure, but it seems to dominate Python and R at the moment.


Bradley J. Setzler