tharwan.de

Some Words About Physics, Python And The World In General

MATLAB is not for Science - follow up

I wrote in detail why I think MATLAB is a bad choice for scientific research here. Imagine my surprise when I saw a letter about the "new MATLAB licensing model" from my university in my inbox. Just some quotes translated into English:

Further notes for MATLAB users at the TU Ilmenau:

1. Close MATLAB if you do not use it for a longer period of time (longer than 30 minutes), to free the license for other users. Forgo to block a academic-license by doing pseudo calculations.

4. Examine the use of comparable commercial or free software in your department like Maple, Mathematica, R, Octave or Scilab.

Sadly they forget to mention Python/NumPy. Maybe I should offer an introduction course… 

MATLAB is not for Science

Currently I am working a lot with MATLAB®. Actually knowing how to use MATLAB got me a job in the first place. Nevertheless I could not really overcome my dislike for it even though I become more and more accustomed to its quirks. It is not so much that the idea of MATLAB as a tool for easy and fast numerical prototyping is a bad one in it self. It is not even the case that it is a bad tool in it self. It is only that there are some parts of the execution that are just poor or a bad fit.

I held a talk about MATLAB versus my go to language Python for use in our physics department (slides in German) which was very well received but sadly did not lead to any changes yet. The worst thing about MATLAB in scientific use is that it is closed source and therefore not freely available for everyone. And its open source clone (Octave) is simply not a full replacement.

Open Source? Aren‘t you an Apple fanboy?

The bad thing about the usage of MATLAB, and all other closed source tools for science, is that they destroy what could makes computer science and science on computers such a great thing: that almost everyone has the tools to do it. To trap a single atom in a quantum well and find out how its absorption spectrum changes is an expensive experiment to setup. Simulating it on a computer is almost a trivial task nowadays.

But if the knowledge you base your own simulations on is build up with tools you do not have access to, you will have a very hard time to setup your "virtual experiment" too. Or you have to swallow the pill and buy and use the same closed source tools all over again.

I think scientist should devote themselves to use open source for all their publications, if possible in any way, because it is the nature of science to be reproducible. It is sad that we cannot do high energy particle physics in our backyard or in our lecture halls. But we have a way to make the simulations useable for everyone and we should use it.

Where MATLAB excels

And here I am, telling you that Python might be a better alternative, yet earning money programming in MATLAB. Actually I did try to convince my employer to let me use Python and got some good reasons to use MATLAB. It is more or less the same reason why large cooperation‘s use Windows and Office: it comes with a promise of service and completeness and it seams to be a carefree package. It is some kind of outsourcing. You have someone that is responsible and is not you. You have some kind of stability. And the biggest point of all: it is the go to standard in the industry (and in science). Sadly.

The Story goes like this: your company wants to develop some kind of image processing software. You need something to prototype your algorithms in that just works. MATLAB provides you with almost all the things you need. You install it on your machine and you are ready to go. Having used MATLAB for exactly this case, I must admit it is very nice to have almost anything your can whish for already implemented and very well documented bundled in an almost care free package to play around with.

Money is the solution

For a company it is more or less no problem to pay the price for a MATLAB installation (some k€), even if it is only used for a single project. For a private user the price would be nuts. There exists a student version that is priced much lower (~150€ without any toolboxes) but based on my observations is not justifiable for most students. This might be because of the availability of pirated versions (almost anyone I know has one) or the versions that are accessible on the university owned PCs.

I think the price for MATLAB is justified; it is a useful tool with a small target audience that has the financial resources to pay for it, if you look at commercial use of it. I have no idea what it costs as a university to get a campus license, but no matter what it is the price for educational users is ridiculous. Especially if you consider that a lot of the useful and important functions are bundled in so called toolboxes, which will milk some extra money out of you. Though you can argue that it lowers the price of the basic version.

To make the situation a little bit more vivid here: for a lot of courses in engineering you have to use MATLAB. And while it is usually provided by the university, you can only use it in the computer lab which are natuarly to small for the whole university and not accessible at any time and so on. (The issue can partly be solved via remote desktop access) But the plan of MathWorks is not to provide an easy way for a university to give MATLAB to its students, since they already have paid for the license, instead it wants you as a student to pay again.

Making the educational version of MATLAB free would solve the problem of the software availability as a scientific tool for everyone while providing MathWorks the advantage that it will be used even more broadly in publications, manifesting its position as the go to tool. But as it already is the go to tool there is very little incentive for MathWorks to do so other then to be nice. The only way to push them in this direction would be to build on open source languages and toolboxes to challenge this status.

Basically I have no problem with MathWorks earning money with their software. I think the situation is not there fault (even if they could solve it). I also would not argue that everything should be free for education and science. But in the case of scientific software we have an alternative, which we do not have for hardware, that would strengthen the foundation of science and therefore we should use it.

MATLAB – The Language

MATLAB is not only a tool it although describes a programming language. One that as outgrown itself years ago. MATLAB was designed as a MATrix LABoratory but the language is now used for general purpose. There are a lot of little and some bigger things where you have the feeling you do something that this is not made for when you program with MATLAB. Some Examples follow – more will come in later posts.

Variable Unpacking:

A function can return multiple matrices and so you would guess you can do some kind for variable unpacking. You actually can, but I have know idea why the bothered to implemented it anyway as limited as it ended up being.
We assume our function f(x) returns three variables a,b and c:

function [a,b,c]=f(x)
    % some clever math
    a = ;
    b = ;
    c = ;
end

Now when you do:

X = f(x);

All you get is a. To get a,b and c you have to do:

[a,b,c] = f(x);

Lets assume you are only interested in one or two of the return values you can do:

[~,b,c] = f(x) % works

Since you can substitute the commas in MATLAB with spaces this also works:

[a b ~] = f(x) % works

But not this:

[~ b c] = f(x) % does not work

You can not do something like:

l = [1 2 3];
[a,b,c] = l; % ERROR

Cellarrays will not help either:

l = {1 2 3};
[a,b,c] = l; % ERROR

Optional Arguments

This issue thing will hit very often while you are still develop your algorithm and the structure of your problem is not yet entirely clear. You can argue that this is my personal problem and I should think more before I start to program. Be assured I think a lot when I program but sometimes it works really well for me to just start writing down some code. But even if your program is completely laid out befor you start do write your code, almost certainly there will come a time you or someone else has to add something and here we are again.

Assume you have a function f(x,y) and later you discover there may be some circumstances where you need another argument z. You could refactor all your code and pass in three arguments for every call of f and pass in NaN for z where it was not needed before. Or you can use varargin:

function out = f(x,y,varargin)
    out = x+y
    if length(varargin)>0
        out = out * varargin{1}
    end

Honestly even the idea that you have a special argument you have to name varargin is kind of funny. It gets much more pretty if you have more than one optional argument and you have to think about some logic to figure out how many arguments you got, in what order and if they are all of the right type.

function out = f(x,y,varargin)
    out = x+y
    if length(varargin)==1
        out = out * varargin{1}
    elseif length(varargin)==2
        out = (out + varargin{2})*varargin{1}
    end

Matlab helps you with the inputParser. InputParser then does what the language should to by default. It gives you a way to test the arguments for type and set default values. It also gives you the opportunity to write much more code.

The way it would like to see this solved is like it is done in Python:

def f(x,y,z=1,v=0)
    return (x+y+v)*z

This even allows you to do:

f(1,2,v=1)

The only thing you may have may have to add is to check if all variables are of the right type, what is quite hard since you probably don‘t care about float or integer but the availability of + and * operators for the objects. No matter how elegant you code your varargin check, I doubt it will ever be as clear as Pythons syntax.

to be continued…

 page 1 / 1