Feb 10, 2012

Write-only code

There are lots of modern and mature languages stating the speed of coding as one of major language advantages:

Python lets you write the code you need, quickly. 
Perl -- Because life's too short to code without punctuation.

Phrases like this always puzzle me. While it sounds nice to be able to write code quickly, what's about reading it? I mean, the code you write is not only for a compiler to read. The code is to be read by humans. By you! And by those unfortunates, who will have to support your code after you leave the building. These poor guys will be glad to know that the fancy language allowed you to write that shit of a code by 15 minutes faster. This warm thought would be with them all that hours they would spend sticking eyes into cryptic lines, trying to fix that little mistake you did 6 months ago..




I would not be so emotional about this matter, if I havent' been in the shoes of that misfortunate recently. Well, right now I am spending my free time trying to do something reasonable with some "not-so-great" Python code.. And so I have one question - how do you ever read Python??

"by Guido van Rossum, from a presentation to Hewlett Packard, on the principle that has guided the development of Python syntax and language features. The emphasis is on clear and readable code."[IronPython in Action]
ARE YOU JOKING ME??


OK, I'm not about the indentation stuff. While it's inconvenient and clumsy after C++/Java/C#, it's still bearable. Still, if speaking of indentation, there's one thing that puzzles me. As far as I understand, using indentation to denote blocks in Python was intended to assure code readability ("Uses an elegant syntax, making the programs you write easier to read."[python.org]). Still there's much more things in Python that make code hard to read, so all that indentation stuff does not matter much.


So, getting back to my question on how to read Python. First thing is - how do you read function signature without parameter types specified? Here's an example:
def checkIntersection(self, value, signal, direction, signalU=None, signalD=None, intersections=None, n=None):
So, how should I know, what kind of stuff to pass as a value for intersections param? From documentation? Don't make me laugh! I just have to sit read the guts of the function just to understand how should I use it. In a statically typed language I would have much more information just from the function declaration itself. The fact that the author of the function did not spend 30 seconds more on writing type specification helps me greatly to read it now..

And the second thing - how do you understand the class state without field declarations? Now, really, in Python you don't have to specify class field members. Instead just assigning to self.<field_name> anywhere inside of a function declares that self has <field_name>. Conveinent, isn't it? Well, it isn't. It saves you tine when you write code. But when you read it, you'll have to go exploring whole class methods to see, what makes it's state. PyDev plugin for Eclipse is of a help here - it provides an outline:


And in the outline for every function it lists fields that have values assigned to in the function! So you may get an idea of the class state a bit faster. For "some reason" there's no need for this in a "less-readable" languages.

Now reading "The Zen of Python", I get mixed feelings..


Well, on the bright sight of the things, I've recently found an emerging language - Ceylon. The goals statement for the language starts with "Readability":


We spend much more time reading other people's code than writing our own, so readability is the most important thing in a language meant for teams. A programming language is partly for communication between humans. Verbosity can sometimes contribute to readability, and it can sometimes harm it. Neither verbosity nor brevity is a goal in and of itself. What matters is striking a balance that makes difficult code understandable. The value of readability increases as the size of a team increases.

Authors say that "Ceylon keeps the best bits of Java but improves things that in our experience are annoying, tedious, frustrating, difficult to understand, or bugprone." An admirable aim. While the language development is at milestone 1 right now, I hope I'll have time to try it soon. After all, it's a new experience for me to witness a new programming language growth from the very start.



7 comments:

  1. Извините, что мой комментарий на русском языке =)

    Хочу начать с того, что python - язык с динамической типизацией. Да,
    статическая типизация действительно полезна, так как позволяет отсечь
    широкий класс ошибок на этапе компиляции. Программы на "динамических"
    языках приходится тестировать более тщательно.

    Но статические типы я бы не рассматривал в качестве
    _документации_. Документация необходима в любом случае, написана ли
    программа на Python или C++.

    Опять же не спорю, что знание о типах переменных - полезное знание. Но
    в большинстве случаев тип переменной ясен из семантики функции. В этом
    посте приведен пример функции (я об checkIntersection) со спорным
    дизайном, причем эта функция вырвана из контекста.

    У динамически типизированных языков есть и достоинства. Например, на
    таких языках проще писать обобщенные функции, то есть функции, которые
    могут работать с переменными разных типов единым образом. А это
    несомненно добавляет языку гибкости, а значит выражать мысли на таком
    языке проще.

    Простой пример: функция, которая сворачивает список.

    def reduce(start, f, list):
    r = start
    for x in list:
    r = f(r, x)
    return r

    А вот пример использования такой функции:

    def add(x, y):
    return x + y

    >>> print reduce(0, add, [1,2,3,4,5])
    15
    >>> print reduce("0", add, ["a", "b", "c", "d"])
    0abcd

    Кроме того, никто не запрещает проверять типы переменных во время
    выполнения программы. Нормальная практика:

    def f(x):
    assert type(x) == expected_type
    body

    И уж точно рекомендуют писать документацию к функциям:

    def avg2(a, b):
    """
    Находит среднее арифметическое двух чисел.

    @param a: число любого числового типа;
    @param b: число любого числового типа.

    Под числовым типом понимается тип с определенными на нем операциями +, - и
    /.
    """
    if cmp(a, 0) == cmp(b, 0):
    return a + (b - a) / 2
    else:
    return (a + b) / 2

    Теперь узнать о сути функции avg2 можно вот так:

    >>> from module import avg2
    >>> help(avg2)

    Спасибо.

    ReplyDelete
    Replies
    1. Спасибо за комментарий, Алексей. Ответ я написал ниже отдельным комментарием.

      Delete
  2. Anonymous13/2/12 15:37

    "These poor guys will be glad to know that the fancy language allowed you to write that shit of a code by 15 minutes faster..."

    I believe it's typo. It should be written "by 15 times faster" :) Well I agree with 10 times.

    "OK, I'm not about the indentation stuff. While
    it's inconvenient and clumsy after C++/Java/C#, it's still bearable..."

    You may be surprised, but there are lot of people who think finnish is very convenient and readable language.
    Don't you ever think it's inconvenient and clumsy? Don't you think only people who speaks in python fluently should read it?

    "def checkIntersection(self, value, signal, direction, signalU=None, signalD=None, intersections=None, n=None):..."


    public static int checkIntersection(int value, int signal, boolean direction, int signalU, int signalD, String[] intersections) {}

    What do you know about int value besides that it's integer?
    How much does it help to read programm without comments and/or documentation?

    "And the second thing - how do you understand the class state without field declarations?..."

    Personally me prefer to know which method of the object should I call to change it's state.
    In the other words I don't care which exactly properties are specified. I'm interesting in how to work with that object.

    ReplyDelete
    Replies
    1. "public static int checkIntersection(int value, int signal, boolean direction, int signalU, int signalD, String[] intersections) {}

      What do you know about int value besides that it's integer?"

      I partially agree with you that "int value" doesn't add much additional information. But, from the other hand, I know that "value" is not an instance of MySuperClassThatErasesTheEntireHardDiskOnAnyMethodCall.

      "Personally me prefer to know which method of the object should I call to change it's state.
      In the other words I don't care which exactly properties are specified. I'm interesting in how to work with that object."

      It's OK when you use a class. But what about debugging. You have to dive into the class' guts to figure out why it behaves wrong.

      Delete
    2. "You have to dive into the class' guts to figure out why it behaves wrong."

      Yep, that's exactly what I'm talking about.

      Delete
  3. Alexey, Anonymous, thank you for your comments! Sorry, I was not able to answer earlier.


    "Но статические типы я бы не рассматривал в качестве
    _документации_. Документация необходима в любом случае, написана ли программа на Python или C++."

    "И уж точно рекомендуют писать документацию к функциям:"

    "How much does it help to read programm without comments and/or documentation?"

    I do agree with you. Certainly. Still there are few things that matter.

    First, how often do you see a good documentation? My previous post was about the poor documentation of the code that is published by really huge companies, industry leaders. I believe it's a real problem.

    Next thing is, while you saved some time on writing code faster using a dynamically typed language, you have to spend more time on writing detailed documentation for it. In your example with avg2 instead of _declaring_ parameter types you had to _describe_ it in the documentation. So you spent some time on doing it and also compiler will not be able to check the constraints for you.

    Now, the third thing is that there's an (unreachable) ideal of a "self-documented code". The idea is that the code should be clear enough to require just few comments. For example, that 'checkIntersection' function could be declared like this:

    public bool checkIntersection(int value, Signal signal, Direction direction, Signal signalU, Signal signalD)

    This way you'll probably don't have to do detailed documentation on the values for 'signal' and 'direction' parameters, as that could be obvious from type declarations. Instead you'll be able to focus on what function does and how to use it.


    "У динамически типизированных языков есть и достоинства. Например, на таких языках проще писать обобщенные функции, то есть функции, которые могут работать с переменными разных типов единым образом. А это несомненно добавляет языку гибкости, а значит выражать мысли на таком языке проще."

    I also agree here. Dynamically-typed languages help a lot to write generic functions faster. Still the question is - do you really care about writing faster and how do you pay for this?

    Statically-typed languages are not too bad with generic programming too. Just as a mental exercise, here's what I imagine 'reduce' could look like in C#:

    static T reduce(T start, Func f, IEnumerable list)
    {
    T r = start;
    foreach (T x in list)
    {
    r = f(r, x);
    }
    return r;
    }

    int result = reduce(0, (int r, int x) => { return r + x; }, new int[] { 1, 2, 3, 4, 5 });

    string resultS = reduce("0", (string r, string x) => { return r + x; }, new string[] {"a", "b", "c", "d"});


    OK, yes, I spent a bit more time to write this. (And in Java I would have to spend even more because of no anonymous functions support, but that's another story). Funny thing is that I probably not able to implement a generic 'add' function in C#. Also I would not want to use this function to concatenate strings because of the performance reasons..

    So overall, yes, I agree, that the code you shown is generally prettier. Still I wander, does it worth not being able to reason about types? for example, imagine you are not used to idea of 'reduce'. What would you get from the declaration 'reduce(start, f, list)'? I agree that my declaration is also not too clear.. (Though it can be improved by giving parameters more descriptive names...)

    ReplyDelete
  4. Sorry, I had to split the comment in two, too many letters :)

    "Personally me prefer to know which method of the object should I call to change it's state.
    In the other words I don't care which exactly properties are specified. I'm interesting in how to work with that object."

    Sure, but that's when you have to use other's code. What's about when you have to _fix_ it?


    "You may be surprised, but there are lot of people who think finnish is very convenient and readable language. Don't you ever think it's inconvenient and clumsy? Don't you think only people who speaks in python fluently should read it?"

    Well, I already written that I'm OK with indentation, right? And yes, I'm not fluent in Python. So you may very well just ignore me. Still I think there's a reason in my post - I believe that trading readability for the speed of writing code is a wrong, and I believe that dynamically typed languages do exactly that.


    "I believe it's typo. It should be written "by 15 times faster" :) Well I agree with 10 times."

    Actually I'm generous enough to agree on 15 times speedup :)
    The thing is that you generally spend more time reading the code, than writing it. (E.g. Joel says "Think back to the last project you worked on. Chances are, debugging took from 100% - 200% of the time it took to write the code in the first place." http://www.joelonsoftware.com/articles/fog0000000245.html)
    So my point is - it does not matter if you write code 5 or 15
    times faster - if it makes you spend just twice more time on reading - it still not worth it.


    "Да, статическая типизация действительно полезна, так как позволяет отсечь широкий класс ошибок на этапе компиляции. Программы на "динамических" языках приходится тестировать более тщательно."
    "Кроме того, никто не запрещает проверять типы переменных во время выполнения программы."

    Well, I can't emphasize enough why it's so important to do as much as possible during compile-time. When you are developing a complex system, you have so many things to think of, so you'll want your tools to do as much of the work as possible..

    ReplyDelete