alex gaynor's blago-blog

Posts tagged with response

The compiler rarely knows best

Posted July 12th, 2012. Tagged with pypy, python, response.

This is a response to http://pwang.wordpress.com/2012/07/11/does-the-compiler-know-best/ if you haven't read it yet, start there.

For lack of any other way to say it, I disagree with nearly every premise presented and conclusion derived in Peter's blog post. The post itself doesn't appear to have any coherent theme, besides that PyPy is not the future of Python, so I'll attempt to reply to Peter's statements more or less in order.

First, and perhaps most erroneously, he claims that "PyPy is an even more drastic change to the Python language than Python3". This is wrong. Complete and utterly. PyPy is in fact not a change to Python at all, PyPy faithfully implements the Python language as described by the Python language reference, and as verified by the test suite. Moreover, this is a statement that would apply equally to Jython and IronPython. It is pure, unadulterated FUD. Peter is trying to extend the definition of the Python language to things that it simple doesn't cover, such as the C-API and what he thinks the interpreter should look like (to be discussed more).

Second, he writes, "What is the core precept of PyPy? It’s that “the compiler knows best”." This too, is wrong. First, PyPy's central thesis is, "any task repeatedly performed manually will be done incorrectly", this is why we have things like automatic insertion of the garbage collector, in preference to CPython's "reference counting everywhere", and automatically generating the just in time compiler from the interpreter, in preference to Unladen Swallow's (and almost every other language's) manual construction of it. Second, the PyPy developers would never argue that the compiler knows best, as I alluded to in this post's title. That doesn't mean you should quit trying to write intelligent compilers, 1) the compiler often knows better than the user, just like with C, while it's possible to write better x86 assembler than GCC for specific functions, over the course of a large project GCC will always win, 2) they aren't mutually exclusive, having an intelligent compiler does not prohibit giving the user more control, in fact it's a necessity! There are no pure-python hints that you can give to CPython to improve performance, but these can easily be added with PyPy's JIT.

He follows this by saying that in contrast to PyPy's (nonexistent) principle of "compiler knows best" CPython's strength is that it can communicate with other platforms because its inner workings are simple. These three things have nothing to do with each other. CPython's interoperability with other platforms is a function of it's C-API. You can build an API like this on top of something monstrously complicated too, look at JNI for the JVM. (I don't accept that PyPy is so complex, but that's another post for another time.) In any event, the PyPy developers are deeply committed to interoperability with other platforms, which is why Armin and Maciej have been working on cffi: http://cffi.readthedocs.org/en/latest/index.html

The next paragraph is one of the most bizarre things I've ever read. He suggests that if you do want the free performance gains PyPy promises you should just build a a Python to JS compiler and use Node.js. I have to assume this paragraph is a joke not meant for publication, because it's nonsense. First, I've been told by the scientific Python community (of which Peter is a member) that any solution that isn't backwards compatible with a mountain of older platforms will never be adopted. So naturally his proposed solution is to throw away all existing work. Next, he implies that Google, Mozilla, Apple, and Microsoft are all collaborating on a single Javascript runtime which is untrue, in fact they each have their own VM. And V8, the one runtime specifically alluded to via Node.js, is not, as he writes, designed to be concurrent; Evan Phoenix, lead developer of Rubinius, comments, "It's probably the least concurrent runtime I've seen."

He then moves on to discussing the transparency of the levels involved in a runtime. Here I think he's 100% correct. Being able to understand how a VM is operating, what it's doing, what it's optimizing, how it's executing is enormously important. That's why I'm confused that he's positioning this as an argument against PyPy, as we've made transparency of our system incredibly important. We have the jitviewer, a tool which exposes the exact internal operations and machine code generated for everything PyPy compiles, which can be correlated to a individual line of Python code. We also have a set of hooks into the JIT to be able to programatically inspect what's happening, including writing your own, pure Python, optimization passes: http://pypy.readthedocs.org/en/latest/jit-hooks.html!

That's all I have. Hope you enjoyed.

You can find the rest here. There are view comments.

Things College Taught me that the "Real World" Didn't

Posted November 21st, 2009. Tagged with pypy, parse, python, unladen-swallow, compile, django, ply, programming-languages, c++, response, lex, compiler, yacc, college.

A while ago Eric Holscher blogged about things he didn't learn in college. I'm going to take a different spin on it, looking at both things that I did learn in school that I wouldn't have learned else where (henceforth defined as my job, or open source programming), as well as thinks I learned else where instead of at college.

Things I learned in college:

  • Big O notation, and algorithm analysis. This is the biggest one, I've had little cause to consider this in my open source or professional work, stuff is either fast or slow and that's usually enough. Learning rigorous algorithm analysis doesn't come up all the time, but every once in a while it pops up, and it's handy.
  • C++. I imagine that I eventually would have learned it myself, but my impetus to learn it was that's what was used for my CS2 class, so I started learning with the class then dove in head first. Left to my own devices I may very well have stayed in Python/Javascript land.
  • Finite automaton and push down automaton. I actually did lexing and parsing before I ever started looking at these in class (see my blog posts from a year ago) using PLY, however, this semester I've actually been learning about the implementation of these things (although sadly for class projects we've been using Lex/Yacc).

Things I learned in the real world:

  • Compilers. I've learned everything I know about compilers from reading my papers from my own interest and hanging around communities like Unladen Swallow and PyPy (and even contributing a little).
  • Scalability. Interesting this is a concept related to algorithm analysis/big O, however this is something I've really learned from talking about this stuff with guys like Mike Malone and Joe Stump.
  • APIs, Documentation. These are the core of software development (in my opinion), and I've definitely learned these skills in the open source world. You don't know what a good API or documentation is until it's been used by someone you've never met and it just works for them, and they can understand it perfectly. One of the few required, advanced courses at my school is titled, "Software Design and Documentation" and I'm deathly afraid it's going to waste my time with stuff like UML, instead of focusing on how to write APIs that people want to use and documentation that people want to read.

So these are my short lists. I've tried to highlight items that cross the boundaries between what people traditionally expect are topics for school and topics for the real world. I'd be curious to hear what other people's experience with topics like these are.</div>

You can find the rest here. There are view comments.

Syntax Matters

Posted November 13th, 2009. Tagged with go, response, programming-languages.

Yesterday I wrote about why I wasn't very interested in Go. Two of my three major complaints were about the syntax of Go, and based on the comments I got here and on Hacker News a lot of people didn't seem to mind the syntax, or at least didn't think it was worth talking about. However, the opposite is true, for me the syntax is among the single most important things about a programming language.

I'd estimate that I spend about 60% of my day thinking about and reading code and 40% actually writing code. This means that code needs to be easy to read, that means no stray punctuation or anything else that distracts me from what I want to see in my code: what does it do when I run it. This means any code I'm looking at better be properly indented. It also means that I find braces and semicolons to be noise, stuff that just distracts me from what I'm reading the code to do. Therefore, code ought to use the existing, nonintrusive, structure, instead of obligating me to add more noise.

"Programs must be written for people to read, and only incidentally for machines to execute." This is a quote from Structure and Interpretation of Computer Programs, by Harold Abelson and Gerald Sussman. It has always struck me as odd that the people who wrote that chose to use Scheme for their text book. In my view Lisp and Scheme are the height of writing for a machine to execute. I think David Heinemeier Hansson got it right when he said, "code should be beautiful", I spent 5+ hours a day reading it, I damned well better want to look at it.

You can find the rest here. There are view comments.

When Django Fails? (A response)

Posted November 11th, 2009. Tagged with django, python, response, rails.

I saw an article on reddit (or was in hacker news?) that asked the question: what happens when newbies make typos following the Rails tutorial, and how good of a job does Rails do at giving useful error messages? I decided it would be interesting to apply this same question to Django, and see what the results are. I didn't have the time to review the entire Django tutorial, so instead I'm going to make the same mistakes the author of that article did and see what the results are, I've only done the first few where the analogs in Django were clear.

Mistake #1: Point a URL at a non-existent view:

I pointed a URL at the view "django_fails.views.homme" when it should have been "home". Let's see what the error is:

ViewDoesNotExist at /
Tried homme in module django_fails.views. Error was: 'module' object has no attribute 'homme'

So the exception name is definitely a good start, combined with the error text I think it's pretty clear that the view doesn't exist.

Mistake #2: misspell url in the mapping file

Instead of doing url("^$" ...) I did urll:

NameError at /
name 'urll' is not defined

The error is a normal Python exception, which for a Python programmer is probably decently helpful, the cake is that if you look at the traceback it points to the exact line, in user code, that has the typo, which is exactly what you need.

Mistake #3: Linking to non-existent pages

I created a template and tried to use the {% url %} tag on a nonexistent view.

TemplateSyntaxError at /
Caught an exception while rendering: Reverse for 'homme' with arguments '()' and keyword arguments '{}' not found.

It points me at the exact line of the template that's giving me the error and it says that the reverse wasn't found, it seems pretty clear to me, but it's been a while since I was new, so perhaps a new users perspective on an error like this would be important.

It seems clear to me that Django does a pretty good job with providing useful exceptions, in particular the tracebacks on template specific exceptions can show you where in your templates the errors are. One issue I'll note that I've experience in my own work is that when you have an exception from within a templatetag it's hard to get the Python level traceback, which is important when you are debugging your own templatetags. However, there's a ticket that's been filed for that in Django's trac.

You can find the rest here. There are view comments.

A response to "Python sucks"

Posted June 4th, 2009. Tagged with python, response.

I recently saw an article on Programming Reddit, titled, "Python sucks: Why Python is not my favourite programming language". I personally like Python quite a lot (see my blog title), but I figured I might read an interesting critique, probably from a very different point of view from mine. Unfortunately that is the opposite of what I found. The post was, at best, a horribly misinformed inaccurate critique, and at worst an intentionally dishonest, misleading, farce. The post can be found here. I felt the need to respond to it precisely because it is so lacking in facts, and reading it one can get impressions that are completely incorrect, and I am hoping I can correct some of these.

The post's initial statements about iterating over a file are accurate. However, he then goes on to say Python supports closures (which is true), and follows this with a piece of code that has absolutely nothing to do with closures, it is actually a callable object (or as C++ calls them, functors). The authors seems to take issue with these (though he doesn't explain why), ignoring the fact that Python has complete support for actual closures, not just callable objects.

The author then claims that Python has many other such arbitrary rules, using as an example the "yield" keyword. The author appears to be claiming the behavior of the yield keyword is arbitrary and poorly defined, however it's very unclear what his point actually is, or what the source of his complaints is. My only response can be to say that the "yield" keyword always turns the function it's used in into a generator, that is to say it returns an iterable that lazy evaluates the function, pausing each time it reaches the yield statement, and returning that object.

The author claims that many of the arbitrary decisions in Python are a result of Guido's insistence on a specific programming style, using as an example crippled lambdas. It is generally accepted that in Python lambda is just syntactic sugar for defining a function within any context (which the author completely ignores in his discussion of closure). To say that lambdas are crippled is to ignore the fact that absolutely nothing is rendered impossible by this, except for unreadable one liners.

The author's final complaint is directed at Python's C-API. This is possibly his least accurate critique. The author compares what is necessary to use a C library from within various programming languages. He shows that in Python all you have to do is import the library like you would for normal Python code. However, he goes on to say that for this to work you need to write lots of C boilerplate, and says that in other programming languages (showing examples from Haskell and PLT Scheme) this boiler plate is unnecessary. However, this is a completely disingenuous comparison. This is because what he is showing for Haskell and Scheme is their foreign function interface, not any actual language level integration. To do what he shows in Python is perfectly possible using the included ctypes library. I'm not familiar with the C-API of either Haskell or PLT Scheme, however I imagine that in order to work seamlessly and have the APIs appear the same as in code in those languages it is still necessary to write boiler plate so that the interpreter can recognize them.

In conclusion that blog post was a critique completely devoid of value, not worth the bytes that are used to store it. This is not to say there aren't any valid criticisms of Python, there are many, as evidenced by any number of recent blog posts discussing "5 things they hate about technology X", where technology X is something the author likes, because no technology is perfect, however no such honest critique was present here.

You can find the rest here. There are view comments.