Tuesday, November 30, 2010

Zen of NumPy

While I was on-site working for a client, one of the developers I worked with would begin each day with a brief discussion of one of the tenets from the "Zen of Python." For those who are not familiar with this little pearl of Python goodness. You can find the "Zen of Python" as an Easter egg inside a Python distribution:

>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

The Zen of Python is often quoted from one Python user to another in trying to communicate something of the essence of what makes programming in Python different. While we were discussing one of the points, one of my co-workers suggested that there should be a "Zen of NumPy". This isn't the first time I've heard that suggestion. Actually David Morrill (author of Traits) was the first person who suggested there should be a book about the "Zen of NumPy." I totally agree with him. The only problem is that everybody involved with NumPy has apparently been too busy to write one :-)

With this idea in my mind, when it came time to give a talk on NumPy at the New York Python Meetup group in Manhattan, I decided to create a first-draft of the Zen of NumPy. The phrases are included on one slide in the deck shared here.

I'm interested in feedback on these before proposing them for placement as

Here is my attempt at a "Zen of NumPy"

Strided is better than scattered
Contiguous is better than strided
Descriptive is better than imperative (use data-types)
Array-oriented is often better than object-oriented
Broadcasting is a great idea -- use where possible
Vectorized is better than an explicit loop
Unless it’s complicated --- then use numexpr, weave, or Cython
Think in higher dimensions

I think there are useful edits as well as more statements that could be added to this list. Your feedback is welcome.

Friday, November 19, 2010

A New Blog

Lately I have been finding a need to have a voice --- an authentic voice. A voice, which I occasionally expressed in the days when I had the time to be more active on open source mailing lists (SciPy, NumPy, and even Python itself). When I was younger, I didn't have as many endearing entanglements to the future that depend on my present. As a result, I could spend much time pursuing efforts that gave me a tremendous sense of accomplishment.

For as long as I can remember, I have been driven by discovery. Much to their annoyance, I would constantly ask my parents and 9 siblings "Why?" I used to be quite proud of myself as they would relate these stories of my inquisitive childhood at family gatherings. My particular combination of infused biochemistry that led to my knowledge addiction certainly drove most pursuits during my formative years, and this has had a strong impact in my life.

During my nearly 40 years, however, I have encountered an impressive cadre of awe-inspiring people each uniquely different. This has led me to conclude that it is not the particular current physical emergence that I find myself in. Rather, it is the particular use I am making of it. Do I pursue an agenda that barely extends beyond my internal neurobiology, or do I use my combination of skills and knowledge to seek a wider consistency that can harmonize with a beautifully complex society.

Earlier tonight, I listened to technology leaders and entrepreneurs tell their view of what society would be like if their respective companies were wildly successful. I listened to this message in a stunning lecture hall in Peterhouse at Cambridge University. While they each brought a distinct perspective, their unifying message was the power of technology to change the world.

Search for "Silicon valley comes to Cambridge" in a few days to get a summary and possibly even video of the talks. Megan Smith from Google (www.google.org) spoke of the power of big data to solve social injustices such as the sexual exploitation of children. Reid Hoffman, co-founder of LinkedIn, spoke of the power of inter-connectedness to solve big problems by bringing the right people together quickly.

Other people spoke and gave interesting perspectives including Mike Lynch, founder of Autonomy, who gave a wonderful talk on the importance of meaningful interaction with data so that our lives are enhanced and not enslaved by the explosion of data and technology. He also gave a tribute to Thomas Bayes. By looking on his site, I noticed that he gives similar props to Claude Shannon. I'm already impressed. These are two thinkers who were able to present important concepts that remain under-appreciated.

I do think it's important what people think. The ideas we carry in our heads are critical. It is these ideas which drive our necessarily individual pursuits and can lead to disharmony. I like to pass along useful information, colored of course by my own experiences and perspective in the simple and perhaps naive hopes that sustainable, lasting solutions can be discovered.

Most of my posts will be technical, as I am hoping to use this forum as a way to write about the thoughts I am having in my own attempt to hone and pare them. In particular, most of these posts will be about technology that I am involved with or have some exposure to. Upcoming posts include "The Zen of NumPy", "7 Heresies of Technical Computing", and "What I've learned from SciPy and Open Source"

If you happen to come across these musings, your feedback is welcome. I would love to hear about your experiences with any thoughts that are covered in my posts.