alex gaynor's blago-blog

Posts tagged with community

Python for Ada

Posted September 23rd, 2014. Tagged with django, python, diversity, community, programming.

Last year I wrote about why I think it's important to support diversity within our communities, and about some of the work the Ada Initiative does to support this. The reasons I talked about are good, and (sadly) as relevant today as they were then.

I'd like to add a few more reasons I care about these issues:

  • I'm tired of wondering if I should recommend a local meetup to a friend: what if a known harasser shows up?
  • I'm tired of having people come up to me at conferences and tell me "this whole feminism thing might be going too far".
  • I'm tired of having thousands of people show up to leave angry comments because I merged a pull request.

I'm very tired of being tired. And yet, I can't even begin to imagine how tired I would be if I was a recipient of the constant stream of harassment that many women who speak up receive.

For all these reasons (and one more that I'll get to), I'm calling on members of the Python community to join me in showing their support for working to fix these issues, and foster a diverse community, by donating to support the Ada Initiative.

For the next 5 days, Jacob Kaplan-Moss, Carl Meyer, and myself will be matching donations, up to $7,5000:

Donate now

I encourage you to donate to show your support.

I mentioned there was one additional reason this is important to me. A major theme, for myself, over the last year has been thinking about my ethical obligations as a programmer (and more broadly, the obligations all programmers have). I've been particularly influenced by this blog post by Glyph, and this talk by Mike Monteiro. If you haven't already, take a moment to read/watch them.

Whoever came up with the term "User agent" to describe a browser uncovered a very powerful idea. Computer programs are expected to faithfully execute and represent the agency of their human operator.

Trying to understand the intent and desire of our users can be a challenging thing under the best of circumstances. As an industry, we compound this problem many times over by the underrepresentation of many groups among our workforce. This issues shows up again and again with service's such as Twitter and Facebook's handling of harassment and privacy issues. We tend to build products for ourselves, and when we all look the same, we don't build products that serve all of our users well.

The best hope we have for building programs that are respectful of the agency of our users is for the people who use them to be represented by the people who build them. To get there, we need to create an industry where harassment and abuse are simply unacceptable.

It's a long road, but the Ada Initiative does fantastic work to pursue these goals (particularly in the open source community, which is near and dear to me). Please, join us in supporting the ongoing work of building the community I know we all want to see, and which we can be proud of.

Donate now

You can find the rest here. There are view comments.

Quo Vadimus?

Posted May 26th, 2014. Tagged with open-source, community, python.

I've spent just about every single day for the last 6 months doing something with Python 3. Some days it was helping port a library, other days it was helping projects put together their porting strategies, and on others I've written prose on the subject. At this point, I am very very bored of talking about porting, and about the health of our ecosystem.

Most of all, I'm exhausted, particularly from arguing about whether or not the process is going well. So here's what I would like:

I would like to know what the success condition for Python 3 is. If we were writing a test case for this, when would it pass?

And let's do this with objective measures. Here are some ideas I have:

  • Percentage of package downloads from PyPI performed with Python 3 clients
  • Percentage of packages on PyPI which support Python 3
  • Percentage of Python builds on Travis CI which featured a Python 3 builder

I'd like a measurement, and I'd like a schedule: "At present x% of PyPI downloads use Python 3, in 3 months we'd like it to be at y%, in 12 months we'd like it to be at z%". Then we can have some way of judging whether we're on a successful path. And if we miss our goal, we'll know it's time to reevaluate this effort.

Quo vadimus?

You can find the rest here. There are view comments.

Service

Posted May 19th, 2014. Tagged with django, python, open-source, community.

If you've been around an Open Source community for any length of time, you've probably heard someone say, "We're all volunteers here". Often this is given as an explanation for why some feature hasn't been implemented, why a release has been delayed, and in general, why something hasn't happened.

I think when we say these things (and I've said them as much as anyone), often we're being dishonest. Almost always it's not a question of an absolute availability of resources, but rather how we prioritize among the many tasks we could complete. It can explain why we didn't have time to do things, but not why we did them poorly.

Volunteerism does not place us above criticism, nor should it absolve us when we err.

Beyond this however, many Open Source projects (including entirely volunteer driven ones) don't just make their codebases available to others, they actively solicit users, and make the claim that people can depend on this software.

That dependency can take many forms. It usually means an assumption that the software will still exist (and be maintained) tomorrow, that it will handle catastrophic bugs in a reasonable way, that it will be a stable base to build a platform or a business on, and that the software won't act unethically (such as by flagrantly violating expectations about privacy or integrity).

And yet, across a variety of these policy areas, such as security and backwards compatibility we often fail to properly consider the effects of our actions on our users, particularly in a context of "they have bet their businesses on this". Instead we continue to treat these projects as our hobby projects, as things we casually do on the side for fun.

Working on PyCA Cryptography, and security in general, has grealy influenced my thinking on these issues. The nature of cryptography means that when we make mistakes, we put our users' businesses, and potentially their customers' personal information at risk. This responsibility weighs heavily on me. It means we try to have policies that emphasize review, it means we utilize aggressive automated testing, it means we try to design APIs that prevent inadvertent mistakes which affect security, it means we try to write excellent documentation, and it means, should we have a security issue, we'll do everything in our power to protect our users. (I've previous written about what I think Open Source projects' security policies should look like).

Open Source projects of a certain size, scope, and importance need to take seriously the fact that we have an obligation to our users. Whether we are volunteers, or paid, we have a solemn responsibility to consider the impact of our decisions on our users. And too often in the past, we have failed, and acted negligently and recklessly with their trust.

Often folks in the Open Source community (again, myself included!) have asked why large corporations, who use our software, don't give back more. Why don't they employ developers to work on these projects? Why don't they donate money? Why don't they donate other resources (e.g. build servers)?

In truth, my salary is paid by every single user of Python and Django (though Rackspace graciously foots the bill). The software I write for these projects would be worth nothing if it weren't for the community around them, of which a large part is the companies which use them. This community enables me to have a job, to travel the world, and to meet so many people. So while companies, such as Google, don't pay a dime of my salary, I still gain a lot from their usage of Python.

Without our users, we would be nothing, and it's time we started acknowledging a simple truth: our projects exist in service of our users, and not the other way around.

You can find the rest here. There are view comments.

Best of PyCon 2014

Posted April 17th, 2014. Tagged with python, community.

This year was my 7th PyCon, I've been to every one since 2008. The most consistent trend in my attendance has been that over the years, I've gone to fewer and fewer talks, and spent more and more time volunteering. As a result, I can't tell you what the best talks to watch are (though I recommend watching absolutely anything that sounds interesting online). Nonetheless, I wanted to write down the two defining events at PyCon for me.

The first is the swag bag stuffing. This event occurs every year on the Thursday before the conference. Dozens of companies provide swag for PyCon to distribute to our attendees, and we need to get it into over 2,000 bags. This is one of the things that defines the Python community for me. By all rights, this should be terribly boring and monotonous work, but PyCon has turned it into an incredibly fun, and social event. Starting at 11AM, half a dozen of us unpacked box after box from our sponsors, and set the area up. At 3PM, over one hundred volunteers showed up to help us operate the human assembly line, and in less than two and a half hours, we'd filled the bags.

The second event I wanted to highlight was an open space session, on Composition. For over two hours, a few dozen people discussed the problems with inheritance, the need for explicit interface definition, what the most idiomatic ways to use decorators are, and other big picture software engineering topics. We talked about design mistakes we'd all made in our past, and discussed refactoring strategies to improve code.

These events are what make PyCon special for me: community, and technical excellence, in one place.

PS: You should totally watch my two talks. One is about pickle and the other is about performance.

You can find the rest here. There are view comments.

Gender neutral language - An FAQ

Posted November 30th, 2013. Tagged with ethics, diversity, open-source, community.

I'd like to refer to a hypothetical person in my documentation

Try something like this:

When a user visits the website, they will be assigned a session ID, and it will be transmitted to them in the HTTP response and stored in their browser.

But not like this!

When a user visits the website, he will be assigned a session ID, and it will be transmitted to him in the HTTP response and stored in his browser.

Why?

Using gendered pronouns signals to the audience your assumptions about who they are, and very often lets them know that they don't belong. Since that's not your intent, better to just be gender neutral.

And if you don't believe me, some folks did some science (other studies have consistently reproduced this result).

Can I just go 50/50 on male and female pronouns?

It's a nice idea, unfortunately it doesn't work. Your users don't read your documentation cover to cover, so they won't be able to see your good intentions. Instead they'll be linked somewhere in the middle, see your gendered language, and feel excluded.

In addition, not everyone identifies by male or female pronouns. Play it safe, just be gender neutral.

Using the plural pronouns isn't grammatically correct!

I've been assured by people far more knowlegable than I that it's ok, even Shakespeare did it. Personally, I'm comforted by the knowledge that even if I'm wrong about the grammar, I won't have made anyone feel excluded.

Someone sent a pull request to my project changing the languages!

So merge it! If you've got some process that a contributors needs to go through (such as a CLA), let them know. They're just trying to make your community better and bigger!

They said I was being hostile!

I'm sorry, but you were. Your choice of language has an impact on people.

I wasn't trying to be!

That's ok, hostility isn't about intent, your words had an impact whether you meant it or not.

Maybe you didn't know, you're not a native English speaker, your 11th grade English teacher beat you over the head with some bad advice. That's ok, it only takes a moment to fix it, and then you're letting everyone know it's easy to fix!

Aren't there bigger issues we should be dealing with?

There are so many giant issues we face. This one takes 15 seconds to fix, has no downsides, and we can all be a part of making it better. If we can't do this, how could we ever tackle the other challenges?

Has anyone ever asked these questions?

You have no idea.

Some of these aren't questions!

That's ok.

You can find the rest here. There are view comments.

Affirmative action

Posted November 27th, 2013. Tagged with community, ethics, diversity.

Whenever the topic of affirmative action comes up, you can be sure someone will ask the question: "How would you feel if you found out that you got your job, or got into college, because of your race?"

It's funny, no one ever asks: "How would you feel if you got your job, or got into college, because you were systemically advantaged from the moment you were born?"

Interesting.

You can find the rest here. There are view comments.

Security process for Open Source Projects

Posted October 19th, 2013. Tagged with django, python, open-source, community.

This post is intended to describe how open source projects should handle security vulnerabilities. This process is largely inspired by my involvement in the Django project, whose process is in turn largely drawn from the PostgreSQL project's process. For every recommendation I make I'll try to explain why I've made it, and how it serves to protect you and your users. This is largely tailored at large, high impact, projects, but you should able to apply it to any of your projects.

Why do you care?

Security vulnerabilities put your users, and often, in turn, their users at risk. As an author and distributor of software, you have a responsibility to your users to handle security releases in a way most likely to help them avoid being exploited.

Finding out you have a vulnerability

The first thing you need to do is make sure people can report security issues to you in a responsible way. This starts with having a page in your documentation (or on your website) which clearly describes an email address people can report security issues to. It should also include a PGP key fingerprint which reporters can use to encrypt their reports (this ensures that if the email goes to the wrong recipient, that they will be unable to read it).

You also need to describe what happens when someone emails that address. It should look something like this:

  1. You will respond promptly to any reports to that address, this means within 48 hours. This response should confirm that you received the issue, and ideally whether you've been able to verify the issue or more information is needed.
  2. Assuming you're able to reproduce the issue, now you need to figure out the fix. This is the part with a computer and programming.
  3. You should keep in regular contact with the reporter to update them on the status of the issue if it's taking time to resolve for any reason.
  4. Now you need to inform the reporter of your fix and the timeline (more on this later).

Timeline of events

From the moment you get the initial report, you're on the clock. Your goal is to have a new release issued within 2-weeks of getting the report email. Absolutely nothing that occurs until the final step is public. Here are the things that need to happen:

  1. Develop the fix and let the reporter know.
  2. You need to obtain a CVE (Common Vulnerabilities and Exposures) number. This is a standardized number which identifies vulnerabilities in packages. There's a section below on how this works.
  3. If you have downstream packagers (such as Linux distributions) you need to reach out to their security contact and let them know about the issue, all the major distros have contact processes for this. (Usually you want to give them a week of lead time).
  4. If you have large, high visibility, users you probably want a process for pre-notifying them. I'm not going to go into this, but you can read about how Django handles this in our documentation.
  5. You issue a release, and publicize the heck out of it.

Obtaining a CVE

In short, follow these instructions from Red Hat.

What goes in the release announcement

Your release announcement needs to have several things:

  1. A precise and complete description of the issue.
  2. The CVE number
  3. Actual releases using whatever channel is appropriate for your project (e.g. PyPI, RubyGems, CPAN, etc.)
  4. Raw patches against all support releases (these are in addition to the release, some of your users will have modified the software, and they need to be able to apply the patches easily too).
  5. Credit to the reporter who discovered the issue.

Why complete disclosure?

I've recommended that you completely disclose what the issue was. Why is that? A lot of people's first instinct is to want to keep that information secret, to give your users time to upgrade before the bad guys figure it out and start exploiting it.

Unfortunately it doesn't work like that in the real world. In practice, not disclosing gives more power to attackers and hurts your users. Dedicated attackers will look at your release and the diff and figure out what the exploit is, but your average users won't be able to. Even embedding the fix into a larger release with many other things doesn't mask this information.

In the case of yesterday's Node.JS release, which did not practice complete disclosure, and did put the fix in a larger patch, this did not prevent interested individuals from finding out the attack, it took me about five minutes to do so, and any serious individual could have done it much faster.

The first step for users in responding to a security release in something they use is to assess exposure and impact. Exposure means "Am I affected and how?", impact means "What is the result of being affected?". Denying users a complete description of the issue strips them of the ability to answer these questions.

What happens if there's a zero-day?

A zero-day is when an exploit is publicly available before a project has any chance to reply to it. Sometimes this happens maliciously (e.g. a black-hat starts using the exploit against your users) and sometimes it is accidentally (e.g. a user reports a security issue to your mailing list, instead of the security contact). Either way, when this happens, everything goes to hell in a handbasket.

When a zero-day happens basically everything happens in 16x fast-forward. You need to immediately begin preparing a patch and issuing a release. You should be aiming to issue a release on the same day as the issue is made public.

Unfortunately there's no secret to managing zero-days. They're quite simply a race between people who might exploit the issue, and you to issue a release and inform your users.

Conclusion

Your responsibility as a package author or maintainer is to protect your users. The name of the game is keeping your users informed and able to judge their own security, and making sure they have that information before the bad guys do.

You can find the rest here. There are view comments.

Meritocracy

Posted October 12th, 2013. Tagged with politics, community, ethics, django, open-source.

Let's start with a definition, a meritocracy is a group where leadership or authority is derived from merit (merit being skills or ability), and particularly objective merit. I think adding the word objective is important, but not often explicitly stated.

A lot of people like to say open source is a meritocracy, the people who are the top of projects are there because they have the most merit. I'd like to examine this idea. What if I told you the United States Congress was a meritocracy? You might say "gee, how could that be, they're really terrible at their jobs, the government isn't even operational!?!". To which I might respond "that's evidence that they aren't good at their jobs, it doesn't prove that they aren't the best of the available candidates". You'd probably tell me that "surely someone, somewhere, is better qualified to do their jobs", and I'd say "we have an open, democratic process, if there was someone better, they'd run for office and get elected".

Did you see what I did there? It was subtle, a lot of people miss it. I begged the question. Begging the question is the act of responding to a hypothesis with a conclusion that's premised on exactly the question the hypothesis asks.

So what if you told me that Open Source was meritocracy? Projects gain recognition because they're the best, people become maintainers of libraries because they're the best.

And those of us involved in open source love this explanation, why wouldn't we? This explanation says that the reason I'm a core developer of Django and PyPy because I'm so gosh-darned awesome. And who doesn't like to think they're awesome? And if I can have a philosophy that leads to myself being awesome, all the better!

Unfortunately, it's not a valid conclusion. The problem with stating that a group is meritocratic is that it's not a falsifiable hypothesis.

We don't have a definition of objective merit. As a result of which there's no piece of evidence that I can show you to prove that a group isn't in fact meritocratic. And a central tenant of any sort of rigorous inquisitive process is that we need to be able to construct a formal opposing argument. I can test whether a society is democratic, do the people vote, is the result of the vote respected? I can't test if a society is meritocratic.

It's unhealthy when we consider or groups, or cultures, or our societies as being meritocratic. It makes us ignore questions about who our leaders are, how they got there who isn't represented. The best we can say is that maybe our organizations are (perceptions of subjective merit)-ocracies, which is profoundly different from what we mean when we say meritocracy.

I'd like to encourage groups that self-identify as being meritocratic (such as The Gnome Foundation, The Apache Software Foundation, Mozilla, The Document Foundation, and The Django Software Foundation) to reconsider this. Aspiring to meritocracy is a reasonable, it makes sense to want for the people who are best capable of doing so to lead us, but it's not something we can ever say we've achieved.

You can find the rest here. There are view comments.

Effective Code Review

Posted September 26th, 2013. Tagged with openstack, python, community, django, open-source.

Maybe you practice code review, either as a part of your open source project or as a part of your team at work, maybe you don't yet. But if you're working on a software project with more than one person it is, in my view, a necessary piece of a healthy workflow. The purpose of this piece is to try to convince you its valuable, and show you how to do it effectively.

This is based on my experience doing code review both as a part of my job at several different companies, as well as in various open source projects.

What

It seems only seems fair that before I try to convince you to make code review an integral part of your workflow, I precisely define what it is.

Code review is the process of having another human being read over a diff. It's exactly like what you might do to review someone's blog post or essay, except it's applied to code. It's important to note that code review is about code. Code review doesn't mean an architecture review, a system design review, or anything like that.

Why

Why should you do code review? It's got a few benefits:

  • It raises the bus factor. By forcing someone else to have the familiarity to review a piece of code you guarantee that at least two people understand it.
  • It ensures readability. By getting someone else to provide feedback based on reading, rather than writing, the code you verify that the code is readable, and give an opportunity for someone with fresh eyes to suggest improvements.
  • It catches bugs. By getting more eyes on a piece of code, you increase the chances that someone will notice a bug before it manifests itself in production. This is in keeping with Eric Raymond's maxim that, "given enough eyeballs, all bugs are shallow".
  • It encourages a healthy engineering culture. Feedback is important for engineers to grow in their jobs. By having a culture of "everyone's code gets reviewed" you promote a culture of positive, constructive feedback. In teams without review processes, or where reviews are infrequent, code review tends to be a tool for criticism, rather than learning and growth.

How

So now that I've, hopefully, convinced you to make code review a part of your workflow how do you put it into practice?

First, a few ground rules:

  • Don't use humans to check for things a machine can. This means that code review isn't a process of running your tests, or looking for style guide violations. Get a CI server to check for those, and have it run automatically. This is for two reasons: first, if a human has to do it, they'll do it wrong (this is true of everything), second, people respond to certain types of reviews better when they come from a machine. If I leave the review "this line is longer than our style guide suggests" I'm nitpicking and being a pain in the ass, if a computer leaves that review, it's just doing it's job.
  • Everybody gets code reviewed. Code review isn't something senior engineers do to junior engineers, it's something everyone participates in. Code review can be a great equalizer, senior engineers shouldn't have special privledges, and their code certainly isn't above the review of others.
  • Do pre-commit code review. Some teams do post-commit code review, where a change is reviewed after it's already pushed to master. This is a bad idea. Reviewing a commit after it's already been landed promotes a feeling of inevitability or fait accompli, reviewers tend to focus less on small details (even when they're important!) because they don't want to be seen as causing problems after a change is landed.
  • All patches get code reviewed. Code review applies to all changes for the same reasons as you run your tests for all changes. People are really bad at guessing the implications of "small patches" (there's a near 100% rate of me breaking the build on change that are "so small, I don't need to run the tests"). It also encourages you to have a system that makes code review easy, you're going to be using it a lot! Finally, having a strict "everything gets code reviewed" policy helps you avoid arguments about just how small is a small patch.

So how do you start? First, get yourself a system. Phabricator, Github's pull requests, and Gerrit are the three systems I've used, any of them will work fine. The major benefit of having a tool (over just mailing patches around) is that it'll keep track of the history of reviews, and will let you easily do commenting on a line-by-line basis.

You can either have patch authors land their changes once they're approved, or you can have the reviewer merge a change once it's approved. Either system works fine.

As a patch author

Patch authors only have a few responsibilities (besides writing the patch itself!).

First, they need to express what the patch does, and why, clearly.

Second, they need to keep their changes small. Studies have shown that beyond 200-400 lines of diff, patch review efficacy trails off [1]. You want to keep your patches small so they can be effectively reviewed.

It's also important to remember that code review is a collaborative feedback process if you disagree with a review note you should start a conversation about it, don't just ignore it, or implement it even though you disagree.

As a review

As a patch reviewer, you're going to be looking for a few things, I recommend reviewing for these attributes in this order:

  • Intent - What change is the patch author trying to make, is the bug they're fixing really a bug? Is the feature they're adding one we want?
  • Architecture - Are they making the change in the right place? Did they change the HTML when really the CSS was busted?
  • Implementation - Does the patch do what it says? Is it possibly introducing new bugs? Does it have documentation and tests? This is the nitty-gritty of code review.
  • Grammar - The little things. Does this variable need a better name? Should that be a keyword argument?

You're going to want to start at intent and work your way down. The reason for this is that if you start giving feedback on variable names, and other small details (which are the easiest to notice), you're going to be less likely to notice that the entire patch is in the wrong place! Or that you didn't want the patch in the first place!

Doing reviews on concepts and architecture is harder than reviewing individual lines of code, that's why it's important to force yourself to start there.

There are three different types of review elements:

  • TODOs: These are things which must be addressed before the patch can be landed; for example a bug in the code, or a regression.
  • Questions: These are things which must be addressed, but don't necessarily require any changes; for example, "Doesn't this class already exist in the stdlib?"
  • Suggestions for follow up: Sometimes you'll want to suggest a change, but it's big, or not strictly related to the current patch, and can be done separately. You should still mention these as a part of a review in case the author wants to adjust anything as a result.

It's important to note which type of feedback each comment you leave is (if it's not already obvious).

Conclusion

Code review is an important part of a healthy engineering culture and workflow. Hopefully, this post has given you an idea of either how to implement it for your team, or how to improve your existing workflow.

[1]http://www.ibm.com/developerworks/rational/library/11-proven-practices-for-peer-review/

You can find the rest here. There are view comments.

Being negative

Posted September 22nd, 2013. Tagged with thinking, community.

From time to time I joke that Bob Knight stole the title of my autobiography with his, which is titled "The Power of Negativity". I've never read the book, but it's very easy for me to imagine how it could apply to me. Many people who know me would immediately identify me as a negative person. They're not wrong, and it's a constant source of struggle for me.

To be clear: I'm sarcastic, I'm critical, I'm a perfectionist and impossible to impress, and I have a capacious ego. As a result of which I almost universally have a problem with any technology I come across, I have a critique to offer of nearly everything, both social and technical.

Some of this is probably my "personality" [1], but a lot of it is intentional. I'm deliberately negative about many things. There's a few reasons for this. First, I'm good at it, I seem to have an ability to identify and articulate problems with things. I also think it's important, when things are not perfect (and they so rarely are), we have a responsibility to speak honestly about them, and to discuss their flaws with the same prominence we discuss their features. Finally, articulating problems with things is one of the ways I learn best. Much of my philosophy about software, and the world, has been formed by identifying problems with the things that exist today.

The conflict about this negativity for me comes from two places. First, the effect it has on other people. For many people, when they see this negativity it has a demoralizing effect on them, they lose interest in something as a result. In particular I'm concerned that my attitudes could be an discouraging to people getting into software development; James Coglan wrote a thing about this, and I certainly don't want to be part of the problem, particularly given how much I've invested in trying to make the tech community more, not less, welcoming . The second conflict comes from the fact that I am, at heart, a boundlessly optimistic person. A strong complement to my negativity is an unyielding belief that we must and can fix all of these things.

Where does this leave me? Uncertain. It is truly important to me that I continue to cast a critical eye on everything, including playing the devil's advocate; it's part of how I learn, and learning is very much something I want to continue to do. But I don't want to ever be why someone is afraid to get involved in programming, in open source, in speaking, or in anything else, because they're afraid I'll do nothing but critique their work. I don't know how to resolve this tension. For the past few months I've been trying to be less negative and angry on Twitter, I don't know how successful I'm being. I hope you'll try to help by letting me know when I've got over the line.

[1]This isn't to say it's intrinsic, or immutable, but simply that it's not a conscious thing.

You can find the rest here. There are view comments.

You guys know who Philo Farnsworth was?

Posted September 15th, 2013. Tagged with django, python, open-source, community.

Friends of mine will know I'm a very big fan of the TV show Sports Night (really any of Aaron Sorkin's writing, but Sports Night in particular). Before you read anything I have to say, take a couple of minutes and watch this clip:

I doubt Sorkin knew it when he scripted this (I doubt he knows it now either), but this piece is about how Open Source happens (to be honest, I doubt he knows what Open Source Software is).

This short clip actually makes two profound observations about open source.

First, most contribution are not big things. They're not adding huge new features, they're not rearchitecting the whole system to address some limitation, they're not even fixing a super annoying bug that affects every single user. Nope, most of them are adding a missing sentence to the docs, fixing a bug in a wacky edge case, or adding a tiny hook so the software is a bit more flexible. And this is fantastic.

The common wisdom says that the thing open source is really bad at is polish. My experience has been the opposite, no one is better at finding increasingly edge case bugs than open source users. And no one is better at fixing edge case bugs than open source contributors (who overlap very nicely with open source users).

The second lesson in that clip is about how to be an effective contributor. Specifically that one of the keys to getting involved effectively is for other people to recognize that you know how to do things (this is an empirical observation, not a claim of how things ought to be). How can you do that?

  • Write good bug reports. Don't just say "it doesn't work", if you've been a programmer for any length of time, you know this isn't a useful bug report. What doesn't work? Show us the traceback, or otherwise unexpected behavior, include a test case or instructions for reproduction.
  • Don't skimp on the details. When you're writing a patch, make sure you include docs, tests, and follow the style guide, don't just throw up the laziest work possible. Attention to detail (or lack thereof) communicates very clearly to someone reviewing your work.
  • Start a dialogue. Before you send that 2,000 line patch with that big new feature, check in on the mailing list. Make sure you're working in a way that's compatible with where the project is headed, give people a chance to give you some feedback on the new APIs you're introducing.

This all works in reverse too, projects need to treat contributors with respect, and show them that the project is worth their time:

  • Follow community standards. In Python this means things like PEP8, having a working setup.py, and using Sphinx for documentation.
  • Have passing tests. Nothing throws me for a loop worse than when I checkout a project to contribute and the tests don't pass.
  • Automate things. Things like running your tests, linters, even state changes in the ticket tracker should all be automated. The alternative is making human beings manually do a bunch of "machine work", which will often be forgotten, leading to a sub-par experience for everyone.

Remember, Soylent Green Open Source is people

That's it, the blog post's over.

You can find the rest here. There are view comments.

Your project doesn't mean your playground

Posted September 8th, 2013. Tagged with community, django, python.

Having your own open source project is awesome. You get to build a thing you like, obviously. But you also get to have your own little playground, a chance to use your favorite tools: your favorite VCS, your favorite test framework, your favorite issue tracker, and so on.

And if the point of your project is to share a thing you're having fun with with the world, that's great, and that's probably all there is to the story (you may stop reading here). But if you're interested in growing a legion of contributors to build your small side project into an amazing thing, you need to forget about all of that and remember these words: Your contributors are more important than you.

Your preferences aren't that important: This means you probably shouldn't use bzr if everyone else is using git. You shouldn't use your own home grown documentation system when everyone is using Sphinx. Your playground is a tiny thing in the giant playground that is the Python (or whatever) community. And every unfamiliar thing a person needs to familiarize themselves with to contribute to your project is another barrier to entry, and another N% of potential contributors who won't actually materialize.

I'm extremely critical of the growing culture of "Github is open source", I think it's ignorant, shortsighted, and runs counter to innovation. But if your primary interest is "having more contributors", you'd be foolish to ignore the benefits of having your project on Github. It's where people are. It has tools that are better than almost anything else you'll potentially use. And most importantly it implies a workflow and toolset with which a huge number of people are familiar.

A successful open source project outgrows the preferences of its creators. It's important to prepare for that by remembering that (if you want contributors) your workflow preferences must always be subservient to those of your community.

You can find the rest here. There are view comments.

Why I support diversity

Posted August 28th, 2013. Tagged with python, diversity, community, programming, django.

I get asked from time to time why I care about diversity in the communities I'm a part of, particularly the Django, Python, and the broader software development and open source community.

There's a lot of good answers. The simplest one, and the one I imagine just about everyone can get behind: diverse groups perform better at creative tasks. A group composed of people from different backgrounds will do better work than a homogeneous group.

But that's not the main reason I care. I care because anyone who knows how to read some statistics knows that it's ridiculous that I'm where I am today. I have a very comfortable job and life, many great friends, and the opportunity to travel and to spend my time on the things I care about. And that's obscenely anomalous for a high school dropout like me.

All of that opportunity is because when I showed up to some open source communities no one cared that I was a high school dropout, they just cared about the fact that I seemed to be interested, wanted to help, and wanted to learn. I particularly benefited from the stereotype of white dropouts, which is considerably more charitable than (for example) the stereotype of African American dropouts.

Unfortunately, our communities aren't universally welcoming, aren't universally nice, and aren't universally thoughtful and caring. Not everyone has the same first experience I did. In particular people who don't look like me, aren't white males, disproportionately don't have this positive experience. But everyone ought to. (This is to say nothing of the fact that I had more access to computers at a younger age then most people.)

That's why I care. Because I benefited from so much, and many aren't able to.

This is why I support the Ada Initiative. I've had the opportunity to see their work up close twice. Once, as a participant in Ada Camp San Francisco's Allies Track. And a second time in getting their advice in writing the Code of Conduct for the Django community. They're doing fantastic work to support more diversity, and more welcoming communities.

Right now they're raising funds to support their operations for the next year, if you accord to, I hope you'll donate: http://supportada.org

You can find the rest here. There are view comments.