7 posts about code

Trying to Quit

Dec 3 2008

I have been trying to quit programming for about six months.

I'm down to about 2 times a day, on average, sometimes more. It's a problem.

Don't get me wrong, programming is one of the best disciplines I ever stuck my nose into. Programming teaches you fundamentally how to constantly improve your craft. But it's exhausting, distracting, and inseparable from the cult of efficiency. It takes you away from humans, mashes your nose into an LCD all night, and leaves you with a delusional sense of total control over your environment. I wish that I could take back every second of my life that I spent learning to build stacks and refactor my own crappy code.

The problem is that code is so damn useful. It beckons with APIs and gives you entire weekends of mindless flow; hack, hack, hack, make it pretty and fast.

So anyway, I'm working on it.

Visualizing Human Rights with the Google Charts API

Jun 20 2008

Smartly presented information is a nonprofit's best friend. If you can't communicate the problem, no one is going to give a damn. Hash's blog just pointed me to some powerful charts Sokwanele mapping project , which I've mentioned previously. These charts are extremely important data to have in the public domain, and it's great that they appear well-executed and polished, with a high resolution of visual information.

pretty charts, not so pretty data.

Google Charts from Sokwanele and Mobile Researcher

The charts at Mobile Researcher also caught my eye recently (also pictured, at right and bottom). Turns out they were both made with Google Charts. I hadn't used it before, but I recognized their densely set labels and had some vague ideas that there was a Ruby wrapper. And I just thought that the Mobile Researcher charts were really beautiful. Turns out it's super easy. There are plenty of libraries for managing the requests to the API, and, in some simple cases you can even code it by hand, since all of the code is simply passed through a query tag on an image url.

What it does do is proved fodder for organizations inside and out to make an even stronger case against this repressive regime. - Erik Hersman @ whiteafrican.com
But in most of the cases the data is hashed somehow with javascript, since urls can only be so long. Since I was working in a Rails app, I used a great ruby library and was up and running very quickly. I wholeheartedly recommend them anytime you need simple, sharp graphics to illustrate your research. If you have a web app or with dynamic charts, it's a no-brainer, since generating images is relatively intensive memory-wise.

All this is to say that something like this:

  <img src="http://chart.apis.google.com/chart?chd=t:6,2,6,4,2,4,7,3,4,2,3,4 &chco=0077CC&chs=120x40&cht=ls" />;

Turns into this:


Or you can abstract the access to make it easier with a library like this with a method like this:

    google_pie('val1', 'val2', 'val3')
    def google_pie(*args)
    size = "250x100"
    url  = "http://chart.apis.google.com/chart?"
    return < img src=\"#{Gchart.pie_3d( :background => 'E9E7DD', :size => '300x80', :data => *args)} \" />

But the API of course is language neutral, and I am sure it would be just as easy in most any language. Oh, and it's free! (Though capped at 10k charts per day ... which would be a good problem to have, I suppose.)

The Three Simplest and Most Effective Anti-Spam Hacks I Have Ever Seen

May 28 2008

Hack zero: Switch to Gmail This is not a joke: Gmail is a fantastic and nearly spam-free platform. Notably, you can hook it up with a custom domain name so no one knows you are part of the Goog machine like everyone else.

Hack one: Greylisting with Postfix on Ubuntu

A mail transfer agent using greylisting will "temporarily reject" any email from a sender it does not recognize. If the mail is legitimate, the originating server will most likely try again to send it later, at which time the destination will accept it. Wikipedia: Greylisting

Assuming that you have your own email server, greylisting is genius. Diabolically elegant, really. If you run an email server (or any server that can receive email) you are probably running the Postfix MTA, in which case their is a main configuration file appropriately named main.cf. A couple of edits to this file and you are on your way.

Here's how this setup looks (not my graph but I have definitely seen this happen on production mailservers):

The really brilliant thing about greylisting is that it it deals with spam way before it ever reaches your inbox, which is the only way to go (I don't use any spam filtering on my mailbox. That's too late, especially from a sysadmin perspective (think of the children cycles!).

Step one: install postgrey.

apt-get install postgrey

Two: edit your main.cf file.

sudo vi /etc/postfix/main.cf

Three: Then open it up and look for your smtpd_restrictions; add the following line:

checkpolicyservice inet:

Four: Reload Postfix

/etc/init.d/postfix reload

Hack 2: DNS Blocklists

This one is even easier, requiring only an extra line (for each blocklist). The blocklists are Just put it right there in that same block in main.cf. I typically use four of them. (Each has a slightly different purpose and tolerance. Check out the sites to get a flavor for why they exist.) This one is actually my favorite — it was created by the geek premier Paul Vixie and uses a DNS lookup for an extraordinarily light overhead.

Step One:

Open your main.cf file again and add these lines:

  reject_rbl_client list.dsbl.org,
  reject_rbl_client sbl.spamhaus.org,
  reject_rbl_client cbl.abuseat.org,
  reject_rbl_client dul.dnsbl.sorbs.net

Then reboot Postfix:

/etc/init.d/postfix reload

As with the example above you will also want to watch your mail log to make sure nothings gone wrong.

sudo tail -f /var/log/maillog

Hack 3: Keep Spammers out of Your Forms

This is really the ideal place to stop spam: before it happens. There are a bazillion ways to prove that someone is a human (CAPTCHAs ... sigh), but I think it is instead better to put the burden on the bots.

Step one: Add a hidden field to your form.

< textarea name="comment" class="hidden" >

Step two:

In your handler, ignore anybody that filled out that form (as robots will do). Here's a fragment in php (assumes that the presence of a errors array will prevent submissions):

if (!empty($_REQUEST['comment'])) { $errors[] = "No Spam please."; }

Those are my favorites, let me know if you have any others!

Kestrel update.

Feb 20 2008

Things are looking great so far with this hairbrained project of ours.

Fabulous, actually: Bolt | Peters is super interested in the project and wants me to work on it for some percent of my total time at work. Which is fan-freaking-tastic! Thanks BP!

If you have no idea what I am talking about, check out my last post about it.

But, in short, these are the three things that I love about Kestrel already:

Kestrel is a web application for farmers. Kestrel is a participatory design project. Kestrel is an open source project. Kestrel is a user-centered project. (Deeply so; as in, we won't build it if it doesn't solve real-people problems.)

Ok that was four. Anyway, I've gotten so much great feedback already by email phone and comment — and I am now setting up interview dates. Let me know if you would like to talk on the phone for about an hour. We'll be gathering feedback about the initial concept and looking at some first drafts of first drafts. Basically, we're gabbing on the phone for a bit and I'm taking some notes.

You can participate in a number of ways:

Please leave a comment or email me at unthinkingly at gmail if you want to participate. You know you wanna. Research is fun!

We're conducting real live conversations, not just email exchanges, though email is also a great way to give feedback. Also, note that, as much as possible, we'll be recording interviews; part of the point of this project is that the methodology will be completely documented. We record stuff partly just part of the public nature of participatory design, partly because we want to get as much informed criticism as possible, but also because we want to teach other communities of practice to create a web app!

Dammit, if we (as a very small, active team) can build something that work really well for 50 farmers, then we probably have created something that will work really well for 50,000 farmers.

Thanks to tes, Andrea, Beck, Mary, Chris R., Rick, Ben, Mike, Nate and Anne for commenting already on the previous post; we already have strong support in San Francisco, Portland and Central NC.

Understanding a community tag: the history of nptech

Jan 11 2007

Recently there has been a lot of discussion among the nonprofit technology geeks about the use (and usefulness) of the tag "nptech".

When the nptech tag started one of the ideas was to gather enough data to look and see what words people were using to describe, say, open source (open source, floss, foss, open source software) and then use those words to inform a taxonomy. It's a taken a long time but I bet there's enough data in the nptech tag on a combination of bookmarking systems to do a little crunching and get at some of those commonly used terms. Sort of an emergent taxonomy... Marnie Webb,
nptech proto-tagger

The nptech tag (on del.icio.us) dates back to December of 2004 and was created by a group of nonprofit technologists that were exploring the potential for social tagging in the community. While I have a "curmudgeonly" eye for Web2.0 gizmos, in addition to a deep distrust of technophilic "progress" ... I think that the development of this tag is arguably the single largest reason for the current (thriving I think) state of what is commonly called the "nptech community." Which means a lot to me.

(A great summary of the current conversation is at Beth Kanter's blog.)

Opinions abound. Most of us seem to be worked up about the efficiency of the tag. On this note there has been a lot of interesting reaction to a post by Gavin Clabaugh, which was critical of folksonomies. Laura Quinn of Idealware largely agrees with Gavin.

In this context, it seems that generally the consensus has been that 1.) Taxonomies are harder to create than Folksonomies, but they are better in many contexts. And 2.) we need more data about how to make the nptech tag more useful as an "emergent taxonomy".

So, in the spirit of improving the tag and promoting the nptech community, here's some data:

  1. A plain text listing of every word that has been used on del.icio.us in association with nptech. fulltext.xml
  2. A sorted and ranked list of these tags. nptech-tagged.txt
  3. All of the tags presented as a scrollable tag-timeline.
  4. The script that I wrote to gather the data from delicious (in perl): community-tag-robot.txt. (The code is also displayed below with syntax highlighting.)

 [Prototype removed]

The script that I wrote crawls the pages of del.icio.us and pulls out all of the tags that were used to describe the same stuff tagged "nptech". This gives us an idea of how the tag has been used — effectively describing the tagged links, if we assume taggers are using "synonym clouds". Del.icio.us has a "related tags" feature but it is lame (only 10 are listed), and judging from my initial review of the data it is pretty random. (Not really sure if I broke some terms of use or not with my script, but it's our data, right? And besides, the script is very polite.)

There are a lot of delicious mashupy-type things that show you tagging patterns, but these approaches seem somehow very passive, and not community-oriented. I mean, in general delicious is used very passively — people want to be able consume more efficiently, not create some community in which greater action can be taken. Or it is just used for explicitly personal purposes, as a web-based bookmark service.

What I like so much about the nptech tag is that it was intentionally created to support and reflect a community (unlike, say, the tag "nintendo," which may very well support a community, but it is not active in a self-critical, dialogic way.) And certainly there is a beauty, I think, in using these hyper-technological tools (which have the ability to be very atomizing and consumerist) for the sake of doing things that are explicitly not-for-profit and mission-driven.

And personally I tend to agree with Michelle Murrain that we need to be wary of an "expert" approach to developing our tags and community taxonomies. That line of thinking is what made me want to do this in the first place. (Likewise I need to point out how much I have really been thinking lately about stuff that I have been reading at Ulises Ali Majias' blog like this.)

Anyway, further experimentation (graphs/charts from excel would be easy using the text files, for instance) would be nice; please let me know if you are doing something interesting with the data. I'm hoping that this will help us, as a community, determine what we want to do with this tag now that we have been using it for more than two years. What patterns do you see in the data? What does the nptech tag mean for our community? I am not going to try to start doing any analysis here, now — but I would really like to hear what people's reactions to the tag timeline are.

There are still a lot of holes in this data that I could answer with a bit more programming. (i.e., who has been using the tag?) Suggestions for extending the script are welcome. What do we want to know?

Low-Bandwidth user experience

Jan 5 2007

I make websites, and I manage a few web servers. Making sure that pages load quickly is a pretty fundamental part of my job.

Lately I have been thinking a lot about how much more important this concern is for people who are in low-bandwidth environments (my house in rural NC), and especially in what you might call "ultra-low bandwith" places, where issues like cost or a lack of reliable power compound the issue by an order of magnitudes.

I am thinking about how web developers can become more invested in the ultra-low-bandwidth user experience.

When it comes to bandwidth, international bloggers are talking about something totally different compared the "optimization" issues that web developers are fussing over.

For example, the Yahoo User Experience blog has just posted a really interesting analysis of page load time optimization techniques. But it occurs to me that the audience that they are writing for is largely the elite of the internet — they are trying to save a few tenths of a second. They are tuning a product, not really thinking about ultra-low-bandwidth situations. Which is fine, that's why they exist I guess.

I mean, It's web development 101 that you need to make you pages small. Load time is the number one element in usability. Even with that fancy DSL line, you appreciate pages that function in 1/10ths of a second instead of 2 or 3 seconds. I'll take it. And I am glad that they Yahoo folks are doing this great research.

For example, the a November post made me think about my techniques a bit differently — they pointed out that having a low-weight page is nice, but the speed experience is much more greatly affected by the number of elements on the page. An understanding of latency vs. bandwith makes this point even clearer, once you consider it.

Today they published an even more interesting look at the browser, focusing on the fact that browser caching appears to only be utilized in about 20% of browsers! As a developer there is no way to force a user's browser to cache data, which could easily reduce page load times by about 3x, depending on the content being served, mostly because of the point above: you save the most time by reducing the number of http requests altogether.

For me this immediately made me think that this is a technique that should be built into any browser that gets used in a low-bandwidth environment ... but I bet that there are innumerable browsers installed in that do not take advantage of this feature, even where connections are extremely slow. Hopefully the browsers that ship with the $100 laptop (and its kind) with be caching as much as possible. ...

Relatively Simple RSS Aggregation

Mar 20 2006

I recently posted about my need for a simpler RSS aggregator.

I worked through the massive list of existing RSS aggregators at wikipedia and couldn't reallly find something that did exactly what I wanted and worked.

I needed something that was simple, cached hourly, displayed various encodings well and worked with RSS and Atom formats. Most importantly, I wanted something I could install on my own server, and it needed to be community oriented (not designed for a single reader).

Most of the existing solutions on the web are hosted by a third party, and are limited in the number of feeds that they can aggregate.

A robust, simple option for a thrid-party hosted aggregator is Feed2JS, but if you want a bunch of feeds on your site you'll have to use a bit of javascript for each one. Multiple feeds get ugly fast, and if their server goes down, your scripts go down.

The best hosted solution I found was Gregarius, which has a lovely community. I think that gregarious will take the cake sometime this year as aggregating catches on. For now, however, the plugins are pretty limited — you are locked into a personal reader with "Read/Unread" tagging. (Which makes Gregarious a great replacement for a newsreader like Bloglines, but it's not good for a community aggregator where lots of people visit.)

If I were a better person I'd just write a Gregarius plugin, but instead I've just reworked some of Feed2JS's code (which is in turn dependent on the Magpie RSS parser.)

You can download it here.

It's a index.php file with an "admin" folder. Drop it on your server in an appropriate directory and you should have a page that displays 15 biofuels blogs. NOTE: if you don't know what php is or don't have ftp access to your server, just use Feed2JS and insert their javascript into your page.

Otherwise edit the php file with the rss or atom feeds you want to include. It will probably need a little CSS to spice things up. You can send comments or questions about the script to chris(at)nonprofitdesign.org. I'll release another version sometime soon with better documentation and a prettier front end.

An example of the styled script in action is here (at the Piedmont Biofuels website).

UPDATE: That page now uses simplepie, which kicks ass.