8 posts about tagging

'Slashtags' for citizen editors

Nov 9 2009

Updated Nov 16, 2009: @chrismessina created a wiki for the Twitter syntax http://microsyntax.pbworks.com/Slashtags

The NYT reported today on how the #fthood hashtag has failed:

Until lately, the main way to make sense of an urgent outpouring of tweets on a particular subject was to use text searches: look for the phrase 'Fort Hood,' for example, or maybe an agreed-upon label, '#fthood,' within tweets. Yet during events like the shootings on Thursday at Fort Hood that left 13 people dead, this method is useless. Hundreds of 'relevant' tweets pop up every minute, most repeating the same news reports over and over again or expressing concern from far away."Refining the Twitter Explosion" on nyt.com

I believe that there is an enormous potential to do citizen journalism better on the web, and that we need the leadership of people who are willing to help clean up the mess. Unlike some people, I do not think that the poor citizen journalism around #fthood is an indictment of citizen journalism — rather I would say it points to the absence of citizen editors.

In the Vote Report and Swift parlance, these are "Sweepers," the custodians working to clean the stream, validate claims, and generally insert some professionalism.

Taken to their logical next step, you can see the emergence of volunteer "citizen editors," who appreciate journalistic rigor and take time to bring signal to the noise in dozens of different ways.

Recently around Meedan we have been talking a lot about using Delicious and Twitter tagging to more effectively manage our content across our many networks, and to bring more meaningful conversations to our users.

This is the power of tags: they are impossible to contain in a single network.

By relying on Delicious and other social bookmarking systems, we've been able to build our editorial backchannel into numerous social platforms. Rather than being stuck with the limitations of some CMS, and have to copy everything out to our social network, we can use the social network and then bring it in to our own domains.

That's always a smart approach for nonprofits, because it builds your conversation in a meaningful, and searchable way. Metadata value (real usable value!) accumulates like interest in your bank account. And citizen editors are the people who are trying to make this system provide even more of a return, because fundamentally we want more people to care, understand and take action.

Twitter Lists Taken Seriously

So we've been looking into some of the existing pseudo-standards like the #hashtag, and looking for ways for improving our journalistic rigor. George recently posted about using the new Twitter lists features to curate groups of sources for our Iran Twitter feed:

Rather than treating our Twitter list as a gizmo, with shoddy maintenance and dubious output, what if we put some rigor into it by beginning with Journalism 101?

George, our lead editor, knows this stuff all too well:

What is the reported location of the Twitter Stream? Is the Twitter Stream using Farsi or a local language? How long has the Twitter Stream account been up and running?

(And oh yes there are many more criteria.)

I think these are the good, basic questions that may not be answered by some organizations — and their lists are thus quantifiably worse, in the sense that they are less reliable, less meaningful, and probably noiser. So we can see that by following basic journalistic standards, your attention data becomes more valuable. Garbage in, garbage out, or, more positively, the system can be improved.

For nonprofits, which typically do not have a microgram of energy to spare, these kinds of tricks can be really helpful.

#hashtags and /slashtags

A great example of this type of "attention data enhancement" is the #hastag, which clarifies the context of a short statement on twitter with a globally recognizable tagging syntax. (I'll spare us the debate around hashtags, but suffice it to say, they can be done better.)

Chris Messina, one of the biggest advocates of #hashtags and other microsyntax, has just described a few extra bits of attribution using the "slasher." (I think we could just call it a "slashtag.")

'Pointers' are short words with different intentions. A group of pointers should typically be prefixed by ONE slasher character. You can daisy-chain multiple pointer phrases together, padded on both sides with one whitespace character. There should be NO space following the slasher. Hashtags should be appended to the very end of a tweet, except when they are part of the content of the message itself and indicate some proper name or abbreviation. Normal words that would be part of the content of a tweet anyway SHOULD NOT be hashed."New microsyntax for Twitter: three pointers and the slasher"

Particularly I think using /by is a great idea to reference an article or direct quote.

Using /by gives a very specific meaning to the username that follows it. It's intuitive enough that I don't think it even needs to be explained, you can just read it:


Not beautiful, but very clear.

This is useful for when you need to be more precise — say, if you wanted to use your attention data in another application.

For us at Meedan, this is the direction we are headed, fast. We are working on developing a clear and simple standard for using tags on the delicious network. This standard will be something that our editorial team (and anyone who cares to participate) can use to route information to our hand-curated database. You don't have to leave the comfort of your own twitter client, or use any fancy tools — just the simple, clear standards that we are figuring out.

We are already making great use of social bookmarks at meedan as a editorial backchannel. For example, you can see all of Meedan's Iraq sources on delicious, from our lead editor:


And everything that the Meedan user unthinkingly (me) has tagged as being generically "for meedan" (using an informal tag "for_meedan").


Because George also uses this tag, we can get a nice community of practice working together. This page shows the shared pool:


So, as you can see, we are using underscores, which is a common tagging convention because it looks like a space. We're not so happy with this: it's simply not expressive enough.

(Even though you can do a lot with a single little shared tag like #nptech.)

A more robust tagging system, which I believe would be very compelling if it were well designed, would extend some of this syntax. The question is: how to extend the syntax without making it overwhelming?

Setting some goals

I think that any tag needs to follow a standard that meets several critiera:

1.) it should read naturally when spoken out loud (no dots, equals signs, or weird abbreviations) 2.) it should be as cross-network as possible (for now the syntax should not break on Twitter or Delicious) [1] 3.) it should rely an aliases instead of strict taxonomies (tag first, fix it later)

So what I'm talking about is extending the tag that George used to curate Iraqi newspapers, iraq_newspaper to something like this:


which I think has several advantages.

  1. of the tag in ways that make the taxonomy immediately clearer. Iraq is nested "inside" a type of source.
  2. It works on Twitter
  3. It works on Delicious
  4. It is still very short (adds only one character over the underscore)

On delicious, spaces are not allowed, so I have started using two slashes. So where previously I might have tagged the article with a kind of meaningless tag:


but now I can tag it


Which is still a pretty meaningless tag, but is at least prefixed meaningfully to mean "this content is by this person" as per chris' helpful article above.

Also I can improve the previous technique of using the for_meedan shorthand





Which has the benefit of being equally readable, while obeying a more general rule of syntax.

Machine tags are not what we want, we are not machines

By far the most complete standard that is being used to solve these problems is the machine tag. This tag uses a colon and an equals sign to indicate a much more specific (though not necessarily accurate) structure. The history is from the geo community, mostly for this:


These namespaced key value pairs are admirably used as the output of some web apps, but are quite intimidating for human input.

Common opinion seems to be that they are too "dorky" to be usable at this point, considering especially that any good taxonomy is constantly in slight flux. (Though Flickr has made great use of them to kick of custom actions in their UI).

Similarly, what might be called a "double tag" is an interesting simplification down to a context-less key value pair:


In fact this is what comprises almost all of the tags in OSM, one of the most ambitious tagging innovations on the web. (I have said before that tagging is the secret sauce that makes a crazy project like OSM work.)

Finding a balance

Replace the equals sign in that last example, and you have slashtags, which I think are much better at communicating that "color" is a parent of the "red" value:


In this way, this "slashtag" or "slasher" approach, extended a with tiny bit of folksonomic conventions, could really strike the right balance between editorial simplicity and powerful machine-readablity.

Finding better editorial tools for realtime crises

I think that a better-defined tagging approach could really help make sense of critical, breaking news.

A wiki about hurricane Ida, for example, is probably not the right way to manage news about a critical event:

[http://farm3.static.flickr.com/2712/4089097932_350f83174c.jpg Ida]

Mediawiki makes me groan just looking at it. I'd much rather help update that information by tagging links into delicious, and knowing that someone is listening on the other end. This would motivate me to learn the emergent standards, follow a loose taxonomy, and generally try to be more articulate.

If we could react in realtime to create a more sophisticated picture of the news by expressing ourselves more clearly in the tagging interaction, I think we could ultimately make great strides in improving citizen journalism (even if all the idiots keep on tweeting, which, naturally, they will.)

This is why the usability of a citizen editor tagging scheme is so critical — it needs to be flexible enough (to handle hurricanes) but maintain a low barrier to participation (to cultivate citizen editors). The tagging approach has already proven itself in many trivial domains, now we need to step it up using our journalistic standards, and our shared interest in making sense of the news, particularly crises.

We are early in this strange distributed crisis data management effort, but I think that some of the ideas proposed by Chris Messina, and the experiments of the OSM community go a really long way in this regard. Particularly the nestabilty and readability seem like great virtues of this tagging system. Overall the "slash" is a widely understood metaphor, used by all major operating systems to indicate travresing "down" or "up" a taxonomy.

I'm going to transition some of my tagging habits accordingly, and see where it ends up!

I would love to know what you think. Stop by the contact page or @unthinkingly on Twitter and let me know what you think.

[1] notice how it breaks on gnolia.com and breaks on flickr.com Although it appears that Flickr preserves the slashes in the background, just doesn't display them on output.

[2] On Twitter there is s a bit of a variation required if we are to follow existing patterns: 1.) I can omit the space, so I will, and 2.) You need to prefix a user's name with the @ sign, like /for @meedan — I think this is still quite readable, but the difference between networks might need to be cleared up. We could in fact collapse the twitter tags to /for/meedan (ie: identical to the delicious tag) but this would probably break some automation in twitter clients that are expecting the @ prefix.


Sep 29 2009

I have been keeping track of things that are interesting on Delicious for some time now.

Here're my top tags from September 2009 in case you are interested in checking out my Delicious feed.

    <li>design 425</li>
    <li>development 174</li>
    <li>ux 172</li>
    <li>programming 166</li>
    <li>opensource 149</li>
    <li>ui 148</li>
    <li>for_meedan 118</li>
    <li>usability 114</li>
    <li>journalism 96</li>
    <li>mobile 95</li>

I am working on several projects that use tags in slightly different ways, involving other specialized "social bookmarking" applications. I will post a note here when those feeds are ready! My current development activies (as a frontend coder and interaction designer) will be released over the course of 2008 at http://meedan.net and http://swiftapp.org.


May 7 2009

For the last 5 months I've been working with friends at Ushahidi and Meedan on a project nicknamed "Swift."

Our goal with Swift is to provide a crowdsourcing platform for "data triage." Imagine something like Mechanical Turk used only for tagging news, photos, microblogging and videos. There's no business model or anything like that — it's strictly Open Source Nonprofity Goodness(tm). Meedan and Ushahidi are partners in hacking it out.

As a user of Swift you can sit down at an "assembly line" of news and tag it. Swift gives you a straightforward aggregator for news (say, news about earthquakes in california) then asks you to tag all of the people, places and organizations in that firehose of data. With a little bit of effort (collecting a few rss feeds and marking up all the content) it becomes possible to put a very bright light on an emerging part of the web. You can, for example, tag violations of electoral code in an election, as we are doing with Vote Report India, which uses Ushahidi and Wordpress as a platform for grassroots reporting in the month-long Indian election.

I'm especially interested in knowing how much we can actually do with the public data that emerges in realtime during a crisis. From a journalistic perspective, it seems like there is an opportunity to understand more concretely what the hell is going on.

For Ushahidi, Swift is an extension of their exisiting SMS reporting cycle. By "listening" to the "outside" web in a more structured way, the hope is that we can provide more relevant alerts to people on the ground in a crisis.

For Meedan, Swift is a tool for a team of editors who need to produce interesting content for their digital newsroom. Because it is an aggregator, Swift serves naturally as a listening post as well as a tagging workbench. Rope in a few feeds (such as Twitter search results feed for "election" ) and then do location extraction for the Middle East with Calais on that feed, and you have a pretty cool stream of entities.

Today we had a great meeting at InSTEDD, with a crazy good crowd of people — everybody was in town for the conference at Berkeley. Thanks to everyone for their ideas and support!

Here's my presentation from today:



All of the photos and links can be found on my Flickr page.

Swift seeks to publish all of the entities that concerned communities publish about crisis, both hot flash and slow burn events. The core use case is for the period immediately following a disaster or crisis, during the hours and days of confusion.

One thing that is always interesting about Swift is that it is a very unusual use case. The tragedy of a crisis creates a temporary period of great social empathy during which many "rules" of interaction design break down. This is a design opportunity. Many people are willing to match their #have to someone else's #need, but they don't have a medium for volunteering, or a network of supporters who can contextualize and respect their work. We just watch CNN and feel powerless; we would love a way to help, as an individual, from across the world. An improved marketplace of volunteerism is possible if we can design the appropriate interactions.

Experimenting with IBM's "Many Eyes"

Jan 28 2007

IBM's new Many Eyes rocks. I experimented with the nptech data last weekend and built this in about 10 minutes. It's a very rough bubble map of the users of the nptech tag. Interesting how it shows the distribution of the tagging activity. Related: Swivel and Data360.

Number of times "nptech" was tagged, by del.icio.us username

many eyes nptech data

My Many Eyes account is here. (You can get an RSS feed of my visualizations.)

EDIT: Just to be clear, the usernames in the plot above do not reflect the actual number of contributions (the top posters are not getting credit for more than 100 posts each), because of a bug in del.icio.us, which I have discussed previously.

The Culture of Open Networks, or: Watch What You Tag

Jan 20 2007

I'm getting into an excellent free pdf called "In the Shade of the Commons," a publication from the Waag Society, which bills itself as a small group of enthusiastic idealists ... with a mission "to make new media available for groups of people that have little access to computers and internet, thus increasing their quality of living."

They sound like a nice little bunch of information hippies in the Netherlands.

"We value information as a human resource of cultural expression rather than a commodity to be sold to consumers. ... We realize that intangible information resources raise the issue of a digital ecology, the need to understand ecosystems constituted by information flows through various media. " The Vienna Document
In the Shade of the Commons.

They have quite nicely put together a range of material about the fragility of openness in up-and-coming information societies and the need for "intellectual commons." My favorite part of it so far is the "Vienna Document," (quoted here) which summarizes a number of thoughtful progressive info-principles.

The lesson I'm taking away is not just that "information should be free" (ZZZzzzzzz.... ), but there is also need for a kind of "humane" network design that leverages openness in ways that are beneficial to more than just a select minority.

For those of us who design software (which is now 99% defined by networked computing), I think this has pretty hip implications.

I think it is brilliant to conceptualize information, as they do, as a product of "intellectual labor." In this light it becomes clearer how the information that we produce (in the context of, say, social tagging) can be evaluated as a product that can be shaped by the conditions in which it is produced, controlled, consumed and potentially misused.

Really, what is the most profitable thing to do with a massive database of human generated metadata? Exactly how often should we expect The Most Profitable Thing to line up with the most useful thing for real-live human beings?

Anyway, this line of thinking seems especially relevant to me now that I am so frustrated to discover that more than 1000 nptech tags are apparently not shown in some views of del.icio.us. I can't really blame del.icio.us for whatever is causing this, but it is a reminder that we are trusting our attention data to the databases and algorithms of a corporation with no vested interest in the integrity or proper use of our data. It's enough to make me want to start googling for an open source alternative ... but then there goes my intellectual labor being photographed again ...

A question for del.icio.us

Jan 15 2007

I am still working on developing a tool for analyzing community tags in del.icio.us, but I have run into a problem that messes up the data pretty significantly. I would be interested to know if anyone has any ideas what is going on.

The problem is this: del.icio.us says that there are about 5160 items tagged with nptech in its database. I think this number is correct. But you can see for yourself that, if you put the pagination on 100, you will get to the last page (the first time the tag was used) when you hit the 41st page.

That's only 4100-ish. Are there 1000 of our entries missing?

People Tagging with Tagalag

Nov 12 2005

If you like the idea of tagging you web pages and your books, how about tagging your people? Sounds a bit like info-tainment to me, but I felt obligated to post it because of the compelling freakiness of it all.

Tagalag is a service that lets you tag people, via their email address. It’s not a ‘tribute’ site like 43 people, because only people who know a person’s email address can add tags for that person.

If you create a profile you can add personal and geographical information about yourself.

I don’t know if Tagalag is onto a viable business model, but I like the idea of tagging people. This could become interesting as it evolves.

Read it: People Tagging with Tagalag

See also the recent surge of interest in Facebook, which about 80 percent of college kids are using. (No, really, 80 percent — I've seen other numbers that confirm this phenomenon.): Facebook Users sure are Passionate

(Found on:TechCrunch.)

Using Del.icio.us

Jan 13 2005

The Linc project (based in NY) has an interesting article on their use of del.icio.us (a web service that maintains lists of links for you.) I really appreciated their technical description of everything they're doing with it, but it isn't a beach read.

LINC Support Grab Bag Archive: Are you del.icio.us?

Here's a bit:

Are you maintaining lists of links someplace and looking for a way to keep them up-to-date? You might consider taking a look at del.icio.us, a free service that will allow you to aggregate links on their system, tag them with category names of your own choosing and then display them in other forums.