0 comments
February 6 2007

Who’s linking to our website? New tools.


6/25/07 UPDATE: I am obligated to point out that this little script has graduated from interesting to useless — thanks to the new Google Analytics, which is hands down the best tool for understanding web traffic. And it’s bloody free. You probably knew this already. But, just in case, here’s a great new tutorial. That is all.

It is a pretty basic trick to get an idea of people that are linking to your site. Just google:

link:http://mysite.com

But that is an extremely rudimentary technique for several reasons.

You asked, and we listened: We’ve extended our support for querying links to your site to much beyond the link: operator you might have used in the past. Now you can use webmaster tools to view a much larger sample of links to pages on your site that we found on the web. Unlike the link: operator, this data is much more comprehensive and can be classified, filtered, and downloaded. All you need to do is verify site ownership to see this information. Peeyush,
Google Webmaster Central Blog

So yesterday I was super happy to discover via the trusty Google Webmaster Central Blog that there is a new “links view” in the Webmaster Toolkit.

The Webmaster Toolkit is a service from Google that you really should be using. It takes just a few minutes to get started and then you get lots of data, including the new link data. If you haven’t already (and, uh, you run a website), check it out and you will be happy to pick up a bunch of free statistics about your site. Notably, you can also create an XML sitemap (not a graphical HTML sitemap, though!) of your site to make sure google is indexing the whole thing. And you can test your robots.txt file (important for keeping those pictures of the last drunken staff party out of images.google.com).

I did have a couple of problems with the data, though — there still is no way to get a good ranking of your referrers, or a ranking of your most popular pages. Luckily, you can download the entire file and do whatever you want with it. (hooray for openness!)

Since we have a bunch of clients I wanted to send this new data, I took the time to write a simple perl script. And I figured a few other people could use it.

It’s here:

[syntax,unique_addresses.pl.txt,perl]

Instructions for unique_addresses.pl

Prerequisites: Using this script requires that you know how to execute file from the command line (and that you have perl installed). This will only work for Mac/Linux folks (requires perl and the *nix commands for sorting). … If you are a progressive blogger or organization and can’t get this to work, email me your stats and I will process them for you.

  1. Download your entire external links file from the webmaster toolkit.
  2. Use Excel or something to pull out that column of external links, and save this as something like “referrers.txt”
  3. Repeat the above for your “pages” column, but name it something like “pages.txt”
  4. Download script.
  5. Make it excutable in the same directory as your “pages.txt” and “referrers.txt” files
  6. Run “./unique_addresses.pl” and it will prompt you through the rest.

Again, if you are working for a good cause but run into trouble, just email chris at blast dot com or leave a comment.

1 comment
January 20 2007

The Panopticon. Now With An Improved Menu!

“When I go to a restaurant, and look at leftovers on my plate, I don’t see food, I see information. If the restaurant were Google, they wouldn’t just take that plate and scrape it off into the trash. There would be a camera in the kitchen, photographing every plate coming back, with analysis of what people liked and disliked, and what portions were too big, helping to optimize future servings.”Jon Orwant,
on Oreilly Radar

A recent post on O’Reilly Radar describes a “pervasive culture of measurement” which is touted as an example of how “smart” web companies these days are maximizing their use of data from their consumer’s “leftovers.”

Hmmm.

Waitasecond. Photographing my leftovers? You’re totally creeping me out. I mean, I get the point, but is that really the direction that savvy Web-2.0-aware businesses take these days? The overtone of pervasive surveillance makes me feel a bit ill. Minus points for O’Reilly implying that this will lead to Web 2.0 apps that are constantly improving themselves based on user activity. Of course the corporate world has always wanted to know as much about me as possible. But what do they usually do with it? Banner Ads.

Bansky.