Googlebot crawl rate control and more

I admit I’m a little too hopeful that people will do good things but it looks like Vanessa Fox and the rest of the Google Webmaster team actually are working hard to bring valuable tools to webmasters. The [latest offering](http://googlewebmastercentral.blogspot.com/2006/10/learn-more-about-googlebots-crawl-of.html) is the ability to choose from three different crawl speeds. Unfortunately the whole “this change will last for 90 days” and you have to keep coming back to select your preference is pretty lame. _At least_ send out a notification email if requested to remind webmasters that their crawl rate is about to change if it’s set to anything other than “Normal.”

Other new features include the ability to opt in your site’s images into the [Google Image Labeler](http://images.google.com/imagelabeler/) project. Good that you can increase the quality of your images metadata (which hopefully means increased traffic from Google Image Search), not good that you can’t see what metadata they end up producing. It’d be a nice feature if Google “donated” back to the image owners what keywords the Google Image Labeler generates for your images. I’m _not_ suggesting that Google publish the metadata for all the images in Image Search — only that they tell the image **owners** what metadata was discovered.

The charting looks cool but really only has value when you are experiencing problems — otherwise it’s just for the “ooh and aah” factor.

Altogether a nice clutch of new functionality. Thanks Google Webmasters Team!

Find out TO whom a site links

An exciting [new search operator on MSN](http://blogs.msdn.com/livesearch/archive/2006/10/16/search-macros-linkfromdomain.aspx) went up today. (Thanks [Oilman](http://www.oilman.ca/msn/msn-live-search-adds-linkfromdomain-operator/) for the tip!)

Unfortunately, when I went to try it out at MSN the request timed out. I imagine this is related to the spotty latency problems I’ve been experiencing all day and does not reflect on MSN.

This tool will be quite helpful in understanding the relationship between sites and will further the ability for folks that don’t have their own crawlers to better identify sites that [exhibit “hub” or “authority” linking behavior](http://www.cs.cornell.edu/home/kleinber/auth.pdf) (PDF).

It probably won’t replace the benefit of crawling specific sites and networks of interest (especially those with which MSN has poor coverage), but that takes time and resources that aren’t always available.

*Added:* The MSDN blog has a really good graphic to explain the difference between linkdomain and linkfromdomain if you don’t grok the difference. :)

del.icio.us isn’t it?

Because I chose _toyed with the idea of using_ the very chic [Japanese Cherry Blossom](http://krisandapril.us/2006/06/11/japanese-cherry-blossom/) template, I was introduced to the [del.icio.us bookmarking site](http://del.icio.us/) yet again. The difference between this time and the last dozen or so times I’ve run across it? This time I signed up and converted.

The idea of moving my bookmarks online has been nagging at me for almost a year so it’s good that I had an excuse. :)

### Bad News

The bookmarklet provided during the sign up process (v3) and the one I downloaded from the help section (v4) both failed to work in Safari. Each threw a syntax error.

[Bad Bookmarklet v3](https://secure.del.icio.us/register?step2):

javascript:location.href=’http://del.icio.us/post?v=3&url=’+
encodeURIComponent(location.href)+
‘&title=’+encodeURIComponent(document.title.replace(/^\s*|i\s*g,”))

See the problem? The regular expression in the title.replace bit is missing the closing “/” plus it has the critical flaw that it removes all the spaces from the entire title, Not just the ones at the beginning and the end.

[Bad Bookmarklet v4](http://del.icio.us/help/buttons):

javascript:location.href=’http://del.icio.us/post?v=4;url=’+
encodeURIComponent(location.href)+’;title=’+
encodeURIComponent(document.title.replace(/^\s*|\s*g,”))

Same problem here, the only difference between versions is that this passes the variables via javascript instead of in the URL. Probably to deal with extraordinarily long URLs and titles breaking the maximum character length of the URI field.

Corrected Code (Just for fun, I’ll call it v5):

javascript:location.href=’http://del.icio.us/post?v=5;url=’+
encodeURIComponent(location.href)+’;title=’+
encodeURIComponent(document.title.replace(/^\s*|\s*$/g,”))

Here we go! We’ve added our closing “/” and dropped in the dollar sign into the second condition to mean “only remove extra spaces from the end of the title.”

### Oh right, the link

[Gregg’s Bookmarks](http://del.icio.us/whoisgregg)

WordPress Plugin Choices

This is one of those posts that you don’t care about, but will be very handy for me later. :)

### Comment Spam

* [Akismet](http://akismet.com/) Why? Because it’s bundled with WordPress, silly.
* [Challenge](http://lordchaos.dominatus.net/wordpress-plugin-challenge) I loath the idea of using a [CAPTCHA](http://en.wikipedia.org/wiki/Captcha) because of accessibility concerns. I chose Challenge over other plugins because it allows me to provide PHP eval’d questions of my own. Once the bots get smarter at reading the standard question, I can generate my own variations.

### Formatting

* [PHP Markdown](http://www.michelf.com/projects/php-markdown/) Of course, _PHP_ Markdown is only possible because of [John Grubers](http://daringfireball.net/) work building [Markdown](http://daringfireball.net/projects/markdown/).

### Fancy Shmancy Stuff

* [IP to Country](http://priyadi.net/archives/2005/02/25/wordpress-ip-to-country-plugin/) I still get excited when I realize that the people I am conversing with online (through forums, blogs, email, whatever) are from _all over the world_. :)