Google Patent Search

“Now you can search for U.S. Patents” says the [official google blog](http://googleblog.blogspot.com/2006/12/now-you-can-search-for-us-patents.html). This is exciting news because the [official patent search](http://www.uspto.gov/patft/index.html) feature at USPTO.gov is lacking. The navigational interface is clunky, practically to the point of being non-usable. The image display interface is not cross-browser compatible and should really highlight the pages that actually contain images. Instead, the user gets to page through the text of the application, even though they typically just read the text of the patent.

Thankfully, [Google Patent Search](http://www.google.com/patents) solves the most glaring of flaws in the official search. The interface for a “casual patent searcher” is a breath of fresh air although [Bill Slawski’s analysis of the patent search](http://searchengineland.com/061213-200005.php) indicates that Google has a long way to go before capturing the hearts and minds of professional patent searchers, such as patent attorneys.

A Lesson In Reputation Management Damage Control

Clearly, the end of this past week was much busier in one corner of the internet than many realized. Michael Arrington at TechCrunch reported on a [Jotspot/Google merger blog post by “Kevin”](http://www.techcrunch.com/2006/11/30/anti-jotspotgoogle-post-deleted-under-pressure/) disappearing under suspect circumstances. Thanks to search engine caching, he republished the original text of the blog which was basically an indictment of how little attention Google gave to Jotspot’s partners.

I won’t add to the discourse about either Kevin’s blog or the tin-foil hat conspiracy mob behavior. The most fascinating thing about this story is how effectively the players are fixing the problem. Any time negative news about a company is released and then suppressed, it ignites an intense need in people to know what happened. This story could have been a nightmare for Jotspot and Google, but it is going a very different direction.

It’s a little hard to follow with all the random Kevin/Michael/Jotspot/Google bashing and conspiracy theories intermixed, so I’ve put together a timeline of the key events in the conversation that followed the initial post. Timestamps are linked to the comment anchor, so you can read the actual comments in context.

##Anti-Jotspot/Google Post Deletion Comment Timeline

* [Nov 30 5:49 PM](http://www.techcrunch.com/2006/11/30/anti-jotspotgoogle-post-deleted-under-pressure/#comment-465966): Kevin is identified as Kevin Hague of Knowesys.
* [Dec 1 12:16 AM](http://www.techcrunch.com/2006/11/30/anti-jotspotgoogle-post-deleted-under-pressure/#comment-467177): Joe Kraus, co-founder and former CEO of Jotspot responds:
“Let me start out at the personal level. Simply put, it sucks to hear partners or customers say bad things about you. And, it’s not because I expect everyone to say nice things. It’s because I take very personally the fact that people, like Kevin, who invested in us early, feel this way.”
“we’ve joined a company (google) that has a policy of not announcing anything about future product direction. … I know it can be frustrating and I know this comes as a turnaround from the very-open-about-future-plans nature that JotSpot’s partners and customers were used to.”
“I want to assure folks that a) we are continuing to offer the JotSpot service to customers that had signed up before we were acquired and b) we will continue to do so until the time that we can (and will) migrate users to a new service.”
“Last but not least, I can assure you that nodoby on our end asked Kevin to remove his post.”
* [Dec 1 8:51 AM](http://www.techcrunch.com/2006/11/30/anti-jotspotgoogle-post-deleted-under-pressure/#comment-469251): Kevin responds, clarifying that neither Jotspot or Google requested he remove the post. He doesn’t backpeddle on his position though (which is good):
“I can imagine the synergies Google can provide and can’t wait to see what’s in store for the Jot platform and I’m sorry you are unable give more answers about this future. It’s a real bummer for us.”
* [Dec 1 9:26 AM](http://www.techcrunch.com/2006/11/30/anti-jotspotgoogle-post-deleted-under-pressure/#comment-469334): Kevin responds to an anonymous poster who reminds him he should “re-read your posts before you publish.” Kevin’s response? “Anon, you’re right.”
* [Dec 1 10:52 AM](http://www.techcrunch.com/2006/11/30/anti-jotspotgoogle-post-deleted-under-pressure/#comment-469657): Kathleen Romano, co-founder with Kevin at Knowesys, explains the circumstances by which Kevin chose to delete the original post:
“Kevin posted his personal opinions; I mentioned I didn’t fully agree with him and wanted it clear that these were his views and not mine; he mistook my comment as a request to remove the post. That was not my intention. He has every right to voice his opinion and should’ve left the post up.”
* [Dec 1 3:55 PM](http://www.techcrunch.com/2006/11/30/anti-jotspotgoogle-post-deleted-under-pressure/#comment-470645): Bob Haugen, with Rising Technologies (another Jotspot partner) makes it clear that he considers the controversy overblown:
“We expected Jot to be purchased by somebody, and are happy that it is Google. … We wish Google would tell us more, too, but we are not surprised that they aren’t.”

I can think of plenty of ways this could have been handled badly. Could it have been handled any better?

P.S. Thanks to Andy Beal for pointing out this article in his [weekly internet news roundup](http://www.marketingpilgrim.com/2006/12/the-best-internet-marketing-news-this-friday.html).

301 Redirects With A Custom 404 Page

For anyone who cares about search engine rankings and has had to move a page or domain from one location to another, the handling of redirects is very important. The wrong type of redirect and any links pointing to the old location don’t count towards the contents new location.

If you read most work on “[How To Set Up a 301 Redirect](http://www.webmasterworld.com/forum92/82.htm)” it involves delving into Apache .htaccess files or [waving a dead rat over a .NET server](http://www.webmasterworld.com/forum47/3287.htm) while clucking like a chicken. In those odd shared hosting cases where you have no access to modify .htaccess files you may be further screwed into learning about meta redirects and determing just [how Yahoo! is going to interpret a 2 second delay versus a 3 second delay](http://help.yahoo.com/help/us/ysearch/slurp/slurp-11.html).

Here’s a method for handling redirects in a way that is simple to manage, requires only PHP and the ability to define a custom error page, and allows you to prepare your redirects *before* you move the page with a seamless redirect once you have. Even with the ability to control every aspect of my Apache server, I still prefer this method.

### How it works

When you move a page to a new location the most important thing is to ensure a user or search engine spider does not receive a “File Not Found” error (hereafter referred to as a “404”). It’s important to understand how a request for a non-existant page is handled by the server. Here’s a diagram I put together:

How Apache Handles a Request for a Non-Existant Page

If you didn’t realize already, it’s a simplified diagram of the process. Where in this process do you intercede to prevent a 404 error? Because everything a server must do takes up processing power, it is logical to conserve server resources and intercede as soon as possible. For example, if you could go out and fix all your visitor’s bookmarks and search engine listings to point to the new page, you should. Then, no one even would have to worry about 404 pages or dealing with redirects! :)

So, logically, to conserve server resources, the next step in which we can intercede is the checking of the .htaccess file. (For simplicities sake, I won’t discuss editing the httpd.conf to place the redirects in the VirtualHost section.) For example, we could put something like this in our .htaccess file:

RedirectMatch 301 /olddirectory/([^.]+)\.html$ http://www.example.com/newdirectory/$1.html

And, that code would work, just as expected! But, once you _and your boss and clients_ learn about the wonder of the 301 redirect, some of the arguments against changing page locations willy nilly will go away and you’ll find yourself with .htaccess directives that are dozens of lines long. Which reminds us that *Apache parses all the directives it finds* for *every page request* it receives in *every directory in which that page may be found*. Suddenly our performance savings by using Apache to handle redirects are being whittled away by an excess number of directives.

### The solution lies downstream

Scroll back up and bit and take a look at that diagram. When a page isn’t found, the web server will serve up an error page. It will even serve a custom error page if configured to do so. How do you configure it do so? It’s easy and I’m going to skip ahead a bit and reveal that our custom error page will be a PHP script so I can provide all the .htaccess code you’ll need in this one bit of code:

ErrorDocument 404 /404.php

That line added to the main .htaccess file for your entire script is the only and last time you’ll need to edit .htaccess to deal with your 301 redirects. All the work occurs in that PHP script.

### PHP headers overrule Apache headers

Even though Apache will normally serve a 404 header along with your custom 404 page, you can easily overrule that with a [PHP header](http://php.net/header). So, why note send a 301 header? Once we make this cognitive leap, it’s dead simple to put all your redirect code into your custom 404 page and check the request against your URL patterns.

Using a database structure like so:

CREATE TABLE `redirects` (
`primary` bigint(20) unsigned NOT NULL auto_increment,
`old_uri` varchar(254) NOT NULL default ”,
`new_uri` varchar(254) default NULL,
`action_type` mediumint(9) default NULL,
`domain_match` varchar(254) NOT NULL default ”,
PRIMARY KEY (`primary`),
KEY `old_uri` (`old_uri`,`new_uri`,`action_type`,`domain_match`),
FULLTEXT KEY `old_uri_2` (`old_uri`)
) ENGINE=MyISAM;

You can use the following PHP code at the beginning of your custom 404 page:

<?php

// include your own DB connection info here
$link = mysql_connect(‘localhost’, ‘mysql_user’, ‘mysql_password’);

// build a complete URL from server variables
$complete_request = ‘http’.(($_SERVER[‘HTTPS’]==’on’)?’s’:”).’://’.$_SERVER[‘HTTP_HOST’].$_SERVER[‘REQUEST_URI’];

// Parse out the URI
$parsed_request = parse_url($complete_request);

// Especially badly formed URLs will throw a false from the parse_url function, we only continue if that is not the case
if($parsed_request !== false){

// Check the DB for a specific entry
$query = sprintf(“SELECT * FROM `redirects` WHERE `domain_match` LIKE ‘%s’ AND `old_uri` LIKE ‘%s’ LIMIT 1″,
”.mysql_real_escape_string($_SERVER[‘SERVER_NAME’]).”,
”.mysql_real_escape_string($parsed_request[‘path’]).’%’
);
//print($query); // for debugging

$result = mysql_query($query);
while ($row = mysql_fetch_array($result, MYSQL_ASSOC)) {
$redirects_found[] = $row;
}
// print_r($redirects_found); // for debugging

// If there is a match, redirect the user
if(count($redirects_found)==1){

// send the right type of redirect
if($redirects_found[0][‘action_type’] == ‘302’) {
header(‘HTTP/1.1 302 Moved Temporarily’);
header(‘Location: http’.$s.’://’.$_SERVER[‘HTTP_HOST’].$redirects_found[0][‘new_uri’].”);
$headers_sent = true;
} else {
header(‘HTTP/1.1 301 Moved Permanently’);
header(‘Location: http’.$s.’://’.$_SERVER[‘HTTP_HOST’].$redirects_found[0][‘new_uri’].”);
$headers_sent = true;
}
// You can add additional behaviors for other header codes here

} // if(count($redirects_found)==1){

} // if($parsed_request !== false){

// Exit if we have sent headers
if ($headers_sent) {
die;
}

// Otherwise, include friendly 404 HTML below :)
?>

With this code in place, you can add entries to the `redirects` table ahead of removing/moving pages. As soon as your new pages are in place, just rename or delete the old files and all incoming requests (from both users and spiders) will automagically be redirected to the new location.

Testing MSN Live Search

Dave is doing a bit of [testing on an issue with MSN Live Search](http://www.davidnaylor.co.uk/archives/2006/11/24/search-engine-test/). It seems only appropriate to link to [Matt’s post about url canonicalization tips](http://www.mattcutts.co.uk/blog/seo-advice-url-canonicalization/). :)