I have spoken in the past about how Google’s best spam weapon was it’s indexing intelligence. However, I do believe that Google does miss out on good quality content due to it’s over zealous “this might be spam, I’m not going to index this” filter.

Whilst browsing around the Google webmaster help forums I found this rather interesting post about some guy who couldn’t get his site indexed properly. Curiosity kicked in and I felt compelled to look into his site problems and maybe give him some impartial advice as to why Google only thinks a very small percentage of his site is tasty enough to gobble up.

Strangely, only some of the pages linking directly from his homepage are being indexed. Take this for example. It has not been indexed but has two direct links off the home page. Neither have the majority of the second level pages been indexed. However, a large number of pages from far deeper into the site have been indexed.

I wish Google would be a bit more open as to how their magic indexing formula works as this creates problems for almost every webmaster I know, it would be in Google’s interest to have the high quality pages indexed rather than the ones that don’t matter too much. Maybe something like indexing priority similar to indexing speed in the webmaster tools.

