Natural Search Blog


Google Scholar – a new search engine for us eggheads

Google has just launched a new search engine called Google Scholar. It’s an engine specifically of scholarly content, such as articles in academic journals. It’s still in beta, so don’t be too hard on Google if it’s not perfect. Danny Sullivan has written an article in SearchDay about the new service. Good on ya, Google!

Google’s index hits 8 billion pages. Yes folks, size does matter.

On Wednesday, the day before Microsoft unveiled the beta of Microsoft Search, Google announced that their index was now over eight billion pages strong. Impeccable timing from the Googleplex. Just a couple days later, and Microsoft could have proudly touted its bigger web page index over Google’s. Still, Microsoft’s 5 billion documents is an impressive feat, particularly for a new search engine just out of the blocks. Google continues to show their market dominance, however, with a database of a whopping 8,058,044,651 web pages. Poor Microsoft, trumped by Google at the last minute!

Why the big deal about index size? From the user’s perspective, a search engine that is comprehensive of the Web in its entirety is going to be more useful than one whose indexation is patchy. Which is why I think the Overture Site Match paid inclusion program from Yahoo! is a really bad idea. Sites shouldn’t pay the search engine to be indexed. Rather, the search engine should strive to index as much of the Web as possible because that makes for a better search engine.

Indeed, I see Google’s announcement as a landmark in the evolution of search engines. Search engine spiders have historically had major problems with “spider traps” — dynamic database-driven websites that serve up identical or nearly identical content at varying URLs (e.g. when there is a session ID in the URL). Alas, search engines couldn’t find their way through this quagmire without severe duplication clogging up their indices. The solution for the search engines was to avoid dynamic sites, to a large degree — or at least to approach them with caution. Over time, however, the sophistication of the spidering and indexing algorithms has improved to the point that search engines (most notably, Google) have been able to successfully index a plethora of previously un-indexed content and minimize the amount of duplication. And thus, the “Invisible Web” begins to shrink. Keep it up, Google and Microsoft!

Top sites by PageRank score

For a very long time I was one of the elite few who knew how to get a list of the top 1000 web pages on the Internet sorted in order of Google’s PageRank importance score. Since this top secret little trick no longer works, I feel I can share it with you all now. 😉

The trick is this: doing a search for http in Google with your Google Preferences set to return 100 results per page used to supply you with the top 1000, at a 100 at a time. Boy that was handy!

http://longway.affordable-webhostingus.net/
stanabol 50

Inconsistencies in the Google user interface

While we all love Google, I’ve found a couple mildly irritating usability issues that have, surprisingly, been overlooked.

First off, from the Google home page, if you type your query into the search box and then hit one of the tabs, say “Images”, you won’t actually end up with search results for that search term within Google Images. You will simply end up on the Google Images home page with that search query already keyed in for you. It would make a lot more sense if you were taken directly to Google Images search results. Lo and behold, that is exactly what happens if you operate the interface in the same way — BUT from a Google SEARCH RESULTS page. What’s up with that? Is this inconsistency in functionality done on purpose or inadvertently?

My second usability gripe is specific to the Google Directory. For some bizarre reason, users are not allowed to search the full contents of the Google Directory from category pages, only from search results pages. So, if you head to the Google Directory home page, type in a search query, then click on a category name listed in the search results, your ability to conduct another search of the Directory goes away! You’re only allowed from that point on to search within that particular category, or to search the entire Web via Google.com. If you want to do another Google Directory search, you have to use your BACK BUTTON. Yuch! Doesn’t it seem kind of silly that you wouldn’t be able to search within all the Google Directory once you are within the umpteen number of Google Directory’s category pages?

Attention, Google employees: I humbly request that you get these minor annoyances fixed. Other than that, kudos on the fantastic search engine!

where to buy anabolic steroids

Google using the largest database of clustering in the world

Peter Norvig, Google’s Director of Search Quality, was quoted as saying at the Web 2.0 conference last week that Google is using the largest database of clustering in the world. Norvig also went on to say that:

the problem with web search is that an entered keyword could be associated with different meanings, but the results displayed may not be the meaning you want. This is why Google is working on the largest bayesian database of clusters to determine the most likely meaning for any given search request.

Read Andy Beal’s account of Norvig’s exclusive demonstration of Google’s clustering technology at Web 2.0.

buy oxycodone
buy food grade hydrogen peroxide

Google Store makeover still not wooing the spiders

You may recall my observation a few months ago that the Google Store is not all that friendly to search engine spiders, including Googlebot. Now that the site has had a makeover, and the session IDs have been eliminated from the URLs, the many tens of thousands of duplicate pages have dropped to a mere 144. This is a good thing, since there’s only a small number of products for sale on the site. Unfortunately, a big chunk of those hundred-and-some search results lead to error pages. So even after a site rebuild, Google’s own store STILL isn’t spider friendly. And if you’re curious what the old site looked like, don’t bother checking the Wayback Machine for it. Unfortunately, the Wayback Machine’s bot has choked on the site since 2002, so all you’ll find for the past several years are “redirect errors”.

ephedrine weight loss pills

What Google searchers are looking for

Google exec David Scacco (Director, Vertical Markets Group) had some interesting things to say about Google usage this week at the channeladvisor Strategy Summit 2004:

Kudos to Andy Beal of Search Engine Lowdown for documenting David’s comments to the channeladvisor audience.

winstrol cycle

A Google web browser? That would be cool.

According to The Register, Google is headhunting staff to build a web browser. Apparently, staff have already been poached from Microsoft and Sun. I, like many others, welcome a Google web browser to displace Microsoft’s Internet Explorer as the dominant browser. Just imagine… a second “browser war,” but this one could actually have a happy ending for us users!

shakira quotes

Multiple ad blocks per page on Google AdSense

Google has recently changed their AdSense policy on the number of add blocks you can display per page if you are part of Google’s AdSense program. That’s the program whereby websites can get paid for displaying Google ads on their web pages. Now you can display up to three ad units per page, according to Google.

makeup free samples
buy zolpidem

Google Grants is taking applications again

Non-profits rejoice! The Google Grants program is taking applications again. If you’re unfamiliar with Google Grants, it’s an allowance of free Google AdWords advertising for worthy non-profits. Here’s a short description of the program from Google:

“The Google Grants program is designed to help nonprofit organizations like yours further their goals and objectives through targeted, online advertising on Google.com. Past Google Grant recipients have used their grants to publicize services and awareness, recruit staff and volunteers, promote special events, sell merchandise related to their organization or cause, and much more.”

что такое сигарета

RSS Feeds
Categories
Archives
Other