Majestic-12

From P2P Foundation
Jump to navigation Jump to search

Majestic-12 is a P2P-based Search Engine

URL = http://www.majestic12.co.uk/

"Majestic-12 is working towards creation of a World Wide Web search engine based on concepts of distributing workload in a similar fashion achieved by successful projects such as SETI@home and distributed.net."


Background

From the Guardian, http://technology.guardian.co.uk/weekly/story/0,,1736761,00.html


"Majestic-12's volunteers - 60 so far - are crawling about 50m pages a day using unlimited broadband connections and software that runs in the background. Over the past few months, 7bn pages have been crawled although, at 1bn pages, the completed index lags behind for now. This is stored centrally to enable the Majestic-12 distributed search engine (via majestic12.co.uk) to return fast, relevant results.

"Ideally, I'd like to distribute the search index," says Chudnovsky. This is a challenging proposition that would see duplicate chunks of a huge index distributed between broadband-connected PCs. There are also parallels with peer-to-peer systems such as Gnutella, which share music, films and software. A small-scale experiment with one country, perhaps Finland, may happen later this year.

Professor Jon Crowcroft of Cambridge University says this type of collaborative web crawling and indexing is very reasonable. "Many search engines do this to reduce the traffic load returning to a single central site - distributing the index itself is OK, so long as you have an efficient mechanism to search the index."

These efforts also interest Professor Levene. "I hope the project succeeds. People finding novel ways of doing crawling or search is good for the competition," he says. Should Google, Yahoo, and MSN be worried? "It would be hard to push Google out of the way - they're just going to buy you out.

Chudnovsky's aspirations are more community-minded, helping to develop a search engine that users control. Nevertheless, his innovative code might revitalise searches on corporate websites or, more controversially, assist with search engine optimisation. But as video, images and music are added to burgeoning search engine indices, crawling and search tasks will need to become more distributed." (http://technology.guardian.co.uk/weekly/story/0,,1736761,00.html)