User blog comment:Dopp/New Global Search, plus Updates to Local Search/@comment-755865-20120513155524/@comment-188432-20120514190638

With respect to user:Bethel23's comments, it's important to be objective. Yes, you'd think that a wiki about books should be ranked higher than a wiki about Doctor Who.

But that's only until you start to look at the objective truth of the situation. The Children's Books Wiki has only recently started and has much fewer than 1000 pages. Meanwhile, Doctor Who is far and away the most literary media franchise, with far more books than any other single franchise. Books are so much a part of the Doctor Who experience that we've had to customise our usage of Special:BookSources at w:c:tardis, as can be seen by this typical ISBN search. Indeed, Doctor Who writers have been involved in producing content for all sorts of age groups. A big push of the current regime in charge of the franchise is to use the popularity of the television series to help under-educated adults discover the joy of reading through the tardis:Quick Reads range.

You may well be right that the search engine is placing too much weight upon the word "book" being in titles. The Jungle Book thing is pretty indicative of that. And if you search for "novel", w:c:tardis is number 1, probably in part because we insert that word into every novel title. However, the other point is that we actually do have that much content. We have hundreds and hundreds of pages dedicated to novels/books/short stories — prose. There are far more Doctor Who, Torchwood and Sarah Jane Adventures stories told in print than in any other medium. And really, it's simply more than your 700-odd pages.

The thing you gotta remember, too, is that yours is a "one-sided" wiki. You're basically just housing pages where you're talking about books from an out-of-universe perspective, kinda like wikipedia. Well, we have that, but we — and by "we", I include Harry Potter, Star Trek, Star Wars, all the "media" wikis — also have an in-universe dimension to our wikis. And we cover non-fiction or reference books about our franchises, whereas you're just listing novels. So we have Great Expectations (a book mentioned in an episode), The Fifth Doctor Handbook (a reference book about the Fifth Doctor's era on television) and Castrovalva, a novelisation of a televised episode. Not to mention the several ranges of original fiction. You've got a much harder task at creating content for the search engine, because you're limited to one sort of page about one kind of book. Now, obviously, if your wiki fully matured, you'd be able to have way more articles about books than any media franchise wiki. We all have limits to the number of articles that we can create. The difference is that we've already created them.

So is there a relevance problem with the algorithms? Despite the fact that the numbers are in our favor, I certainly think there is a problem. A search for "star trek bafflingly puts w:c:memory-gamma (with 1370 pages) above w:c:memory-beta (with 38,379 pages).   But, even if the code were tweaked, your wki would not be ranked higher than w:c:tardis on a search for "novel",  or w:c:doctorwho-collectors on a search for "book".  If it did (and your article count remained at 700-800) I would say something was definitely wrong —  particularly if the tweak started taking into better account the text of the articles.

I'd say that the search engine, with whatever flaws it has, is working, to the extent that it's revealing an unexpected truth.