NowPublic.com uses Sphinx for searching and it is really easy to use Sphinx for Drupal and then it's fast, extremely fast. There is no Sphinx search module because I could not find a module worth of code to release. You just need to set up Sphinx according to its documentation, of course the queries are Drupal specific, you just need to JOIN your tables together USING(vid) and don't forget to start the SELECT with nid:
sql_query_range = SELECT MIN(nid), MAX(nid) FROM node sql_range_step = 1000 sql_query = \ SELECT n.nid, n.title, body, changed \ FROM node n \ INNER JOIN node_revisions USING(vid) \ WHERE n.nid BETWEEN $start AND $end AND status = 1
Here is how you add terms: sql_attr_multi = uint tid from query;SELECT nid, tid FROM term_node
. Once Sphinx is set up, you need something like the following:
require_once ('./sphinxapi.php'); $start_from = $_GET['page'] ? $_GET['page'] : 0; $index = ...; $page_increment = 10; // number of nodes shown $amount = 10; $sphinx = new SphinxClient (); $sphinx->SetServer ('localhost', 3312); $sphinx->SetWeights (array(100, 1)); $sphinx->SetMatchMode (SPH_MATCH_ALL); $sphinx->SetLimits($start_from * $page_increment, $page_increment); $sphinx->SetSortMode(SPH_SORT_TIME_SEGMENTS, 'changed'); // you can add filters with $sphinx->SetFilter etc $result = $sphinx->Query(implode(' ', $keys), $index); if (isset($result) && $result !== FALSE && !empty($result['matches'])) { $GLOBALS['pager_page_array'][] = $start_from; $GLOBALS['pager_total'][] = intval($result['total'] / $page_increment) + 1; $GLOBALS['pager_total_items'][] = $result['total']; foreach (array_keys($result["matches"]) as $nid) { $node = node_load($nid);
While I am unaware of any documentation of the PHP API, the Perl docs work equally well and there is a sample php in the package.
Commenting on this Story is closed.
At information.dk we're using sphinx as well. Both search and indexing of more than 150k nodes is VERY fast.
But when we make changes to our contenttype - we have to update our sphinx.conf - which get sorta complex (see: http://drupalbin.com/1078). If you or anyone you know of tries to make an effort to integrate Sphinx with Drupal in a module or the like, we're very interested in helping out.
Best
/Johs. (http://drupal.org/user/58666)
Writing code that can figure out a query which loads the node information for every possible site,... that's not possible.
I dont think so..
Jackson
=========
email extractor
Most themeable functions and APIs take HTML for their arguments, and there are a few that automatically sanitize text by first passing it through check_plain() t(): the 70-284 exam placeholders (e.g. '%name' or '@name') are passed as plain-text and will be escaped when inserted into the translatable string. You can disable this escaping by using placeholders of the form '!name' (more info).
l(): the link caption should be passed as plain-text (unless overridden with the $html parameter).
menu items and breadcrumbs: 640-721 exam the menu item titles and breadcrumb titles are automatically sanitized.
theme('placeholder'): the placeholder text is plain-text.
Block descriptions (but not 642-504 exam titles--see below)
User names when printed using theme_username()
Form API (FAPI) #default_value element and #options element when the type is a
select box.
Thanks chx for pointing out to sphinx; I remember playing with it when it originally came out but the latest version is really easy to install and powerful; it's so fast I even implemented its buildexcerpts (for highlighting) and its still faster than search.module.
I use it on a site im developing that has currently 50000 nodes, going up by about 500 everyday. Indexing takes 5 seconds for the whole collection so I'm not even sure i'm gonna bother with a main + delta schema.
Thanks for the great 'tutorial', it saved me time.