2011-05-02 16:33:05 +12:00
# WARNING: Heavily experimental API. Likely to change without notice.
# FullTextSearch module
2012-07-19 13:27:53 +12:00
An attempt to add stable support for Fulltext Search engines like Sphinx and Solr to SilverStripe CMS
2011-05-02 16:33:05 +12:00
## Maintainer Contact
* Hamish Friedlander < hamish ( at ) silverstripe ( dot ) com >
## Requirements
2012-07-19 13:27:53 +12:00
* SilverStripe 3.0
2011-05-02 16:33:05 +12:00
## Introduction
This is a module aimed at adding support for standalone fulltext search engines to SilverStripe.
2012-07-19 13:27:53 +12:00
It contains several layers:
2011-05-02 16:33:05 +12:00
* A fulltext API, ignoring the actual provision of fulltext searching
* A connector API, providing common code to allow connecting a fulltext searching engine to the fulltext API, and
* Some connectors for common fulltext searching engines.
## Reasoning
There are several fulltext search engines that work in a similar manner. They build indexes of denormalized data that
is then searched through using some custom query syntax.
Traditionally, fulltext search connectors for SilverStripe have attempted to hide this design, instead presenting
fulltext searching as an extension of the object model. However the disconnect between the fulltext search engine's
design and the object model meant that searching was inefficient. The abstraction would also often break and it was
hard to then figure out what was going on.
2012-07-19 13:27:53 +12:00
This module instead provides the ability to define those indexes and queries in PHP. The indexes are defined as a mapping
between the SilverStripe object model and the connector-specific fulltext engine index model. This module then interrogates model metadata
to build the specific index definition.
2011-05-02 16:33:05 +12:00
2012-07-19 13:27:53 +12:00
It also hooks into Sapphire in order to update the indexes when the models change and connectors then convert those index and query definitions
into fulltext engine specific code.
2011-05-02 16:33:05 +12:00
The intent of this module is not to make changing fulltext search engines seamless. Where possible this module provides
common interfaces to fulltext engine functionality, abstracting out common behaviour. However, each connector also
2012-07-19 13:27:53 +12:00
offers its own extensions, and there is some behaviour (such as getting the fulltext search engines installed, configured
and running) that each connector deals with itself, in a way best suited to that search engine's design.
2011-05-02 16:33:05 +12:00
## Basic usage
2012-07-19 13:27:53 +12:00
Basic usage is a four step process:
2011-05-02 16:33:05 +12:00
2012-07-19 13:27:53 +12:00
1). Define an index in SilverStripe (Note: The specific connector index instance - that's what defines which engine gets used)
2011-05-02 16:33:05 +12:00
2012-07-18 17:54:59 +02:00
// File: mysite/code/MyIndex.php:
< ?php
class MyIndex extends SolrIndex {
function init() {
$this->addClass('Page');
$this->addFulltextField('DocumentContents');
}
2011-05-02 16:33:05 +12:00
}
2012-07-19 13:27:53 +12:00
2). Add something to the index (Note: You can also just update an existing document in the CMS. but adding _existing_ objects to the index is connector specific)
2011-05-02 16:33:05 +12:00
2012-07-18 17:54:59 +02:00
$page = new Page(array('Contents' => 'Help me. My house is on fire. This is less than optimal.'));
$page->write();
Note: There's usually a connector-specific "reindex" task for this.
2011-05-02 16:33:05 +12:00
2012-07-19 13:27:53 +12:00
3). Build a query
2011-05-02 16:33:05 +12:00
2012-07-18 17:54:59 +02:00
$query = new SearchQuery();
$query->search('My house is on fire');
2011-05-02 16:33:05 +12:00
2012-07-19 13:27:53 +12:00
4). Apply that query to an index
2011-05-02 16:33:05 +12:00
2012-07-18 17:54:59 +02:00
$results = singleton('MyIndex')->search($query);
2011-05-02 16:33:05 +12:00
Note that for most connectors, changes won't be searchable until _after_ the request that triggered the change.
## Connectors
### Solr
See Solr.md
### Sphinx
Not written yet
2012-08-28 23:21:51 +02:00
## FAQ
### How do I exclude draft pages from the index?
By default, the `SearchUpdater` class indexes all available "variant states",
so in the case of the `Versioned` extension, both "draft" and "live".
For most cases, you'll want to exclude draft content from your search results.
You can either prevent the draft content from being indexed in the first place,
by adding the following to your `SearchIndex->init()` method:
$this->excludeVariantState(array('SearchVariantVersioned' => 'Stage'));
Alternatively, you can index draft content, but simply exclude it from searches.
This can be handy to preview search results on unpublished content, in case a CMS author is logged in.
Before constructing your `SearchQuery` , conditionally switch to the "live" stage:
if(!Permission::check('CMS_ACCESS_CMSMain')) Versioned::reading_stage('Live');
$query = new SearchQuery();
// ...