diff --git a/README.md b/README.md index c519ee8..0d5af98 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,9 @@ Adds support for fulltext search engines like Sphinx and Solr to SilverStripe CM ## Documentation -See [the docs](/docs/en/00_index.md), or for the quick version see [the quick start guide](/docs/en/01_getting_started.md#quick-start). +For pure Solr docs, check out [the Solr 4.10.4 guide](https://archive.apache.org/dist/lucene/solr/ref-guide/apache-solr-ref-guide-4.10.pdf). + +See [the docs](/docs/en/00_index.md) for configuration and setup, or for the quick version see [the quick start guide](/docs/en/01_getting_started.md#quick-start). For details of updates, bugfixes, and features, please see the [changelog](CHANGELOG.md). @@ -48,8 +50,4 @@ maybe 'Content->Summary' to allow calling a specific method on the field object - Allow user logic to cause triggering reindex of documents when field is user generated -* Add sphinx connector - * Add generic APIs for spell correction, file text extraction and snippet generation - -* Better docs diff --git a/docs/en/00_index.md b/docs/en/00_index.md index f5f8da8..d71e99e 100644 --- a/docs/en/00_index.md +++ b/docs/en/00_index.md @@ -11,20 +11,24 @@ - [Solr server parameters](03_configuration.md#solr-server-parameters) - [Creating an index](03_configuration.md#creating-an-index) - [Adding data to an index](03_configuration.md#adding-data-to-an-index) - - [Querying an index](03_configuration.md#querying-the-index) - [Running the dev/tasks](03_configuration.md#dev-tasks) - [File-based configuration](03_configuration.md#file-based-configuration) - [Handling results](03_configuration.md#handling-results) +- Querying + - [Building a SearchQuery](04_querying.md#building-a-`searchquery`) + - [Searching value ranges](04_querying.md#searching-value-ranges) + - [Empty or existing values](04_querying.md#empty-or-existing-values) + - [Executing your query](04_querying.md#executing-your-query) - Advanced configuration - - [Facets](04_advanced_configuration.md#facets) - - [Using multiple indexes](04_advanced_configuration.md#multiple-indexes) - - [Analyzers, tokens and token filters](04_advanced_configuration.md#analyzers,-tokenizers-and-token-filters) - - [Spellcheck](04_advanced_configuration.md#spell-check-("did-you-mean...")) - - [Highlighting](04_advanced_configuration.md#highlighting) - - [Boosting](04_advanced_configuration.md#boosting) - - [Indexing related objects](04_advanced_configuration.md#indexing-related-objects) - - [Subsites](04_advanced_configuration.md#subsites) - - [Custom field types](04_advanced_configuration.md#custom-field-types) - = [Text extraction](04_advanced_configuration.md#text-extraction) + - [Facets](05_advanced_configuration.md#facets) + - [Using multiple indexes](05_advanced_configuration.md#multiple-indexes) + - [Analyzers, tokens and token filters](05_advanced_configuration.md#analyzers,-tokenizers-and-token-filters) + - [Spellcheck](05_advanced_configuration.md#spell-check-("did-you-mean...")) + - [Highlighting](05_advanced_configuration.md#highlighting) + - [Boosting](05_advanced_configuration.md#boosting) + - [Indexing related objects](05_advanced_configuration.md#indexing-related-objects) + - [Subsites](05_advanced_configuration.md#subsites) + - [Custom field types](05_advanced_configuration.md#custom-field-types) + = [Text extraction](05_advanced_configuration.md#text-extraction) - Troubleshooting - - [Gotchas](05_troubleshooting.md#common-gotchas) + - [Gotchas](06_troubleshooting.md#common-gotchas) diff --git a/docs/en/03_configuration.md b/docs/en/03_configuration.md index 247a949..b11d65e 100644 --- a/docs/en/03_configuration.md +++ b/docs/en/03_configuration.md @@ -106,7 +106,7 @@ $page = Page::create(['Content' => 'Help me. My house is on fire. This is less t $page->write(); ``` -Depending on the size of the index and how much content needs to be processed, it could take a while for your search results to be updated, so your newly-updated page may not be available in your search results immediately. +Depending on the size of the index and how much content needs to be processed, it could take a while for your search results to be updated, so your newly-updated page may not be available in your search results immediately. This approach is typically not recommended. ### Queued jobs @@ -136,138 +136,41 @@ class MyIndex extends SolrIndex } ``` -Alternatively, you can index draft content, but simply exclude it from searches. This can be handy to preview search results on unpublished content, in case a CMS author is logged in. Before constructing your `SearchQuery`, conditionally switch to the "live" stage: +Alternatively, you can index draft content, but simply exclude it from searches. This can be handy to preview search results on unpublished content, in case a CMS author is logged in. Before constructing your `SearchQuery`, conditionally switch to the "live" stage. + +### Adding DataObjects + +If you create a class that extends `DataObject` (and not `Page`) then it won't be automatically added to the search +index. You'll have to make some changes to add it in. The `DataObject` class will require the following minimum code +to render properly in the search results: + +* `Link()` needs to return the URL to follow from the search results to actually view the object. +* `Name` (as a DB field) will be used as the result title. +* `Abstract` (as a DB field) will show under the search result title. +* `getShowInSearch()` is required to get the record to show in search, since all results are filtered by `ShowInSearch`. + +So with that, you can add your class to your index: ```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use SilverStripe\Security\Permission; -use SilverStripe\Versioned\Versioned; +use My\Namespace\Model\SearchableDataObject; +use SilverStripe\FullTextSearch\Solr\SolrIndex; +use Page; -if (!Permission::check('CMS_ACCESS_CMSMain')) { - Versioned::set_stage(Versioned::LIVE); +class MySolrSearchIndex extends SolrIndex { + + public function init() + { + $this->addClass(SearchableDataObject::class); + $this->addClass(Page::class); + $this->addAllFulltextFields(); + } } -$query = SearchQuery::create(); -// ... ``` -## Querying an index - -This is where the magic happens. You will construct the search terms and other parameters required to form a `SearchQuery` object, and pass that into a `SearchIndex` to get results. - -### Building a `SearchQuery` - -First, you'll need to construct a new `SearchQuery` object: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = SearchQuery::create(); -``` - -You can then alter the `SearchQuery` with a number of methods: - -#### `addSearchTerm()` - -The simplest - pass through a string to search your index for. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = SearchQuery::create() - ->addSearchTerm('fire'); -``` - -You can also limit this to specific fields by passing an array as the second argument, specified in the form of `{table}_{field}`: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use Page; - -$query = SearchQuery::create() - ->addSearchTerm('on fire', [Page::class . '_Title']); -``` - -#### `addFuzzySearchTerm()` - -Pass through a string to search your index for, with "fuzzier" matching - this means that a term like "fishing" would also likely find results containing "fish" or "fisher". Otherwise behaves the same as `addSearchTerm()`. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = SearchQuery::create() - ->addFuzzySearchTerm('fire'); -``` - -#### `addClassFilter()` - -Only query a specific class in the index, optionally including subclasses. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use My\Namespace\PageType\SpecialPage; - -$query = SearchQuery::create() - ->addClassFilter(SpecialPage::class, false); // only return results from SpecialPages, not subclasses -``` - -#### Searching value ranges - -Most values can be expressed as ranges, most commonly dates or numbers. To search for a range of values rather than an exact match, -use the `SearchQuery_Range` class. The range can include bounds on both sides, or stay open-ended by simply leaving the argument blank. -It takes arguments in the form of `SearchQuery_Range::create($start, $end))`: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery_Range; -use My\Namespace\Index\MyIndex; -use Page; - -$query = SearchQuery::create() - ->addSearchTerm('fire') - // Only include documents edited in 2011 or earlier - ->addFilter(Page::class . '_LastEdited', SearchQuery_Range::create(null, '2011-12-31T23:59:59Z')); -$results = MyIndex::singleton()->search($query); -``` - -Note: At the moment, the date format is specific to the search implementation. - -#### Searching for empty or existing values - -Since there's a type conversion between the SilverStripe database, object properties -and the search index persistence, it's often not clear which condition is searched for. -Should it equal an empty string, or only match if the field wasn't indexed at all? -The `SearchQuery` API has the concept of a "missing" and "present" field value for this: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use My\Namespace\Index\MyIndex; -use Page; - -$query = SearchQuery::create() - ->addSearchTerm('fire'); - // Needs a value, although it can be false - ->addFilter(Page::class . '_ShowInMenus', SearchQuery::$present); -$results = MyIndex::singleton()->search($query); -``` - -### Querying an index - -Once you have your query constructed, you need to run it against your index. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use My\Namespace\Index\MyIndex; - -$query = SearchQuery::create()->addSearchTerm('fire'); -$results = MyIndex::singleton()->search($query); -``` - -The return value of a `search()` call is an object which contains a few properties: - - * `Matches`: `ArrayList` of the current "page" of search results. - * `Suggestion`: (optional) Any suggested spelling corrections in the original query notation - * `SuggestionNice`: (optional) Any suggested spelling corrections for display (without query notation) - * `SuggestionQueryString` (optional) Link to repeat the search with suggested spelling corrections +Once you've created the above classes and run the [solr dev tasks](#solr-dev-tasks) to tell Solr about the new index +you've just created, this will add `SearchableDataObject` and the text fields it has to the index. Now when you search +on the site using `MySolrSearchIndex->search()`, the `SearchableDataObject` results will show alongside normal `Page` +results. ## Solr dev tasks @@ -306,7 +209,7 @@ The Solr indexes will be stored as binary files inside your SilverStripe project ## File-based configuration -Many aspects of Solr are configured outside of the `schema.xml` file which SilverStripe generates based on the `SolrIndex` subclass that is defined. For example, stopwords are placed in their own `stopwords.txt` file, and advanced [spellchecking](04_advanced_configuration.md#spell-check-("did-you-mean...")) can be configured in `solrconfig.xml`. +Many aspects of Solr are configured outside of the `schema.xml` file which SilverStripe generates based on the `SolrIndex` subclass that is defined. For example, stopwords are placed in their own `stopwords.txt` file, and advanced [spellchecking](05_advanced_configuration.md#spell-check-("did-you-mean...")) can be configured in `solrconfig.xml`. By default, these files are copied from the `fulltextsearch/conf/extras/` directory over to the new index location. In order to use your own files, copy these files into a location of your choosing (for example `mysite/data/solr/`), and tell Solr to use this folder with the `extraspath` [configuration setting](#solr-server-parameters). Run a [`Solr_Configure](#solr-configure) to apply these changes. @@ -340,7 +243,7 @@ class PageController extends ContentController In your template (e.g. `Page_results.ss`) you can access the results and loop through them. They're stored in the `$Matches` property of the search return object. -```ss +```silverstripe <% if $SearchResult.Matches %>

Results for "{$Query}"

Displaying Page $SearchResult.Matches.CurrentPage of $SearchResult.Matches.TotalPages

diff --git a/docs/en/04_querying.md b/docs/en/04_querying.md new file mode 100644 index 0000000..4c7b818 --- /dev/null +++ b/docs/en/04_querying.md @@ -0,0 +1,130 @@ +# Querying + +This is where the magic happens. You will construct the search terms and other parameters required to form a `SearchQuery` object, and pass that into a `SearchIndex` to get results. + +## Building a `SearchQuery` + +First, you'll need to construct a new `SearchQuery` object: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$query = SearchQuery::create(); +``` + +You can then alter the `SearchQuery` with a number of methods: + +### `addSearchTerm()` + +The simplest - pass through a string to search your index for. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$query = SearchQuery::create() + ->addSearchTerm('fire'); +``` + +You can also limit this to specific fields by passing an array as the second argument, specified in the form of `{table}_{field}`: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm('on fire', [Page::class . '_Title']); +``` + +### `addFuzzySearchTerm()` + +Pass through a string to search your index for, with "fuzzier" matching - this means that a term like "fishing" would also likely find results containing "fish" or "fisher". Otherwise behaves the same as `addSearchTerm()`. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$query = SearchQuery::create() + ->addFuzzySearchTerm('fire'); +``` + +### `addClassFilter()` + +Only query a specific class in the index, optionally including subclasses. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\PageType\SpecialPage; + +$query = SearchQuery::create() + ->addClassFilter(SpecialPage::class, false); // only return results from SpecialPages, not subclasses +``` + +## Searching value ranges + +Most values can be expressed as ranges, most commonly dates or numbers. To search for a range of values rather than an exact match, +use the `SearchQuery_Range` class. The range can include bounds on both sides, or stay open-ended by simply leaving the argument blank. +It takes arguments in the form of `SearchQuery_Range::create($start, $end))`: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery_Range; +use My\Namespace\Index\MyIndex; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm('fire') + // Only include documents edited in 2011 or earlier + ->addFilter(Page::class . '_LastEdited', SearchQuery_Range::create(null, '2011-12-31T23:59:59Z')); +$results = MyIndex::singleton()->search($query); +``` + +### How do I use date ranges where dates might not be defined? + +The Solr index updater only includes dates with values, so the field might not exist in all your index entries. A simple bounded range query (`:[* TO ]`) will fail in this case. In order to query the field, reverse the search conditions and exclude the ranges you don't want: + +```php +// Wrong: Filter will ignore all empty field values +$query->addFilter('fieldname', SearchQuery_Range::create('*', 'somedate')); + +// Right: Exclude the opposite range +$query->addExclude('fieldname', SearchQuery_Range::create('somedate', '*')); +``` + +Note: At the moment, the date format is specific to the search implementation. + +## Empty or existing values + +Since there's a type conversion between the SilverStripe database, object properties +and the search index persistence, it's often not clear which condition is searched for. +Should it equal an empty string, or only match if the field wasn't indexed at all? +The `SearchQuery` API has the concept of a "missing" and "present" field value for this: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\Index\MyIndex; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm('fire'); + // Needs a value, although it can be false + ->addFilter(Page::class . '_ShowInMenus', SearchQuery::$present); +$results = MyIndex::singleton()->search($query); +``` + +## Executing your query + +Once you have your query constructed, you need to run it against your index. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\Index\MyIndex; + +$query = SearchQuery::create()->addSearchTerm('fire'); +$results = MyIndex::singleton()->search($query); +``` + +The return value of a `search()` call is an object which contains a few properties: + + * `Matches`: `ArrayList` of the current "page" of search results. + * `Suggestion`: (optional) Any suggested spelling corrections in the original query notation + * `SuggestionNice`: (optional) Any suggested spelling corrections for display (without query notation) + * `SuggestionQueryString` (optional) Link to repeat the search with suggested spelling corrections diff --git a/docs/en/04_advanced_configuration.md b/docs/en/05_advanced_configuration.md similarity index 94% rename from docs/en/04_advanced_configuration.md rename to docs/en/05_advanced_configuration.md index f31c7d5..942af4c 100644 --- a/docs/en/04_advanced_configuration.md +++ b/docs/en/05_advanced_configuration.md @@ -357,8 +357,31 @@ class SolrSearchIndex extends SolrIndex ## Indexing related objects +To add a related object to your index. + ## Subsites +When you are utilising the [subsites module](https://github.com/silverstripe/silverstripe-subsites) you +may want to add [boosting](#boosting/weighting) to results from the current subsite. To do so, you'll +need to use [eDisMax](https://lucene.apache.org/solr/guide/6_6/the-extended-dismax-query-parser.html) +and the supporting parameters `bq` and `bf`. You should add the following to your `SolrIndex` +extension: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use SilverStripe\Subsites\Model\Subsite; + +public function search(SearchQuery $query, $offset = -1, $limit = -1, $params = [])) { + $params = array_merge($params, [ + 'defType' => 'edismax', // turn on eDisMax + 'bq' => '_subsite:'.Subsite::currentSubsiteID(), // boost-query on current subsite ID + 'bf' => '_subsite^2' // double the score of any document with that subsite ID + ]); + + return parent::search($query, $offset, $limit, $params); +} +``` + ## Custom field types Solr supports custom field type definitions which are written to its XML schema. Many standard ones are already included diff --git a/docs/en/05_troubleshooting.md b/docs/en/05_troubleshooting.md deleted file mode 100644 index 6972d3d..0000000 --- a/docs/en/05_troubleshooting.md +++ /dev/null @@ -1,3 +0,0 @@ -# Troubleshooting - -## Common gotchas diff --git a/docs/en/06_troubleshooting.md b/docs/en/06_troubleshooting.md new file mode 100644 index 0000000..18e66e3 --- /dev/null +++ b/docs/en/06_troubleshooting.md @@ -0,0 +1,14 @@ +# Troubleshooting + +## Common gotchas + +* By default number-letter boundaries are treated as a word boundary. For example, `A1` is two words - `a` and `1` - when Solr parses the search term. +* Special characters and operators are not correctly escaped +* Multi-word synonym issues +* When Dolr indexes are reconfigured and reindexed, their content is trashed and rebuilt + +### CWP-specific + +* `solrconfig.xml` customisations fail silently +* Developers aren’t able to test raw queries or see output via the +[Solr admin interface](02_setup.md#solr-admin) diff --git a/docs/en/Solr.md b/docs/en/Solr.md deleted file mode 100644 index 64627f9..0000000 --- a/docs/en/Solr.md +++ /dev/null @@ -1,99 +0,0 @@ -## Adding DataObject classes to Solr search - -If you create a class that extends `DataObject` (and not `Page`) then it won't be automatically added to the search -index. You'll have to make some changes to add it in. - -So, let's take an example of `StaffMember`: - -```php -use SilverStripe\Control\Controller; -use SilverStripe\ORM\DataObject; - -class StaffMember extends DataObject -{ - private static $db = [ - 'Name' => 'Varchar(255)', - 'Abstract' => 'Text', - 'PhoneNumber' => 'Varchar(50)', - ]; - - public function Link($action = 'show') - { - return Controller::join_links('my-controller', $action, $this->ID); - } - - public function getShowInSearch() - { - return 1; - } -} -``` - -This `DataObject` class has the minimum code necessary to allow it to be viewed in the site search. - -`Link()` will return a URL for where a user goes to view the data in more detail in the search results. -`Name` will be used as the result title, and `Abstract` the summary of the staff member which will show under the -search result title. -`getShowInSearch` is required to get the record to show in search, since all results are filtered by `ShowInSearch`. - -So with that, let's create a new class called `MySolrSearchIndex`: - -```php -use StaffMember; -use SilverStripe\CMS\Model\SiteTree; -use SilverStripe\FullTextSearch\Solr\SolrIndex; - -class MySolrSearchIndex extends SolrIndex { - - public function init() - { - $this->addClass(SiteTree::class); - $this->addClass(StaffMember::class); - - $this->addAllFulltextFields(); - $this->addFilterField('ShowInSearch'); - } -} -``` - -This is a copy/paste of the existing configuration but with the addition of `StaffMember`. - -Once you've created the above classes and run `flush=1`, access `dev/tasks/Solr_Configure` and `dev/tasks/Solr_Reindex` -to tell Solr about the new index you've just created. This will add `StaffMember` and the text fields it has to the -index. Now when you search on the site using `MySolrSearchIndex->search()`, -the `StaffMember` results will show alongside normal `Page` results. - - -## Debugging - -### Using the web admin interface - -You can visit `http://localhost:8983/solr`, which will show you a list -to the admin interfaces of all available indices. -There you can search the contents of the index via the native SOLR web interface. - -It is possible to manually replicate the data automatically sent -to Solr when saving/publishing in SilverStripe, -which is useful when debugging front-end queries, -see `thirdparty/fulltextsearch/server/silverstripe-solr-test.xml`. - -``` -java -Durl=http://localhost:8983/solr/MyIndex/update/ -Dtype=text/xml -jar post.jar silverstripe-solr-test.xml -``` - -## FAQ - -### How do I use date ranges where dates might not be defined? - -The Solr index updater only includes dates with values, -so the field might not exist in all your index entries. -A simple bounded range query (`:[* TO ]`) will fail in this case. -In order to query the field, reverse the search conditions and exclude the ranges you don't want: - -```php -// Wrong: Filter will ignore all empty field values -$myQuery->addFilter('fieldname', new SearchQuery_Range('*', 'somedate')); - -// Better: Exclude the opposite range -$myQuery->addExclude('fieldname', new SearchQuery_Range('somedate', '*')); -```