From 26a6581f9bafa8b953b8ef0ce939dd877d884930 Mon Sep 17 00:00:00 2001 From: Andrew Aitken-Fincham Date: Tue, 22 May 2018 14:07:38 +0100 Subject: [PATCH 1/8] lay out docs skeleton adding an index adding data to an index solr dev tasks handling results querying an index --- README.md | 2 +- docs/en/00_index.md | 28 +++++++ .../10_module_scope.md} | 36 --------- docs/en/01_getting_started/11_quick_start.md | 0 docs/en/02_setup/20_requirements.md | 0 docs/en/02_setup/21_installing_solr.md | 0 docs/en/02_setup/22_installing_the_module.md | 0 docs/en/02_setup/23_solr_admin.md | 0 .../03_configuration/30_creating_an_index.md | 23 ++++++ .../31_adding_data_to_an_index.md | 30 +++++++ .../03_configuration/32_querying_the_index.md | 78 +++++++++++++++++++ docs/en/03_configuration/33_dev_tasks.md | 19 +++++ .../34_file_based_configuration.md | 0 .../03_configuration/35_handling_results.md | 47 +++++++++++ .../en/04_advanced_configuration/40_facets.md | 0 .../41_multiple_indexes.md | 0 .../04_advanced_configuration/42_synonyms.md | 0 .../43_spell_check.md | 0 .../04_advanced_configuration/44_boosting.md | 0 .../45_indexing_related_objects.md | 0 .../04_advanced_configuration/46_subsites.md | 0 .../47_adding_new_fields.md | 0 .../05_troubleshooting/50_common_gotchas.md | 0 23 files changed, 226 insertions(+), 37 deletions(-) create mode 100644 docs/en/00_index.md rename docs/en/{index.md => 01_getting_started/10_module_scope.md} (92%) create mode 100644 docs/en/01_getting_started/11_quick_start.md create mode 100644 docs/en/02_setup/20_requirements.md create mode 100644 docs/en/02_setup/21_installing_solr.md create mode 100644 docs/en/02_setup/22_installing_the_module.md create mode 100644 docs/en/02_setup/23_solr_admin.md create mode 100644 docs/en/03_configuration/30_creating_an_index.md create mode 100644 docs/en/03_configuration/31_adding_data_to_an_index.md create mode 100644 docs/en/03_configuration/32_querying_the_index.md create mode 100644 docs/en/03_configuration/33_dev_tasks.md create mode 100644 docs/en/03_configuration/34_file_based_configuration.md create mode 100644 docs/en/03_configuration/35_handling_results.md create mode 100644 docs/en/04_advanced_configuration/40_facets.md create mode 100644 docs/en/04_advanced_configuration/41_multiple_indexes.md create mode 100644 docs/en/04_advanced_configuration/42_synonyms.md create mode 100644 docs/en/04_advanced_configuration/43_spell_check.md create mode 100644 docs/en/04_advanced_configuration/44_boosting.md create mode 100644 docs/en/04_advanced_configuration/45_indexing_related_objects.md create mode 100644 docs/en/04_advanced_configuration/46_subsites.md create mode 100644 docs/en/04_advanced_configuration/47_adding_new_fields.md create mode 100644 docs/en/05_troubleshooting/50_common_gotchas.md diff --git a/README.md b/README.md index 88001a9..a51696f 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,7 @@ Adds support for fulltext search engines like Sphinx and Solr to SilverStripe CM ## Documentation -See docs/en/index.md +See [the docs](/docs/en/00_index.md), or for the quick version see [the quick start guide](/docs/en/01_getting_started/11_quick_start.md). For details of updates, bugfixes, and features, please see the [changelog](CHANGELOG.md). diff --git a/docs/en/00_index.md b/docs/en/00_index.md new file mode 100644 index 0000000..7e2f981 --- /dev/null +++ b/docs/en/00_index.md @@ -0,0 +1,28 @@ +# Fulltext search documentation index + +- Getting started + - [Module scope](01_getting_started/10_module_scope.md) + - [Quick start guide](01_getting_started/11_quick_start.md) +- Setup + - [Requirements](02_setup/20_requirements.md) + - [Installing Solr](02_setup/21_installing_solr.md) + - [Installing this module](02_setup/22_installing_the_module.md) + - [Solr admin](02_setup/23_solr_admin.md) +- Configuration + - [Creating an index](03_configuration/30_creating_an_index.md) + - [Adding data to an index](03_configuration/31_adding_data_to_an_index.md) + - [Querying an index](03_configuration/32_querying_the_index.md) + - [Running the dev/tasks](03_configuration/33_dev_tasks.md) + - [File-based configuration](03_configuration/34_file_based_configuration.md) + - [Handling results](03_configuration/35_handling_results.md) +- Advanced configuration + - [Facets](04_advanced_configuration/40_facets.md) + - [Using multiple indexes](04_advanced_configuration/41_multiple_indexes.md) + - [Synonyms](04_advanced_configuration/42_synonyms.md) + - [Spellcheck](04_advanced_configuration/43_spell_check.md) + - [Boosting](04_advanced_configuration/44_boosting.md) + - [Indexing related objects](04_advanced_configuration/45_indexing_related_objects.md) + - [Subsites](04_advanced_configuration/46_subsites.md) + - [Adding new fields](04_advanced_configuration/47_adding_new_fields.md) +- Troubleshooting + - [Gotchas](05_troubleshooting/50_common_gotchas.md) diff --git a/docs/en/index.md b/docs/en/01_getting_started/10_module_scope.md similarity index 92% rename from docs/en/index.md rename to docs/en/01_getting_started/10_module_scope.md index 825f41e..402af13 100644 --- a/docs/en/index.md +++ b/docs/en/01_getting_started/10_module_scope.md @@ -73,45 +73,9 @@ You can override the default template with a new one at `templates/Layout/Page_r ### "Slow" start Otherwise, basic usage is a four step process: -1). Define an index in SilverStripe (Note: The specific connector index instance - that's what defines which engine gets used) - -```php -// File: mysite/code/MyIndex.php: - -use Page; -use SilverStripe\FullTextSearch\Solr\SolrIndex; - -class MyIndex extends SolrIndex -{ - public function init() - { - $this->addClass(Page::class); - $this->addFulltextField('Title'); - $this->addFulltextField('Content'); - } -} -``` - -You can also skip listing all searchable fields, and have the index -figure it out automatically via `addAllFulltextFields()`. - -2). Add something to the index (Note: You can also just update an existing document in the CMS. but adding _existing_ objects to the index is connector specific) - -```php -$page = Page::create(['Content' => 'Help me. My house is on fire. This is less than optimal.']); -$page->write(); -``` - -Note: There's usually a connector-specific "reindex" task for this. - 3). Build a query -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -$query = new SearchQuery(); -$query->addSearchTerm('My house is on fire'); -``` 4). Apply that query to an index diff --git a/docs/en/01_getting_started/11_quick_start.md b/docs/en/01_getting_started/11_quick_start.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/02_setup/20_requirements.md b/docs/en/02_setup/20_requirements.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/02_setup/21_installing_solr.md b/docs/en/02_setup/21_installing_solr.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/02_setup/22_installing_the_module.md b/docs/en/02_setup/22_installing_the_module.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/02_setup/23_solr_admin.md b/docs/en/02_setup/23_solr_admin.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/03_configuration/30_creating_an_index.md b/docs/en/03_configuration/30_creating_an_index.md new file mode 100644 index 0000000..70fa45a --- /dev/null +++ b/docs/en/03_configuration/30_creating_an_index.md @@ -0,0 +1,23 @@ +# Creating an index + +An index can essentially be considered a database that contains all of your searchable content. By default, it will store everything in a field called `Content`, which is queried to find your search results. To create an index that you can query, you can define it like so: + +```php +use Page; +use SilverStripe\FullTextSearch\Solr\SolrIndex; + +class MyIndex extends SolrIndex +{ + public function init() + { + $this->addClass(Page::class); + $this->addFulltextField('Title'); + } +} +``` + +This will create a new `SolrIndex` called `MyIndex`, and it will store the `Title` field on all `Pages` for searching. + +You can also skip listing all searchable fields, and have the index figure it out automatically via `addAllFulltextFields()`. This will add any database fields that are `instanceof DBString` to the index. + +Once you've added this file, make sure you run a [Solr configure](./33_dev_tasks.md) to set up your new index. diff --git a/docs/en/03_configuration/31_adding_data_to_an_index.md b/docs/en/03_configuration/31_adding_data_to_an_index.md new file mode 100644 index 0000000..fea4728 --- /dev/null +++ b/docs/en/03_configuration/31_adding_data_to_an_index.md @@ -0,0 +1,30 @@ +# Adding data to an index + +Once you have [created your index](./30_creating_an_index.md), you can add data to it in a number of ways. + +## Reindex the site + +Running the [Solr reindex task](./33_dev_tasks.md) will crawl your site for classes that match those defined on your index, and add the defined fields to the index for searching. This is the most common method used to build the index the first time, or to perform a full rebuild of the index. + +## Publish a page in the CMS + +Every change, addition or removal of an indexed class instance triggers an index update through a "processor" object. The update is transparently handled through inspecting every executed database query and checking which database tables are involved in it. + +A reindex event will trigger when you make a change in the CMS, via `SearchUpdater::handle_manipulation()`, or `ProxyDBExtension::updateProxy()`. This tracks changes to the database, so any alterations will trigger a reindex. In order to minimise delays to those users, the index update is deferred until after the actual request returns to the user, through PHP's `register_shutdown_function()` functionality. + +## Manually + +If you get desperate, you can create a new page in a build task or something like that: + +```php +use Page; + +$page = Page::create(['Content' => 'Help me. My house is on fire. This is less than optimal.']); +$page->write(); +``` + +Depending on the size of the index and how much content needs to be updated, it could take a while for your search results to be updated, so try not to panic if your newly-updated page isn't available immediately. + +## Queued jobs + +If the queuedjobs module is installed, updates are queued up instead of executed in the same request. Queued jobs are usually processed every minute. Large index updates will be batched into multiple queued jobs to ensure a job can run to completion within common constraints, such as memory and execution time limits. You can check the status of jobs in an administrative interface under `admin/queuedjobs/`. diff --git a/docs/en/03_configuration/32_querying_the_index.md b/docs/en/03_configuration/32_querying_the_index.md new file mode 100644 index 0000000..5ef0bd3 --- /dev/null +++ b/docs/en/03_configuration/32_querying_the_index.md @@ -0,0 +1,78 @@ +# Querying an index + +This is where the magic happens. You will construct the search terms and other parameters required to form a `SearchQuery` object, and pass that into a `SearchIndex` to get results. + +## Building a `SearchQuery` + +First, you'll need to construct a new `SearchQuery` object: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$query = SearchQuery::create(); +``` + +You can then alter the `SearchQuery` with a number of methods: + +### `addSearchTerm()` + +The simplest - pass through a string to search your index for. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$query = SearchQuery::create() + ->addSearchTerm('fire'); +``` + +You can also limit this to specific fields by passing an array as the second argument: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm('on fire', [Page::class . '_Title']); +``` + +### `addFuzzySearchTerm()` + +Pass through a string to search your index for, with "fuzzier" matching - this means that a term like "fishing" would also likely find results containing "fish" or "fisher". Otherwise behaves the same as `addSearchTerm()`. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$query = SearchQuery::create() + ->addFuzzySearchTerm('fire'); +``` + +### `addClassFilter()` + +Only query a specific class in the index, optionally including subclasses. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\PageType\SpecialPage; + +$query = SearchQuery::create() + ->addClassFilter(SpecialPage::class, false); // only return results from SpecialPages, not subclasses +``` + +## Querying an index + +Once you have your query constructed, you need to run it against your index. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\Index\MyIndex; + +$query = SearchQuery::create()->addSearchTerm('fire'); +$results = singleton(MyIndex::class)->search($query); +``` + +The return value of a `search()` call is an object which contains a few properties: + + * `Matches`: `ArrayList` of the current "page" of search results. + * `Suggestion`: (optional) Any suggested spelling corrections in the original query notation + * `SuggestionNice`: (optional) Any suggested spelling corrections for display (without query notation) + * `SuggestionQueryString` (optional) Link to repeat the search with suggested spelling corrections diff --git a/docs/en/03_configuration/33_dev_tasks.md b/docs/en/03_configuration/33_dev_tasks.md new file mode 100644 index 0000000..d99062e --- /dev/null +++ b/docs/en/03_configuration/33_dev_tasks.md @@ -0,0 +1,19 @@ +# Solr dev tasks + +There are two dev/tasks that are central to the operation of the module - `Solr_Configure` and `Solr_Reindex`. You can access these through the web, or via CLI. It is often a good idea to run a configure, followed by a reindex, after a code change - for example, after a deployment. + +## Solr configure + +`dev/tasks/Solr_Configure` + +This task will upload configuration to the Solr core, reloading it or creating it as necessary. This should be run after every code change to your indexes, or configuration changes. + +## Solr reindex + +`dev/tasks/Solr_Reindex` + +This task performs a reindex, which adds all the data specified in the index definition into the index store. + +If you have the [Queued Jobs module](https://github.com/symbiote/silverstripe-queuedjobs/) installed, then this task will create multiple reindex jobs that are processed asynchronously. Otherwise, it will run in one process. Often, if you are running it via the web, the request will time out. Usually this means the actually process is still running in the background, but it can be alarming to the user. + +If instead you run the task via the command line, you will see verbose output as the reindexing progresses. diff --git a/docs/en/03_configuration/34_file_based_configuration.md b/docs/en/03_configuration/34_file_based_configuration.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/03_configuration/35_handling_results.md b/docs/en/03_configuration/35_handling_results.md new file mode 100644 index 0000000..cd04194 --- /dev/null +++ b/docs/en/03_configuration/35_handling_results.md @@ -0,0 +1,47 @@ +# Handling results + +In order to render search results, you need to return them from a controller. You can also drive this through a form response through standard SilverStripe forms. In this case we simply assume there's a GET parameter named `q` with a search term present. + +```php +use SilverStripe\CMS\Controllers\ContentController; +use SilverStripe\Control\HTTPRequest; +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\Index\MyIndex; + +class PageController extends ContentController +{ + private static $allowed_actions = [ + 'search', + ]; + + public function search(HTTPRequest $request) + { + $query = SearchQuery::create()->addSearchTerm($request->getVar('q')); + return $this->renderWith([ + 'SearchResult' => singleton(MyIndex::class)->search($query) + ]); + } +} +``` + +In your template (e.g. `Page_results.ss`) you can access the results and loop through them. They're stored in the `$Matches` property of the search return object. + +```ss +<% if $SearchResult.Matches %> +

Results for "{$Query}"

+

Displaying Page $SearchResult.Matches.CurrentPage of $SearchResult.Matches.TotalPages

+
    + <% loop $SearchResult.Matches %> +
  1. +

    $Title

    +

    <% if $Abstract %>$Abstract.XML<% else %>$Content.ContextSummary<% end_if %>

    +
  2. + <% end_loop %> +
+<% else %> +

Sorry, your search query did not return any results.

+<% end_if %> +``` + +Please check the [pagination guide](https://docs.silverstripe.org/en/4/developer_guides/templates/how_tos/pagination/) +in the main SilverStripe documentation to learn how to paginate through search results. diff --git a/docs/en/04_advanced_configuration/40_facets.md b/docs/en/04_advanced_configuration/40_facets.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/04_advanced_configuration/41_multiple_indexes.md b/docs/en/04_advanced_configuration/41_multiple_indexes.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/04_advanced_configuration/42_synonyms.md b/docs/en/04_advanced_configuration/42_synonyms.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/04_advanced_configuration/43_spell_check.md b/docs/en/04_advanced_configuration/43_spell_check.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/04_advanced_configuration/44_boosting.md b/docs/en/04_advanced_configuration/44_boosting.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/04_advanced_configuration/45_indexing_related_objects.md b/docs/en/04_advanced_configuration/45_indexing_related_objects.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/04_advanced_configuration/46_subsites.md b/docs/en/04_advanced_configuration/46_subsites.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/04_advanced_configuration/47_adding_new_fields.md b/docs/en/04_advanced_configuration/47_adding_new_fields.md new file mode 100644 index 0000000..e69de29 diff --git a/docs/en/05_troubleshooting/50_common_gotchas.md b/docs/en/05_troubleshooting/50_common_gotchas.md new file mode 100644 index 0000000..e69de29 From 8b5a3dd2e7bf605ae51fcf4f7a94ca1bc753bce9 Mon Sep 17 00:00:00 2001 From: Andrew Aitken-Fincham Date: Thu, 24 May 2018 14:30:32 +0100 Subject: [PATCH 2/8] PR feedback --- docs/en/03_configuration/30_creating_an_index.md | 2 +- docs/en/03_configuration/31_adding_data_to_an_index.md | 6 +++--- docs/en/03_configuration/33_dev_tasks.md | 8 ++++---- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/en/03_configuration/30_creating_an_index.md b/docs/en/03_configuration/30_creating_an_index.md index 70fa45a..97bb493 100644 --- a/docs/en/03_configuration/30_creating_an_index.md +++ b/docs/en/03_configuration/30_creating_an_index.md @@ -18,6 +18,6 @@ class MyIndex extends SolrIndex This will create a new `SolrIndex` called `MyIndex`, and it will store the `Title` field on all `Pages` for searching. -You can also skip listing all searchable fields, and have the index figure it out automatically via `addAllFulltextFields()`. This will add any database fields that are `instanceof DBString` to the index. +You can also skip listing all searchable fields, and have the index figure it out automatically via `addAllFulltextFields()`. This will add any database fields that are `instanceof DBString` to the index. Use this with caution, however, as you may inadvertently return sensitive information - it is often safer to declare your fields explicitly. Once you've added this file, make sure you run a [Solr configure](./33_dev_tasks.md) to set up your new index. diff --git a/docs/en/03_configuration/31_adding_data_to_an_index.md b/docs/en/03_configuration/31_adding_data_to_an_index.md index fea4728..08fa3d2 100644 --- a/docs/en/03_configuration/31_adding_data_to_an_index.md +++ b/docs/en/03_configuration/31_adding_data_to_an_index.md @@ -14,7 +14,7 @@ A reindex event will trigger when you make a change in the CMS, via `SearchUpdat ## Manually -If you get desperate, you can create a new page in a build task or something like that: +If the situation calls for it, you can add an object to the index directly: ```php use Page; @@ -23,8 +23,8 @@ $page = Page::create(['Content' => 'Help me. My house is on fire. This is less t $page->write(); ``` -Depending on the size of the index and how much content needs to be updated, it could take a while for your search results to be updated, so try not to panic if your newly-updated page isn't available immediately. +Depending on the size of the index and how much content needs to be processed, it could take a while for your search results to be updated, so your newly-updated page may not be available in your search results immediately. ## Queued jobs -If the queuedjobs module is installed, updates are queued up instead of executed in the same request. Queued jobs are usually processed every minute. Large index updates will be batched into multiple queued jobs to ensure a job can run to completion within common constraints, such as memory and execution time limits. You can check the status of jobs in an administrative interface under `admin/queuedjobs/`. +If the [Queued Jobs module](https://github.com/symbiote/silverstripe-queuedjobs/) is installed, updates are queued up instead of executed in the same request. Queued jobs are usually processed every minute. Large index updates will be batched into multiple queued jobs to ensure a job can run to completion within common constraints, such as memory and execution time limits. You can check the status of jobs in an administrative interface under `admin/queuedjobs/`. diff --git a/docs/en/03_configuration/33_dev_tasks.md b/docs/en/03_configuration/33_dev_tasks.md index d99062e..824bacb 100644 --- a/docs/en/03_configuration/33_dev_tasks.md +++ b/docs/en/03_configuration/33_dev_tasks.md @@ -1,6 +1,8 @@ # Solr dev tasks -There are two dev/tasks that are central to the operation of the module - `Solr_Configure` and `Solr_Reindex`. You can access these through the web, or via CLI. It is often a good idea to run a configure, followed by a reindex, after a code change - for example, after a deployment. +There are two dev/tasks that are central to the operation of the module - `Solr_Configure` and `Solr_Reindex`. You can access these through the web, or via CLI. Running via the web will return "quiet" output by default, but you can increase verbosity by adding `?verbose=1` to the `dev/tasks` URL; CLI will return verbose output by default. + +It is often a good idea to run a configure, followed by a reindex, after a code change - for example, after a deployment. ## Solr configure @@ -14,6 +16,4 @@ This task will upload configuration to the Solr core, reloading it or creating i This task performs a reindex, which adds all the data specified in the index definition into the index store. -If you have the [Queued Jobs module](https://github.com/symbiote/silverstripe-queuedjobs/) installed, then this task will create multiple reindex jobs that are processed asynchronously. Otherwise, it will run in one process. Often, if you are running it via the web, the request will time out. Usually this means the actually process is still running in the background, but it can be alarming to the user. - -If instead you run the task via the command line, you will see verbose output as the reindexing progresses. +If you have the [Queued Jobs module](https://github.com/symbiote/silverstripe-queuedjobs/) installed, then this task will create multiple reindex jobs that are processed asynchronously; unless you are in `dev` mode, in which case the index will be processed immediately (see [processor.yml](/_config/processor.yml)). Otherwise, it will run in one process. Often, if you are running it via the web, the request will time out. Usually this means the actually process is still running in the background, but it can be alarming to the user, so bear that in mind. From 51656d94b93f044d2825fc6a9bdcda3657dd5f9c Mon Sep 17 00:00:00 2001 From: Andrew Aitken-Fincham Date: Thu, 24 May 2018 15:24:37 +0100 Subject: [PATCH 3/8] prune module scope, add boosting docs --- docs/en/00_index.md | 1 + docs/en/01_getting_started/10_module_scope.md | 329 +----------------- docs/en/02_setup/22_installing_the_module.md | 14 + .../03_configuration/30_creating_an_index.md | 21 +- .../31_adding_data_to_an_index.md | 38 ++ .../03_configuration/32_querying_the_index.md | 40 +++ .../41_multiple_indexes.md | 32 ++ .../04_advanced_configuration/44_boosting.md | 28 ++ docs/en/05_troubleshooting.md | 112 ++++++ 9 files changed, 290 insertions(+), 325 deletions(-) create mode 100644 docs/en/05_troubleshooting.md diff --git a/docs/en/00_index.md b/docs/en/00_index.md index 7e2f981..4e73f4c 100644 --- a/docs/en/00_index.md +++ b/docs/en/00_index.md @@ -26,3 +26,4 @@ - [Adding new fields](04_advanced_configuration/47_adding_new_fields.md) - Troubleshooting - [Gotchas](05_troubleshooting/50_common_gotchas.md) + - [Test Anchor](05_troubleshooting.md#test-anchor) diff --git a/docs/en/01_getting_started/10_module_scope.md b/docs/en/01_getting_started/10_module_scope.md index 402af13..b957684 100644 --- a/docs/en/01_getting_started/10_module_scope.md +++ b/docs/en/01_getting_started/10_module_scope.md @@ -1,20 +1,20 @@ -## Introduction +# Introduction This is a module aimed at adding support for standalone fulltext search engines to SilverStripe. It contains several layers: * A fulltext API, ignoring the actual provision of fulltext searching - * A connector API, providing common code to allow connecting a fulltext searching engine to the fulltext API, and - * Some connectors for common fulltext searching engines. + * A connector API, providing common code to allow connecting a fulltext searching engine to the fulltext API + * Some connectors for common fulltext searching engines (currently only [Apache Solr](http://lucene.apache.org/solr/)) ## Reasoning There are several fulltext search engines that work in a similar manner. They build indexes of denormalized data that -is then searched through using some custom query syntax. +are then searched through using some custom query syntax. Traditionally, fulltext search connectors for SilverStripe have attempted to hide this design, instead presenting -fulltext searching as an extension of the object model. However the disconnect between the fulltext search engine's +fulltext searching as an extension of the object model. However, the disconnect between the fulltext search engine's design and the object model meant that searching was inefficient. The abstraction would also often break and it was hard to then figure out what was going on. @@ -29,322 +29,3 @@ The intent of this module is not to make changing fulltext search engines seamle common interfaces to fulltext engine functionality, abstracting out common behaviour. However, each connector also offers its own extensions, and there is some behaviour (such as getting the fulltext search engines installed, configured and running) that each connector deals with itself, in a way best suited to that search engine's design. - -## Disabling automatic configuration - -If you have this module installed but do not have a Solr server running, you can disable the database manipulation -hooks that trigger automatic index updates: - -```yaml -# File: mysite/_config/search.yml ---- -Name: mysitesearch ---- -SilverStripe\FullTextSearch\Search\Updaters\SearchUpdater: - enabled: false -``` - -## Basic usage - -### Quick start - -If you are running on a Linux-based system, you can get up and running quickly with the quickstart script, like so: - -```bash -composer require silverstripe/fulltextsearch && vendor/bin/fulltextsearch_quickstart -``` - -This will: - -- Install the required Java SDK (using `apt-get` or `yum`) -- Install Solr 4 -- Set up a daemon to run Solr on startup -- Start Solr -- Configure Solr in your `_config.php` (and create one if you don't have one) -- Create a DefaultIndex -- Run a [Solr Configure](03_configuration.md#solr-configure) and a [Solr Reindex](03_configuration.md#solr-reindex) - -If you have the [CMS module](https://github.com/silverstripe/silverstripe-cms) installed, you will be able to simply add `$SearchForm` to your template to add a Solr search form. Default configuration is added via the [`ContentControllerExtension`](/src/Solr/Control/ContentControllerExtension.php) and alternative [`SearchForm`](/src/Solr/Forms/SearchForm.php). - -Ensure that you _don't_ have `SilverStripe\ORM\Search\FulltextSearchable::enable()` set in `_config.php`, as the `SearchForm` action provided by that class will conflict. - -You can override the default template with a new one at `templates/Layout/Page_results_solr.ss`. - -### "Slow" start -Otherwise, basic usage is a four step process: - -3). Build a query - - - -4). Apply that query to an index - -```php -$results = singleton(MyIndex::class)->search($query); -``` - -Note that for most connectors, changes won't be searchable until _after_ the request that triggered the change. - -The return value of a `search()` call is an object which contains a few properties: - - * `Matches`: ArrayList of the current "page" of search results. - * `Suggestion`: (optional) Any suggested spelling corrections in the original query notation - * `SuggestionNice`: (optional) Any suggested spelling corrections for display (without query notation) - * `SuggestionQueryString` (optional) Link to repeat the search with suggested spelling corrections - -## Controllers and Templates - -In order to render search results, you need to return them from a controller. -You can also drive this through a form response through standard SilverStripe forms. -In this case we simply assume there's a GET parameter named `q` with a search term present. - -```php -use SilverStripe\CMS\Controllers\ContentController; -use SilverStripe\Control\HTTPRequest; -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -class PageController extends ContentController -{ - private static $allowed_actions = [ - 'search', - ]; - - public function search(HTTPRequest $request) - { - $query = new SearchQuery(); - $query->addSearchTerm($request->getVar('q')); - return $this->renderWith([ - 'SearchResult' => singleton(MyIndex::class)->search($query) - ]); - } -} -``` - -In your template (e.g. `Page_results.ss`) you can access the results and loop through them. -They're stored in the `$Matches` property of the search return object. - -```ss -<% if $SearchResult.Matches %> -

Results for "{$Query}"

-

Displaying Page $SearchResult.Matches.CurrentPage of $SearchResult.Matches.TotalPages

-
    - <% loop $SearchResult.Matches %> -
  1. -

    $Title

    -

    <% if $Abstract %>$Abstract.XML<% else %>$Content.ContextSummary<% end_if %>

    -
  2. - <% end_loop %> -
-<% else %> -

Sorry, your search query did not return any results.

-<% end_if %> -``` - -Please check the [pagination guide](https://docs.silverstripe.org/en/4/developer_guides/templates/how_tos/pagination/) -in the main SilverStripe documentation to learn how to paginate through search results. - -## Automatic Index Updates - -Every change, addition or removal of an indexed class instance triggers an index update through a -"processor" object. The update is transparently handled through inspecting every executed database query -and checking which database tables are involved in it. - -Index updates usually are executed in the same request which caused the index to become "dirty". -For example, a CMS author might have edited a page, or a user has left a new comment. -In order to minimise delays to those users, the index update is deferred until after -the actual request returns to the user, through PHP's `register_shutdown_function()` functionality. - -If the [queuedjobs](https://github.com/symbiote/silverstripe-queuedjobs) module is installed, -updates are queued up instead of executed in the same request. Queue jobs are usually processed every minute. -Large index updates will be batched into multiple queue jobs to ensure a job can run to completion within -common execution constraints (memory and time limits). You can check the status of jobs in -an administrative interface under `admin/queuedjobs/`. - -## Manual Index Updates - -Manual updates are connector specific, please check the connector docs for details. - -## Searching Specific Fields - -By default, the index searches through all indexed fields. -This can be limited by arguments to the `addSearchTerm()` call. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = new SearchQuery(); -$query->addSearchTerm('My house is on fire', [Page::class . '_Title']); -// No results, since we're searching in title rather than page content -$results = singleton(MyIndex::class)->search($query); -``` - -## Searching Value Ranges - -Most values can be expressed as ranges, most commonly dates or numbers. -To search for a range of values rather than an exact match, -use the `SearchQuery_Range` class. The range can include bounds on both sides, -or stay open ended by simply leaving the argument blank. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery_Range; - -$query = new SearchQuery(); -$query->addSearchTerm('My house is on fire'); -// Only include documents edited in 2011 or earlier -$query->addFilter(Page::class . '_LastEdited', new SearchQuery_Range(null, '2011-12-31T23:59:59Z')); -$results = singleton(MyIndex::class)->search($query); -``` - -Note: At the moment, the date format is specific to the search implementation. - -## Searching Empty or Existing Values - -Since there's a type conversion between the SilverStripe database, object properties -and the search index persistence, its often not clear which condition is searched for. -Should it equal an empty string, or only match if the field wasn't indexed at all? -The `SearchQuery` API has the concept of a "missing" and "present" field value for this: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = new SearchQuery(); -$query->addSearchTerm('My house is on fire'); -// Needs a value, although it can be false -$query->addFilter(Page::class . '_ShowInMenus', SearchQuery::$present); -$results = singleton(MyIndex::class)->search($query); -``` - -## Indexing Multiple Classes - -An index is a denormalized view of your data, so can hold data from more than one model. -As you can only search one index at a time, all searchable classes need to be included. - -```php -// File: mysite/code/MyIndex.php -use SilverStripe\FullTextSearch\Solr\SolrIndex; -use SilverStripe\Security\Member; - -class MyIndex extends SolrIndex -{ - public function init() - { - $this->addClass(Page::class); - $this->addClass(Member::class); - $this->addFulltextField('Content'); // only applies to Page class - $this->addFulltextField('FirstName'); // only applies to Member class - } -} -``` - -## Using Multiple Indexes - -Multiple indexes can be created and searched independently, but if you wish to override an existing -index with another, you can use the `$hide_ancestor` config. - -```php -use SilverStripe\Assets\File; - -class MyReplacementIndex extends MyIndex -{ - private static $hide_ancestor = 'MyIndex'; - - public function init() - { - parent::init(); - - $this->addClass(File::class); - $this->addFulltextField('Title'); - } -} -``` - -You can also filter all indexes globally to a set of pre-defined classes if you wish to -prevent any unknown indexes from being automatically included. - -```yaml -SilverStripe\FullTextSearch\Search\FullTextSearch: - indexes: - - MyReplacementIndex - - CoreSearchIndex -``` - -## Indexing Relationships - -TODO - -## Weighting/Boosting Fields - -Results aren't all created equal. Matches in some fields are more important -than others, for example terms in a page title rather than its content -might be considered more relevant to the user. - -To account for this, a "weighting" (or "boosting") factor can be applied to each -searched field. The default is 1.0, anything below that will decrease the relevance, -anthing above increases it. - -Example: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = new SearchQuery(); -$query->addSearchTerm( - 'My house is on fire', - null, - [ - Page::class . '_Title' => 1.5, - Page::class . '_Content' => 1.0, - ] -); -$results = singleton(MyIndex::class)->search($query); -``` - -## Filtering - -## Connectors - -### Solr - -See Solr.md - -### Sphinx - -Not written yet - -## FAQ - -### How do I exclude draft pages from the index? - -By default, the `SearchUpdater` class indexes all available "variant states", -so in the case of the `Versioned` extension, both "draft" and "live". -For most cases, you'll want to exclude draft content from your search results. - -You can either prevent the draft content from being indexed in the first place, -by adding the following to your `SearchIndex->init()` method: - -```php -use SilverStripe\FullTextSearch\Search\Variants\SearchVariantVersioned; - -$this->excludeVariantState([SearchVariantVersioned::class => 'Stage']); -``` - -Alternatively, you can index draft content, but simply exclude it from searches. -This can be handy to preview search results on unpublished content, in case a CMS author is logged in. -Before constructing your `SearchQuery`, conditionally switch to the "live" stage: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use SilverStripe\Security\Permission; -use SilverStripe\Versioned\Versioned; - -if (!Permission::check('CMS_ACCESS_CMSMain')) { - Versioned::set_stage(Versioned::LIVE); -} -$query = new SearchQuery(); -// ... -``` - -### How do I write nested/complex filters? - -TODO diff --git a/docs/en/02_setup/22_installing_the_module.md b/docs/en/02_setup/22_installing_the_module.md index e69de29..989711b 100644 --- a/docs/en/02_setup/22_installing_the_module.md +++ b/docs/en/02_setup/22_installing_the_module.md @@ -0,0 +1,14 @@ +# Installing the module + +## Disabling automatic configuration + +If you have this module installed but do not have a Solr server running, you can disable the database manipulation +hooks that trigger automatic index updates: + +```yaml +--- +Name: mysitesearch +--- +SilverStripe\FullTextSearch\Search\Updaters\SearchUpdater: + enabled: false +``` diff --git a/docs/en/03_configuration/30_creating_an_index.md b/docs/en/03_configuration/30_creating_an_index.md index 97bb493..68e20d8 100644 --- a/docs/en/03_configuration/30_creating_an_index.md +++ b/docs/en/03_configuration/30_creating_an_index.md @@ -16,7 +16,26 @@ class MyIndex extends SolrIndex } ``` -This will create a new `SolrIndex` called `MyIndex`, and it will store the `Title` field on all `Pages` for searching. +This will create a new `SolrIndex` called `MyIndex`, and it will store the `Title` field on all `Pages` for searching. To index more than one class, +you simply call `addClass()` multiple times. Fields that you add don't have to be present on all classes in the index, they will only apply to a class +if it is present. + +```php +use Page; +use SilverStripe\Security\Member; +use SilverStripe\FullTextSearch\Solr\SolrIndex; + +class MyIndex extends SolrIndex +{ + public function init() + { + $this->addClass(Page::class); + $this->addClass(Member::class); + $this->addFulltextField('Content'); // only applies to Page class + $this->addFulltextField('FirstName'); // only applies to Member class + } +} +``` You can also skip listing all searchable fields, and have the index figure it out automatically via `addAllFulltextFields()`. This will add any database fields that are `instanceof DBString` to the index. Use this with caution, however, as you may inadvertently return sensitive information - it is often safer to declare your fields explicitly. diff --git a/docs/en/03_configuration/31_adding_data_to_an_index.md b/docs/en/03_configuration/31_adding_data_to_an_index.md index 08fa3d2..aa977d8 100644 --- a/docs/en/03_configuration/31_adding_data_to_an_index.md +++ b/docs/en/03_configuration/31_adding_data_to_an_index.md @@ -28,3 +28,41 @@ Depending on the size of the index and how much content needs to be processed, i ## Queued jobs If the [Queued Jobs module](https://github.com/symbiote/silverstripe-queuedjobs/) is installed, updates are queued up instead of executed in the same request. Queued jobs are usually processed every minute. Large index updates will be batched into multiple queued jobs to ensure a job can run to completion within common constraints, such as memory and execution time limits. You can check the status of jobs in an administrative interface under `admin/queuedjobs/`. + +### Excluding draft content + +By default, the `SearchUpdater` class indexes all available "variant states", so in the case of the `Versioned` extension, both "draft" and "live". +For most cases, you'll want to exclude draft content from your search results. + +You can either prevent the draft content from being indexed in the first place, by adding the following to your `SearchIndex::init()` method: + +```php +use Page; +use SilverStripe\FullTextSearch\Search\Variants\SearchVariantVersioned; +use SilverStripe\FullTextSearch\Solr\SolrIndex; +use SilverStripe\Versioned\Versioned; + +class MyIndex extends SolrIndex +{ + public function init() + { + $this->addClass(Page::class); + $this->addFulltextField('Title'); + $this->excludeVariantState([SearchVariantVersioned::class => Versioned::DRAFT]); + } +} +``` + +Alternatively, you can index draft content, but simply exclude it from searches. This can be handy to preview search results on unpublished content, in case a CMS author is logged in. Before constructing your `SearchQuery`, conditionally switch to the "live" stage: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use SilverStripe\Security\Permission; +use SilverStripe\Versioned\Versioned; + +if (!Permission::check('CMS_ACCESS_CMSMain')) { + Versioned::set_stage(Versioned::LIVE); +} +$query = SearchQuery::create(); +// ... +``` diff --git a/docs/en/03_configuration/32_querying_the_index.md b/docs/en/03_configuration/32_querying_the_index.md index 5ef0bd3..63dc7cc 100644 --- a/docs/en/03_configuration/32_querying_the_index.md +++ b/docs/en/03_configuration/32_querying_the_index.md @@ -58,6 +58,46 @@ $query = SearchQuery::create() ->addClassFilter(SpecialPage::class, false); // only return results from SpecialPages, not subclasses ``` +### Searching value ranges + +Most values can be expressed as ranges, most commonly dates or numbers. To search for a range of values rather than an exact match, +use the `SearchQuery_Range` class. The range can include bounds on both sides, or stay open-ended by simply leaving the argument blank. +It takes arguments in the form of `SearchQuery_Range::create($start, $end))`: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery_Range; +use My\Namespace\Index\MyIndex; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm('fire') + // Only include documents edited in 2011 or earlier + ->addFilter(Page::class . '_LastEdited', SearchQuery_Range::create(null, '2011-12-31T23:59:59Z')); +$results = singleton(MyIndex::class)->search($query); +``` + +Note: At the moment, the date format is specific to the search implementation. + +### Searching for empty or existing values + +Since there's a type conversion between the SilverStripe database, object properties +and the search index persistence, it's often not clear which condition is searched for. +Should it equal an empty string, or only match if the field wasn't indexed at all? +The `SearchQuery` API has the concept of a "missing" and "present" field value for this: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\Index\MyIndex; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm('fire'); + // Needs a value, although it can be false + ->addFilter(Page::class . '_ShowInMenus', SearchQuery::$present); +$results = singleton(MyIndex::class)->search($query); +``` + ## Querying an index Once you have your query constructed, you need to run it against your index. diff --git a/docs/en/04_advanced_configuration/41_multiple_indexes.md b/docs/en/04_advanced_configuration/41_multiple_indexes.md index e69de29..da0c1bd 100644 --- a/docs/en/04_advanced_configuration/41_multiple_indexes.md +++ b/docs/en/04_advanced_configuration/41_multiple_indexes.md @@ -0,0 +1,32 @@ +# Using multiple indexes + +Multiple indexes can be created and searched independently, but if you wish to override an existing +index with another, you can use the `$hide_ancestor` config. + +```php +use SilverStripe\Assets\File; +use My\Namespace\Index\MyIndex; + +class MyReplacementIndex extends MyIndex +{ + private static $hide_ancestor = MyIndex::class; + + public function init() + { + parent::init(); + + $this->addClass(File::class); + $this->addFulltextField('Title'); + } +} +``` + +You can also filter all indexes globally to a set of pre-defined classes if you wish to +prevent any unknown indexes from being automatically included. + +```yaml +SilverStripe\FullTextSearch\Search\FullTextSearch: + indexes: + - MyReplacementIndex + - CoreSearchIndex +``` diff --git a/docs/en/04_advanced_configuration/44_boosting.md b/docs/en/04_advanced_configuration/44_boosting.md index e69de29..6fb6935 100644 --- a/docs/en/04_advanced_configuration/44_boosting.md +++ b/docs/en/04_advanced_configuration/44_boosting.md @@ -0,0 +1,28 @@ +# Boosting/Weighting + +Results aren't all created equal. Matches in some fields are more important +than others; for example, a page `Title` might be considered more relevant to the user than terms in the `Content` field. + +To account for this, a "weighting" (or "boosting") factor can be applied to each searched field. The default value is `1.0`, anything below that will decrease the relevance, anything above increases it. + +To adjust the relative values, pass them in as the third argument to your `addSearchTerm()` call: + +```php +use My\Namespace\Index\MyIndex; +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm( + 'fire', + null, // don't limit the classes to search + [ + Page::class . '_Title' => 1.5, + Page::class . '_Content' => 1.0, + Page::class . '_SecretParagraph' => 0.1, + ] + ); +$results = singleton(MyIndex::class)->search($query); +``` + +This will ensure that `Title` is given higher priority for matches than `Content`, which is well above `SecretParagraph`. diff --git a/docs/en/05_troubleshooting.md b/docs/en/05_troubleshooting.md new file mode 100644 index 0000000..9edff12 --- /dev/null +++ b/docs/en/05_troubleshooting.md @@ -0,0 +1,112 @@ +# Troubleshooting + +## Common Gotchas + +Oooh I gotcha + + +Here's some whitespace + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +## Test anchor + +I'm on a boat From 718b1d4e30dc7179337abdcc25578ba9526d4ff6 Mon Sep 17 00:00:00 2001 From: Andrew Aitken-Fincham Date: Fri, 25 May 2018 16:14:56 +0100 Subject: [PATCH 4/8] restructure to fewer files with anchors --- README.md | 2 +- docs/en/00_index.md | 44 +-- ..._module_scope.md => 01_getting_started.md} | 10 +- docs/en/01_getting_started/11_quick_start.md | 0 docs/en/02_setup.md | 60 ++++ docs/en/02_setup/20_requirements.md | 0 docs/en/02_setup/21_installing_solr.md | 0 docs/en/02_setup/22_installing_the_module.md | 14 - docs/en/02_setup/23_solr_admin.md | 0 docs/en/03_configuration.md | 340 ++++++++++++++++++ .../03_configuration/30_creating_an_index.md | 42 --- .../31_adding_data_to_an_index.md | 68 ---- .../03_configuration/32_querying_the_index.md | 118 ------ docs/en/03_configuration/33_dev_tasks.md | 19 - .../34_file_based_configuration.md | 0 .../03_configuration/35_handling_results.md | 47 --- docs/en/04_advanced_configuration.md | 75 ++++ .../en/04_advanced_configuration/40_facets.md | 0 .../41_multiple_indexes.md | 32 -- .../04_advanced_configuration/42_synonyms.md | 0 .../43_spell_check.md | 0 .../04_advanced_configuration/44_boosting.md | 28 -- .../45_indexing_related_objects.md | 0 .../04_advanced_configuration/46_subsites.md | 0 .../47_adding_new_fields.md | 0 docs/en/05_troubleshooting.md | 111 +----- .../05_troubleshooting/50_common_gotchas.md | 0 docs/en/Solr.md | 81 ----- 28 files changed, 507 insertions(+), 584 deletions(-) rename docs/en/{01_getting_started/10_module_scope.md => 01_getting_started.md} (95%) delete mode 100644 docs/en/01_getting_started/11_quick_start.md create mode 100644 docs/en/02_setup.md delete mode 100644 docs/en/02_setup/20_requirements.md delete mode 100644 docs/en/02_setup/21_installing_solr.md delete mode 100644 docs/en/02_setup/22_installing_the_module.md delete mode 100644 docs/en/02_setup/23_solr_admin.md create mode 100644 docs/en/03_configuration.md delete mode 100644 docs/en/03_configuration/30_creating_an_index.md delete mode 100644 docs/en/03_configuration/31_adding_data_to_an_index.md delete mode 100644 docs/en/03_configuration/32_querying_the_index.md delete mode 100644 docs/en/03_configuration/33_dev_tasks.md delete mode 100644 docs/en/03_configuration/34_file_based_configuration.md delete mode 100644 docs/en/03_configuration/35_handling_results.md create mode 100644 docs/en/04_advanced_configuration.md delete mode 100644 docs/en/04_advanced_configuration/40_facets.md delete mode 100644 docs/en/04_advanced_configuration/41_multiple_indexes.md delete mode 100644 docs/en/04_advanced_configuration/42_synonyms.md delete mode 100644 docs/en/04_advanced_configuration/43_spell_check.md delete mode 100644 docs/en/04_advanced_configuration/44_boosting.md delete mode 100644 docs/en/04_advanced_configuration/45_indexing_related_objects.md delete mode 100644 docs/en/04_advanced_configuration/46_subsites.md delete mode 100644 docs/en/04_advanced_configuration/47_adding_new_fields.md delete mode 100644 docs/en/05_troubleshooting/50_common_gotchas.md diff --git a/README.md b/README.md index a51696f..c519ee8 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,7 @@ Adds support for fulltext search engines like Sphinx and Solr to SilverStripe CM ## Documentation -See [the docs](/docs/en/00_index.md), or for the quick version see [the quick start guide](/docs/en/01_getting_started/11_quick_start.md). +See [the docs](/docs/en/00_index.md), or for the quick version see [the quick start guide](/docs/en/01_getting_started.md#quick-start). For details of updates, bugfixes, and features, please see the [changelog](CHANGELOG.md). diff --git a/docs/en/00_index.md b/docs/en/00_index.md index 4e73f4c..ed9d14b 100644 --- a/docs/en/00_index.md +++ b/docs/en/00_index.md @@ -1,29 +1,29 @@ # Fulltext search documentation index - Getting started - - [Module scope](01_getting_started/10_module_scope.md) - - [Quick start guide](01_getting_started/11_quick_start.md) + - [Module scope](01_getting_started.md#module-scope) + - [Quick start guide](01_getting_started.md#quick-start) - Setup - - [Requirements](02_setup/20_requirements.md) - - [Installing Solr](02_setup/21_installing_solr.md) - - [Installing this module](02_setup/22_installing_the_module.md) - - [Solr admin](02_setup/23_solr_admin.md) + - [Requirements](02_setup.md#requirements) + - [Installing Solr](02_setup.md#installing-solr) + - [Installing this module](02_setup.md#installing-the-module) + - [Solr admin](02_setup.md#solr-admin) - Configuration - - [Creating an index](03_configuration/30_creating_an_index.md) - - [Adding data to an index](03_configuration/31_adding_data_to_an_index.md) - - [Querying an index](03_configuration/32_querying_the_index.md) - - [Running the dev/tasks](03_configuration/33_dev_tasks.md) - - [File-based configuration](03_configuration/34_file_based_configuration.md) - - [Handling results](03_configuration/35_handling_results.md) + - [Solr server parameters](03_configuration.md#solr-server-parameters) + - [Creating an index](03_configuration.md#creating-an-index) + - [Adding data to an index](03_configuration.md#adding-data-to-an-index) + - [Querying an index](03_configuration.md#querying-the-index) + - [Running the dev/tasks](03_configuration.md#dev-tasks) + - [File-based configuration](03_configuration.md#file-based-configuration) + - [Handling results](03_configuration.md#handling-results) - Advanced configuration - - [Facets](04_advanced_configuration/40_facets.md) - - [Using multiple indexes](04_advanced_configuration/41_multiple_indexes.md) - - [Synonyms](04_advanced_configuration/42_synonyms.md) - - [Spellcheck](04_advanced_configuration/43_spell_check.md) - - [Boosting](04_advanced_configuration/44_boosting.md) - - [Indexing related objects](04_advanced_configuration/45_indexing_related_objects.md) - - [Subsites](04_advanced_configuration/46_subsites.md) - - [Adding new fields](04_advanced_configuration/47_adding_new_fields.md) + - [Facets](04_advanced_configuration.md#facets) + - [Using multiple indexes](04_advanced_configuration.md#multiple-indexes) + - [Synonyms](04_advanced_configuration.md#synonyms) + - [Spellcheck](04_advanced_configuration.md#spell-check) + - [Boosting](04_advanced_configuration.md#boosting) + - [Indexing related objects](04_advanced_configuration.md#indexing-related-objects) + - [Subsites](04_advanced_configuration.md#subsites) + - [Adding new fields](04_advanced_configuration.md#adding-new-fields) - Troubleshooting - - [Gotchas](05_troubleshooting/50_common_gotchas.md) - - [Test Anchor](05_troubleshooting.md#test-anchor) + - [Gotchas](05_troubleshooting.md#common-gotchas) diff --git a/docs/en/01_getting_started/10_module_scope.md b/docs/en/01_getting_started.md similarity index 95% rename from docs/en/01_getting_started/10_module_scope.md rename to docs/en/01_getting_started.md index b957684..0ae68ee 100644 --- a/docs/en/01_getting_started/10_module_scope.md +++ b/docs/en/01_getting_started.md @@ -1,4 +1,8 @@ -# Introduction +# Getting started + +## Module scope + +### Introduction This is a module aimed at adding support for standalone fulltext search engines to SilverStripe. @@ -8,7 +12,7 @@ It contains several layers: * A connector API, providing common code to allow connecting a fulltext searching engine to the fulltext API * Some connectors for common fulltext searching engines (currently only [Apache Solr](http://lucene.apache.org/solr/)) -## Reasoning +### Reasoning There are several fulltext search engines that work in a similar manner. They build indexes of denormalized data that are then searched through using some custom query syntax. @@ -29,3 +33,5 @@ The intent of this module is not to make changing fulltext search engines seamle common interfaces to fulltext engine functionality, abstracting out common behaviour. However, each connector also offers its own extensions, and there is some behaviour (such as getting the fulltext search engines installed, configured and running) that each connector deals with itself, in a way best suited to that search engine's design. + +## Quick start diff --git a/docs/en/01_getting_started/11_quick_start.md b/docs/en/01_getting_started/11_quick_start.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/en/02_setup.md b/docs/en/02_setup.md new file mode 100644 index 0000000..8b16bdf --- /dev/null +++ b/docs/en/02_setup.md @@ -0,0 +1,60 @@ +# Setup + +The fulltextsearch module includes support for connecting to Solr. + +It works with Solr in multi-core mode. It needs to be able to update Solr configuration files, and has modes for doing this by direct file access (when Solr shares a server with SilverStripe) and by WebDAV (when it's on a different server). + +See the helpful [Solr Tutorial](http://lucene.apache.org/solr/4_5_1/tutorial.html), for more on cores +and querying. + +## Requirements + +Since Solr is Java based, it requires Java 1.5 or greater installed. + +When you're installing it yourself, it also requires a servlet container such as Tomcat, Jetty, or Resin. For +development testing there is a standalone version that comes bundled with Jetty (see [Installing Solr](#installing-solr) below). + +See the official [Solr installation docs](http://wiki.apache.org/solr/SolrInstall) for more information. + +Note that these requirements are for the Solr server environment, which doesn't have to be the same physical machine as the SilverStripe webhost. + +## Installing Solr + +### Local installation + +If you'll be running Solr on the same machine as your SilverStripe installation, you can use the [silverstripe/fulltextsearch-localsolr module](https://github.com/silverstripe-archive/silverstripe-fulltextsearch-localsolr). This can also be useful as a development dependency. You can bring it in via composer (use `require-dev` if you plan to use install Solr remotely in Production): + +```bash +composer require silverstripe/fulltextsearch-localsolr +``` + +Once installed, start the server via CLI: + +```bash +cd fulltextsearch-localsolr/server +java -jar start.jar +``` + +Then configure Solr to use `file` more with the following configuration in your `app/_config.php`, making sure that the `path` directory is writeable by the user that started the server (above): + +```php +use SilverStripe\FullTextSearch\Solr\Solr; + +Solr::configure_server([ + 'host' => 'localhost', + 'indexstore' => [ + 'mode' => 'file', + 'path' => BASE_PATH . '/.solr' + ] +]); +``` + +### Remote installation + + + +## Installing the module + + + +## Solr admin diff --git a/docs/en/02_setup/20_requirements.md b/docs/en/02_setup/20_requirements.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/en/02_setup/21_installing_solr.md b/docs/en/02_setup/21_installing_solr.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/en/02_setup/22_installing_the_module.md b/docs/en/02_setup/22_installing_the_module.md deleted file mode 100644 index 989711b..0000000 --- a/docs/en/02_setup/22_installing_the_module.md +++ /dev/null @@ -1,14 +0,0 @@ -# Installing the module - -## Disabling automatic configuration - -If you have this module installed but do not have a Solr server running, you can disable the database manipulation -hooks that trigger automatic index updates: - -```yaml ---- -Name: mysitesearch ---- -SilverStripe\FullTextSearch\Search\Updaters\SearchUpdater: - enabled: false -``` diff --git a/docs/en/02_setup/23_solr_admin.md b/docs/en/02_setup/23_solr_admin.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/en/03_configuration.md b/docs/en/03_configuration.md new file mode 100644 index 0000000..0330b0e --- /dev/null +++ b/docs/en/03_configuration.md @@ -0,0 +1,340 @@ +# Configuration + +## Solr server parameters + +Set these values inside your `app/_config.php` - the defaults are shown below: + +```php +use SilverStripe\FullTextSearch\Solr\Solr; + +Solr::configure_server([ + 'host' => 'localhost', // The host or IP address that Solr is listening on + 'port' => '8983', // The port Solr is listening on + 'path' => '/solr', // The suburl the Solr service is available on + 'version' => '4', // Solr server version - currently only 3 and 4 supported + 'service' => 'Solr4Service', // The class that provides actual communcation to the Solr server + 'extraspath' => BASE_PATH .'/fulltextsearch/conf/solr/4/extras/', // Absolute path to the folder containing templates used for generating the schema and field definitions + 'templates' => BASE_PATH . '/fulltextsearch/conf/solr/4/templates/', // Absolute path to the configuration default files, e.g. solrconfig.xml + 'indexstore' => [ + 'mode' => NULL, // [REQUIRED] a classname which implements SolrConfigStore, or 'file' or 'webdav' + 'path' => NULL, // [REQUIRED] The (locally accessible) path to write the index configurations to OR The suburl on the Solr host that is set up to accept index configurations via webdav (e.g. BASE_PATH . '/.solr') + 'remotepath' => same as 'path' when using 'file' mode, // The path that the Solr server will read the index configurations from + 'auth' => NULL, // Webdav only - A username:password pair string to use to auth against the webdav server (e.g. solr:solr) + 'port' => '8983' // The port for WebDAV if different from the Solr port + ] +]); +``` + +Note: We recommend to put the `indexstore['path']` directory outside of the webroot. If you place it inside of the webroot (as shown in the example), please ensure its contents are not accessible through the webserver. +This can be achieved by server configuration, or (in most configurations) also by marking the folder as hidden via a "dot" prefix. + +### Disabling automatic configuration + +If you have this module installed but do not have a Solr server running, you can disable the database manipulation +hooks that trigger automatic index updates: + +```yaml +SilverStripe\FullTextSearch\Search\Updaters\SearchUpdater: + enabled: false +``` + +## Creating an index + +An index can essentially be considered a database that contains all of your searchable content. By default, it will store everything in a field called `Content`, which is queried to find your search results. To create an index that you can query, you can define it like so: + +```php +use Page; +use SilverStripe\FullTextSearch\Solr\SolrIndex; + +class MyIndex extends SolrIndex +{ + public function init() + { + $this->addClass(Page::class); + $this->addFulltextField('Title'); + } +} +``` + +This will create a new `SolrIndex` called `MyIndex`, and it will store the `Title` field on all `Pages` for searching. To index more than one class, +you simply call `addClass()` multiple times. Fields that you add don't have to be present on all classes in the index, they will only apply to a class +if it is present. + +```php +use Page; +use SilverStripe\Security\Member; +use SilverStripe\FullTextSearch\Solr\SolrIndex; + +class MyIndex extends SolrIndex +{ + public function init() + { + $this->addClass(Page::class); + $this->addClass(Member::class); + $this->addFulltextField('Content'); // only applies to Page class + $this->addFulltextField('FirstName'); // only applies to Member class + } +} +``` + +You can also skip listing all searchable fields, and have the index figure it out automatically via `addAllFulltextFields()`. This will add any database fields that are `instanceof DBString` to the index. Use this with caution, however, as you may inadvertently return sensitive information - it is often safer to declare your fields explicitly. + +Once you've added this file, make sure you run a [Solr configure](#dev_tasks) to set up your new index. + +## Adding data to an index + +Once you have [created your index](./30_creating_an_index.md), you can add data to it in a number of ways. + +### Reindex the site + +Running the [Solr reindex task](./33_dev_tasks.md) will crawl your site for classes that match those defined on your index, and add the defined fields to the index for searching. This is the most common method used to build the index the first time, or to perform a full rebuild of the index. + +### Publish a page in the CMS + +Every change, addition or removal of an indexed class instance triggers an index update through a "processor" object. The update is transparently handled through inspecting every executed database query and checking which database tables are involved in it. + +A reindex event will trigger when you make a change in the CMS, via `SearchUpdater::handle_manipulation()`, or `ProxyDBExtension::updateProxy()`. This tracks changes to the database, so any alterations will trigger a reindex. In order to minimise delays to those users, the index update is deferred until after the actual request returns to the user, through PHP's `register_shutdown_function()` functionality. + +### Manually + +If the situation calls for it, you can add an object to the index directly: + +```php +use Page; + +$page = Page::create(['Content' => 'Help me. My house is on fire. This is less than optimal.']); +$page->write(); +``` + +Depending on the size of the index and how much content needs to be processed, it could take a while for your search results to be updated, so your newly-updated page may not be available in your search results immediately. + +### Queued jobs + +If the [Queued Jobs module](https://github.com/symbiote/silverstripe-queuedjobs/) is installed, updates are queued up instead of executed in the same request. Queued jobs are usually processed every minute. Large index updates will be batched into multiple queued jobs to ensure a job can run to completion within common constraints, such as memory and execution time limits. You can check the status of jobs in an administrative interface under `admin/queuedjobs/`. + +### Excluding draft content + +By default, the `SearchUpdater` class indexes all available "variant states", so in the case of the `Versioned` extension, both "draft" and "live". +For most cases, you'll want to exclude draft content from your search results. + +You can either prevent the draft content from being indexed in the first place, by adding the following to your `SearchIndex::init()` method: + +```php +use Page; +use SilverStripe\FullTextSearch\Search\Variants\SearchVariantVersioned; +use SilverStripe\FullTextSearch\Solr\SolrIndex; +use SilverStripe\Versioned\Versioned; + +class MyIndex extends SolrIndex +{ + public function init() + { + $this->addClass(Page::class); + $this->addFulltextField('Title'); + $this->excludeVariantState([SearchVariantVersioned::class => Versioned::DRAFT]); + } +} +``` + +Alternatively, you can index draft content, but simply exclude it from searches. This can be handy to preview search results on unpublished content, in case a CMS author is logged in. Before constructing your `SearchQuery`, conditionally switch to the "live" stage: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use SilverStripe\Security\Permission; +use SilverStripe\Versioned\Versioned; + +if (!Permission::check('CMS_ACCESS_CMSMain')) { + Versioned::set_stage(Versioned::LIVE); +} +$query = SearchQuery::create(); +// ... +``` + +## Querying an index + +This is where the magic happens. You will construct the search terms and other parameters required to form a `SearchQuery` object, and pass that into a `SearchIndex` to get results. + +### Building a `SearchQuery` + +First, you'll need to construct a new `SearchQuery` object: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$query = SearchQuery::create(); +``` + +You can then alter the `SearchQuery` with a number of methods: + +#### `addSearchTerm()` + +The simplest - pass through a string to search your index for. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$query = SearchQuery::create() + ->addSearchTerm('fire'); +``` + +You can also limit this to specific fields by passing an array as the second argument: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm('on fire', [Page::class . '_Title']); +``` + +#### `addFuzzySearchTerm()` + +Pass through a string to search your index for, with "fuzzier" matching - this means that a term like "fishing" would also likely find results containing "fish" or "fisher". Otherwise behaves the same as `addSearchTerm()`. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$query = SearchQuery::create() + ->addFuzzySearchTerm('fire'); +``` + +#### `addClassFilter()` + +Only query a specific class in the index, optionally including subclasses. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\PageType\SpecialPage; + +$query = SearchQuery::create() + ->addClassFilter(SpecialPage::class, false); // only return results from SpecialPages, not subclasses +``` + +#### Searching value ranges + +Most values can be expressed as ranges, most commonly dates or numbers. To search for a range of values rather than an exact match, +use the `SearchQuery_Range` class. The range can include bounds on both sides, or stay open-ended by simply leaving the argument blank. +It takes arguments in the form of `SearchQuery_Range::create($start, $end))`: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery_Range; +use My\Namespace\Index\MyIndex; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm('fire') + // Only include documents edited in 2011 or earlier + ->addFilter(Page::class . '_LastEdited', SearchQuery_Range::create(null, '2011-12-31T23:59:59Z')); +$results = singleton(MyIndex::class)->search($query); +``` + +Note: At the moment, the date format is specific to the search implementation. + +#### Searching for empty or existing values + +Since there's a type conversion between the SilverStripe database, object properties +and the search index persistence, it's often not clear which condition is searched for. +Should it equal an empty string, or only match if the field wasn't indexed at all? +The `SearchQuery` API has the concept of a "missing" and "present" field value for this: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\Index\MyIndex; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm('fire'); + // Needs a value, although it can be false + ->addFilter(Page::class . '_ShowInMenus', SearchQuery::$present); +$results = singleton(MyIndex::class)->search($query); +``` + +### Querying an index + +Once you have your query constructed, you need to run it against your index. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\Index\MyIndex; + +$query = SearchQuery::create()->addSearchTerm('fire'); +$results = singleton(MyIndex::class)->search($query); +``` + +The return value of a `search()` call is an object which contains a few properties: + + * `Matches`: `ArrayList` of the current "page" of search results. + * `Suggestion`: (optional) Any suggested spelling corrections in the original query notation + * `SuggestionNice`: (optional) Any suggested spelling corrections for display (without query notation) + * `SuggestionQueryString` (optional) Link to repeat the search with suggested spelling corrections + +## Solr dev tasks + +There are two dev/tasks that are central to the operation of the module - `Solr_Configure` and `Solr_Reindex`. You can access these through the web, or via CLI. Running via the web will return "quiet" output by default, but you can increase verbosity by adding `?verbose=1` to the `dev/tasks` URL; CLI will return verbose output by default. + +It is often a good idea to run a configure, followed by a reindex, after a code change - for example, after a deployment. + +### Solr configure + +`dev/tasks/Solr_Configure` + +This task will upload configuration to the Solr core, reloading it or creating it as necessary. This should be run after every code change to your indexes, or configuration changes. + +### Solr reindex + +`dev/tasks/Solr_Reindex` + +This task performs a reindex, which adds all the data specified in the index definition into the index store. + +If you have the [Queued Jobs module](https://github.com/symbiote/silverstripe-queuedjobs/) installed, then this task will create multiple reindex jobs that are processed asynchronously; unless you are in `dev` mode, in which case the index will be processed immediately (see [processor.yml](/_config/processor.yml)). Otherwise, it will run in one process. Often, if you are running it via the web, the request will time out. Usually this means the actually process is still running in the background, but it can be alarming to the user, so bear that in mind. + +## File-based configuration + +## Handling results + +In order to render search results, you need to return them from a controller. You can also drive this through a form response through standard SilverStripe forms. In this case we simply assume there's a GET parameter named `q` with a search term present. + +```php +use SilverStripe\CMS\Controllers\ContentController; +use SilverStripe\Control\HTTPRequest; +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\Index\MyIndex; + +class PageController extends ContentController +{ + private static $allowed_actions = [ + 'search', + ]; + + public function search(HTTPRequest $request) + { + $query = SearchQuery::create()->addSearchTerm($request->getVar('q')); + return $this->renderWith([ + 'SearchResult' => singleton(MyIndex::class)->search($query) + ]); + } +} +``` + +In your template (e.g. `Page_results.ss`) you can access the results and loop through them. They're stored in the `$Matches` property of the search return object. + +```ss +<% if $SearchResult.Matches %> +

Results for "{$Query}"

+

Displaying Page $SearchResult.Matches.CurrentPage of $SearchResult.Matches.TotalPages

+
    + <% loop $SearchResult.Matches %> +
  1. +

    $Title

    +

    <% if $Abstract %>$Abstract.XML<% else %>$Content.ContextSummary<% end_if %>

    +
  2. + <% end_loop %> +
+<% else %> +

Sorry, your search query did not return any results.

+<% end_if %> +``` + +Please check the [pagination guide](https://docs.silverstripe.org/en/4/developer_guides/templates/how_tos/pagination/) +in the main SilverStripe documentation to learn how to paginate through search results. diff --git a/docs/en/03_configuration/30_creating_an_index.md b/docs/en/03_configuration/30_creating_an_index.md deleted file mode 100644 index 68e20d8..0000000 --- a/docs/en/03_configuration/30_creating_an_index.md +++ /dev/null @@ -1,42 +0,0 @@ -# Creating an index - -An index can essentially be considered a database that contains all of your searchable content. By default, it will store everything in a field called `Content`, which is queried to find your search results. To create an index that you can query, you can define it like so: - -```php -use Page; -use SilverStripe\FullTextSearch\Solr\SolrIndex; - -class MyIndex extends SolrIndex -{ - public function init() - { - $this->addClass(Page::class); - $this->addFulltextField('Title'); - } -} -``` - -This will create a new `SolrIndex` called `MyIndex`, and it will store the `Title` field on all `Pages` for searching. To index more than one class, -you simply call `addClass()` multiple times. Fields that you add don't have to be present on all classes in the index, they will only apply to a class -if it is present. - -```php -use Page; -use SilverStripe\Security\Member; -use SilverStripe\FullTextSearch\Solr\SolrIndex; - -class MyIndex extends SolrIndex -{ - public function init() - { - $this->addClass(Page::class); - $this->addClass(Member::class); - $this->addFulltextField('Content'); // only applies to Page class - $this->addFulltextField('FirstName'); // only applies to Member class - } -} -``` - -You can also skip listing all searchable fields, and have the index figure it out automatically via `addAllFulltextFields()`. This will add any database fields that are `instanceof DBString` to the index. Use this with caution, however, as you may inadvertently return sensitive information - it is often safer to declare your fields explicitly. - -Once you've added this file, make sure you run a [Solr configure](./33_dev_tasks.md) to set up your new index. diff --git a/docs/en/03_configuration/31_adding_data_to_an_index.md b/docs/en/03_configuration/31_adding_data_to_an_index.md deleted file mode 100644 index aa977d8..0000000 --- a/docs/en/03_configuration/31_adding_data_to_an_index.md +++ /dev/null @@ -1,68 +0,0 @@ -# Adding data to an index - -Once you have [created your index](./30_creating_an_index.md), you can add data to it in a number of ways. - -## Reindex the site - -Running the [Solr reindex task](./33_dev_tasks.md) will crawl your site for classes that match those defined on your index, and add the defined fields to the index for searching. This is the most common method used to build the index the first time, or to perform a full rebuild of the index. - -## Publish a page in the CMS - -Every change, addition or removal of an indexed class instance triggers an index update through a "processor" object. The update is transparently handled through inspecting every executed database query and checking which database tables are involved in it. - -A reindex event will trigger when you make a change in the CMS, via `SearchUpdater::handle_manipulation()`, or `ProxyDBExtension::updateProxy()`. This tracks changes to the database, so any alterations will trigger a reindex. In order to minimise delays to those users, the index update is deferred until after the actual request returns to the user, through PHP's `register_shutdown_function()` functionality. - -## Manually - -If the situation calls for it, you can add an object to the index directly: - -```php -use Page; - -$page = Page::create(['Content' => 'Help me. My house is on fire. This is less than optimal.']); -$page->write(); -``` - -Depending on the size of the index and how much content needs to be processed, it could take a while for your search results to be updated, so your newly-updated page may not be available in your search results immediately. - -## Queued jobs - -If the [Queued Jobs module](https://github.com/symbiote/silverstripe-queuedjobs/) is installed, updates are queued up instead of executed in the same request. Queued jobs are usually processed every minute. Large index updates will be batched into multiple queued jobs to ensure a job can run to completion within common constraints, such as memory and execution time limits. You can check the status of jobs in an administrative interface under `admin/queuedjobs/`. - -### Excluding draft content - -By default, the `SearchUpdater` class indexes all available "variant states", so in the case of the `Versioned` extension, both "draft" and "live". -For most cases, you'll want to exclude draft content from your search results. - -You can either prevent the draft content from being indexed in the first place, by adding the following to your `SearchIndex::init()` method: - -```php -use Page; -use SilverStripe\FullTextSearch\Search\Variants\SearchVariantVersioned; -use SilverStripe\FullTextSearch\Solr\SolrIndex; -use SilverStripe\Versioned\Versioned; - -class MyIndex extends SolrIndex -{ - public function init() - { - $this->addClass(Page::class); - $this->addFulltextField('Title'); - $this->excludeVariantState([SearchVariantVersioned::class => Versioned::DRAFT]); - } -} -``` - -Alternatively, you can index draft content, but simply exclude it from searches. This can be handy to preview search results on unpublished content, in case a CMS author is logged in. Before constructing your `SearchQuery`, conditionally switch to the "live" stage: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use SilverStripe\Security\Permission; -use SilverStripe\Versioned\Versioned; - -if (!Permission::check('CMS_ACCESS_CMSMain')) { - Versioned::set_stage(Versioned::LIVE); -} -$query = SearchQuery::create(); -// ... -``` diff --git a/docs/en/03_configuration/32_querying_the_index.md b/docs/en/03_configuration/32_querying_the_index.md deleted file mode 100644 index 63dc7cc..0000000 --- a/docs/en/03_configuration/32_querying_the_index.md +++ /dev/null @@ -1,118 +0,0 @@ -# Querying an index - -This is where the magic happens. You will construct the search terms and other parameters required to form a `SearchQuery` object, and pass that into a `SearchIndex` to get results. - -## Building a `SearchQuery` - -First, you'll need to construct a new `SearchQuery` object: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = SearchQuery::create(); -``` - -You can then alter the `SearchQuery` with a number of methods: - -### `addSearchTerm()` - -The simplest - pass through a string to search your index for. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = SearchQuery::create() - ->addSearchTerm('fire'); -``` - -You can also limit this to specific fields by passing an array as the second argument: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use Page; - -$query = SearchQuery::create() - ->addSearchTerm('on fire', [Page::class . '_Title']); -``` - -### `addFuzzySearchTerm()` - -Pass through a string to search your index for, with "fuzzier" matching - this means that a term like "fishing" would also likely find results containing "fish" or "fisher". Otherwise behaves the same as `addSearchTerm()`. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = SearchQuery::create() - ->addFuzzySearchTerm('fire'); -``` - -### `addClassFilter()` - -Only query a specific class in the index, optionally including subclasses. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use My\Namespace\PageType\SpecialPage; - -$query = SearchQuery::create() - ->addClassFilter(SpecialPage::class, false); // only return results from SpecialPages, not subclasses -``` - -### Searching value ranges - -Most values can be expressed as ranges, most commonly dates or numbers. To search for a range of values rather than an exact match, -use the `SearchQuery_Range` class. The range can include bounds on both sides, or stay open-ended by simply leaving the argument blank. -It takes arguments in the form of `SearchQuery_Range::create($start, $end))`: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery_Range; -use My\Namespace\Index\MyIndex; -use Page; - -$query = SearchQuery::create() - ->addSearchTerm('fire') - // Only include documents edited in 2011 or earlier - ->addFilter(Page::class . '_LastEdited', SearchQuery_Range::create(null, '2011-12-31T23:59:59Z')); -$results = singleton(MyIndex::class)->search($query); -``` - -Note: At the moment, the date format is specific to the search implementation. - -### Searching for empty or existing values - -Since there's a type conversion between the SilverStripe database, object properties -and the search index persistence, it's often not clear which condition is searched for. -Should it equal an empty string, or only match if the field wasn't indexed at all? -The `SearchQuery` API has the concept of a "missing" and "present" field value for this: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use My\Namespace\Index\MyIndex; -use Page; - -$query = SearchQuery::create() - ->addSearchTerm('fire'); - // Needs a value, although it can be false - ->addFilter(Page::class . '_ShowInMenus', SearchQuery::$present); -$results = singleton(MyIndex::class)->search($query); -``` - -## Querying an index - -Once you have your query constructed, you need to run it against your index. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use My\Namespace\Index\MyIndex; - -$query = SearchQuery::create()->addSearchTerm('fire'); -$results = singleton(MyIndex::class)->search($query); -``` - -The return value of a `search()` call is an object which contains a few properties: - - * `Matches`: `ArrayList` of the current "page" of search results. - * `Suggestion`: (optional) Any suggested spelling corrections in the original query notation - * `SuggestionNice`: (optional) Any suggested spelling corrections for display (without query notation) - * `SuggestionQueryString` (optional) Link to repeat the search with suggested spelling corrections diff --git a/docs/en/03_configuration/33_dev_tasks.md b/docs/en/03_configuration/33_dev_tasks.md deleted file mode 100644 index 824bacb..0000000 --- a/docs/en/03_configuration/33_dev_tasks.md +++ /dev/null @@ -1,19 +0,0 @@ -# Solr dev tasks - -There are two dev/tasks that are central to the operation of the module - `Solr_Configure` and `Solr_Reindex`. You can access these through the web, or via CLI. Running via the web will return "quiet" output by default, but you can increase verbosity by adding `?verbose=1` to the `dev/tasks` URL; CLI will return verbose output by default. - -It is often a good idea to run a configure, followed by a reindex, after a code change - for example, after a deployment. - -## Solr configure - -`dev/tasks/Solr_Configure` - -This task will upload configuration to the Solr core, reloading it or creating it as necessary. This should be run after every code change to your indexes, or configuration changes. - -## Solr reindex - -`dev/tasks/Solr_Reindex` - -This task performs a reindex, which adds all the data specified in the index definition into the index store. - -If you have the [Queued Jobs module](https://github.com/symbiote/silverstripe-queuedjobs/) installed, then this task will create multiple reindex jobs that are processed asynchronously; unless you are in `dev` mode, in which case the index will be processed immediately (see [processor.yml](/_config/processor.yml)). Otherwise, it will run in one process. Often, if you are running it via the web, the request will time out. Usually this means the actually process is still running in the background, but it can be alarming to the user, so bear that in mind. diff --git a/docs/en/03_configuration/34_file_based_configuration.md b/docs/en/03_configuration/34_file_based_configuration.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/en/03_configuration/35_handling_results.md b/docs/en/03_configuration/35_handling_results.md deleted file mode 100644 index cd04194..0000000 --- a/docs/en/03_configuration/35_handling_results.md +++ /dev/null @@ -1,47 +0,0 @@ -# Handling results - -In order to render search results, you need to return them from a controller. You can also drive this through a form response through standard SilverStripe forms. In this case we simply assume there's a GET parameter named `q` with a search term present. - -```php -use SilverStripe\CMS\Controllers\ContentController; -use SilverStripe\Control\HTTPRequest; -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use My\Namespace\Index\MyIndex; - -class PageController extends ContentController -{ - private static $allowed_actions = [ - 'search', - ]; - - public function search(HTTPRequest $request) - { - $query = SearchQuery::create()->addSearchTerm($request->getVar('q')); - return $this->renderWith([ - 'SearchResult' => singleton(MyIndex::class)->search($query) - ]); - } -} -``` - -In your template (e.g. `Page_results.ss`) you can access the results and loop through them. They're stored in the `$Matches` property of the search return object. - -```ss -<% if $SearchResult.Matches %> -

Results for "{$Query}"

-

Displaying Page $SearchResult.Matches.CurrentPage of $SearchResult.Matches.TotalPages

-
    - <% loop $SearchResult.Matches %> -
  1. -

    $Title

    -

    <% if $Abstract %>$Abstract.XML<% else %>$Content.ContextSummary<% end_if %>

    -
  2. - <% end_loop %> -
-<% else %> -

Sorry, your search query did not return any results.

-<% end_if %> -``` - -Please check the [pagination guide](https://docs.silverstripe.org/en/4/developer_guides/templates/how_tos/pagination/) -in the main SilverStripe documentation to learn how to paginate through search results. diff --git a/docs/en/04_advanced_configuration.md b/docs/en/04_advanced_configuration.md new file mode 100644 index 0000000..eb74e51 --- /dev/null +++ b/docs/en/04_advanced_configuration.md @@ -0,0 +1,75 @@ +# Advanced configuration + +## Facets + +## Multiple indexes + +Multiple indexes can be created and searched independently, but if you wish to override an existing +index with another, you can use the `$hide_ancestor` config. + +```php +use SilverStripe\Assets\File; +use My\Namespace\Index\MyIndex; + +class MyReplacementIndex extends MyIndex +{ + private static $hide_ancestor = MyIndex::class; + + public function init() + { + parent::init(); + + $this->addClass(File::class); + $this->addFulltextField('Title'); + } +} +``` + +You can also filter all indexes globally to a set of pre-defined classes if you wish to +prevent any unknown indexes from being automatically included. + +```yaml +SilverStripe\FullTextSearch\Search\FullTextSearch: + indexes: + - MyReplacementIndex + - CoreSearchIndex +``` + +## Synonyms + +## Spell check + +## Boosting/Weighting + + Results aren't all created equal. Matches in some fields are more important + than others; for example, a page `Title` might be considered more relevant to the user than terms in the `Content` field. + + To account for this, a "weighting" (or "boosting") factor can be applied to each searched field. The default value is `1.0`, anything below that will decrease the relevance, anything above increases it. + + To adjust the relative values, pass them in as the third argument to your `addSearchTerm()` call: + + ```php + use My\Namespace\Index\MyIndex; + use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + use Page; + + $query = SearchQuery::create() + ->addSearchTerm( + 'fire', + null, // don't limit the classes to search + [ + Page::class . '_Title' => 1.5, + Page::class . '_Content' => 1.0, + Page::class . '_SecretParagraph' => 0.1, + ] + ); + $results = singleton(MyIndex::class)->search($query); + ``` + + This will ensure that `Title` is given higher priority for matches than `Content`, which is well above `SecretParagraph`. + +## Indexing related objects + +## Subsites + +## Adding new fields diff --git a/docs/en/04_advanced_configuration/40_facets.md b/docs/en/04_advanced_configuration/40_facets.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/en/04_advanced_configuration/41_multiple_indexes.md b/docs/en/04_advanced_configuration/41_multiple_indexes.md deleted file mode 100644 index da0c1bd..0000000 --- a/docs/en/04_advanced_configuration/41_multiple_indexes.md +++ /dev/null @@ -1,32 +0,0 @@ -# Using multiple indexes - -Multiple indexes can be created and searched independently, but if you wish to override an existing -index with another, you can use the `$hide_ancestor` config. - -```php -use SilverStripe\Assets\File; -use My\Namespace\Index\MyIndex; - -class MyReplacementIndex extends MyIndex -{ - private static $hide_ancestor = MyIndex::class; - - public function init() - { - parent::init(); - - $this->addClass(File::class); - $this->addFulltextField('Title'); - } -} -``` - -You can also filter all indexes globally to a set of pre-defined classes if you wish to -prevent any unknown indexes from being automatically included. - -```yaml -SilverStripe\FullTextSearch\Search\FullTextSearch: - indexes: - - MyReplacementIndex - - CoreSearchIndex -``` diff --git a/docs/en/04_advanced_configuration/42_synonyms.md b/docs/en/04_advanced_configuration/42_synonyms.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/en/04_advanced_configuration/43_spell_check.md b/docs/en/04_advanced_configuration/43_spell_check.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/en/04_advanced_configuration/44_boosting.md b/docs/en/04_advanced_configuration/44_boosting.md deleted file mode 100644 index 6fb6935..0000000 --- a/docs/en/04_advanced_configuration/44_boosting.md +++ /dev/null @@ -1,28 +0,0 @@ -# Boosting/Weighting - -Results aren't all created equal. Matches in some fields are more important -than others; for example, a page `Title` might be considered more relevant to the user than terms in the `Content` field. - -To account for this, a "weighting" (or "boosting") factor can be applied to each searched field. The default value is `1.0`, anything below that will decrease the relevance, anything above increases it. - -To adjust the relative values, pass them in as the third argument to your `addSearchTerm()` call: - -```php -use My\Namespace\Index\MyIndex; -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use Page; - -$query = SearchQuery::create() - ->addSearchTerm( - 'fire', - null, // don't limit the classes to search - [ - Page::class . '_Title' => 1.5, - Page::class . '_Content' => 1.0, - Page::class . '_SecretParagraph' => 0.1, - ] - ); -$results = singleton(MyIndex::class)->search($query); -``` - -This will ensure that `Title` is given higher priority for matches than `Content`, which is well above `SecretParagraph`. diff --git a/docs/en/04_advanced_configuration/45_indexing_related_objects.md b/docs/en/04_advanced_configuration/45_indexing_related_objects.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/en/04_advanced_configuration/46_subsites.md b/docs/en/04_advanced_configuration/46_subsites.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/en/04_advanced_configuration/47_adding_new_fields.md b/docs/en/04_advanced_configuration/47_adding_new_fields.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/en/05_troubleshooting.md b/docs/en/05_troubleshooting.md index 9edff12..6972d3d 100644 --- a/docs/en/05_troubleshooting.md +++ b/docs/en/05_troubleshooting.md @@ -1,112 +1,3 @@ # Troubleshooting -## Common Gotchas - -Oooh I gotcha - - -Here's some whitespace - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -## Test anchor - -I'm on a boat +## Common gotchas diff --git a/docs/en/05_troubleshooting/50_common_gotchas.md b/docs/en/05_troubleshooting/50_common_gotchas.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/en/Solr.md b/docs/en/Solr.md index e57a917..dd73f0d 100644 --- a/docs/en/Solr.md +++ b/docs/en/Solr.md @@ -1,90 +1,9 @@ # Solr connector for SilverStripe fulltextsearch module -## Introduction - -The fulltextsearch module includes support for connecting to Solr. - -It works with Solr in multi-core mode. It needs to be able to update Solr configuration files, and has modes for -doing this by direct file access (when Solr shares a server with SilverStripe) and by WebDAV (when it's on a different -server). - -See the helpful [Solr Tutorial](http://lucene.apache.org/solr/4_5_1/tutorial.html), for more on cores -and querying. - -## Requirements - -Since Solr is Java based, it requires Java 1.5 or greater installed. - -When you're installing it yourself, it also requires a servlet container such as Tomcat, Jetty, or Resin. For -development testing there is a standalone version that comes bundled with Jetty (see below). - -See the official [Solr installation docs](http://wiki.apache.org/solr/SolrInstall) for more information. - -Note that these requirements are for the Solr server environment, which doesn't have to be the same physical machine -as the SilverStripe webhost. - -## Installation (Local) - -### Get the Solr server - -``` -composer require silverstripe/fulltextsearch-localsolr -``` - -### Start the server (via CLI, in a separate terminal window or background process) - -``` -cd fulltextsearch-localsolr/server/ -java -jar start.jar -``` - -### Configure the fulltextsearch Solr component to use the local server - -Configure Solr in file mode. The 'path' directory has to be writeable -by the user the Solr search server is started with (see below). - -```php -// File: mysite/_config.php: -use SilverStripe\FullTextSearch\Solr\Solr; - -Solr::configure_server([ - 'host' => 'localhost', - 'indexstore' => [ - 'mode' => 'file', - 'path' => BASE_PATH . '/.solr' - ] -]); -``` All possible parameters incl optional ones with example values: -```php -// File: mysite/_config.php: -use SilverStripe\FullTextSearch\Solr\Solr; -Solr::configure_server([ - 'host' => 'localhost', // default: localhost | The host or IP Solr is listening on - 'port' => '8983', // default: 8983 | The port Solr is listening on - 'path' => '/solr', // default: /solr | The suburl the solr service is available on - 'version' => '4', // default: 4 | Solr server version - currently only 3 and 4 supported - 'service' => 'Solr4Service', // default: depends on version, Solr3Service for 3, Solr4Service for 4 | the class that provides actual communcation to the Solr server - 'extraspath' => BASE_PATH .'/fulltextsearch/conf/solr/4/extras/', // default: /fulltextsearch/conf/solr/{version}/extras/ | Absolute path to the folder containing templates which are used for generating the schema and field definitions. - 'templates' => BASE_PATH . '/fulltextsearch/conf/solr/4/templates/', // default: /fulltextsearch/conf/solr/{version}/templates/ | Absolute path to the configuration default files, e.g. solrconfig.xml - 'indexstore' => [ - 'mode' => 'file', // a classname which implements SolrConfigStore, or 'file' or 'webdav' - 'path' => BASE_PATH . '/.solr', // The (locally accessible) path to write the index configurations to OR The suburl on the solr host that is set up to accept index configurations via webdav - 'remotepath' => '/opt/solr/config', // default (file mode only): same as 'path' above | The path that the Solr server will read the index configurations from - 'auth' => 'solr:solr', // default: none | Webdav only - A username:password pair string to use to auth against the webdav server - 'port' => '80' // default: same as solr port | The port for WebDAV if different from the Solr port - ] -]); -``` - -Note: We recommend to put the `indexstore.path` directory outside of the webroot. -If you place it inside of the webroot (as shown in the example), -please ensure its contents are not accessible through the webserver. -This can be achieved by server configuration, or (in most configurations) -also by marking the folder as hidden via a "dot" prefix. ## Configuration From fa6a412d72c0b22e6f46252f10cc06d06c57b3c2 Mon Sep 17 00:00:00 2001 From: Andrew Aitken-Fincham Date: Tue, 29 May 2018 16:51:47 +0100 Subject: [PATCH 5/8] add quickstart script enable FTS automatically in quickstart add a little idempotency to quickstart script find _config.php better configure and reindex as well add default index to quickstart add configure_server to quickstart --- bin/fts_quickstart | 88 +++++++++++++++++++++++++++++++++++ docs/en/01_getting_started.md | 16 +++++++ 2 files changed, 104 insertions(+) create mode 100644 bin/fts_quickstart diff --git a/bin/fts_quickstart b/bin/fts_quickstart new file mode 100644 index 0000000..376cc77 --- /dev/null +++ b/bin/fts_quickstart @@ -0,0 +1,88 @@ +#!/usr/bin/env bash + +echo "Installing Java SDK..." +# Find our package manager +if VERB="$( command -v apt-get )" 2> /dev/null; then + echo "Debian-based OS detected" + sudo apt-get install -y openjdk-8-jdk 2> /dev/null +elif VERB="$( command -v yum )" 2> /dev/null; then + echo "Modern Red Hat-based OS detected" + sudo yum install -y java-1.8.0-openjdk.x86_64 2> /dev/null +else + echo "No valid package manager detected; try one of apt-get, yum." + exit 1 +fi + +if [ ! -d "/opt/solr" ]; then + printf "Installing Solr 4" + # Acquire and unzip solr4 + wget http://archive.apache.org/dist/lucene/solr/4.10.4/solr-4.10.4.tgz 2> /dev/null && printf "." + tar -xf solr-4.10.4.tgz 2> /dev/null && printf "." + rm solr-4.10.4.tgz 2> /dev/null && printf "." + + # Set the defaults in /opt/solr + sudo mv solr-4.10.4 /opt/solr 2> /dev/null && printf "." + mv /opt/solr/example /opt/solr/core 2> /dev/null && echo "." +fi + +if [ ! -f "/etc/init.d/solr" ]; then + echo "Installing Solr daemon..." + # Set up the daemon so that solr will run on startup + sudo cp vendor/silverstripe/fulltextsearch/docs/examples/daemon_script /etc/init.d/solr 2> /dev/null + sudo chmod +x /etc/init.d/solr 2> /dev/null + sudo chkconfig --add solr 2> /dev/null +fi + +# Get solr running +sudo /etc/init.d/solr start 2> /dev/null + +# Determine application dir +if [ -d app ]; then + APPDIR="app" +elif [ -d mysite ]; then + APPDIR="mysite" +else + echo "Can't detect application dir - skipping default index creating" + exit 1 +fi + +# Check to see if it has been enabled in _config.php +grep -i "FulltextSearchable::enable(" "$APPDIR/_config.php" 2> /dev/null +if [ "$?" != 0 ]; then + echo "Enabling FulltextSearchable in _config.php..." + if [ ! -f "$APPDIR/_config.php" ]; then + echo " "$APPDIR/_config.php" + echo "" >> "$APPDIR/_config.php" + fi + echo "" >> "$APPDIR/_config.php" + echo "# Enable Fulltextsearch" >> "$APPDIR/_config.php" + echo "\\SilverStripe\\ORM\\Search\\FulltextSearchable::enable();" >> "$APPDIR/_config.php" >> "$APPDIR/_config.php" + echo "\\SilverStripe\\FullTextSearch\\Solr\\Solr::configure_server([" >> "$APPDIR/_config.php" + echo " 'indexstore' => [" >> "$APPDIR/_config.php" + echo " 'mode' => 'file'," >> "$APPDIR/_config.php" + echo " 'path' => BASE_PATH . '/.solr'" >> "$APPDIR/_config.php" + echo " ]" >> "$APPDIR/_config.php" + echo "]);" >> "$APPDIR/_config.php" +fi + +# Determine code dir +if [ -d "$APPDIR/src" ]; then + CODEDIR="$APPDIR/src" +elif [ -d "$APPDIR/code" ]; then + CODEDIR="$APPDIR/code" +else + echo "Can't detect code dir - skipping default index creating" + exit 1 +fi + +# Create a default index +if [ ! -f "$CODEDIR/FulltextSearch/DefaultIndex.php" ]; then + echo "Creating default index..." + mkdir -p "$CODEDIR/FulltextSearch" + cp vendor/silverstripe/fulltextsearch/docs/examples/default_index.php.example "$CODEDIR/FulltextSearch/DefaultIndex.php" +fi + +vendor/bin/sake dev/tasks/Solr_Configure +vendor/bin/sake dev/tasks/Solr_Reindex + +echo "Quickstart complete!" diff --git a/docs/en/01_getting_started.md b/docs/en/01_getting_started.md index 0ae68ee..26e6fe6 100644 --- a/docs/en/01_getting_started.md +++ b/docs/en/01_getting_started.md @@ -35,3 +35,19 @@ offers its own extensions, and there is some behaviour (such as getting the full and running) that each connector deals with itself, in a way best suited to that search engine's design. ## Quick start + +If you are running on a Linux-based system, you can get up and running quickly with the quickstart script, like so: + +```bash +composer require silverstripe/fulltextsearch && vendor/bin/fts_quickstart +``` + +This will: + +- Install the required Java SDK (using `apt-get` or `yum`) +- Install Solr 4 +- Set up a daemon to run Solr on startup +- Start Solr +- Enable `FulltextSearchable` in your `_config.php` (and create one if you don't have one) + +The simply adding `$SearchForm` to a template and flushing the template cache should add a search text box to your site. From e08731d1f1b783f8202f283d85bdb4d5d4d8b720 Mon Sep 17 00:00:00 2001 From: Andrew Aitken-Fincham Date: Mon, 4 Jun 2018 16:55:47 +0100 Subject: [PATCH 6/8] solr admin url fix dead links and update quickstart solr admin expansion spellcheck docs custom types custom field types highlighting facets --- _config/config.yml | 3 - bin/fts_quickstart | 7 +- docs/en/00_index.md | 6 +- docs/en/01_getting_started.md | 22 +- docs/en/02_setup.md | 50 ++- docs/en/03_configuration.md | 39 ++- docs/en/04_advanced_configuration.md | 281 ++++++++++++++++- docs/en/Solr.md | 455 --------------------------- 8 files changed, 364 insertions(+), 499 deletions(-) diff --git a/_config/config.yml b/_config/config.yml index 11eea88..50ea2b3 100644 --- a/_config/config.yml +++ b/_config/config.yml @@ -4,6 +4,3 @@ Name: fulltextsearchconfig SilverStripe\ORM\DataObject: extensions: - SilverStripe\FullTextSearch\Search\Extensions\SearchUpdater_ObjectHandler -SilverStripe\CMS\Controllers\ContentController: - extensions: - - SilverStripe\FullTextSearch\Solr\Control\ContentControllerExtension diff --git a/bin/fts_quickstart b/bin/fts_quickstart index 376cc77..b179d2d 100644 --- a/bin/fts_quickstart +++ b/bin/fts_quickstart @@ -46,17 +46,16 @@ else exit 1 fi -# Check to see if it has been enabled in _config.php -grep -i "FulltextSearchable::enable(" "$APPDIR/_config.php" 2> /dev/null +# Check to see if it has been configured in _config.php +grep -i "Solr::configure_server(" "$APPDIR/_config.php" 2> /dev/null if [ "$?" != 0 ]; then - echo "Enabling FulltextSearchable in _config.php..." + echo "Configuring Solr in _config.php..." if [ ! -f "$APPDIR/_config.php" ]; then echo " "$APPDIR/_config.php" echo "" >> "$APPDIR/_config.php" fi echo "" >> "$APPDIR/_config.php" echo "# Enable Fulltextsearch" >> "$APPDIR/_config.php" - echo "\\SilverStripe\\ORM\\Search\\FulltextSearchable::enable();" >> "$APPDIR/_config.php" >> "$APPDIR/_config.php" echo "\\SilverStripe\\FullTextSearch\\Solr\\Solr::configure_server([" >> "$APPDIR/_config.php" echo " 'indexstore' => [" >> "$APPDIR/_config.php" echo " 'mode' => 'file'," >> "$APPDIR/_config.php" diff --git a/docs/en/00_index.md b/docs/en/00_index.md index ed9d14b..c81efc8 100644 --- a/docs/en/00_index.md +++ b/docs/en/00_index.md @@ -6,7 +6,6 @@ - Setup - [Requirements](02_setup.md#requirements) - [Installing Solr](02_setup.md#installing-solr) - - [Installing this module](02_setup.md#installing-the-module) - [Solr admin](02_setup.md#solr-admin) - Configuration - [Solr server parameters](03_configuration.md#solr-server-parameters) @@ -20,10 +19,11 @@ - [Facets](04_advanced_configuration.md#facets) - [Using multiple indexes](04_advanced_configuration.md#multiple-indexes) - [Synonyms](04_advanced_configuration.md#synonyms) - - [Spellcheck](04_advanced_configuration.md#spell-check) + - [Spellcheck](04_advanced_configuration.md#spell-check-("did-you-mean...")) + - [Highlighting](04_advanced_configuration.md#highlighting) - [Boosting](04_advanced_configuration.md#boosting) - [Indexing related objects](04_advanced_configuration.md#indexing-related-objects) - [Subsites](04_advanced_configuration.md#subsites) - - [Adding new fields](04_advanced_configuration.md#adding-new-fields) + - [Custom field types](04_advanced_configuration.md#custom-field-types) - Troubleshooting - [Gotchas](05_troubleshooting.md#common-gotchas) diff --git a/docs/en/01_getting_started.md b/docs/en/01_getting_started.md index 26e6fe6..eb05674 100644 --- a/docs/en/01_getting_started.md +++ b/docs/en/01_getting_started.md @@ -22,17 +22,17 @@ fulltext searching as an extension of the object model. However, the disconnect design and the object model meant that searching was inefficient. The abstraction would also often break and it was hard to then figure out what was going on. -This module instead provides the ability to define those indexes and queries in PHP. The indexes are defined as a mapping -between the SilverStripe object model and the connector-specific fulltext engine index model. This module then interrogates model metadata -to build the specific index definition. +This module instead provides the ability to define those indexes and queries in PHP. The indexes are defined as a +mapping between the SilverStripe object model and the connector-specific fulltext engine index model. This module then +interrogates model metadata to build the specific index definition. -It also hooks into SilverStripe framework in order to update the indexes when the models change and connectors then convert those index and query definitions -into fulltext engine specific code. +It also hooks into SilverStripe framework in order to update the indexes when the models change and connectors then +convert those index and query definitions into fulltext engine specific code. The intent of this module is not to make changing fulltext search engines seamless. Where possible this module provides common interfaces to fulltext engine functionality, abstracting out common behaviour. However, each connector also -offers its own extensions, and there is some behaviour (such as getting the fulltext search engines installed, configured -and running) that each connector deals with itself, in a way best suited to that search engine's design. +offers its own extensions, and there is some behaviour (such as getting the fulltext search engines installed, +configured and running) that each connector deals with itself, in a way best suited to that search engine's design. ## Quick start @@ -48,6 +48,10 @@ This will: - Install Solr 4 - Set up a daemon to run Solr on startup - Start Solr -- Enable `FulltextSearchable` in your `_config.php` (and create one if you don't have one) +- Configure Solr in your `_config.php` (and create one if you don't have one) +- Create a DefaultIndex +- Run a [Solr Configure](03_configuration.md#solr-configure) and a [Solr Reindex](03_configuration.md#solr-reindex) -The simply adding `$SearchForm` to a template and flushing the template cache should add a search text box to your site. +You'll then need to build a search form and results display that suits the functionality of your site. + +// TODO update me when https://github.com/silverstripe/silverstripe-fulltextsearch/pull/216 is merged diff --git a/docs/en/02_setup.md b/docs/en/02_setup.md index 8b16bdf..49222d0 100644 --- a/docs/en/02_setup.md +++ b/docs/en/02_setup.md @@ -1,28 +1,34 @@ # Setup -The fulltextsearch module includes support for connecting to Solr. +The FulltextSearch module includes support for connecting to Solr. -It works with Solr in multi-core mode. It needs to be able to update Solr configuration files, and has modes for doing this by direct file access (when Solr shares a server with SilverStripe) and by WebDAV (when it's on a different server). +It works with Solr in multi-core mode. It needs to be able to update Solr configuration files, and has modes for doing +so by direct file access (when Solr shares a server with SilverStripe) and by WebDAV (when it's on a different server). -See the helpful [Solr Tutorial](http://lucene.apache.org/solr/4_5_1/tutorial.html), for more on cores -and querying. +See the helpful [Solr Tutorial](http://lucene.apache.org/solr/4_5_1/tutorial.html), for more on cores and querying. ## Requirements Since Solr is Java based, it requires Java 1.5 or greater installed. When you're installing it yourself, it also requires a servlet container such as Tomcat, Jetty, or Resin. For -development testing there is a standalone version that comes bundled with Jetty (see [Installing Solr](#installing-solr) below). +development testing there is a standalone version that comes bundled with Jetty (see [Installing Solr](#installing-solr) + below). See the official [Solr installation docs](http://wiki.apache.org/solr/SolrInstall) for more information. -Note that these requirements are for the Solr server environment, which doesn't have to be the same physical machine as the SilverStripe webhost. +Note that these requirements are for the Solr server environment, which doesn't have to be the same physical machine as +the SilverStripe webhost. ## Installing Solr ### Local installation -If you'll be running Solr on the same machine as your SilverStripe installation, you can use the [silverstripe/fulltextsearch-localsolr module](https://github.com/silverstripe-archive/silverstripe-fulltextsearch-localsolr). This can also be useful as a development dependency. You can bring it in via composer (use `require-dev` if you plan to use install Solr remotely in Production): +If you'll be running Solr on the same machine as your SilverStripe installation, and the +[quick start script](01_getting_started.md#quick-start) doesn't suit your needs, you can use the +[fulltextsearch-localsolr module](https://github.com/silverstripe-archive/silverstripe-fulltextsearch-localsolr). This +can also be useful as a development dependency. You can bring it in via composer (use `require-dev` if you plan to +install Solr remotely in Production): ```bash composer require silverstripe/fulltextsearch-localsolr @@ -35,7 +41,8 @@ cd fulltextsearch-localsolr/server java -jar start.jar ``` -Then configure Solr to use `file` more with the following configuration in your `app/_config.php`, making sure that the `path` directory is writeable by the user that started the server (above): +Then configure the module to use `file` mode with the following configuration in your `app/_config.php`, making sure +that the `path` directory is writeable by the user that started the server (above): ```php use SilverStripe\FullTextSearch\Solr\Solr; @@ -51,10 +58,35 @@ Solr::configure_server([ ### Remote installation +Alternatively, it can be beneficial to keep the Solr service contained on its own infrastructure, for performance and +security reasons. The [Common Web Platform (CWP)](www.cwp.govt.nz) uses Solr in this manner. To do so, you should +install the dependencies on the remote server, and then configure the module to use the `webdav` mode like so: +```php +use SilverStripe\FullTextSearch\Solr\Solr; -## Installing the module +Solr::configure_server([ + 'host' => 'remotesolrserver.com', // IP address or hostname + 'indexstore' => [ + 'mode' => 'webdav', + 'path' => BASE_PATH . '/webdav', + ] +]); +``` +Check all the available [configuration options](03_configuration.md#solr-server-parameters) to fine-tune the module to +work with your desired setup. +This will mean that all configuration files, and the indexes themselves, are stored remotely. ## Solr admin + +Solr provides an administration interface with a GUI to allow you to get at the finer details of your cores and +configuration. You can access it at example.com://#/ on a local installation +(usually example.com:8983/solr/#/). + +There you can access logging, run raw queries against your stored indexes, and get some basic performance metrics. +Additionally, you can perform more drastic changes, such as dropping and reloading cores. + +For a comprehensive look at the Solr admin interface, read the +[user guide for Solr 4.10](http://archive.apache.org/dist/lucene/solr/ref-guide/apache-solr-ref-guide-4.10.pdf#page=17) diff --git a/docs/en/03_configuration.md b/docs/en/03_configuration.md index 0330b0e..718274f 100644 --- a/docs/en/03_configuration.md +++ b/docs/en/03_configuration.md @@ -79,15 +79,15 @@ class MyIndex extends SolrIndex You can also skip listing all searchable fields, and have the index figure it out automatically via `addAllFulltextFields()`. This will add any database fields that are `instanceof DBString` to the index. Use this with caution, however, as you may inadvertently return sensitive information - it is often safer to declare your fields explicitly. -Once you've added this file, make sure you run a [Solr configure](#dev_tasks) to set up your new index. +Once you've added this file, make sure you run a [Solr configure](#solr-configure) to set up your new index. ## Adding data to an index -Once you have [created your index](./30_creating_an_index.md), you can add data to it in a number of ways. +Once you have [created your index](#creating-an-index), you can add data to it in a number of ways. ### Reindex the site -Running the [Solr reindex task](./33_dev_tasks.md) will crawl your site for classes that match those defined on your index, and add the defined fields to the index for searching. This is the most common method used to build the index the first time, or to perform a full rebuild of the index. +Running the [Solr reindex task](#solr-reindex) will crawl your site for classes that match those defined on your index, and add the defined fields to the index for searching. This is the most common method used to build the index the first time, or to perform a full rebuild of the index. ### Publish a page in the CMS @@ -177,7 +177,7 @@ $query = SearchQuery::create() ->addSearchTerm('fire'); ``` -You can also limit this to specific fields by passing an array as the second argument: +You can also limit this to specific fields by passing an array as the second argument, specified in the form of `{table}_{field}`: ```php use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; @@ -226,7 +226,7 @@ $query = SearchQuery::create() ->addSearchTerm('fire') // Only include documents edited in 2011 or earlier ->addFilter(Page::class . '_LastEdited', SearchQuery_Range::create(null, '2011-12-31T23:59:59Z')); -$results = singleton(MyIndex::class)->search($query); +$results = MyIndex::singleton()->search($query); ``` Note: At the moment, the date format is specific to the search implementation. @@ -247,7 +247,7 @@ $query = SearchQuery::create() ->addSearchTerm('fire'); // Needs a value, although it can be false ->addFilter(Page::class . '_ShowInMenus', SearchQuery::$present); -$results = singleton(MyIndex::class)->search($query); +$results = MyIndex::singleton()->search($query); ``` ### Querying an index @@ -259,7 +259,7 @@ use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; use My\Namespace\Index\MyIndex; $query = SearchQuery::create()->addSearchTerm('fire'); -$results = singleton(MyIndex::class)->search($query); +$results = MyIndex::singleton()->search($query); ``` The return value of a `search()` call is an object which contains a few properties: @@ -279,7 +279,13 @@ It is often a good idea to run a configure, followed by a reindex, after a code `dev/tasks/Solr_Configure` -This task will upload configuration to the Solr core, reloading it or creating it as necessary. This should be run after every code change to your indexes, or configuration changes. +This task will upload configuration to the Solr core, reloading it or creating it as necessary, and generate the schema. This should be run after every code change to your indexes, or after any configuration changes. This will convert the PHP-based abstraction layer into actual Solr XML. Assuming default configuration and the use of the `DefaultIndex`, it will: + +- create the directory `BASE_PATH/.solr/DefaultIndex/` if it doesn't already exist +- copy configuration files from `vendor/silverstripe/fulltextsearch/conf/extras` to `BASE_PATH/.solr/DefaultIndex/conf/` +- generate a `schema.xml` in `BASE_PATH/.solr/DefaultIndex/conf/` + +This task will overwrite these files every time it is run. ### Solr reindex @@ -289,8 +295,23 @@ This task performs a reindex, which adds all the data specified in the index def If you have the [Queued Jobs module](https://github.com/symbiote/silverstripe-queuedjobs/) installed, then this task will create multiple reindex jobs that are processed asynchronously; unless you are in `dev` mode, in which case the index will be processed immediately (see [processor.yml](/_config/processor.yml)). Otherwise, it will run in one process. Often, if you are running it via the web, the request will time out. Usually this means the actually process is still running in the background, but it can be alarming to the user, so bear that in mind. +Internally groups of records are grouped into sizes of 200. You can configure this group sizing by using the `Solr_Reindex.recordsPerRequest` config: + +```yaml +SilverStripe\FullTextSearch\Solr\Tasks\Solr_Reindex: + recordsPerRequest: 150 +``` + +The Solr indexes will be stored as binary files inside your SilverStripe project. You can also copy the `thirdparty/` Solr directory somewhere else, just set the `path` value in `mysite/_config.php` to point to the new location. + ## File-based configuration +Many aspects of Solr are configured outside of the `schema.xml` file which SilverStripe generates based on the `SolrIndex` subclass that is defined. For example, stopwords are placed in their own `stopwords.txt` file, and advanced [spellchecking](04_advanced_configuration.md#spell-check-("did-you-mean...")) can be configured in `solrconfig.xml`. + +By default, these files are copied from the `fulltextsearch/conf/extras/` directory over to the new index location. In order to use your own files, copy these files into a location of your choosing (for example `mysite/data/solr/`), and tell Solr to use this folder with the `extraspath` [configuration setting](#solr-server-parameters). Run a [`Solr_Configure](#solr-configure) to apply these changes. + +You can also define these on an index-by-index basis by defining `SolrIndex->getExtrasPath()`. + ## Handling results In order to render search results, you need to return them from a controller. You can also drive this through a form response through standard SilverStripe forms. In this case we simply assume there's a GET parameter named `q` with a search term present. @@ -311,7 +332,7 @@ class PageController extends ContentController { $query = SearchQuery::create()->addSearchTerm($request->getVar('q')); return $this->renderWith([ - 'SearchResult' => singleton(MyIndex::class)->search($query) + 'SearchResult' => MyIndex::singleton()->search($query) ]); } } diff --git a/docs/en/04_advanced_configuration.md b/docs/en/04_advanced_configuration.md index eb74e51..a0cd754 100644 --- a/docs/en/04_advanced_configuration.md +++ b/docs/en/04_advanced_configuration.md @@ -2,6 +2,85 @@ ## Facets +Inside the `SolrIndex->search()` function, the third-party library solr-php-client is used to send data to Solr and parse the response. Additional information can be pulled from this response and added to your results object for use in templates using the `updateSearchResults()` extension hook. + +```php +use My\Namespace\Index\MyIndex; +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$index = MyIndex::singleton(); +$query = SearchQuery::create() + ->addSearchTerm('My Term'); +$params = [ + 'facet' => 'true', + 'facet.field' => 'SiteTree_ClassName', +]; +$results = $index->search($query, -1, -1, $params); +``` + +By adding facet fields into the query parameters, our response object from Solr now contains some additional information that we can add into the results sent to the page. + +```php +namespace My\Namespace\Extension; + +use SilverStripe\Core\Extension; +use SilverStripe\View\ArrayData; +use SilverStripe\ORM\ArrayList; + +class FacetedResultsExtension extends Extension +{ + /** + * Adds extra information from the solr-php-client repsonse + * into our search results. + * @param ArrayData $results The ArrayData that will be used to generate search + * results pages. + * @param stdClass $response The solr-php-client response object. + */ + public function updateSearchResults($results, $response) + { + if (!isset($response->facet_counts) || !isset($response->facet_counts->facet_fields)) { + return; + } + $facetCounts = ArrayList::create([]); + foreach($response->facet_counts->facet_fields as $name => $facets) { + $facetDetails = ArrayData::create([ + 'Name' => $name, + 'Facets' => ArrayList::create([]), + ]); + + foreach($facets as $facetName => $facetCount) { + $facetDetails->Facets->push(ArrayData::create([ + 'Name' => $facetName, + 'Count' => $facetCount, + ])); + } + $facetCounts->push($facetDetails); + } + $results->setField('FacetCounts', $facetCounts); + } +} +``` + +And then apply the extension to your index via `yaml`: + +```yaml +My\Namespace\Index\MyIndex: + extensions: + - My\Namespace\Extension\FacetedResultsExtension +``` + +We can now access the facet information inside our templates like so: + +```silverstripe +<% if $Results.FacetCounts %> + <% loop $Results.FacetCounts.Facets %> + <% loop $Facets %> +

$Name: $Count

+ <% end_loop %> + <% end_loop %> +<% end_if %> +``` + ## Multiple indexes Multiple indexes can be created and searched independently, but if you wish to override an existing @@ -37,16 +116,119 @@ SilverStripe\FullTextSearch\Search\FullTextSearch: ## Synonyms -## Spell check +## Spell check ("Did you mean...") + +Solr has various spell checking strategies (see the ["SpellCheckComponent" docs](http://wiki.apache.org/solr/SpellCheckComponent)), all of which are configured through `solrconfig.xml`. +In the default config which is copied into your index, spell checking data is collected from all fulltext fields +(everything you added through `SolrIndex->addFulltextField()`). The values of these fields are collected in a special `_text` field. + +```php +use My\Namespace\Index\MyIndex; +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$index = MyIndex::singleton(); +$query = SearchQuery::create() + ->addSearchTerm('My Term'); +$params = [ + 'spellcheck' => 'true', + 'spellcheck.collate' => 'true', +]; +$results = $index->search($query, -1, -1, $params); +$results->spellcheck; +``` + +The built-in `_text` data is better than nothing, but also has some problems: it's heavily processed, for example by +stemming filters which butcher words. So misspelling "Govnernance" will suggest "govern" rather than "Governance". +This can be fixed by aggregating spell checking data in a separate field. + +```php +use SilverStripe\CMS\Model\SiteTree; +use SilverStripe\FullTextSearch\Solr\SolrIndex; + +class MyIndex extends SolrIndex +{ + public function init() + { + $this->addCopyField(SiteTree::class . '_Title', 'spellcheckData'); + $this->addCopyField(SomeModel::class . '_Title', 'spellcheckData'); + $this->addCopyField(SiteTree::class . '_Content', 'spellcheckData'); + $this->addCopyField(SomeModel::class . '_Content', 'spellcheckData'); + } + + public function getFieldDefinitions() + { + $xml = parent::getFieldDefinitions(); + + $xml .= "\n\n\t\t"; + $xml .= "\n\t\t"; + + return $xml; + } +} +``` + +Now you need to tell Solr to use our new field for gathering spelling data. In order to customise the spell checking configuration, +create your own `solrconfig.xml` (see [File-based configuration](03_configuration.md#file-based-configuration)). In there, change the following directive: + +```xml + + spellcheckData + +``` + +Copy the new configuration via a the [`Solr_Configure` task](03_configuration.md#solr-configure), and reindex your data before using the spell checker. + +## Highlighting + +Solr can highlight the searched terms in context of the matched content, to help users determine the relevancy of results (e.g. in which part of a sentence the term is used). In order to use this feature, the full content of the field to be highlighted needs to be stored in the index, +by declaring it through `addStoredField()`: + +```php +use SilverStripe\FullTextSearch\Solr\SolrIndex; + +class MyIndex extends SolrIndex +{ + public function init() + { + $this->addClass(Page::class); + $this->addAllFulltextFields(); + $this->addStoredField('Content'); + } +} +``` + +To search with highlighting enabled, you need to pass in a custom query parameter. +There's a lot more parameters available for tweaking results detailed on the [Solr reference guide](https://archive.apache.org/dist/lucene/solr/ref-guide/apache-solr-ref-guide-4.10.pdf#page=270). + +```php +use My\Namespace\Index\MyIndex; +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$index = MyIndex::singleton(); +$query = SearchQuery::create() + ->addSearchTerm('My Term'); +$params = [ + 'hl' => 'true', +]; +$results = $index->search($query, -1, -1, $params); +``` + +Each result will automatically contain an `Excerpt` property which you can use in your own results template. The searched term is highlighted with an `` tag by default. + +> Note: It is recommended to strip out all HTML tags and convert entities on the indexed content, +to avoid matching HTML attributes, and cluttering highlighted content with unparsed HTML. ## Boosting/Weighting - Results aren't all created equal. Matches in some fields are more important - than others; for example, a page `Title` might be considered more relevant to the user than terms in the `Content` field. + Results aren't all created equal. Matches in some fields are more important than others; for example, a page `Title` might be considered more relevant to the user than terms in the `Content` field. - To account for this, a "weighting" (or "boosting") factor can be applied to each searched field. The default value is `1.0`, anything below that will decrease the relevance, anything above increases it. + To account for this, a "weighting" (or "boosting") factor can be applied to each searched field. The default value is `1.0`, anything below that will decrease the relevance, anything above increases it. You can get more information on relevancy at the [Solr wiki](http://wiki.apache.org/solr/SolrRelevancyFAQ). - To adjust the relative values, pass them in as the third argument to your `addSearchTerm()` call: +You can manage the boosting in two ways: + +### Boosting on query + + To adjust the relative values at the time of querying, pass them in as the third argument to your `addSearchTerm()` call: ```php use My\Namespace\Index\MyIndex; @@ -63,13 +245,98 @@ SilverStripe\FullTextSearch\Search\FullTextSearch: Page::class . '_SecretParagraph' => 0.1, ] ); - $results = singleton(MyIndex::class)->search($query); + $results = MyIndex::singleton()->search($query); ``` This will ensure that `Title` is given higher priority for matches than `Content`, which is well above `SecretParagraph`. + +### Boosting on index + +Boost values for specific can also be specified directly on the `SolrIndex` class directly. + +The following methods can be used to set one or more boosted fields: + +* `addBoostedField()` - adds a field with a specific boosted value (defaults to 2) +* `setFieldBoosting()` - if a field has already been added to an index, the boosting + value can be customised, changed, or reset for a single field. +* `addFulltextField()` A boost can be set for a field using the `$extraOptions` parameter +with the key `boost` assigned to the desired value: + +```php +use SilverStripe\CMS\Model\SiteTree; +use SilverStripe\FullTextSearch\Solr\SolrIndex; + +class SolrSearchIndex extends SolrIndex +{ + public function init() + { + $this->addClass(SiteTree::class); + + // The following methods would all add the same boost of 1.5 to "Title" + $this->addBoostedField('Title', null, [], 1.5); + + $this->addFulltextField('Title', null, [ + 'boost' => 1.5, + ]); + + $this->addFulltextField('Title'); + $this->setFieldBoosting(SiteTree::class . '_Title', 1.5); + } +} +``` ## Indexing related objects ## Subsites -## Adding new fields +## Custom field types + +Solr supports custom field type definitions which are written to its XML schema. Many standard ones are already included + in the default schema. As the XML file is generated dynamically, we can add our own types by overloading the template + responsible for it: `types.ss`. + +In the following example, we read our type definitions from a new file `mysite/solr/templates/types.ss` instead: + +```php +use SilverStripe\Control\Director; +use SilverStripe\FullTextSearch\Solr\SolrIndex; + +class MyIndex extends SolrIndex +{ + public function getTypes() + { + return $this->renderWith(Director::baseFolder() . '/mysite/solr/templates/types.ss'); + } +} +``` + +It's usually best to start with the existing definitions, and adjust from there. You can both add your own types and adjust the behaviour of existing definitions. + +### Perform filtering on index + +An example of something you can achieve with this is to move synonym filtering from performed on query, to being performed on index. To do this, you'd take + +```xml + +``` + +from inside the `` block and move it to the `` block. This can be advantageous as Solr does a better job of processing synonyms at index; however, it does mean that it requires a full Reindex to make a change, which - depending on the size of your site - could be overkill. See [this article](https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/) for a good breakdown. + +### Searching for words containing numbers + +By default, the module is configured to split words containing numbers into multiple tokens. For example, the word "A1" would be interpreted as "A" "1", and since "a" is a common stopword, the term "A1" will be excluded from search. + +To allow searches on words containing numeric tokens, you'll need to change the behaviour of the `WordDelimiterFilterFactory` with an overloaded template as described above. Each instance of `` needs to include the following attributes and values: + +- add `splitOnNumerics="0"` on all `WordDelimiterFilterFactory` fields +- change `catenateNumbers="1"` to `catenateNumbers="0"` on all `WordDelimiterFilterFactory` fields + +### Searching for macrons and other Unicode characters + +The `ASCIIFoldingFilterFactory` filter converts alphabetic, numeric, and symbolic Unicode characters which are not in the Basic Latin Unicode block (the first 127 ASCII characters) to their ASCII equivalents, if one exists. + +Find the fields in your overloaded `types.ss` that you want to enable this behaviour in, for example inside the `` block, add the following to both its index analyzer and query analyzer records. + +```xml + +``` diff --git a/docs/en/Solr.md b/docs/en/Solr.md index dd73f0d..da082aa 100644 --- a/docs/en/Solr.md +++ b/docs/en/Solr.md @@ -1,458 +1,3 @@ -# Solr connector for SilverStripe fulltextsearch module - - -All possible parameters incl optional ones with example values: - - - -## Configuration - -### Create an index - -```php -// File: mysite/code/MyIndex.php: -use SilverStripe\FullTextSearch\Solr\SolrIndex; - -class MyIndex extends SolrIndex -{ - public function init() - { - $this->addClass(Page::class); - $this->addAllFulltextFields(); - } -} -``` - -### Create the index schema - -The PHP-based index definition is an abstraction layer for the actual Solr XML configuration. -In order to create or update it, you need to run the `Solr_Configure` task. - -``` -vendor/bin/sake dev/tasks/Solr_Configure -``` - -Based on the sample configuration above, this command will do the following: - -- Create a `/.solr/MyIndex` folder -- Copy configuration files from `vendor/silverstripe/fulltextsearch/conf/extras/` to `/.solr/MyIndex/conf` -- Generate a `schema.xml`, and place it it in `/.solr/MyIndex/conf` - -If you call the task with an existing index folder, -it will overwrite all files from their default locations, -regenerate the `schema.xml`, and ask Solr to reload the configuration. - -You can use the same command for updating an existing schema, -which will automatically apply without requiring a Solr server restart. - -### Reindex - -After configuring Solr, you have the option to add your existing -content to its indices. Run the following command: - -``` -vendor/bin/sake dev/tasks/Solr_Reindex -``` - -This will delete and rebuild all indices. Depending on your data, -this can take anywhere from minutes to hours. -Keep in mind that the normal mode of updating indices is -based on ORM manipulations of the underlying data. -For example, calling `$myPage->write()` will automatically -update the index entry for this record (and all its variants). - -This task has the following options: - -- `verbose`: Debug information - -Internally, depending on what job processing backend you have configured (such as queuedjobs) -individual tasks for re-indexing groups of records may either be performed behind the scenes -as crontasks, or via separate processes initiated by the current request. - -Internally groups of records are grouped into sizes of 200. You can configure this -group sizing by using the `Solr_Reindex.recordsPerRequest` config. - -```yaml -SilverStripe\FullTextSearch\Solr\Tasks\Solr_Reindex: - recordsPerRequest: 150 -``` - -Note: The Solr indexes will be stored as binary files inside your SilverStripe project. -You can also copy the `thirdparty/` solr directory somewhere else, -just set the `path` value in `mysite/_config.php` to point to the new location. - -You can also run the reindex task through a web request. -By default, the web request won't receive any feedback while its running. -Depending on your PHP and web server configuration, -the web request itself might time out, but the reindex continues anyway. -This is possible because the actual index operations are run as separate -PHP sub-processes inside the main web request. - -### File-based configuration (solrconfig.xml etc) - -Many aspects of Solr are configured outside of the `schema.xml` file -which SilverStripe generates based on the index PHP file. -For example, stopwords are placed in their own `stopwords.txt` file, -and spell checks are configured in `solrconfig.xml`. - -By default, these files are copied from the `fulltextsearch/conf/extras/` -directory over to the new index location. In order to use your own files, -copy these files into a location of your choosing (for example `mysite/data/solr/`), -and tell Solr to use this folder with the `extraspath` configuration setting. - -```php -// mysite/_config.php -use SilverStripe\Control\Director; -use SilverStripe\FullTextSearch\Solr\Solr; - -Solr::configure_server([ - // ... - 'extraspath' => Director::baseFolder() . '/mysite/data/solr/', -]); -``` - -Please run the `Solr_Configure` task for the changes to take effect. - -Note: You can also define those on an index-by-index basis by -implementing `SolrIndex->getExtrasPath()`. - -### Custom Types - -Solr supports custom field type definitions which are written to its XML schema. -Many standard ones are already included in the default schema. -As the XML file is generated dynamically, we can add our own types -by overloading the template responsible for it: `types.ss`. - -In the following example, we read out type definitions -from a new file `mysite/solr/templates/types.ss` instead: - -```php -use SilverStripe\Control\Director; -use SilverStripe\FullTextSearch\Solr\SolrIndex; - -class MyIndex extends SolrIndex -{ - public function getTypes() - { - return $this->renderWith(Director::baseFolder() . '/mysite/solr/templates/types.ss'); - } -} -``` - -#### Searching for words containing numbers - -By default, the fulltextmodule is configured to split words containing numbers into multiple tokens. For example, the word "A1" would be interpreted as "A" "1"; since "a" is a common stopword, the term "A1" will be excluded from search. - -To allow searches on words containing numeric tokens, you'll need to update your overloaded template to change the behaviour of the WordDelimiterFilterFactory. Each instance of `` needs to include the following attributes and values: - -* add splitOnNumerics="0" on all WordDelimiterFilterFactory fields -* change catenateOnNumbers="1" on all WordDelimiterFilterFactory fields - -Update your index to point to your overloaded template using the method described above. - -#### Searching for macrons and other Unicode characters - -The "ASCIIFoldingFilterFactory" filter converts alphabetic, numeric, and symbolic Unicode characters which are not in the Basic Latin Unicode block (the first 127 ASCII characters) to their ASCII equivalents, if one exists. - -Find the fields in your overloaded `types.ss` that you want to enable this behaviour in. EG: - -```xml - -``` - -Add the following to both its index analyzer and query analyzer records. - -```xml - -``` - -Update your index to point to your overloaded template using the method described above. - -### Spell Checking ("Did you mean...") - -Solr has various spell checking strategies (see the ["SpellCheckComponent" docs](http://wiki.apache.org/solr/SpellCheckComponent)), all of which are configured through `solrconfig.xml`. -In the default config which is copied into your index, -spell checking data is collected from all fulltext fields -(everything you added through `SolrIndex->addFulltextField()`). -The values of these fields are collected in a special `_text` field. - -```php -use SilverStripe\FullTextSearch\Search\Queries; - -$index = new MyIndex(); -$query = new SearchQuery(); -$query->addSearchTerm('My Term'); -$params = [ - 'spellcheck' => 'true', - 'spellcheck.collate' => 'true', -]; -$results = $index->search($query, -1, -1, $params); -$results->spellcheck; -``` - -The built-in `_text` data is better than nothing, but also has some problems: -Its heavily processed, for example by stemming filters which butcher words. -So misspelling "Govnernance" will suggest "govern" rather than "Governance". -This can be fixed by aggregating spell checking data in a separate - -```php -use SilverStripe\CMS\Model\SiteTree; -use SilverStripe\FullTextSearch\Solr\SolrIndex; - -class MyIndex extends SolrIndex -{ - public function init() - { - // ... - $this->addCopyField(SiteTree::class . '_Title', 'spellcheckData'); - $this->addCopyField(SomeModel::class . '_Title', 'spellcheckData'); - $this->addCopyField(SiteTree::class . '_Content', 'spellcheckData'); - $this->addCopyField(SomeModel::class . '_Content', 'spellcheckData'); - } - - // ... - public function getFieldDefinitions() - { - $xml = parent::getFieldDefinitions(); - - $xml .= "\n\n\t\t"; - $xml .= "\n\t\t"; - - return $xml; - } -} -``` - -Now you need to tell solr to use our new field for gathering spelling data. -In order to customize the spell checking configuration, -create your own `solrconfig.xml` (see "File-based configuration"). -In there, change the following directive: - -```xml - - - - spellcheckData - -``` - -Don't forget to copy the new configuration via a call to the `Solr_Configure` -task, and reindex your data before using the spell checker. - -### Limiting search fields - -Solr has a way of specifying which fields to search on. You specify these -fields as a parameter to `SearchQuery`. - -In the following example, we're telling Solr to *only* search the -`Title` and `Content` fields. Note that the fields must be specified in -the search parameters as "composite fields", which means they should be -specified in the form of `{table}_{field}`. - -These fields are defined in the schema.xml file that gets sent to Solr. - -```php -use SilverStripe\CMS\Model\SiteTree; -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = new SearchQuery(); -$query->addClassFilter(Page::class); -$query->addSearchTerm('someterms', [SiteTree::class . '_Title', SiteTree::class . '_Content']); -$result = singleton(SolrSearchIndex::class)->search($query, -1, -1); - -// the request to Solr would be: -// q=(SiteTree_Title:Lorem+OR+SiteTree_Content:Lorem) -``` - -### Configuring boosts - -There are several ways in which you can configure boosting on search fields or terms. - -#### Boosting on search query - -Solr has a way of specifying which fields should be boosted as a parameter to `SearchQuery`. - -This means if you boost a certain field, search query matches on that field will be considered -higher relevance than other fields with matches, and therefore those results will be closer -to the top of the results. - -In this example, we enter "Lorem" as the search term, and boost the `Content` field: - -```php -use SilverStripe\CMS\Model\SiteTree; -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = new SearchQuery(); -$query->addClassFilter(Page::class); -$query->addSearchTerm('Lorem', null, [SiteTree::class . '_Content' => 2]); -$result = singleton(SolrSearchIndex::class)->search($query, -1, -1); - -// the request to Solr would be: -// q=SiteTree_Content:Lorem^2 -``` - -More information on [relevancy on the Solr wiki](http://wiki.apache.org/solr/SolrRelevancyFAQ). - -### Boosting on index fields - -Boost values for specific can also be specified directly on the `SolrIndex` class directly. - -The following methods can be used to set one or more boosted fields: - -* `SolrIndex::addBoostedField` Adds a field with a specific boosted value (defaults to 2) -* `SolrIndex::setFieldBoosting` If a field has already been added to an index, the boosting - value can be customised, changed, or reset for a single field. -* `SolrIndex::addFulltextField` A boost can be set for a field using the `$extraOptions` parameter -with the key `boost` assigned to the desired value. - -For example: - -```php -use SilverStripe\CMS\Model\SiteTree; -use SilverStripe\FullTextSearch\Solr\SolrIndex; - -class SolrSearchIndex extends SolrIndex -{ - public function init() - { - $this->addClass(SiteTree::class); - $this->addAllFulltextFields(); - $this->addFilterField('ShowInSearch'); - $this->addBoostedField('Title', null, [], 1.5); - $this->setFieldBoosting(SiteTree::class . '_SearchBoost', 2); - } - -} -``` - -### Custom Types - -Solr supports custom field type definitions which are written to its XML schema. -Many standard ones are already included in the default schema. -As the XML file is generated dynamically, we can add our own types -by overloading the template responsible for it: `types.ss`. - -In the following example, we read out type definitions -from a new file `mysite/solr/templates/types.ss` instead: - -```php -use SilverStripe\Control\Director; -use SilverStripe\FullTextSearch\Solr\SolrIndex; - -class MyIndex extends SolrIndex -{ - public function getTemplatesPath() - { - return Director::baseFolder() . '/mysite/solr/templates/'; - } -} -``` - -### Highlighting - -Solr can highlight the searched terms in context of the matched content, -to help users determine the relevancy of results (e.g. in which part of a sentence -the term is used). In order to use this feature, the full content of the -field to be highlighted needs to be stored in the index, -by declaring it through `addStoredField()`. - -```php -use SilverStripe\FullTextSearch\Solr\SolrIndex; - -class MyIndex extends SolrIndex -{ - public function init() - { - $this->addClass(Page::class); - $this->addAllFulltextFields(); - $this->addStoredField('Content'); - } -} -``` - -To search with highlighting enabled, you need to pass in a custom query parameter. -There's a lot more parameters to tweak results on the [Solr Wiki](http://wiki.apache.org/solr/HighlightingParameters). - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$index = new MyIndex(); -$query = new SearchQuery(); -$query->addSearchTerm('My Term'); -$results = $index->search($query, -1, -1, ['hl' => 'true']); -``` - -Each result will automatically contain an "Excerpt" property -which you can use in your own results template. -The searched term is highlighted with an `` tag by default. - -Note: It is recommended to strip out all HTML tags and convert entities on the indexed content, -to avoid matching HTML attributes, and cluttering highlighted content with unparsed HTML. - -### Adding additional information into search results - -Inside the SolrIndex::search() function, the third-party library solr-php-client -is used to send data to Solr and parse the response. Additional information can -be pulled from this response and added to your results object for use in templates -using the `updateSearchResults()` extension hook. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$index = new MyIndex(); -$query = new SearchQuery(); -$query->addSearchTerm('My Term'); -$results = $index->search($query, -1, -1, [ - 'facet' => 'true', - 'facet.field' => 'SiteTree_ClassName', -]); -``` - -By adding facet fields into the query parameters, our response object from Solr -now contains some additional information that we can add into the results sent -to the page. - -```php -use SilverStripe\Core\Extension; -use SilverStripe\View\ArrayData; -use SilverStripe\ORM\ArrayList; - -class MyResultsExtension extends Extension -{ - /** - * Adds extra information from the solr-php-client repsonse - * into our search results. - * @param ArrayData $results The ArrayData that will be used to generate search - * results pages. - * @param stdClass $response The solr-php-client response object. - */ - public function updateSearchResults($results, $response) - { - if (!isset($response->facet_counts) || !isset($response->facet_counts->facet_fields)) { - return; - } - $facetCounts = ArrayList::create(array()); - foreach($response->facet_counts->facet_fields as $name => $facets) { - $facetDetails = ArrayData::create([ - 'Name' => $name, - 'Facets' => ArrayList::create([]), - ]); - - foreach($facets as $facetName => $facetCount) { - $facetDetails->Facets->push(ArrayData::create([ - 'Name' => $facetName, - 'Count' => $facetCount, - ])); - } - $facetCounts->push($facetDetails); - } - $results->setField('FacetCounts', $facetCounts); - } -} -``` - -We can now access the facet information inside our templates. - ### Adding Analyzers, Tokenizers and Token Filters When a document is indexed, its individual fields are subject to the analyzing and tokenizing filters that can transform and normalize the data in the fields. For example — removing blank spaces, removing html code, stemming, removing a particular character and replacing it with another From 53eb82668120a338c3e397dac86b66d0d43d7d6d Mon Sep 17 00:00:00 2001 From: Andrew Aitken-Fincham Date: Mon, 11 Jun 2018 15:17:25 +0100 Subject: [PATCH 7/8] re-add default searchform docs add blurb about simple theme text extraction synonyms --- docs/en/00_index.md | 3 +- docs/en/01_getting_started.md | 13 +++- docs/en/03_configuration.md | 4 +- docs/en/04_advanced_configuration.md | 102 ++++++++++++++++++++++++++- docs/en/Solr.md | 66 ----------------- src/Solr/SolrIndex.php | 2 +- 6 files changed, 117 insertions(+), 73 deletions(-) diff --git a/docs/en/00_index.md b/docs/en/00_index.md index c81efc8..f5f8da8 100644 --- a/docs/en/00_index.md +++ b/docs/en/00_index.md @@ -18,12 +18,13 @@ - Advanced configuration - [Facets](04_advanced_configuration.md#facets) - [Using multiple indexes](04_advanced_configuration.md#multiple-indexes) - - [Synonyms](04_advanced_configuration.md#synonyms) + - [Analyzers, tokens and token filters](04_advanced_configuration.md#analyzers,-tokenizers-and-token-filters) - [Spellcheck](04_advanced_configuration.md#spell-check-("did-you-mean...")) - [Highlighting](04_advanced_configuration.md#highlighting) - [Boosting](04_advanced_configuration.md#boosting) - [Indexing related objects](04_advanced_configuration.md#indexing-related-objects) - [Subsites](04_advanced_configuration.md#subsites) - [Custom field types](04_advanced_configuration.md#custom-field-types) + = [Text extraction](04_advanced_configuration.md#text-extraction) - Troubleshooting - [Gotchas](05_troubleshooting.md#common-gotchas) diff --git a/docs/en/01_getting_started.md b/docs/en/01_getting_started.md index eb05674..dc6d877 100644 --- a/docs/en/01_getting_started.md +++ b/docs/en/01_getting_started.md @@ -52,6 +52,15 @@ This will: - Create a DefaultIndex - Run a [Solr Configure](03_configuration.md#solr-configure) and a [Solr Reindex](03_configuration.md#solr-reindex) -You'll then need to build a search form and results display that suits the functionality of your site. +If you have the [CMS module](https://github.com/silverstripe/silverstripe-cms) installed, you will be able to simply add + `$SearchForm` to your template to add a Solr search form. Default configuration is added via the + [`ContentControllerExtension`](/src/Solr/Control/ContentControllerExtension.php) and alternative + [`SearchForm`](/src/Solr/Forms/SearchForm.php). With the + [Simple theme](https://github.com/silverstripe-themes/silverstripe-simple), this is in the + [`Header`](https://github.com/silverstripe-themes/silverstripe-simple/blob/master/templates/Includes/Header.ss#L10-L15) + by default. -// TODO update me when https://github.com/silverstripe/silverstripe-fulltextsearch/pull/216 is merged +Ensure that you _don't_ have `SilverStripe\ORM\Search\FulltextSearchable::enable()` set in `_config.php`, as the +`SearchForm` action provided by that class will conflict. + +You can override the default template with a new one at `templates/Layout/Page_results_solr.ss`. diff --git a/docs/en/03_configuration.md b/docs/en/03_configuration.md index 718274f..247a949 100644 --- a/docs/en/03_configuration.md +++ b/docs/en/03_configuration.md @@ -13,8 +13,8 @@ Solr::configure_server([ 'path' => '/solr', // The suburl the Solr service is available on 'version' => '4', // Solr server version - currently only 3 and 4 supported 'service' => 'Solr4Service', // The class that provides actual communcation to the Solr server - 'extraspath' => BASE_PATH .'/fulltextsearch/conf/solr/4/extras/', // Absolute path to the folder containing templates used for generating the schema and field definitions - 'templates' => BASE_PATH . '/fulltextsearch/conf/solr/4/templates/', // Absolute path to the configuration default files, e.g. solrconfig.xml + 'extraspath' => BASE_PATH .'/vendor/silverstripe/fulltextsearch/conf/solr/4/extras/', // Absolute path to the folder containing templates used for generating the schema and field definitions + 'templates' => BASE_PATH . '/vendor/silverstripe/fulltextsearch/conf/solr/4/templates/', // Absolute path to the configuration default files, e.g. solrconfig.xml 'indexstore' => [ 'mode' => NULL, // [REQUIRED] a classname which implements SolrConfigStore, or 'file' or 'webdav' 'path' => NULL, // [REQUIRED] The (locally accessible) path to write the index configurations to OR The suburl on the Solr host that is set up to accept index configurations via webdav (e.g. BASE_PATH . '/.solr') diff --git a/docs/en/04_advanced_configuration.md b/docs/en/04_advanced_configuration.md index a0cd754..f31c7d5 100644 --- a/docs/en/04_advanced_configuration.md +++ b/docs/en/04_advanced_configuration.md @@ -114,7 +114,77 @@ SilverStripe\FullTextSearch\Search\FullTextSearch: - CoreSearchIndex ``` -## Synonyms +## Analyzers, Tokenizers and Token Filters + +When a document is indexed, its individual fields are subject to the analyzing and tokenizing filters that can transform +and normalize the data in the fields. You can remove blank spaces, strip HTML, replace a particular character and much +more as described in the [Solr Wiki](http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters). + +### Synonyms + +To add synonym processing at query-time, you can add the `SynonymFilterFactory` as an `Analyzer`: + +```php +use SilverStripe\FullTextSearch\Solr\SolrIndex; +use Page; + +class MyIndex extends SolrIndex +{ + public function init() + { + $this->addClass(Page::class); + $this->addField('Content'); + $this->addAnalyzer('Content', 'filter', [ + 'class' => 'solr.SynonymFilterFactory', + 'synonyms' => 'synonyms.txt', + 'ignoreCase' => 'true', + 'expand' => 'false' + ]); + } +} +``` + +This generates the following XML schema definition: + +```xml + + + +``` + +In this case, you most likely also want to define your own synonyms list. You can define a mapping in one of two ways: + +* A comma-separated list of words. If the token matches any of the words, then all the words in the list are +substituted, which will include the original token. + +* Two comma-separated lists of words with the symbol "=>" between them. If the token matches any word on +the left, then the list on the right is substituted. The original token will not be included unless it is also in the +list on the right. + +For example: + +```text +couch,sofa,lounger +teh => the +small => teeny,tiny,weeny +``` + +Then you should update your [Solr configuration](03_configuration.md#solr-server-parameters) to include your synonyms +file via the `extraspath` parameter, for example: + +```php +use SilverStripe\FullTextSearch\Solr\Solr; + +Solr::configure_server([ + 'extraspath' => BASE_PATH . '/mysite/Solr/', + 'indexstore' => [ + 'mode' => 'file', + 'path' => BASE_PATH . '/.solr', + ] +]); +``` + +Will include `/mysite/Solr/synonyms.txt` as your list after a [Solr configure](03_configuration.md#solr-configure) ## Spell check ("Did you mean...") @@ -340,3 +410,33 @@ Find the fields in your overloaded `types.ss` that you want to enable this behav ```xml ``` + +## Text extraction + +Solr provides built-in text extraction capabilities for PDF and Office documents, and numerous other formats, through +the `ExtractingRequestHandler` API (see [the Solr wiki entry](http://wiki.apache.org/solr/ExtractingRequestHandler). +If you're using a default Solr installation, it's most likely already bundled and set up. But if you plan on running the +Solr server integrated into this module, you'll need to download the libraries and link them first. Run the following +commands from the webroot: + +``` +wget http://archive.apache.org/dist/lucene/solr/4.10.4/solr-4.10.4.tgz +tar -xvzf solr-4.10.4.tgz +mkdir .solr/PageSolrIndexboot/dist +mkdir .solr/PageSolrIndexboot/contrib +cp solr-4.10.4/dist/solr-cell-4.10.4.jar .solr/PageSolrIndexboot/dist/ +cp -R solr-4.10.4/contrib/extraction .solr/PageSolrIndexboot/contrib/ +rm -rf solr-4.10.4 solr-4.10.4.tgz +``` + +Create a custom `solrconfig.xml` (see [File-based configuration](03_configuration.md#file-based-configuration)). + +Add the following XML configuration: + +```xml + + +``` + +Now run a [Solr configure](03_configuration.md#solr-configure). You can use Solr text extraction either directly through +the HTTP API, or through the [Text extraction module](https://github.com/silverstripe-labs/silverstripe-textextraction). diff --git a/docs/en/Solr.md b/docs/en/Solr.md index da082aa..64627f9 100644 --- a/docs/en/Solr.md +++ b/docs/en/Solr.md @@ -1,69 +1,3 @@ -### Adding Analyzers, Tokenizers and Token Filters - -When a document is indexed, its individual fields are subject to the analyzing and tokenizing filters that can transform and normalize the data in the fields. For example — removing blank spaces, removing html code, stemming, removing a particular character and replacing it with another -(see [Solr Wiki](http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters)). - -Example: Replace synonyms on indexing (e.g. "i-pad" with "iPad") - -```php -use SilverStripe\FullTextSearch\Solr\SolrIndex; - -class MyIndex extends SolrIndex -{ - public function init() - { - $this->addClass(Page::class); - $this->addField('Content'); - $this->addAnalyzer('Content', 'filter', ['class' => 'solr.SynonymFilterFactory']); - } -} -``` - -Generates the following XML schema definition: - -```xml - - - -``` - -### Text Extraction - -Solr provides built-in text extraction capabilities for PDF and Office documents, -and numerous other formats, through the `ExtractingRequestHandler` API -(see http://wiki.apache.org/solr/ExtractingRequestHandler). -If you're using a default Solr installation, it's most likely already -bundled and set up. But if you plan on running the Solr server integrated -into this module, you'll need to download the libraries and link the first. - -``` -wget http://archive.apache.org/dist/lucene/solr/3.1.0/apache-solr-3.1.0.tgz -mkdir tmp -tar -xvzf apache-solr-3.1.0.tgz -mkdir .solr/PageSolrIndexboot/dist -mkdir .solr/PageSolrIndexboot/contrib -cp apache-solr-3.1.0/dist/apache-solr-cell-3.1.0.jar .solr/PageSolrIndexboot/dist/ -cp -R apache-solr-3.1.0/contrib/extraction .solr/PageSolrIndexboot/contrib/ -rm -rf apache-solr-3.1.0 apache-solr-3.1.0.tgz -``` - -Create a custom `solrconfig.xml` (see "File-based configuration"). -Add the following XML configuration. - -```xml - - -``` - -Now apply the configuration: - -``` -vendor/bin/sake dev/tasks/Solr_Configure -``` - -Now you can use Solr text extraction either directly through the HTTP API, -or indirectly through the ["textextraction" module](https://github.com/silverstripe-labs/silverstripe-textextraction). - ## Adding DataObject classes to Solr search If you create a class that extends `DataObject` (and not `Page`) then it won't be automatically added to the search diff --git a/src/Solr/SolrIndex.php b/src/Solr/SolrIndex.php index 5195413..c25cde5 100644 --- a/src/Solr/SolrIndex.php +++ b/src/Solr/SolrIndex.php @@ -137,7 +137,7 @@ abstract class SolrIndex extends SearchIndex * * @param string $field * @param string $type - * @param Array $params Parameters for the analyzer, usually at least a "class" + * @param array $params parameters for the analyzer, usually at least a "class" */ public function addAnalyzer($field, $type, $params) { From 06c604c9f3678c13fafc48bebe5b47111239ab58 Mon Sep 17 00:00:00 2001 From: Andrew Aitken-Fincham Date: Mon, 11 Jun 2018 17:21:35 +0100 Subject: [PATCH 8/8] split querying into its own file adding non-SiteTree dataobjects subsite boosting --- README.md | 8 +- docs/en/00_index.md | 28 +-- docs/en/03_configuration.md | 161 ++++-------------- docs/en/04_querying.md | 130 ++++++++++++++ ...ration.md => 05_advanced_configuration.md} | 23 +++ docs/en/05_troubleshooting.md | 3 - docs/en/06_troubleshooting.md | 14 ++ docs/en/Solr.md | 99 ----------- 8 files changed, 218 insertions(+), 248 deletions(-) create mode 100644 docs/en/04_querying.md rename docs/en/{04_advanced_configuration.md => 05_advanced_configuration.md} (94%) delete mode 100644 docs/en/05_troubleshooting.md create mode 100644 docs/en/06_troubleshooting.md delete mode 100644 docs/en/Solr.md diff --git a/README.md b/README.md index c519ee8..0d5af98 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,9 @@ Adds support for fulltext search engines like Sphinx and Solr to SilverStripe CM ## Documentation -See [the docs](/docs/en/00_index.md), or for the quick version see [the quick start guide](/docs/en/01_getting_started.md#quick-start). +For pure Solr docs, check out [the Solr 4.10.4 guide](https://archive.apache.org/dist/lucene/solr/ref-guide/apache-solr-ref-guide-4.10.pdf). + +See [the docs](/docs/en/00_index.md) for configuration and setup, or for the quick version see [the quick start guide](/docs/en/01_getting_started.md#quick-start). For details of updates, bugfixes, and features, please see the [changelog](CHANGELOG.md). @@ -48,8 +50,4 @@ maybe 'Content->Summary' to allow calling a specific method on the field object - Allow user logic to cause triggering reindex of documents when field is user generated -* Add sphinx connector - * Add generic APIs for spell correction, file text extraction and snippet generation - -* Better docs diff --git a/docs/en/00_index.md b/docs/en/00_index.md index f5f8da8..d71e99e 100644 --- a/docs/en/00_index.md +++ b/docs/en/00_index.md @@ -11,20 +11,24 @@ - [Solr server parameters](03_configuration.md#solr-server-parameters) - [Creating an index](03_configuration.md#creating-an-index) - [Adding data to an index](03_configuration.md#adding-data-to-an-index) - - [Querying an index](03_configuration.md#querying-the-index) - [Running the dev/tasks](03_configuration.md#dev-tasks) - [File-based configuration](03_configuration.md#file-based-configuration) - [Handling results](03_configuration.md#handling-results) +- Querying + - [Building a SearchQuery](04_querying.md#building-a-`searchquery`) + - [Searching value ranges](04_querying.md#searching-value-ranges) + - [Empty or existing values](04_querying.md#empty-or-existing-values) + - [Executing your query](04_querying.md#executing-your-query) - Advanced configuration - - [Facets](04_advanced_configuration.md#facets) - - [Using multiple indexes](04_advanced_configuration.md#multiple-indexes) - - [Analyzers, tokens and token filters](04_advanced_configuration.md#analyzers,-tokenizers-and-token-filters) - - [Spellcheck](04_advanced_configuration.md#spell-check-("did-you-mean...")) - - [Highlighting](04_advanced_configuration.md#highlighting) - - [Boosting](04_advanced_configuration.md#boosting) - - [Indexing related objects](04_advanced_configuration.md#indexing-related-objects) - - [Subsites](04_advanced_configuration.md#subsites) - - [Custom field types](04_advanced_configuration.md#custom-field-types) - = [Text extraction](04_advanced_configuration.md#text-extraction) + - [Facets](05_advanced_configuration.md#facets) + - [Using multiple indexes](05_advanced_configuration.md#multiple-indexes) + - [Analyzers, tokens and token filters](05_advanced_configuration.md#analyzers,-tokenizers-and-token-filters) + - [Spellcheck](05_advanced_configuration.md#spell-check-("did-you-mean...")) + - [Highlighting](05_advanced_configuration.md#highlighting) + - [Boosting](05_advanced_configuration.md#boosting) + - [Indexing related objects](05_advanced_configuration.md#indexing-related-objects) + - [Subsites](05_advanced_configuration.md#subsites) + - [Custom field types](05_advanced_configuration.md#custom-field-types) + = [Text extraction](05_advanced_configuration.md#text-extraction) - Troubleshooting - - [Gotchas](05_troubleshooting.md#common-gotchas) + - [Gotchas](06_troubleshooting.md#common-gotchas) diff --git a/docs/en/03_configuration.md b/docs/en/03_configuration.md index 247a949..b11d65e 100644 --- a/docs/en/03_configuration.md +++ b/docs/en/03_configuration.md @@ -106,7 +106,7 @@ $page = Page::create(['Content' => 'Help me. My house is on fire. This is less t $page->write(); ``` -Depending on the size of the index and how much content needs to be processed, it could take a while for your search results to be updated, so your newly-updated page may not be available in your search results immediately. +Depending on the size of the index and how much content needs to be processed, it could take a while for your search results to be updated, so your newly-updated page may not be available in your search results immediately. This approach is typically not recommended. ### Queued jobs @@ -136,138 +136,41 @@ class MyIndex extends SolrIndex } ``` -Alternatively, you can index draft content, but simply exclude it from searches. This can be handy to preview search results on unpublished content, in case a CMS author is logged in. Before constructing your `SearchQuery`, conditionally switch to the "live" stage: +Alternatively, you can index draft content, but simply exclude it from searches. This can be handy to preview search results on unpublished content, in case a CMS author is logged in. Before constructing your `SearchQuery`, conditionally switch to the "live" stage. + +### Adding DataObjects + +If you create a class that extends `DataObject` (and not `Page`) then it won't be automatically added to the search +index. You'll have to make some changes to add it in. The `DataObject` class will require the following minimum code +to render properly in the search results: + +* `Link()` needs to return the URL to follow from the search results to actually view the object. +* `Name` (as a DB field) will be used as the result title. +* `Abstract` (as a DB field) will show under the search result title. +* `getShowInSearch()` is required to get the record to show in search, since all results are filtered by `ShowInSearch`. + +So with that, you can add your class to your index: ```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use SilverStripe\Security\Permission; -use SilverStripe\Versioned\Versioned; +use My\Namespace\Model\SearchableDataObject; +use SilverStripe\FullTextSearch\Solr\SolrIndex; +use Page; -if (!Permission::check('CMS_ACCESS_CMSMain')) { - Versioned::set_stage(Versioned::LIVE); +class MySolrSearchIndex extends SolrIndex { + + public function init() + { + $this->addClass(SearchableDataObject::class); + $this->addClass(Page::class); + $this->addAllFulltextFields(); + } } -$query = SearchQuery::create(); -// ... ``` -## Querying an index - -This is where the magic happens. You will construct the search terms and other parameters required to form a `SearchQuery` object, and pass that into a `SearchIndex` to get results. - -### Building a `SearchQuery` - -First, you'll need to construct a new `SearchQuery` object: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = SearchQuery::create(); -``` - -You can then alter the `SearchQuery` with a number of methods: - -#### `addSearchTerm()` - -The simplest - pass through a string to search your index for. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = SearchQuery::create() - ->addSearchTerm('fire'); -``` - -You can also limit this to specific fields by passing an array as the second argument, specified in the form of `{table}_{field}`: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use Page; - -$query = SearchQuery::create() - ->addSearchTerm('on fire', [Page::class . '_Title']); -``` - -#### `addFuzzySearchTerm()` - -Pass through a string to search your index for, with "fuzzier" matching - this means that a term like "fishing" would also likely find results containing "fish" or "fisher". Otherwise behaves the same as `addSearchTerm()`. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; - -$query = SearchQuery::create() - ->addFuzzySearchTerm('fire'); -``` - -#### `addClassFilter()` - -Only query a specific class in the index, optionally including subclasses. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use My\Namespace\PageType\SpecialPage; - -$query = SearchQuery::create() - ->addClassFilter(SpecialPage::class, false); // only return results from SpecialPages, not subclasses -``` - -#### Searching value ranges - -Most values can be expressed as ranges, most commonly dates or numbers. To search for a range of values rather than an exact match, -use the `SearchQuery_Range` class. The range can include bounds on both sides, or stay open-ended by simply leaving the argument blank. -It takes arguments in the form of `SearchQuery_Range::create($start, $end))`: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery_Range; -use My\Namespace\Index\MyIndex; -use Page; - -$query = SearchQuery::create() - ->addSearchTerm('fire') - // Only include documents edited in 2011 or earlier - ->addFilter(Page::class . '_LastEdited', SearchQuery_Range::create(null, '2011-12-31T23:59:59Z')); -$results = MyIndex::singleton()->search($query); -``` - -Note: At the moment, the date format is specific to the search implementation. - -#### Searching for empty or existing values - -Since there's a type conversion between the SilverStripe database, object properties -and the search index persistence, it's often not clear which condition is searched for. -Should it equal an empty string, or only match if the field wasn't indexed at all? -The `SearchQuery` API has the concept of a "missing" and "present" field value for this: - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use My\Namespace\Index\MyIndex; -use Page; - -$query = SearchQuery::create() - ->addSearchTerm('fire'); - // Needs a value, although it can be false - ->addFilter(Page::class . '_ShowInMenus', SearchQuery::$present); -$results = MyIndex::singleton()->search($query); -``` - -### Querying an index - -Once you have your query constructed, you need to run it against your index. - -```php -use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; -use My\Namespace\Index\MyIndex; - -$query = SearchQuery::create()->addSearchTerm('fire'); -$results = MyIndex::singleton()->search($query); -``` - -The return value of a `search()` call is an object which contains a few properties: - - * `Matches`: `ArrayList` of the current "page" of search results. - * `Suggestion`: (optional) Any suggested spelling corrections in the original query notation - * `SuggestionNice`: (optional) Any suggested spelling corrections for display (without query notation) - * `SuggestionQueryString` (optional) Link to repeat the search with suggested spelling corrections +Once you've created the above classes and run the [solr dev tasks](#solr-dev-tasks) to tell Solr about the new index +you've just created, this will add `SearchableDataObject` and the text fields it has to the index. Now when you search +on the site using `MySolrSearchIndex->search()`, the `SearchableDataObject` results will show alongside normal `Page` +results. ## Solr dev tasks @@ -306,7 +209,7 @@ The Solr indexes will be stored as binary files inside your SilverStripe project ## File-based configuration -Many aspects of Solr are configured outside of the `schema.xml` file which SilverStripe generates based on the `SolrIndex` subclass that is defined. For example, stopwords are placed in their own `stopwords.txt` file, and advanced [spellchecking](04_advanced_configuration.md#spell-check-("did-you-mean...")) can be configured in `solrconfig.xml`. +Many aspects of Solr are configured outside of the `schema.xml` file which SilverStripe generates based on the `SolrIndex` subclass that is defined. For example, stopwords are placed in their own `stopwords.txt` file, and advanced [spellchecking](05_advanced_configuration.md#spell-check-("did-you-mean...")) can be configured in `solrconfig.xml`. By default, these files are copied from the `fulltextsearch/conf/extras/` directory over to the new index location. In order to use your own files, copy these files into a location of your choosing (for example `mysite/data/solr/`), and tell Solr to use this folder with the `extraspath` [configuration setting](#solr-server-parameters). Run a [`Solr_Configure](#solr-configure) to apply these changes. @@ -340,7 +243,7 @@ class PageController extends ContentController In your template (e.g. `Page_results.ss`) you can access the results and loop through them. They're stored in the `$Matches` property of the search return object. -```ss +```silverstripe <% if $SearchResult.Matches %>

Results for "{$Query}"

Displaying Page $SearchResult.Matches.CurrentPage of $SearchResult.Matches.TotalPages

diff --git a/docs/en/04_querying.md b/docs/en/04_querying.md new file mode 100644 index 0000000..4c7b818 --- /dev/null +++ b/docs/en/04_querying.md @@ -0,0 +1,130 @@ +# Querying + +This is where the magic happens. You will construct the search terms and other parameters required to form a `SearchQuery` object, and pass that into a `SearchIndex` to get results. + +## Building a `SearchQuery` + +First, you'll need to construct a new `SearchQuery` object: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$query = SearchQuery::create(); +``` + +You can then alter the `SearchQuery` with a number of methods: + +### `addSearchTerm()` + +The simplest - pass through a string to search your index for. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$query = SearchQuery::create() + ->addSearchTerm('fire'); +``` + +You can also limit this to specific fields by passing an array as the second argument, specified in the form of `{table}_{field}`: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm('on fire', [Page::class . '_Title']); +``` + +### `addFuzzySearchTerm()` + +Pass through a string to search your index for, with "fuzzier" matching - this means that a term like "fishing" would also likely find results containing "fish" or "fisher". Otherwise behaves the same as `addSearchTerm()`. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; + +$query = SearchQuery::create() + ->addFuzzySearchTerm('fire'); +``` + +### `addClassFilter()` + +Only query a specific class in the index, optionally including subclasses. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\PageType\SpecialPage; + +$query = SearchQuery::create() + ->addClassFilter(SpecialPage::class, false); // only return results from SpecialPages, not subclasses +``` + +## Searching value ranges + +Most values can be expressed as ranges, most commonly dates or numbers. To search for a range of values rather than an exact match, +use the `SearchQuery_Range` class. The range can include bounds on both sides, or stay open-ended by simply leaving the argument blank. +It takes arguments in the form of `SearchQuery_Range::create($start, $end))`: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery_Range; +use My\Namespace\Index\MyIndex; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm('fire') + // Only include documents edited in 2011 or earlier + ->addFilter(Page::class . '_LastEdited', SearchQuery_Range::create(null, '2011-12-31T23:59:59Z')); +$results = MyIndex::singleton()->search($query); +``` + +### How do I use date ranges where dates might not be defined? + +The Solr index updater only includes dates with values, so the field might not exist in all your index entries. A simple bounded range query (`:[* TO ]`) will fail in this case. In order to query the field, reverse the search conditions and exclude the ranges you don't want: + +```php +// Wrong: Filter will ignore all empty field values +$query->addFilter('fieldname', SearchQuery_Range::create('*', 'somedate')); + +// Right: Exclude the opposite range +$query->addExclude('fieldname', SearchQuery_Range::create('somedate', '*')); +``` + +Note: At the moment, the date format is specific to the search implementation. + +## Empty or existing values + +Since there's a type conversion between the SilverStripe database, object properties +and the search index persistence, it's often not clear which condition is searched for. +Should it equal an empty string, or only match if the field wasn't indexed at all? +The `SearchQuery` API has the concept of a "missing" and "present" field value for this: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\Index\MyIndex; +use Page; + +$query = SearchQuery::create() + ->addSearchTerm('fire'); + // Needs a value, although it can be false + ->addFilter(Page::class . '_ShowInMenus', SearchQuery::$present); +$results = MyIndex::singleton()->search($query); +``` + +## Executing your query + +Once you have your query constructed, you need to run it against your index. + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use My\Namespace\Index\MyIndex; + +$query = SearchQuery::create()->addSearchTerm('fire'); +$results = MyIndex::singleton()->search($query); +``` + +The return value of a `search()` call is an object which contains a few properties: + + * `Matches`: `ArrayList` of the current "page" of search results. + * `Suggestion`: (optional) Any suggested spelling corrections in the original query notation + * `SuggestionNice`: (optional) Any suggested spelling corrections for display (without query notation) + * `SuggestionQueryString` (optional) Link to repeat the search with suggested spelling corrections diff --git a/docs/en/04_advanced_configuration.md b/docs/en/05_advanced_configuration.md similarity index 94% rename from docs/en/04_advanced_configuration.md rename to docs/en/05_advanced_configuration.md index f31c7d5..942af4c 100644 --- a/docs/en/04_advanced_configuration.md +++ b/docs/en/05_advanced_configuration.md @@ -357,8 +357,31 @@ class SolrSearchIndex extends SolrIndex ## Indexing related objects +To add a related object to your index. + ## Subsites +When you are utilising the [subsites module](https://github.com/silverstripe/silverstripe-subsites) you +may want to add [boosting](#boosting/weighting) to results from the current subsite. To do so, you'll +need to use [eDisMax](https://lucene.apache.org/solr/guide/6_6/the-extended-dismax-query-parser.html) +and the supporting parameters `bq` and `bf`. You should add the following to your `SolrIndex` +extension: + +```php +use SilverStripe\FullTextSearch\Search\Queries\SearchQuery; +use SilverStripe\Subsites\Model\Subsite; + +public function search(SearchQuery $query, $offset = -1, $limit = -1, $params = [])) { + $params = array_merge($params, [ + 'defType' => 'edismax', // turn on eDisMax + 'bq' => '_subsite:'.Subsite::currentSubsiteID(), // boost-query on current subsite ID + 'bf' => '_subsite^2' // double the score of any document with that subsite ID + ]); + + return parent::search($query, $offset, $limit, $params); +} +``` + ## Custom field types Solr supports custom field type definitions which are written to its XML schema. Many standard ones are already included diff --git a/docs/en/05_troubleshooting.md b/docs/en/05_troubleshooting.md deleted file mode 100644 index 6972d3d..0000000 --- a/docs/en/05_troubleshooting.md +++ /dev/null @@ -1,3 +0,0 @@ -# Troubleshooting - -## Common gotchas diff --git a/docs/en/06_troubleshooting.md b/docs/en/06_troubleshooting.md new file mode 100644 index 0000000..18e66e3 --- /dev/null +++ b/docs/en/06_troubleshooting.md @@ -0,0 +1,14 @@ +# Troubleshooting + +## Common gotchas + +* By default number-letter boundaries are treated as a word boundary. For example, `A1` is two words - `a` and `1` - when Solr parses the search term. +* Special characters and operators are not correctly escaped +* Multi-word synonym issues +* When Dolr indexes are reconfigured and reindexed, their content is trashed and rebuilt + +### CWP-specific + +* `solrconfig.xml` customisations fail silently +* Developers aren’t able to test raw queries or see output via the +[Solr admin interface](02_setup.md#solr-admin) diff --git a/docs/en/Solr.md b/docs/en/Solr.md deleted file mode 100644 index 64627f9..0000000 --- a/docs/en/Solr.md +++ /dev/null @@ -1,99 +0,0 @@ -## Adding DataObject classes to Solr search - -If you create a class that extends `DataObject` (and not `Page`) then it won't be automatically added to the search -index. You'll have to make some changes to add it in. - -So, let's take an example of `StaffMember`: - -```php -use SilverStripe\Control\Controller; -use SilverStripe\ORM\DataObject; - -class StaffMember extends DataObject -{ - private static $db = [ - 'Name' => 'Varchar(255)', - 'Abstract' => 'Text', - 'PhoneNumber' => 'Varchar(50)', - ]; - - public function Link($action = 'show') - { - return Controller::join_links('my-controller', $action, $this->ID); - } - - public function getShowInSearch() - { - return 1; - } -} -``` - -This `DataObject` class has the minimum code necessary to allow it to be viewed in the site search. - -`Link()` will return a URL for where a user goes to view the data in more detail in the search results. -`Name` will be used as the result title, and `Abstract` the summary of the staff member which will show under the -search result title. -`getShowInSearch` is required to get the record to show in search, since all results are filtered by `ShowInSearch`. - -So with that, let's create a new class called `MySolrSearchIndex`: - -```php -use StaffMember; -use SilverStripe\CMS\Model\SiteTree; -use SilverStripe\FullTextSearch\Solr\SolrIndex; - -class MySolrSearchIndex extends SolrIndex { - - public function init() - { - $this->addClass(SiteTree::class); - $this->addClass(StaffMember::class); - - $this->addAllFulltextFields(); - $this->addFilterField('ShowInSearch'); - } -} -``` - -This is a copy/paste of the existing configuration but with the addition of `StaffMember`. - -Once you've created the above classes and run `flush=1`, access `dev/tasks/Solr_Configure` and `dev/tasks/Solr_Reindex` -to tell Solr about the new index you've just created. This will add `StaffMember` and the text fields it has to the -index. Now when you search on the site using `MySolrSearchIndex->search()`, -the `StaffMember` results will show alongside normal `Page` results. - - -## Debugging - -### Using the web admin interface - -You can visit `http://localhost:8983/solr`, which will show you a list -to the admin interfaces of all available indices. -There you can search the contents of the index via the native SOLR web interface. - -It is possible to manually replicate the data automatically sent -to Solr when saving/publishing in SilverStripe, -which is useful when debugging front-end queries, -see `thirdparty/fulltextsearch/server/silverstripe-solr-test.xml`. - -``` -java -Durl=http://localhost:8983/solr/MyIndex/update/ -Dtype=text/xml -jar post.jar silverstripe-solr-test.xml -``` - -## FAQ - -### How do I use date ranges where dates might not be defined? - -The Solr index updater only includes dates with values, -so the field might not exist in all your index entries. -A simple bounded range query (`:[* TO ]`) will fail in this case. -In order to query the field, reverse the search conditions and exclude the ranges you don't want: - -```php -// Wrong: Filter will ignore all empty field values -$myQuery->addFilter('fieldname', new SearchQuery_Range('*', 'somedate')); - -// Better: Exclude the opposite range -$myQuery->addExclude('fieldname', new SearchQuery_Range('somedate', '*')); -```