From 847a4e06940a2388b0a8f81d3ef579c8f5ca75c5 Mon Sep 17 00:00:00 2001 From: Ingo Schommer Date: Wed, 22 Aug 2012 23:22:07 +0200 Subject: [PATCH] Updated README --- README.md | 33 ++++++++++++++++++++++++++++----- 1 file changed, 28 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index b1394eb..71e153a 100644 --- a/README.md +++ b/README.md @@ -2,14 +2,37 @@ ## Overview +Provides an extraction API for file content, which can hook into different extractor +engines based on availability and the parsed file format. +The output is always a string: the file content. -Previously part of the [sphinx module](https://github.com/silverstripe/silverstripe-sphinx). - -## Usage - +Via the `FileTextExtractable` extension, this logic can be used to +cache the extracted content on a `DataObject` subclass (usually `File`). +Note: Previously part of the [sphinx module](https://github.com/silverstripe/silverstripe-sphinx). ## Requirements * SilverStripe 3.0 - * (optional) [XPDF](http://www.foolabs.com/xpdf/) (`pdftotext` utility) \ No newline at end of file + * (optional) [XPDF](http://www.foolabs.com/xpdf/) (`pdftotext` utility) + +## Configuration + +No configuration is required, unless you want to make +the content available through your `DataObject` subclass. +In this case, add the following to `mysite/_config.php`: + + DataObject::add_extension('File', 'FileTextExtractable'); + +## Usage + +Manual extraction: + + $myFile = '/my/path/myfile.pdf'; + $extractor = FileTextExtractor::for_file($myFile); + $content = $extractor->getContent($myFile); + +DataObject extraction: + + $myFileObj = File::get()->First(); + $content = $myFileObj->extractFileAsText(); \ No newline at end of file