From 847a4e06940a2388b0a8f81d3ef579c8f5ca75c5 Mon Sep 17 00:00:00 2001
From: Ingo Schommer <ingo@silverstripe.com>
Date: Wed, 22 Aug 2012 23:22:07 +0200
Subject: [PATCH] Updated README

---
 README.md | 33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/README.md b/README.md
index b1394eb..71e153a 100644
--- a/README.md
+++ b/README.md
@@ -2,14 +2,37 @@
 
 ## Overview
 
+Provides an extraction API for file content, which can hook into different extractor
+engines based on availability and the parsed file format.
+The output is always a string: the file content.
 
-Previously part of the [sphinx module](https://github.com/silverstripe/silverstripe-sphinx).
-
-## Usage
-
+Via the `FileTextExtractable` extension, this logic can be used to 
+cache the extracted content on a `DataObject` subclass (usually `File`).
 
+Note: Previously part of the [sphinx module](https://github.com/silverstripe/silverstripe-sphinx).
 
 ## Requirements
 
  * SilverStripe 3.0
- * (optional) [XPDF](http://www.foolabs.com/xpdf/) (`pdftotext` utility)
\ No newline at end of file
+ * (optional) [XPDF](http://www.foolabs.com/xpdf/) (`pdftotext` utility)
+
+## Configuration
+
+No configuration is required, unless you want to make
+the content available through your `DataObject` subclass.
+In this case, add the following to `mysite/_config.php`:
+
+	DataObject::add_extension('File', 'FileTextExtractable');
+
+## Usage
+
+Manual extraction:
+
+	$myFile = '/my/path/myfile.pdf';
+	$extractor = FileTextExtractor::for_file($myFile);
+	$content = $extractor->getContent($myFile);
+
+DataObject extraction:
+
+	$myFileObj = File::get()->First();
+	$content = $myFileObj->extractFileAsText();
\ No newline at end of file