Commit Graph

172 Commits

Author SHA1 Message Date
cam-findlay
a34c443be5 FIX additional exception handling for Tika errors return via Guzzle.
Tika server errors via Guzzle can cause the Solr search query to return a 500 error and breaks search results pages for users. Issues was relating to uncaught exceptions from Guzzle causing a silent fail if a text file is perhaps unreadable or missing (return null never occurs which breaks the search).
2013-06-07 10:42:38 +12:00
Ingo Schommer
a380bb7c8f Don't write file, since it'd rename the file and make it inaccessible for subsequent tests 2013-05-07 22:21:56 +02:00
Ingo Schommer
30223e4f7c 3.1 compat 2013-05-07 21:54:51 +02:00
Ingo Schommer
49316d99ff Travis support 2013-05-07 21:49:32 +02:00
Ingo Schommer
24a055a741 More docs on how to use extraction with Solr 2013-05-07 20:14:01 +02:00
Ingo Schommer
b32bc08dc4 More resilience in SolrCellTextExtractor
Shouldn't outright fail the request if a file can't be found
2013-05-07 19:27:06 +02:00
Ingo Schommer
b86483abc4 3.1 compat 2013-05-07 18:47:56 +02:00
Ingo Schommer
b5c663570a Merge pull request #1 from jnv/patch-1
Fix description in composer.json
2013-04-11 00:58:47 -07:00
Jan Vlnas
55b8bc28c1 Fix description in composer.json 2013-03-13 23:59:40 +01:00
Ingo Schommer
f2c8df2348 BUG Exclude meta info from SolrCell content retrieval
Was matching </str> greedily, which included too much content
2013-03-11 00:56:44 +01:00
Ingo Schommer
9af389f51b NEW SolrCellTextExtractor 2013-02-01 15:35:16 +01:00
Ingo Schommer
14816075b8 FIX Case insensitive extension matching 2013-02-01 15:34:54 +01:00
Ingo Schommer
a6cc647d01 Added composer.json 2013-01-07 14:07:39 +01:00
Ingo Schommer
788a49bf9f BUG Improved HTMLTextExtractor, remove non-content tags 2012-09-06 13:41:21 +02:00
Ingo Schommer
733644d6bb Better shell execution feedback from PDF extractor 2012-08-27 11:31:53 +02:00
Ingo Schommer
478ab65db7 Added License 2012-08-22 23:23:34 +02:00
Ingo Schommer
847a4e0694 Updated README 2012-08-22 23:22:46 +02:00
Ingo Schommer
f3fcf60c0f FileTextExtractor->isAvailable() 2012-08-22 18:25:55 +02:00
Ingo Schommer
977c4e49c9 API Using paths instead of File objects in extractors
Makes coupling to File objects optional, by choosing
to use the FileTextExtractable extension.
2012-08-22 18:25:12 +02:00
Ingo Schommer
7de717b0bd 3.0 compat 2012-08-22 18:24:38 +02:00
Ingo Schommer
98f847c946 Added rudimentary test coverage 2012-08-22 18:23:06 +02:00
Ingo Schommer
ec0921c6d1 Initial commit 2012-08-22 17:52:08 +02:00