Ingo Schommer
|
f2c8df2348
|
BUG Exclude meta info from SolrCell content retrieval
Was matching </str> greedily, which included too much content
|
2013-03-11 00:56:44 +01:00 |
|
Ingo Schommer
|
9af389f51b
|
NEW SolrCellTextExtractor
|
2013-02-01 15:35:16 +01:00 |
|
Ingo Schommer
|
14816075b8
|
FIX Case insensitive extension matching
|
2013-02-01 15:34:54 +01:00 |
|
Ingo Schommer
|
a6cc647d01
|
Added composer.json
|
2013-01-07 14:07:39 +01:00 |
|
Ingo Schommer
|
788a49bf9f
|
BUG Improved HTMLTextExtractor, remove non-content tags
|
2012-09-06 13:41:21 +02:00 |
|
Ingo Schommer
|
733644d6bb
|
Better shell execution feedback from PDF extractor
|
2012-08-27 11:31:53 +02:00 |
|
Ingo Schommer
|
478ab65db7
|
Added License
|
2012-08-22 23:23:34 +02:00 |
|
Ingo Schommer
|
847a4e0694
|
Updated README
|
2012-08-22 23:22:46 +02:00 |
|
Ingo Schommer
|
f3fcf60c0f
|
FileTextExtractor->isAvailable()
|
2012-08-22 18:25:55 +02:00 |
|
Ingo Schommer
|
977c4e49c9
|
API Using paths instead of File objects in extractors
Makes coupling to File objects optional, by choosing
to use the FileTextExtractable extension.
|
2012-08-22 18:25:12 +02:00 |
|
Ingo Schommer
|
7de717b0bd
|
3.0 compat
|
2012-08-22 18:24:38 +02:00 |
|
Ingo Schommer
|
98f847c946
|
Added rudimentary test coverage
|
2012-08-22 18:23:06 +02:00 |
|
Ingo Schommer
|
ec0921c6d1
|
Initial commit
|
2012-08-22 17:52:08 +02:00 |
|