Featured post
Recommendations for open/source text indexing and search -
i discovered lucene (java library) , starting read on it.
i'm interesting in taking works of literature (for example, philo, josephus), , indexing them, doing following types of analysis (similar bible software programs do):
1) find word x within 2 or 3 words of word y
2) find "work* of * hand*" - find "works of hands", "work of hand" etc...
3) find literary patterns (also called "motiffs") such author uses phrase "in day". (i think might trickiest, might have find combinations of 2-7 word phrases count them , rank them, showing top 25 example). might show example josephus use 1 sets of phrases, , philo another.
are there open-source libraries recommend? language preferences 1) python, 2) c#, 3) java. ideally no dependencies on proprietary database.
thanks,
neal
lucene best 1 out there in opinion in terms of popularity, community, activity , tooling. suggest @ solr built on top of lucene. open source indexing framework found egothor not sure adoption rate.
and here survey might in choosing right one.
here can find more open source , commercial libraries. have seen few of them supporting bindings more 1 programming language. if have decided go lucene, might need luke debugging purposes.
- Get link
- X
- Other Apps
Comments
Post a Comment