may have spent the last 8 hours building a fulltext search engine in nostrdb.
I made the index as space efficient as possible, they keys are stored in a compressed format and map words and word indices to note ids. So when you type βthe quick brown foxβ it will be able to return results with those exact words in sequence (or not if it canβt find a sequence).
Testing it now π. Will release soon β’
Thread
Login to reply
Replies (28)
Idea: limit the index to, say, the 64000 most used words including plurals and other variations), names (first and last), materials and brands.
So keep typoβs and rare words out of the index, sanitizing the indrx makes its size much more manageable.
</suggestion>
Case sensitive??
oops thanks for reminding me
Full text search in 8 hours. Anyone impressed yet? In the last 3 weeks I changed a banner on the home page of the popular website.
Tshirt Idea: "Will release soon"
WEEEDSTR
If youβre looking for a quick and dirty way to add fuzzy search and stemming, try tokenizing the lowercase string into character triplets, including spaces:
[ th,e q,uic,k b,row,n f,ox ] and sorting by highest count of matching tokens.
I was going to look into stemming/lemmatization after. Keeping it simple on the first pass
Will, release soon
Perfect. I literally implemented Elastic into my client yesterday and concluded we have to do better
Is that for damus only?
Its a feature of nostrdb which has nothing to do with damus, but damus does use nostrdb
look at the difference between searching in tidal (has to be exact) and youtube (does not need to be exact)
Sick!! Is nostrdb out there for other clients to use already?
Hi Will! If I search for hashtag Argentina on #Damus the app closes π³
Feature, not a bug. π
Propaganda whores and the cure.
π·οΈ
What if?
π
π good catch. Will debug
Ok π it was just weird. I think there is a note with a bug.
Thank you! Just to let you know
π
π
Freaking mastermind β‘οΈπ«Άπ½
Some sats on their way to you! π€πΌ