performance - Fuzzy Problems in Solr Filter Query -


it grateful if can me problem. have query:

select?q=city:frankfurt main~&fq=street:gerhart-hauptmann-str.~ 

this not working me. want use fuzzy search catch user input mistakes.

here want:

  • frankfurt main should searched in field city fuzzy search
  • gerhart-hauptmann-str. should converted 3 terms fuzzy search.

debug output of actually:

"debug": {     "rawquerystring": "city:frankfurt main~",     "querystring": "city:frankfurt main~",     "parsedquery": city:frankfurt text:am text:main~2",     "parsedquery_tostring": "city:frankfurt text:am text:main~2",     "explain": {...},     "qparser": "luceneqparser",     "filter_queries": [       "street:gerhart-hauptmann-str.~"     ],     "parsed_filter_queries": [       "street:gerhart-hauptmann-str.~2"     ], 

i (think) want output:

 "debug": {         "rawquerystring": "city:frankfurt main~",         "querystring": "city:frankfurt main~",         "parsedquery": city:frankfurt~2 city:am~2 text:main~2",         "parsedquery_tostring": "city:frankfurt~2 city:am~2 text:main~2",         "explain": {...},         "qparser": "luceneqparser",         "filter_queries": [           "street:gerhart-hauptmann-str.~"         ],         "parsed_filter_queries": [          # analyser converts str. strasse           "street:gerhart~2 street:hauptmann~2 strasse~2"         ], 

the definition of fields in schema.xml

<field name="city" type="admin_name" indexed="true" stored="true" /> <field name="street" type="street_name" indexed="true" stored="true" multivalued="false"/>  <fieldtype name="admin_name" class="solr.textfield" >        <analyzer>                    <tokenizer class="solr.standardtokenizerfactory"/>                     <filter class="solr.lowercasefilterfactory" />           <filter class="solr.synonymfilterfactory" synonyms="lang/synonyms_de_admin.txt"/>                  <filter class="solr.asciifoldingfilterfactory"/>        </analyzer>        </fieldtype>      <fieldtype name="street_name" class="solr.textfield" >        <analyzer>                    <tokenizer class="solr.standardtokenizerfactory"/>                     <filter class="solr.lowercasefilterfactory" />           <!-- startendsynonymfilter replaces synonyms                 @ start or end of term. types                start_synonym or end_synonym set. -->                     <filter class="my.startendsynonymfilterfactory" synonyms="lang/synonyms_de_street.txt"/>                   <filter class="solr.asciifoldingfilterfactory"/>        </analyzer>        </fieldtype> 

is somehow possible?

if need additional information answer, please leave hint in comment.

  1. tokenizing on hyphens

have @ worddelimiterfilterfactory: https://wiki.apache.org/solr/analyzerstokenizerstokenfilters#solr.worddelimiterfilterfactory

  1. applying fuzzy every single term

disclaimer: have not yet used fuzzy search in solr setups.

you might have careful tokenizing city names , applying fuzzy search every single token. example "frankfurt main" in case apply fuzzy search "am", well. please try parenthesis: (frankfurt main)~ whether gets intended result.

however, in case of names (city or streets) i'm not sure should tokenizing them. maybe storing them 1 case insensitive token , applying fuzzy search "frankfurt main"~ (with quotes in query) need.

nevertheless, should try , work in way have described it. @ query results. , (maybe in parallel) setup index store city , street names single tokens (keywordtokenizer lower casing , ascii folding, e.g.) , apply fuzzy search them single terms. guess results sharper. best - try out , compare.

in addition, suggest try out (extended or not) dismax handler input without caring differentiate between cities , streets on input side: https://cwiki.apache.org/confluence/display/solr/the+extended+dismax+query+parser

with dismax handler processing input, can allow user input search terms freely (like having single search field cities , streets can input in random order , format).


Comments

Popular posts from this blog

android - MPAndroidChart - How to add Annotations or images to the chart -

javascript - Add class to another page attribute using URL id - Jquery -

firefox - Where is 'webgl.osmesalib' parameter? -