ZF-623: Query terms for Keyword type fields should not be tokenized
Description
When you trying to find documents via their id or path or any other KEYWORD TYPE field that includes other characters than [A-Za-z0-9] you cannot get any results. Example:
document XY with path=/some/file.txt
query: path:"/some/file.txt" -> no results
document XY with path=abc
query: path:"abc" -> ok query: path:abc -> ok too
Comments
Posted by Lukas Zapletal (lzap) on 2006-12-06T10:03:10.000+0000
From the mailing list:
It seems the query is parsed and tokenized always. It should not parse and tokenize those fields that are marked as KEYWORDs. Is it possible to implement this? If not there could be method like findRaw -- finds documents but doesn`t analyze and tokenize the query.
Posted by Bill Karwin (bkarwin) on 2006-12-09T16:47:55.000+0000
Assigning to Alexander.
Posted by Alexander Veremyev (alexander) on 2006-12-13T18:36:26.000+0000
Query parser always uses default analyzer to tokenize or normalize terms and phrases.
Query parser is index independent, so it can't "know", which field should be tokenized.
Moreover, index doesn't store information, which field was tokenized and which wasn't.
Thus keywords containing non-alphanumeric characters can only be added to a query through API:
It's also possible to extend query language to give possibility to signal which field is a keyword field.
Ex. "bla/bla/bla" is tokenized, but 'bla/bla/bla' isn't. It looks reasonable from PHP point of view :), but I am not sure, that it's a common practice for search engines...
Posted by Adam Lundrigan (adamlundrigan) on 2012-05-05T03:21:29.000+0000
This issue is very, very old and unlikely to be implemented in ZFv1.