|
The poster of the following message is an official representative of CATS.
There are a couple of problems with the search that you referenced, "IS-Oil".
In order to provide keyword search, we need to break all data into words. This involves using non-alphanumeric (a-z, 0-9) characters as word breaks, indexing what's in between. Some characters we've created exceptions for, like + or @ for example because they can be found in common searches like email addresses or C++ (a programming language). For this reason we differ from some boolean search engines in that we use AND, NOT and OR and do not use + or -.
We do not exclude certain characters like a dot (.) or hyphen (-) because they have numerous meanings. For example, if we included the dot, then the last word in any sentence would be indexed with the trailing period, and you'd have to use an asterisks to search for it. Take this sentence:
"I have experience with welding."
A search for "welding" would match no records, you'd have to search for "welding*" or "welding.". This is obviously not desired.
Similarly, the hyphen is used quite frequently to start bullet lists, to indicate ranges, for exclusion, etc., as these examples show:
"Experience:"
"-Welding"
"-Soldering"
"-Pipe Fitting"
"Worked March-May 2009"
"Forklift operator-9 years"
If we included the hyphen character, a number of searches would be corrupted using the above example.
So your first problem is that your hyphen character in CATS is stripped when you process your search, so searching for "IS-Oil" would be similar to "Is Oil".
Secondly, our search includes a set of stopwords, which could also be described as plain words, like "is", "a", "the", etc.. These are words that appear very frequently and hold little meaning. So, your search for "Is Oil" would need be further reduced to just "Oil".
We try to employ the latest in search technologies (our engine is actually used by some pretty big sites, Craigslist for example). We're constantly tweaking it, polishing it for our particular domain. I'd say it's right where it needs to be for 95% of our users' searches, this particular search issue you're having is falling into that 5%. We'll do our best to close that margin even further as we go forward; but in the mean time certain searches like this that sit in the peripheral just aren't going to be possible. |