Formalizers and Search Engines

January 8th, 2009

Google search

Search engines are possibly the most commonly used, and perhaps the oldest type of formalizers out there, which many of us take for granted.

A search form has one major formalizer field, which requires a search query. The user can be quite expressive when filling in this field, while the formalizer field takes care of certain types of input which is expected. Google have been true pioneers in this sense of formalizer usage.

Surely, we all know that there is a specific format for search – we use quotation marks (“”) to group words together, we use logical operators such as OR and AND, or sometimes even use the plus (+) and minus signs (-), especially if you’re an advanced user. However, there are a few things popular search engines have taken care of:

  • Stop words. Also known as noise words in some circles, they are words that are filtered out of search queries, usually since they’re too common and appear far too many times in searchable documents. These words, like “is”, “are”, “the”, “you”, “me”, and many others, are omitted from search queries automatically by the search engine, in order to provide a more relevant search result list.
  • Spelling. Google really made a difference here, both by finding spelling mistakes in your search query and suggesting a correction for you in the search results, and providing the means for other applications to utilize their spelling mechanism with their SOAP search API.
  • Contextual search. When typing a specific search phrase that requires a specific type of data, Google makes sure they provide you with this data as quickly as they can, with the tools that they have. You can get stock quotes, airline schedules and word definitions simply by asking for them. This saves a lot of time and hassle by simply thinking like people do, so a simple phrase like “kilometer in mile” typed into the Google search box will make the simple conversion for us, and display it at the top of the page.

What can we learn from Google?

Many applications, especially web applications, provide search capabilities to their users. With search being such a common use-case, not to mention a perfect way to learn of your users’ habits, product development managers need to make their search engines as usable and formalizer-friendly as they possibly can.

Here’s a short checklist:

  1. Auto-suggest. When users start typing search queries, try to assist them with suggestions based on common search queries.
  2. Spelling. As mentioned before, there are services for checking the spelling of search queries out there, like the one Google offers us. Therefore, there’s no need to develop it in-house, and you can literally use the best tool out there. This way, users won’t get frustrated when a misspelled query gets sent back with no results.
  3. Natural language. This is probably the most specific, and therefore most complicated aspect of formalizer-friendly search engine. It requires you to define a way for the user to communicate with your application’s search engine in a natural language. Try to find common search phrases and to identify patterns that occur the most, and you’ll be able to see how you can go towards your users and help them by understanding their queries better.

Combining these three elements together creates a powerful search user experience, and positions you immediately above your competitors in terms of usability and barrier of entry.

For example: say you have a dating website, and your search box allows users to look for possible people to meet. You could provide a boring form that would require an “advanced search” option to be clicked (possibly the worst use-case a search engine could have, but more on that in a later post).

On the other hand, you can define a few search patterns that upgrades your search field into a formalizer. Such fields are not search-boxes in their classical sense, and are usually preceded with a question, such as “what kind of date are you looking for?” to which the user can answer “middle-aged men in the greater LA area.”

And now, the hard work begins: your language processor should be able to understand this sentence and extract meaningful information out of it.  It should understand that “middle-aged” means a certain age group, and that “the greater LA area” is a geographic region. Again – not the easiest of tasks, but once implemented, it goes a very long way in terms of usability.

This, combined with a spelling service and an auto-suggest utility, makes sure the user types the words correctly, or at least minimizes the number of mistakes the user makes. This allows your formalizer mechanism to be right more times than wrong and ultimately makes your users happier. And happy users will return again tomorrow, no doubt.

Tags: , , , , , , , ,

Leave a comment