Article 10608: : In Search of Text

Special

Introducing the “Welcome to Xojo” Bundle!

New to Xojo and looking for guidance? We've put together a terrific bundle to welcome you! Xojo Bundle

This bundle includes six back issues of the magazine -- all of year 22 in printed book and digital formats -- plus a one-year subscription (beginning with 23.1) so you'll be learning all about Xojo for the next year. It's the perfect way to get started programming with Xojo. And you save as much as $35 over the non-bundle price!

This offer is only available for a limited time as supplies are limited, so hurry today and order this special bundle before the offer goes away!

Recent issues

23.4 cover
July/Aug 2025

23.3 cover
May/June 2025

23.2 cover
Mar/Apr 2025

23.1 cover
Jan/Feb 2025

Article Preview

Buy Now

Issue 10.6 ('Retina')
Instant purchase and download via GumRoad!

FEATURE

In Search of Text

Survey of four text search algorithms

Issue: 10.6 (September/October 2012)
Author: JC Cruz
Article Description: No description available.
Article Length (in bytes): 46,009
Starting Page Number: 48
Article Number: 10608
Resource File(s):

project10608.zip Updated: 2012-09-04 14:23:08

Related Web Link(s):

http://java.dzone.com/articles/algorithm-week-boyer-moore
http://java.dzone.com/articles/algorithm-week-brute-force
http://java.dzone.com/articles/algorithm-week-morris-pratt
http://java.dzone.com/articles/algorithm-week-rabin-karp
http://en.wikipedia.org/wiki/Boyer-Moore_string_search_algorithm
http://en.wikipedia.org/wiki/Knuth-Morris-Pratt_algorithm
http://en.wikipedia.org/wiki/Rabin-Karp_string_search_algorithm
http://en.wikipedia.org/wiki/String_searching_algorithm

Excerpt of article text...

#Reasons to Search
A good search engine is a must for all text-oriented products. Even the simplest text viewer has to provide some rudimentary search service. Lacking the ability to search reflects poorly on the product and on its developers. It shows the developers' lack of insight, or their poor grasp of the basics.
The search engine may also serve a larger, more complex feature. For instance, it may be part of a document statistics feature. For instance, the feature would look for valid words, words with standard dictionary entries. Then it counts the invalid words (proper names, acronyms, and so on) and uses the two counts as a measure of jargon.
An XML preprocessor may also use a search engine to examine each line of text. The preprocessor would search for XML tags and replace them with actual style meta-data. It may search and count the binary tags, then alerts the user if it finds an uneven count. It may search and flag any tags that are malformed, misplaced, or missing.
Text indexing is another feature that relies on a search engine. Here, the indexer searches for key nouns and verbs, and records their positions on the page. It uses the gathered data to construct a glossary or a table of contents.
Finally, an import/export feature may use a search service to prepare the text being processed. For example, assume the text contains Markdown tags. To prepare the text for display, an import session may search for those tags and replace them with the correct style data. It may rearrange the text visually based on the given tag. To save the text as a web page, an export may replace each Markdown tag with the right HTML tag. It may also add other tags required for the page.

...End of Excerpt. Please purchase the magazine to read the full article.