
Search

As of October, 2016, Embarcadero is offering a free release
of Delphi (Delphi
10.1 Berlin Starter Edition ). There
are a few restrictions, but it is a welcome step toward making
more programmers aware of the joys of Delphi. They do say
"Offer may be withdrawn at any time", so don't delay if you want
to check it out. Please use the
feedback link to let
me know if the link stops working.

Support DFF - Shop
If you shop at Amazon anyway, consider
using this link.
We receive a few cents from each
purchase. Thanks

Support DFF - Donate
If you benefit from the website, in terms of
knowledge, entertainment value, or something otherwise useful,
consider making a donation via PayPal to help defray the
costs. (No PayPal account necessary to donate via credit
card.) Transaction is secure.

Mensa®
Daily Puzzlers
For over 15 years
Mensa Page-A-Day calendars have provided several puzzles a year
for my programming pleasure. Coding "solvers" is most fun,
but many programs also allow user solving, convenient for "fill
in the blanks" type. Below are Amazon links to the
two most recent years.
Mensa®
365 Puzzlers Calendar 2017
Mensa®
365 Puzzlers Calendar 2018

(Hint: If you can
wait, current year calendars are usually on sale in January.)

Contact
Feedback:
Send an
e-mail with your comments about this program (or anything else).

|
| |
Problem Description
This is the first installment of a series of program about words. I've included
2 programs here. The first, DicMaint, introduces the dictionary structure
and provides code to maintain them. The second program, CrosswordHelper,
is a word completion program, displays a list of dictionary words matching a mask of known letters
Background & Techniques
The first requirement for many word manipulation
problems is a dictionary. Not the the kind
with definitions, just the kind with a list of valid words. The TDic object
compresses a wordlist to about half of its uncompressed size. The wordlist is maintained as a
TStringList object. The initial letters of each word in the list are replaced by a byte with the count of letters which match the preceding
word. Unused bits of this byte also flag foreign words and abbreviations.
To speed processing, a letter index is maintained pointing to the first word in the list for each letter.
The SetRange method defines the beginning and ending initial letters and the
shortest and longest words to be retrieved. GetNextWord retrieves words within this range and returns false when
no more words are
available. Other methods load and save dictionaries (in compressed or uncompressed form), add and remove words, lookup words, etc.
A request to load a dictionary with an extension of .txt will scan a text
file and extract all unique words as a dictionary. A request to save
a dictionary with an extension of .txt will build an expanded word list with one
word per line.
Just to get us started, I've also included CrosswordHelper, a simple program using the Tdic class to find all words matching a given mask.
Unknown letters are entered as _ characters. For example, using Full.dic. the
mask "_n_e" returns "ante", "knee", and "once".
CrosswordHelper addendum, Jan 20,2001: I
added mask characters "?" as a synonym for "_", and
"*" to represent any number of unknown characters. Works great
to find rhyming words for you poets out there! Implementation was
simplified when I ran across the MatchesMask function included in
Delphi's Mask unit.
I've put three dictionaries in a separate download file. Full.dic
contains about 60,000 words. General.dic about 16,000 and Small.dic about 1500
words. All should be considered works in progress. Any errors for suggestions for improvements will be appreciated.
Small.dic is duplicated with each of the source and object downloads,
so that any download should be runnable, even though you'll want to use one of
the larger dictionaries for most purposes. In general, I'd say that
for checking words, you'll want to use the largest dictionary and for
pprograms that generate words, you would be better served by one of the smaller
dictionaries.
Running/Exploring the Program
Suggestions for Further Explorations
 |
My granddaughter's electronic Hangman game
claims to have an 8,000 word dictionary. It also has "categories", I'll have to
borrow it from her to check this out but categories sounds like a good
idea for that application. Perhaps a descendant of TDic, or a
special header word in at the start of the dictionary could specify that each
word has an added category byte. Category names would also be included in the
dictionary and an index of category counts built at load time (to allow random selections of
word within a category). |
 |
Normal
Readln text file code is used to read text files that are not
compressed dictionaries. I have encountered a problem in one case
where the entire text file is a single record. The maximum record
(line) size for Readln is 255 characters. The solution is to
convert to Blockread logic, but I decided I didn't want to
read that file anyway. I'll just put this on the back burner
along with all the other stuff I'll probably never get around to. |
|