Free or cheap english grammar data / libraries?

I am working on a text adventure engine, and there are little grammar things that come up all the time, where having some sort of english dictionary with some metadata would be really handy. For instance say a user creates some objects: "sword, apple, heirloom"

During the game I want to output things like:
  • "There is a sword here" (a if it starts with a consonant)
  • "There is an apple here" (an if it starts with a vowel)
  • "There is an heirloom here" (well, sometimes!)

To get the a/an correct, I either need to encode the bizarre rules of english, or if I had a lookup table of all nouns and their appropriate article, I could do it that way.

Anyone know of data files like this? Or maybe some C/C++ grammar libraries?

Edited by Jack Mott on
Actually, the rules for things like selecting proper articles or suffixes in English, or any spoken language for that matter, are highly regular; as long as you're looking at the phonetic representation rather than the written one. So if you store a phonetic representation with whatever words you want to give special consideration--like creatable objects in your case--then deciding whether to use 'a' or 'an' really is as simple as inspecting the first character of a word to determine if it's a consonant or vowel.

How you decide to encode the phonemes of words is entirely up to you. But I imagine you will want to line it up to some degree with the format of the pronunciation key taken by whatever dictionary it is you're using as a data source. Some dictionaries opt to use more English-readable phonetic notation, but my advice would be to go with a source that has IPA (International Phonetic Alphabet) pronunciations available. Finding a good dictionary database is probably the hardest part of the whole endeavour.

Other than that, I'm afraid I'm not familiar with any libraries that implement the sort of functionality you're looking for.
The trick is this is an engine, where users can create their own objects, so there is not a finite set determined ahead of time.

I suppose I could compute the phonetics, and maybe that is what I would need to do, after all, since players could use made up nouns!
BRB going to learn linguistics.