Microsoft's format for encoding the Polish alphabet is Windows-1250." All encodings should be 100% US-ASCII compatible. Take a look into the Polish Wiktionary to find out how the IPA of the word looks like that you want to add to your pronunciation dictionary. Translate the IPA phoneme to the corresponding Arpabet phoneme. Your polish dictionary should look like cmudict (of course, you have just a few words, but for your project that is sufficient). Poles are able to read Polish language without Polish letters, only with English letters.

Regardless of the format of input (hiragana/romaji), it outputs in any format you want.Of course, the Sphinx dictionary can contain Polish special characters (be careful with the encoding, use UTF-8). At least, I could import your dictionary into simon as Sphinx dictionary. Should I use the same or different sentences for all of those people? Nsh, please, have a look here:! Download cmudict, and look exactly at their code to find out whether they use spacebars or a tab (or several tabs) between the word and the corresponding pronunciation. Let me also repeat some questions from previous post which are not answered yet: Is it good idea to create ten files with ten sets of ten words in random order (e.g. --- (Edited on 11/20/2009 am [GMT-0600] by johnyjj2) --- Hi johnyjj2! It contains all letters and sounds indicated by two letters and it also takes into account that some sounds can be written in two ways (it is feature of Polish language which only makes ortography more difficult but doesn't change anything - in fact this ortography indicates which word has got root in other words of similar ortography but this knowledge is rather not so useful for ordinary speaker I think). I also noticed that in some words, like piec, there is upper-case j in IPA. You gave me an example how to write this word piec, but you ommited this little j. (Those Sampa are not so useful, I guess, because they contain non-letter characters).From my point of view lack of this j completely changes the way how the word sounds and it cannot be ommited. What should I do with those apostrophes like in jeden (one) IPA and those upper-index j (like in five - pięć)? --- (Edited on 11/16/2009 pm [GMT-0600] by johnyjj2) --- You see that US-ASCII isn't sufficient to catch the details of the Polish language.

