Extraction of Translation Equivalents from Parallel Corpora

In th e p as t m u ch effo rt w as d ev o ted to th e com p ila tio n o f m u ltilin g u a l p a ra lle l co rp o ra for the p u rp o se o f lingu istic in fo rm atio n re tr ie v a l T h is p ap er a im s to in tro d u ce and evaluate th ree s im p le stra teg ies fo r th e ex trac tio n o f translation, equ ivalen ts fro m stru c tu red para lle l texts. T he g o a l is to su p p o rt the p ro d u c tio n o f b ilin g u a l d ic tionaries fo r d o m a in -sp ec ific app lications. T he ap p ro ach es d escribed in th e p ap e r a ssu m e sen tence a lignm ent, s tr ic t tran s la tio n s , and h istorical re la tio n s betw een con sid ered lan g u ag e pa irs . T hey take advan tage o f co rp u s ch aracteristics like sh o rt a lig n ed u n its and s truc tu ra l & o rth o g rap h ic s im ilarities in o rd er to o b ta in resu lts w ith a h ig h le v e l o f p recision . F u rtherm ore , it w ill b e sh o w n th a t au to m atic f ilte rin g c a n b e u sed to im prove th e p rec is io n o f the ex trac ted m a teria l. S im ple techn iques a re u sed to d e tec t transla tion can d id a te s th a t are m o st likely w rong.