首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >带复数单词的阿拉伯文WordNet

带复数单词的阿拉伯文WordNet
EN

Stack Overflow用户
提问于 2018-02-12 05:40:15
回答 2查看 240关注 0票数 2

我正在使用阿拉伯语wordNet和c#来获得像"عرض“这样的单数词的同义词,并且我得到了以下同义词(علامة,أمارة,شدة,ضر,شؤم,بلية等)。

我的问题是:有没有办法从阿拉伯语WordNet中获取复数单词的同义词,比如单词"علامات“。

我之所以需要它,是因为我没有找到从阿拉伯语中的复数单词中提取单数单词的方法,比如"علامات“=> "علامة”。

我感谢你提供的任何帮助。

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-02-16 21:43:05

我通过编辑awn.xml文件并添加所有需要的复数词来解决这个问题。例如,单词"عرض“具有复数"أعراض”,其同义词为علامات,أمارات,شدائد,بلايا,أضرار

<wordnet version="20">
<item itemid="&gt;aArad_n1AR" offset="102231120" lexfile="" name="أعراض" type="synset" headword="" POS="n" source="" gloss="" authorshipid="1" />
<authorship author="ali" date="20150215" score="" comment="" covering="1" authorshipid="1" />
<item itemid="&gt;aMrad_n1AR" offset="102231121" lexfile="" name="أمراض" type="synset" headword="" POS="n" source="" gloss="" authorshipid="2" />
<authorship author="ali" date="20150215" score="" comment="" covering="1" authorshipid="1" />
<item itemid="&gt;Isteqsa'at" offset="102231121" lexfile="" name="استقصاءات" type="synset" headword="" POS="n" source="" gloss="" authorshipid="3" />
<authorship author="ali" date="20150215" score="" comment="" covering="1" authorshipid="1" />

然后添加同义词,如下所示

<authorship author="ali" date="20180215" score="" comment="From suggested word" covering="0" authorshipid="12136" />
<word wordid="&lt;aArad_n1AR" value="أعراض" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;aArad_n1AR" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;$araat" value="إشارات" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;$araat" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;Alamat" value="علامات" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;Alamat" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;$adaed" value="شدائد" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;$adaed" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;adrar" value="أضرار" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;adrar" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;balaya" value="بلايا" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;balaya" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;tawar'a" value="طوارئ" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;tawar'a" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;fawajea" value="فواجع" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;fawajea" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;fawadeh" value="فوادح" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;fawadeh" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;kawareth" value="كوارث" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;kawareth" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;mehan" value="محن" synsetid="&gt;aArad_n1AR"  type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;mehan" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;makrohat" value="مكروهات" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;makrohat" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;masaeb" value="مصائب" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;masaeb" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;masawea" value="مساوئ" synsetid="&gt;aArad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أعراض" wordid="&lt;masawea" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;Elal" value="علل" synsetid="&gt;aMrad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أمراض" wordid="&lt;Elal" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;Ellat" value="علات" synsetid="&gt;aMrad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أمراض" wordid="&lt;Ellat" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;Eatilalat" value="اعتلالات" synsetid="&gt;aMrad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أمراض" wordid="&lt;Eatilalat" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;Da'aat" value="داءات" synsetid="&gt;aMrad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أمراض" wordid="&lt;Da'aat" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;waakat" value="وعكات" synsetid="&gt;aMrad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أمراض" wordid="&lt;waakat" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;askaam" value="أسقام" synsetid="&gt;aMrad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أمراض" wordid="&lt;askaam" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;$akawa" value="شكاوى" synsetid="&gt;aMrad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أمراض" wordid="&lt;$akawa" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;aMrad_n1AR" value="أمراض" synsetid="&gt;aMrad_n1AR" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="أمراض" wordid="&lt;aMrad_n1AR" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;Fohosat" value="فحوصات" synsetid="&gt;Isteqsa'at" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="استقصاءات" wordid="&lt;Fohosat" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;Taharieat" value="تحريات" synsetid="&gt;Isteqsa'at" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="استقصاءات" wordid="&lt;Taharieat" type="brokenPlural" authorshipid="12137" />
<word wordid="&lt;Isteqsa'at" value="استقصاءات" synsetid="&gt;Isteqsa'at" type="brokenPlural" frequency="" corpus="" authorshipid="12137" />
<form value="استقصاءات" wordid="&lt;Isteqsa'at" type="brokenPlural" authorshipid="12137" />

现在,当我们执行以下代码片段时

        List<string> wordId = _awn.Get_List_Word_Id_From_Value("علامات");
        List<string> synonyms = new List<string>();
        if (wordId != null)
        {
            foreach (string ss in wordId)
            {
                string temp = _awn.Get_Synset_ID_From_Word_Id(ss);
                List<string> test = _awn.Get_List_Word_Id_From_Synset_ID(temp);
                if (test.Count != 0)
                {
                    foreach (string str in test)
                    {
                        string s = _awn.Get_Word_Value_From_Word_Id(str);
                        if (!synonyms.Contains(s))
                            synonyms.Add(s);
                    }
                }
            }
        }

我们在同义词列表“علل”,“علات”,“اعتلالات”,“داءات”,“وعكات”,“أسقام”,“شكاوى”中得到以下单词。它们是单词"عرض“的同义词的复数。

票数 1
EN

Stack Overflow用户

发布于 2018-06-05 06:40:23

如果你想从复数中得到单数,你可以使用任何可用的形态分析器,比如"ALKhalil“,这是一个开源的java项目,但这只是为了得到复数的单数,而不是对比。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/48736755

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档