|
The WebCrow is a research project carried out at the Information Engineering of the University of Siena with the purpose of automatically solving crosswords. ==The Project== The scientific relevance of the project can be understood considering that cracking crosswords requires human-level knowledge. Unlike chess and related games and there is no closed world configuration space. Interestingly, a first nucleus of technology, such as search engines, information retrieval, and machine learning techniques enable computers to enfold with semantics real-life concepts. The project is based on a software system whose major assumption is to attack crosswords making use of the Web as its primary source of knowledge. WebCrow is very fast and often thrashes human challengers in competitions,〔G.Angelini, M. Ernandes, E. Di Iorio, "(WebCrow: Previous competitions )"〕 especially on multi language crossword schemes. A distinct feature of the WebCrow software system is to combine properly natural language processing (NLP) techniques, the Google web search engine, and constraint satisfaction algorithms from artificial intelligence to acquire knowledge and to fill the schema. The most important component of WebCrow is the Web Search Module (WSM), which implements a domain specific web based question answering algorithm. The way WebCrow approaches crosswords solving is quite with respect to humans:〔John S. Quarterman,"(Google as AI )"〕 Whereas we tend to first answer clues we are sure of and then proceed filling the schema by exploiting the already answered clues as hints, WebCrow uses two clearly distinct stages. In the first one, it processes all the clues and tries to answer them all: For each clue it finds many possible candidates and sorts them according to complex ranking models mainly based on a probability criteria. In the second stage, WebCrow uses constraint satisfaction algorithms to fill the grid with the overall most likely combination of clue answers. In order to interact with Google, first of all, WebCrow needs to compose queries on the basis of the given clues. This is done by query expansion, whose purpose is to convert the clue into a query expressed by a simplified and more appropriate language for Google. The retrieved documents are parsed so as to extract a list of word candidates that are congruent with the crossword length constraints. Crosswords can hardly be faced by using encyclopedic knowledge only, since many clues are wordplays or are otherwise purposefully very ambiguous. This enigmatic component of crosswords is faced by a massive use of database of solved crosswords, and by automatic reasoning on a properly organized knowledge base of wired rules. Last but not the least, the final constraint satisfaction step is very effective to fill the correct candidate, even though, unlike humans, the system can not rely on very high confidence on the correctness of the answer. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「WebCrow」の詳細全文を読む スポンサード リンク
|