Material Detail
SOFIE: Self-Organizing Flexible Information Extraction
This video was recorded at World Wide Web (WWW) Conference, Madrid 2009. This paper presents SOFIE, a system that can extend an existing ontology by new facts. SOFIE provides a integrative framework, in which information extraction, word disambiguation and semantic reasoning all become part of one unifying model. SOFIE processes text or Web sources and finds meaningful patterns. It maps the words in the pattern to entities in the ontology. It hypothesizes on the meaning of the pattern, and checks the semantic plausibility of the hypothesis with the existing ontology. Then the new fact is added to the ontology, avoiding inconsistency with the existing facts. The logical model that connects existing facts, new hypotheses, extraction patterns, and consistency constraints is represented as a set of propositional clauses. We use an approximation algorithm for the Weighted MAX SAT problem to compute the most plausible subset of hypotheses. Thereby, the SOFIE framework integrates the paradigms of pattern matching, entity disambiguation, and ontological reasoning into one unified model, and enables the automated growth of large ontologies. Experiments, using the YAGO ontology as existing knowledge and various text and Web corpora as input sources, show that our method yields very good precision around 90 percent or higher.
Quality
- User Rating
- Comments
- Learning Exercises
- Bookmark Collections
- Course ePortfolios
- Accessibility Info