Prev | Current Page 244 | Next

A. F. Salam and Jason R. Stevens

"Semantic Web Technologies and E-Business: Toward the Integrated Virtual Organization and Business Process Automation"

With this approach, the grammars are
generated manually, and written patterns are discovered by a human expert, analyzing
a corpus of text documents from the domain. This becomes quite labor-intensive
as the size, number, and stylistic variety of these training texts grows (Appelt &
Israel, 1999).
Unlike the knowledge engineering approach, the automatic training approach does
not require computer experts who know how IE systems work or how to write rules.
A subject matter expert annotates the training corpus. Corpus statistics or rules are
then derived automatically from the training data and used to process novel data.
Since this technique requires large volumes of training data, finding enough training
data can be difficult (Appelt & Israel, 1999; Manning & Schutze, 2002). Research
using this approach includes Neus, Castell, and Mart?­n (2003).
Advanced research in information extraction appears in journals and conferences run
by several AI and NLP organizations, such as the MUC, the Association for Computational
Linguistics (ACL) (www.aclweb.org/), the International Joint Conference
on Artificial Intelligence (IJCAI) (http://ijcai.


Pages:
232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256