US Patent:
20020083068, Jun 27, 2002
Inventors:
Dallan Quass - Elk Ridge UT, US
Randy Waki - Orem UT, US
Fernando Pereira - Philadelphia PA, US
International Classification:
G06F007/00
Abstract:
A system and method is provided for accessing targeted information concealed behind electronic forms, accomplished by identifying the forms, determining which of the identified forms to fill out, and determining how to populate the fields of the forms to be filled out. Electronic content that might contain electronic forms is subjected to a series of transformations culminating in an object model that exposes the existence of any electronic forms in the content, the logical structure of the fields in those forms including features such as descriptive labels that may assist in the interpretation of the fields, and a mechanism for recording how to populate the fields. A collection of classifiers and their support components, whose composition is largely determined by the specific information being sought and whose implementation may employ techniques from the field of machine learning, are applied to features exposed by the transformations in general and the object model in particular, to make decisions about which forms to fill out, how to populate form fields, and how to cause forms to be submitted. The decisions are then applied to the object model to electronically populate the forms in a number of combinations likely to retrieve the information being sought.