Autonomous document classification for business

With the continuing exponential growth of the Internet and the more recent growth of business Intranets, the commercial world is becoming increasingly aware of the problem of electronic information overload. This has encouraged interest in developing agents/softbots that can act as electronic personal assistants and can develop and adapt representations of users information needs, commonly known as profiles. As the result of collaborative research with Friends of the Earth, a leading environmental campaigning organization, we have developed a general purpose information classification agent architecture and are applying it to the problem of document classification and routing. Collaboration with Friends of the Earth allows us to test our ideas in a non-academic context involving high volumes of documents. We use the technique of genetic programming (GP), (Koza & Rice 1992), to evolve classifying agents. This is a novel approach for document classification, where each agent evolves a parse-tree representation of a user's particular information need. The other unusual features of our research are the longevity of our agents and the fact that they undergo a continual training process; feedback from the user enables the agent to adapt to the user's long-term information requirements.