Next generation CiteSeer

CiteSeer, a computer and information science search engine and digital library, has been a radical departure for scientific document access and analysis. With nearly 700,000 documents, it has sometimes two million page views a day making it one of the most popular document access engines in science. CiteSeer is also portable, having been extended to ebusiness (eBizSearch) and more recently to academic business documents (SMEALSearch). CiteSeer is based on two features: actively acquiring new documents and automatic tagging and linking of metadata information inherent in an academic document's syntactic structure. Why is CiteSeer so popular? We discuss this and methods for providing new tagged metadata such as institutions and acknowledgements, new data resources and services and the issues in automation. We then discuss the next generation of CiteSeer.