Intelligence obtained by applying data mining to a database of French theses on the subject of Brazil

The subject of Brazil was analyzed within the context of the French database DocTheses, comprising the years 1969 -1999. The data mining technique was used to obtain intelligence and infer knowledge. The objective was to identify indicators concerning: occurrence of thesis by subject areas; thesis supervisors identified with certain subject areas; geographical distribution of cities hosting institutions where the theses were defended; frequency by subject area in the period when the theses were defended. The technique of data mining is divided into stages which go from identification of the problem-object, through selection and preparation of data, and conclude with analysis of the latter. The software used to do the cleaning of the DocTheses database was Infotrans, and Dataview was used for the preparation of the data. It should be pointed out that the knowledge extracted is directly proportional to the value and validity of the information contained in the database. The results of the analysis were illustrated using the assumptions of Zipf's Law on bibliometrics, classifying the information as: trivial, interesting and 'noise', according to the distribution of frequency. It is concluded that the data mining technique associated with specialist software is a powerful ally when used with competitive intelligence applied at all levels of the decision -making process, including the macro level, since it can help the consolidation, investment and development of actions and policies.