The prognosis of patients with Stage I and II non-small cell lung cancer (NSCLC) can be estimated but cannot be definitively ascertained by use of current clinicopathologic criteria and tumor marker studies. The potential value of probabilistic neural networks (NNs) with genetic algorithms and multivariate logistic regression to predict the survival of NSCLC patients has not been previously evaluated. Multiple prognostic factors (age, sex, cell type, stage, tumor grade, smoking history, and immunoreactivity to c-erbB-3, bcl-2, Glut1, Glut3, retinoblastoma gene and p53 were correlated with 5-year survival in 63 patients with Stage I or II NSCLC, treated solely by surgical excision at Baylor Medical College, Houston, Texas. Several probabilistic NNs with genetic algorithm models were developed using the prognostic features as input neurons and survival at 5 years (free of disease/dead of disease) as output neurons. The probabilistic NN yielded excellent classification rates for dependent variable survival. The best model was trained with 52 cases and classified all 11 "unknown" test cases correctly. Several statistically significant logistic regression models were fitted using 50 cases to build the models and 13 cases as "hold-out" test cases. These multivariate statistical models provide various cutoff values that predict/classify the probability of survival at 5 years. In conclusion, probabilistic NNs and logistic regression models can be useful in estimating the prognosis of patients with Stage I and II NSCLC using multiple clinicopathologic and molecular variables. These multivariate predictive models need to be validated with much larger groups of patients to assess their potential clinical value.