Predicting pKa values of substituted phenols from atomic charges


S. Geidl, R. Svobodová Vařeková, O. Skřehota, M. Kudera, C. M. Ionescu, D. Sehnal, T. Bouchal, and J. Koča


National Centre for Biomolecular Research, Faculty of Science, Masaryk University,

Kamenice 5, 625 00 Brno-Bohunice, Czech Republic


The acid dissociation constant pKa is one of the fundamental properties of organic molecules determining the degree of dissociation at a given pH. Dissociation constants are of interest in chemical, biological, environmental and pharmaceutical research, because the important physicochemical properties like lipophilicity, solubility, and permeability are all pKa dependent. For these reasons, there is a strong interest in the development of reliable methods for pKa prediction.

Numerous methods based on different approaches were developed [1] – the Linear Free Energy Relationships (LFER) method, database methods, decision tree methods, quantum mechanical simulations, QSPR models etc.. Unfortunately, pKa values remain one of the most challenging physicochemical properties to predict.

Partial atomic charges have proven to serve as very successful descriptors for the prediction of pKa using QSPR models [2]. The utilization of charges has been, until recently, limited by the high computational cost of their quantum mechanical calculation. Nowadays, much more powerful computers exist than ever before. It makes charges much better accessible and, thus, very attractive for pKa prediction.

Partial atomic charges can be calculated using a variety of quantum mechanical methods (AM1, PM3, HF, MP2, functionals, etc.), population analyses (Mulliken, ESP, NPA, etc.) and basis sets. Consequently, the way of charge calculation strongly influences their correlation with pKa [3, 4]. We have evaluated different computational strategies and models to predict the pKa values of substituted phenols using partial atomic charges. Partial atomic charges for 143 phenol molecules were calculated using more than 70 combinations comprising of theory levels, basis sets and population analyses. The correlations between pKa and various atomic charge descriptors were examined and the best descriptors were selected for designing the QSPR models. Then, the accuracy of all these models was analyzed an influence of theory level, basis set and population analysis on the quality of the model was evaluated.


1.     A. C. Lee, G. M. Crippen: Predicting pKa. J. Chem. Inf. Model., 49 (2009), 2013-2033.

2.     M. J. Citra: Estimating the pKa of phenols, carboxylic acids and alcohols from semi-empirical quantum chemical methods. Chemosphere, 1 (1999), 191-206.

3.     K. C. Gross, P. G. Seybold, C. M. Hadad: Comparison of different atomic charge schemes for predicting pKa variations in substituted anilines and phenols. Int. J. Quantum Chem., 90, (2002), 445-458.

4.     W. C. Kreye, P. G. Seybold: Correlations between quantum chemical indices and the pKas of a diverse set of organic phenols. Int. J. Quantum Chem., 109 (2009), 3679-3684.