TY - JOUR
T1 - Artificial intelligence may offer insight into factors determining individual TSH level
AU - Santhanam, Prasanna
AU - Nath, Tanmay
AU - Mohammad, Faiz Khan
AU - Ahima, Rexford S.
PY - 2020
Y1 - 2020
N2 - The factors that determine Serum Thyrotropin (TSH) levels have been examined through different methods, using different covariates. However, the use of machine learning methods has so far not been studied in population databases like NHANES (National Health and Nutritional Examination Survey) to predict TSH. In this study, we performed a comparative analysis of different machine learning methods like Linear regression, Random forest, Support vector machine, multilayer perceptron and stacking regression to predict TSH and classify individuals with normal, low and high TSH levels. We considered Free T4, Anti-TPO antibodies, T3, Body Mass Index (BMI), Age and Ethnicity as the predictor variables. A total of 9818 subjects were included in this comparative analysis. We used coefficient of determination (r2) value to compare the results for predicting the TSH and show that the Random Forest, Gradient Boosting and Stacking Regression perform equally well in predicting TSH and achieve the highest r2 value = 0.13, with mean absolute error of 0.78. Moreover, we found that Anti-TPO is the most important feature in predicting TSH followed by Age, BMI, T3 and Free-T4 for the regression analysis. While classifying TSH into normal, high or low levels, our comparative analysis also shows that Random forest performs the best in the classification study, performed with individuals with normal, high and low levels of TSH. We found the following Areas Under Curve (AUC); for low TSH, AUC = 0.61, normal TSH, AUC = 0.61 and elevated TSH AUC = 0.69. Additionally, we found that Anti-TPO was the most important feature in classifying TSH. In this study, we suggest that artificial intelligence and machine learning methods might offer an insight into the complex hypothalamic-pituitary -thyroid axis and may be an invaluable tool that guides us in making appropriate therapeutic decisions (thyroid hormone dosing) for the individual patient.
AB - The factors that determine Serum Thyrotropin (TSH) levels have been examined through different methods, using different covariates. However, the use of machine learning methods has so far not been studied in population databases like NHANES (National Health and Nutritional Examination Survey) to predict TSH. In this study, we performed a comparative analysis of different machine learning methods like Linear regression, Random forest, Support vector machine, multilayer perceptron and stacking regression to predict TSH and classify individuals with normal, low and high TSH levels. We considered Free T4, Anti-TPO antibodies, T3, Body Mass Index (BMI), Age and Ethnicity as the predictor variables. A total of 9818 subjects were included in this comparative analysis. We used coefficient of determination (r2) value to compare the results for predicting the TSH and show that the Random Forest, Gradient Boosting and Stacking Regression perform equally well in predicting TSH and achieve the highest r2 value = 0.13, with mean absolute error of 0.78. Moreover, we found that Anti-TPO is the most important feature in predicting TSH followed by Age, BMI, T3 and Free-T4 for the regression analysis. While classifying TSH into normal, high or low levels, our comparative analysis also shows that Random forest performs the best in the classification study, performed with individuals with normal, high and low levels of TSH. We found the following Areas Under Curve (AUC); for low TSH, AUC = 0.61, normal TSH, AUC = 0.61 and elevated TSH AUC = 0.69. Additionally, we found that Anti-TPO was the most important feature in classifying TSH. In this study, we suggest that artificial intelligence and machine learning methods might offer an insight into the complex hypothalamic-pituitary -thyroid axis and may be an invaluable tool that guides us in making appropriate therapeutic decisions (thyroid hormone dosing) for the individual patient.
UR - http://www.scopus.com/inward/record.url?scp=85085158249&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085158249&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0233336
DO - 10.1371/journal.pone.0233336
M3 - Article
C2 - 32433694
AN - SCOPUS:85085158249
SN - 1932-6203
VL - 15
SP - e0233336
JO - PLoS One
JF - PLoS One
IS - 5
ER -