Epivigila is a Chilean integrated epidemiological surveillance system with more than 17,000,000 Chilean patient records, making it an essential and unique source of information for the quantitative and qualitative analysis of the COVID-19 pandemic in Chile. Nevertheless, given the extensive volume of data controlled by Epivigila, it is difficult for health professionals to classify vast volumes of data to determine which symptoms and comorbidities are related to infected patients. This paper aims to compare machine learning techniques (such as support-vector machine, decision tree and random forest techniques) to determine whether a patient has COVID-19 or not based on the symptoms and comorbidities reported by Epivigila. From the group of patients with COVID-19, we selected a sample of 10% confirmed patients to execute and evaluate the techniques. We used precision, recall, accuracy, F1 -score, and AUC to compare the techniques. The results suggest that the support-vector machine performs better than decision tree and random forest regarding the recall, accuracy, F1 -score, and AUC. Machine learning techniques help process and classify large volumes of data more efficiently and effectively, speeding up healthcare decision making.
This study is published in the International Journal of Environmental Research and Public Health.