In this book, you will implement
two data science projects using Scikit-Learn, Scipy, and other libraries with
Python GUI.
In chapter 1, you will learn how to use Scikit-Learn, SVM, NumPy, Pandas, and other libraries to perform how to predict early stage diabetes using Early Stage Diabetes Risk Prediction Dataset. This dataset contains the sign and symptom data of newly diabetic or would be diabetic patient. This has been collected using direct questionnaires from the patients of Sylhet Diabetes Hospital in Sylhet, Bangladesh and approved by a doctor. The dataset consist of total 15 features and one target variable named class. Age: Age in years ranging from (20years to 65 years); Gender: Male / Female; Polyuria: Yes / No; Polydipsia: Yes/ No; Sudden weight loss: Yes/ No; Weakness: Yes/ No; Polyphagia: Yes/ No; Genital Thrush: Yes/ No; Visual blurring: Yes/ No; Itching: Yes/ No; Irritability: Yes/No; Delayed healing: Yes/ No; Partial Paresis: Yes/ No; Muscle stiffness: yes/ No; Alopecia: Yes/ No; Obesity: Yes/ No; This dataset contains the sign and symptpom data of newly diabetic or would be diabetic patient. This has been collected using direct questionnaires from the patients of Sylhet Diabetes Hospital in Sylhet, Bangladesh and approved by a doctor. You will develop a GUI using PyQt5 to plot distribution of features, feature importance, cross validation score, and prediced values versus true values. The machine learning models used in this project are Adaboost, Random Forest, Gradient Boosting, Logistic Regression, and Support Vector Machine.
In chapter 2, you will learn how to use Scikit-Learn, NumPy, Pandas, and other libraries to perform how to analyze and predict breast cancer using Breast Cancer Prediction Dataset. Worldwide, breast cancer is the most common type of cancer in women and the second highest in terms of mortality rates.Diagnosis of breast cancer is performed when an abnormal lump is found (from self-examination or x-ray) or a tiny speck of calcium is seen (on an x-ray). After a suspicious lump is found, the doctor will conduct a diagnosis to determine whether it is cancerous and, if so, whether it has spread to other parts of the body. This breast cancer dataset was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg. You will develop a GUI using PyQt5 to plot distribution of features, pairwise relationship, test scores, prediced values versus true values, confusion matrix, and decision boundary. The machine learning models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, and Support Vector Machine.
No comments:
Post a Comment