Topic: Data Analytics on X
In this project, your task is to use data analytics and AI techniques to explore a dataset and present
your results. I will use the wine data as an example in the following section. You need to select
your own dataset for analysis. Many useful datasets for real-life projects can be accessed from
https://www.kaggle.com/datasets.
Example: Data Analytics on Wine Dataset.
By considering the 12 input variables provided in the datasets
1 - fixed acidity
2 - volatile acidity
3 - citric acid
4 - residual sugar
Task:
● product (i.e. source code)
● The final project report,
○ Excluding references, (11-point font size, 1.5 line spacing).
○ Turnitin - no plagiarism
The chosen dataset:
https://www.kaggle.com/datasets/aniruddhawankhede/mental-heath-analysis-among-teenagers
report context (Finish 5-7 highlights only)
1.Introduction
2.background
3.Data Overview, The distribution of Variables, outlier(box plot)
4.Relationship between different characteristics
(1) Comparison of differences in social media use time, exercise time and sleep time by gender (Gender) (chi square test)
(2) What is the relationship between time spent on social media use and academic performance? What is the effect of exercise time on stress levels? What is the effect of support systems on academic performance and stress levels? What is the relationship between screen time and psychological stress? (Correlation and regression analyses)
5.1 Random forest (training and testing)
Importance of the above features in predicting stress indices - Which feature has the greatest effect on stress, own perceived stress vs. instrumental measurements? (survey-confusion-matrix/wearable-mean square error/Importance of 2 features)
5.2.Evaluation of the model (split training data and test data to see the accuracy of the model)
6.Visualizaion (already in the code)
7.conclusion