Hello, I'm Xiaoyu Zhang
I am a third-year undergraduate student majoring in Data Science and Big Data Analytics at Southern University of Science and Technology. This website presents my academic background, coursework, and project portfolio.
About Me
I am currently a junior at Southern University of Science and Technology, majoring in Data Science and Big Data Analytics. My academic interests include statistics, data analysis, machine learning, and data-driven problem solving.
Through coursework and project-based practice, I have developed experience in Python, SQL, data cleaning, exploratory data analysis, statistical modeling, and academic report writing. I am especially interested in applying data science methods to real-world problems.
Skills
Programming
- Python (pandas, NumPy, matplotlib, scikit-learn)
- SQL (MySQL, PostgreSQL)
- PySpark for distributed data processing
- Basic HTML/CSS for web presentation
Data Analysis
- Exploratory Data Analysis (EDA)
- Statistical modeling and hypothesis testing
- Survival analysis (Kaplan-Meier, Cox PH, AFT)
- Data visualization and academic reporting
Tools & Frameworks
- Jupyter Notebook for data analysis workflows
- GitHub for version control and project hosting
- GitHub Pages for static site deployment
- LaTeX for academic report writing
Projects
Q2 Survival Analysis Reproduction
A reproduction of the Databricks survival-analysis workflow using PySpark and local notebook adaptation. The project covers Bronze/Silver table construction, Kaplan–Meier analysis, Cox PH, AFT, and CLV estimation.
Read the full report →Text-to-SQL Failure Analysis
A structured evaluation of GPT-based SQL generation under ambiguity, schema grounding errors, dirty data, and complex aggregation settings.
See examples →Contact
Email: 15838710256@163.com
GitHub: github.com/enhhhj
Feel free to contact me for academic communication, coursework discussion, or project collaboration.