Hello, I'm Xiaoyu Zhang

I am a third-year undergraduate student majoring in Data Science and Big Data Analytics at Southern University of Science and Technology. This website presents my academic background, coursework, and project portfolio.

About Me

I am currently a junior at Southern University of Science and Technology, majoring in Data Science and Big Data Analytics. My academic interests include statistics, data analysis, machine learning, and data-driven problem solving.

Through coursework and project-based practice, I have developed experience in Python, SQL, data cleaning, exploratory data analysis, statistical modeling, and academic report writing. I am especially interested in applying data science methods to real-world problems.

Skills

Programming

  • Python (pandas, NumPy, matplotlib, scikit-learn)
  • SQL (MySQL, PostgreSQL)
  • PySpark for distributed data processing
  • Basic HTML/CSS for web presentation

Data Analysis

  • Exploratory Data Analysis (EDA)
  • Statistical modeling and hypothesis testing
  • Survival analysis (Kaplan-Meier, Cox PH, AFT)
  • Data visualization and academic reporting

Tools & Frameworks

  • Jupyter Notebook for data analysis workflows
  • GitHub for version control and project hosting
  • GitHub Pages for static site deployment
  • LaTeX for academic report writing

Projects

Q2 Survival Analysis Reproduction

A reproduction of the Databricks survival-analysis workflow using PySpark and local notebook adaptation. The project covers Bronze/Silver table construction, Kaplan–Meier analysis, Cox PH, AFT, and CLV estimation.

Read the full report →

Text-to-SQL Failure Analysis

A structured evaluation of GPT-based SQL generation under ambiguity, schema grounding errors, dirty data, and complex aggregation settings.

See examples →

Contact

Email: 15838710256@163.com

GitHub: github.com/enhhhj

Feel free to contact me for academic communication, coursework discussion, or project collaboration.