Project Experience

PyHealth Open-Source Contribution | Python, PyHealth | GitHub PR
  • Dataset Integration: Added full SUPPORT2 dataset support (dataset class, YAML config, documentation, test data) with a test suite following PyHealth standards, strengthening dataset coverage for open-source clinical ML research
  • Task Development: Implemented a task class with feature extraction across six clinical feature groups; included a demo example illustrating the complete workflow from dataset loading to ML-ready samples

Deep Learning Research Reproduction: Survival Analysis | Python, PyTorch, scikit-learn | report | slides | GitHub
  • Framework Replication: Reproduced 15 core functionalities of Auton-Survival framework integrating survival regression, patient phenotyping, and evaluation metrics for censored time-to-event data in healthcare
  • Ablation & Validation: Conducted two ablation studies (mixture components, architecture depth) and a cross-dataset validation; clarified how design choices affect performance and identified feature-dependency limitations in cross-domain transfer

Academic Data Analytics Platform | Python, Dash Plotly, MySQL, MongoDB, Neo4j, PythonAnywhere | website | GitHub
  • Full-Stack Development: Developed a web-based analytics dashboard for prospective graduate applicants to explore academic programs, compare universities, and identify prominent researchers through interactive visualizations
  • Cloud Infrastructure: Deployed a multi-database architecture using Aiven (MySQL), MongoDB Atlas, and Neo4j Aura as data stores; ensured continuous availability via PythonAnywhere hosting and GitHub Actions–based keep-alive scheduling