Hi, I'm Xiangke

Thinking will not overcome fear but action will.

MLB Hall of Fame Prediction

Machine Learning, Predictive, Baseball, XGBoost

MLB - Hall of Fame Notes: The details of my implementation without many codes. This document is written for technical audience. So more code and logic details, please refer to mlb-data_challen...

A Crash in Causal Inference

Causal Inference, DID, RDD, Panel Data

A Crash In Causal Inference I leaned Causal Inference in my master program and found it’s interesting and used frequently in business world. I am pretty interested in it and decided to take anothe...

Window Functions

Mysql, Database

Window Functions row_number() / rank() / dense_rank() persent_rank() / cume_dist() lag() / lead() first_val() / last_val() nth_value / nfile() There is a frame window. Sometimes yo...

A/B Testing Common Questions

A/B Testing

A/B Testing Common Questions - original post time: 2020/02/09 - Update time: 2020/03/22 - Lasest Update time: 2020/03/31 General Questions General Questions You Should Be Aware Of: Wha...

Yelp Review Analysis

Text Mining, Tableau, Pyspark

Yelp Review Analysis This is our exploratory journey of yelp dataset. The main topic is about reviews’ topic modeling. Our team has Yi Zhu, Yili Yu, Boyang Wei, Lin Xu, Xiangke Chen and Yusha Wan...

Recruit Restaurant Visitor Forecasting

Time Series, Machine Learning, Python

*Github Detail Codes*: Link 1. Project Definition & Introduction When someone opens a restaurant, their focus is likely on making high-quality food that will make their customers happy. Howev...

Frequent Statistics Asked Questions

Data Analysis, Tips, Statistics, FAQ

Update time: - 02/03/2020 orginal post - 02/18/2020 update linear regression knowledge Conceptual Questions Prediction VS Forecasting Forecasting would be a subset of prediction. Any time you ...

SQL Summary

Mysql, Database

update history: - 2020/02/09 original post - 2020/02/14 update time functions Short Answers Join Table by Where vs by On link The ON clause defines the relationship between the tables. The WH...

Data Analysis Resource

data analysis

INTRO Here I want to summarize some important resources that are useful for data analytics or data science. Data Science/Analytics is usually a mixed area of data engineer, programming, stats .et...