What I do
Here is an overview of my technical skill sets
Data Engineering / Science
- Python
- Big data (Spark, Hadoop)
- Data pipeline (Airflow)
- Database (SQL, NoSQL)
- API
- Web scraping
- Data visualization
- Machine learning
- Natural language processing
Cloud Engineering
- Linux (Bash)
- Amazon Web Service
- Google Cloud Platform
- Heroku
Backend Engineering
- Django
- Ruby on Rails
- Express.js
Frontend Engineering
- Bootstrap
- JavaScript
Featured Projects
Obama VS Trump Twitter Analysis
In this project, I scraped and analyzed President Obama and President Trump tweets using natural language processing (NLP).
Published: 2018-04-01
Y Combinator Startups Analysis
In this project, I scraped and analysed 1317 Y Combinator Startups and share my insights.
Published: 2017-10-24
Latest Blog Posts
Apache Spark VS Pandas VS Koalas
Apache Spark is an open-source unified analytics engine for large-scale data processing. Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language. Koalas is Pandas API on Apache Spark.