Amber (Biying) Tan

I'm a highly motivated and passionate Data Scientist whose dream is to use my talents in machine learning/AI and statistical analysis to solve real-world problems and make the customer achieve more.

Vancouver, Canada

Work Experience

Data Scientist II, Electronic Arts (EA), Jul 2017 - Present

  • Apply quantitative analysis and machine learning to detect in-game exploits, fraud and other security threats in Hockey Ultimate Team (NHL HUT) with Python 
  • Built hybrid classification models with Random Forest and Isolation Forest to detect fraudulent accounts and anomalous behaviors including coin farming/selling, credit card fraud, match cheating to help reduce fraud losses by millions of dollars annually
  • Build Dashboards to monitor the health of the game economy, analyze data anomalies and communicate findings to the team with Tableau and SAS

Data Scientist, PHEMI Systems, Aug 2016 - Jun 2017

  • Built PHEMI Data Science Toolkit to perform descriptive analysis and visualizations to showcase data insights with Spark and D3.js
  • Developed text mining pipelines in Spark to analyze large unstructured datasets with Latent Dirichlet Allocation (LDA), text classification and sentiment analysis
  • Created training materials for the client about Machine Learning models, Natural Language Processing and Recommender System 

Data Analyst, Move Inc, Jan 2016 - Aug 2016

  • Improved the accessibility of multi-sourced data across the team by designing and building a high-performance data warehouse on AWS to ingest data from various data sources
  • Reduced the data and content development time by more than 50% by developing an ETL pipeline to streamline the data transformation, aggregation and reporting process
  • Built Dashboards to extract and visualize key trends impacting residential real estate using Tableau 

Data Analyst, SingTel, Jun 2014 - Nov 2015

  • Improved user engagement for Newsloop (Top 5 News App in Singapore) by 89% by developing a real-time personalized push notification system using Latent Dirichlet Allocation (LDA) and topic modeling
  • Optimized the game acquisition strategy by building a classification model with Logistic Regression to predict viral games with 85% accuracy
  • Supported Games and News Divisions’ data needs by developing ETL and reporting pipelines
  • Defined metrics and built dashboards to gain insights into product and user behaviors using Tableau

Master of science, Data Management & Analytics, Information Systems

Singapore Management University, 2012 - 2014

Specialized in Social Network Mining and Natual Langauge Processing

Bachelor of Engineering, Computer Science

Sichuan University, 2008 - 2012

Academic Scholarship in the consecutive 3 years for Top 5% students for academic achievement

  • Machine Learning: Classification, Clustering, Regression, Fraud Detection, Natural Language Processing, Recommender System, Social Network Mining
  • Statistics: Probability, Distribution, Hypothesis Testing, A/B Testing, Regressions, Time Series Analysis
  • Programming languages: Python (SciKit-Learn, Pandas, NumPy), Scala, R, Java
  • Database: MySQL, Amazon Redshift, SQL Server
  • Big Data: Spark (MLlib), Hive, Hadoop, MapReduce
  • Visualization: Tableau, Python (Matplotlib, Seaborn), SAS, D3.js 


  • Biying Tan, et al. "Clairvoyant-Push: A Real-Time News Personalized Push Notifier using Topic Modeling And Social Scoring for Enhanced Reader Engagement." IEEE International Conference on Big Data, 2015 
  • Biying Tan, et al. "Online Community Transition Detection." Web-Age Information Management (WAIM2014). Springer International Publishing, 2014. 633-644 
  • Du Juan, Biying Tan, et al. "Social Listening for Customer Acquisition." Social Informatics (SocInfo2013). Springer International Publishing, 2013. 75-80 
  • Biying Tan, et al. "Detection of high-risk zones and potentially infected neighbors from infectious disease monitoring data." Database Systems for Advanced Applications (DASFAA2012). Springer, 2012

