Profile 00 00@2x e294063b2878e4164ba1dd904aa6b8bae6a97e19937899e4b4af853acf30de5d

Amber (Biying) Tan

I'm a highly motivated and passionate Data Scientist whose dream is to use my talents in machine learning/AI and statistical analysis to solve real-world problems and make the customer achieve more.

Vancouver, Canada

Work Experience

Data Scientist II, Electronic Arts (EA), Jul 2017 - Present

  • Apply quantitative analysis and machine learning to detect in-game exploits, fraud and other security threats in Hockey Ultimate Team (NHL HUT) with Python 
  • Built hybrid classification models with Random Forest and Isolation Forest to detect fraudulent accounts and anomalous behaviors including coin farming/selling, credit card fraud, match cheating to help reduce fraud losses by millions of dollars annually
  • Build Dashboards to monitor the health of the game economy, analyze data anomalies and communicate findings to the team with Tableau and SAS

Experiences 00 00@2x 504900dc09d82f711fdb54cf5763251cab5193a844856da978b2b8f3907ecd5a

Data Scientist, PHEMI Systems, Aug 2016 - Jun 2017

  • Built PHEMI Data Science Toolkit to perform descriptive analysis and visualizations to showcase data insights with Spark and D3.js
  • Developed text mining pipelines in Spark to analyze large unstructured datasets with Latent Dirichlet Allocation (LDA), text classification and sentiment analysis
  • Created training materials for the client about Machine Learning models, Natural Language Processing and Recommender System 

Experiences 00 01@2x 6daa47d1df70d315f129dcecdd562003831581c4cdd8f9c819aa22d4a24ae150

Data Analyst, Move Inc, Jan 2016 - Aug 2016

  • Improved the accessibility of multi-sourced data across the team by designing and building a high-performance data warehouse on AWS to ingest data from various data sources
  • Reduced the data and content development time by more than 50% by developing an ETL pipeline to streamline the data transformation, aggregation and reporting process
  • Built Dashboards to extract and visualize key trends impacting residential real estate using Tableau 

Experiences 00 02@2x d08cb731546334ac3e784c5845e368719bba256a127be86e720a634d798b8bb9

Data Analyst, SingTel, Jun 2014 - Nov 2015

  • Improved user engagement for Newsloop (Top 5 News App in Singapore) by 89% by developing a real-time personalized push notification system using Latent Dirichlet Allocation (LDA) and topic modeling
  • Optimized the game acquisition strategy by building a classification model with Logistic Regression to predict viral games with 85% accuracy
  • Supported Games and News Divisions’ data needs by developing ETL and reporting pipelines
  • Defined metrics and built dashboards to gain insights into product and user behaviors using Tableau

Experiences 00 03@2x b381b81be12ffb5207e31257bd24e9606aef09c63c7e710fcd513da5ffd4e913

Education

Master of science, Data Management & Analytics, Information Systems

Singapore Management University, 2012 - 2014

Specialized in Social Network Mining and Natual Langauge Processing

Educations 00 00@2x 7b148e1f93abab24dca94b4fdb4a0272c264e306a59e75c713416ce8b6bdd271

Bachelor of Engineering, Computer Science

Sichuan University, 2008 - 2012

Academic Scholarship in the consecutive 3 years for Top 5% students for academic achievement

Educations 00 01@2x 4e43db9b699f13e23c72a9c02425174ede57b0cf1c591e8a1736b04cb30762e1

Skills


  • Machine Learning: Classification, Clustering, Regression, Fraud Detection, Natural Language Processing, Recommender System, Social Network Mining
  • Statistics: Probability, Distribution, Hypothesis Testing, A/B Testing, Regressions, Time Series Analysis
  • Programming languages: Python (SciKit-Learn, Pandas, NumPy), Scala, R, Java
  • Database: MySQL, Amazon Redshift, SQL Server
  • Big Data: Spark (MLlib), Hive, Hadoop, MapReduce
  • Visualization: Tableau, Python (Matplotlib, Seaborn), SAS, D3.js 

Publication

  • Biying Tan, et al. "Clairvoyant-Push: A Real-Time News Personalized Push Notifier using Topic Modeling And Social Scoring for Enhanced Reader Engagement." IEEE International Conference on Big Data, 2015 
  • Biying Tan, et al. "Online Community Transition Detection." Web-Age Information Management (WAIM2014). Springer International Publishing, 2014. 633-644 
  • Du Juan, Biying Tan, et al. "Social Listening for Customer Acquisition." Social Informatics (SocInfo2013). Springer International Publishing, 2013. 75-80 
  • Biying Tan, et al. "Detection of high-risk zones and potentially infected neighbors from infectious disease monitoring data." Database Systems for Advanced Applications (DASFAA2012). Springer, 2012

Powered By CakeResume