GISH G

SUMMARY
Aiming to be a data scientist unicorn who excels at applied machine learning in a business/finance-related field.

Applying tree-based algorithms to model Credit Scoring predictions in the finance industry.

Extract-Transform-Load financial data into relational databases, conduct data wrangling, and then produce domain-specific features to achieve insight extraction from raw data.

Achieving Hyperautomation through AI and RPA, which uses a special set of tools to automate tasks.


技能

Python


  • Machine learning
  • Data Pipelines
  • Web Crawlers
  • PyTorch, Airflow, Pandas, Kubernetes(basic)

SQL

  • Extract-Transform-Load(ETL)
  • Efficiency(Window Functions)
  • json, yaml

Language


  • Chinese(Native)
  • English
    (TOEIC 905/990 TOEFL 96/120)

Working Experience

July 2020 - Present

Machine Learning Engineer  @ E.SUN COMMERCIAL BANK, LTD.

Intelligent Banking Division(智能金融處)

Building Machine Learning models to apply risk assessment in banking.
Specialized in credit card and Join Credit Information Center(JCIC) data.
ETL data and construct data pipelines for retraining models/in production.

Working in RPA(Robotic Process Automation) team, Web Scraping(crawler) and automating routine tasks to achieve labor cost reduction.

June 2019 - May 2020

Data Scientist Intern  @ Cathay Financial Holdings. 國泰金控

Digital data & Technology (DDT, 數數發)
Research into Interpretable Machine Learning and its existing algorithms. Experimented LIME & SHAP on open data. (GitHub)
Real Estate Evaluation model – Geographical/Credit Card data gathering, cleaning, feature engineering (Hit-rate performance improved from 55% to 70%)

Projects

Internal ratings-based (IRB) model -Credit Card

Aiming to reduce risks from capitals through more accurate models by following IRB method.

Applied tree-based methods to produce pd/lgd/ead predictions for computing expected credit loss.

Saved more than 50 billion in capital  through reducing the capital requirement for Capital Adequacy Ratio (CAR).

JCIC Superset - Data ETL and Pipeline

A wide variety of storage methods by individual departments causes difficult time exploring data, and unnecessary duplication of effort on different projects.

Aiming to make an united database and features by extracting and integrating data from various sources through data munging.

Integrated various data sources, resulting in 99% consistency and 80% less effort on data preprocessing.

Monitoring Data Pipelines through Apache Airflow(Refactoring ETL codes with DAGs)

RPA - TGOS latitude longitude convertion

Fetch all Taiwan address from Dept. of Household Registration through Selenium, using doorplate number.

Web crawling government TGOS website to get latitude and longitude from address.

RPA - Miscellaneous automation tasks

1. Automating Bank Trust Dept. AS400 system routine tasks through Pywinauto, saving 8 labor hours/week.

2. Automating Asset Management Dept. routine Excel and PDF tasks through Tabula, saving 16 labor hours/week. 

Education

2018 - 2020

National Taiwan University

Business Administration/Big data analytics                                                             GPA  4.1/4.3

2012 - 2016

National Chengchi University

Majored in Management Information System/ Minored in Accounting           GPA  3.8/4

Competitions (Github)

•    89/1366
E-Sun Credit Card Default Detection(玉山人工智慧公開挑戰賽-信用卡盜刷偵測)

•    7/86
Taishin Financial Product Purchase Prediction(第二屆商業模式與大數據分析競賽 台新銀行)

•    685/2281(public)
Kaggle Deepfake Detection Challenge

English Certifications

  • TOEIC 905/990 (on 2016)

•    TOEFL 96/120 (on 2020)


Additional Documents




NATIONAL TAIWAN UNIVERSITY

TRANSCRIPT OF ACADEMIC RECORD


Name: LI-YU SHAO(邵立瑜)


Student ID Number: R07741050


Department: Business Administration



Course No.


Course Title

Credits

Grade



1st Semester 2018/2019









Ethics 7001

Academic Ethics

0

PASS

Fin 7023

Financial Management

3

A+

MBA 7001

Managerial Accounting

3

A+

MBA 7004

Operations Management

3

A-

MBA 7005

Marketing Management

3

A-

MBA 7008

Organizational Behavior

3

A+

MBA 7009

Information Management

3

A+




Total Credits Enrolled:

18





Total Credits Earned:

18





Grade Point Average:


4.10



2nd Semester 2018/2019









EE 2008

Discrete Mathematics

2

B-

MBA 5011

Multivariate Analysis

3

B-

MBA 5016

Innovation Management and Entrepreneurship

3

A

MBA 5048

Optimization Methods

3

A

MBA 5073

Big Data and Business Analytics

3

A+

MBA 7021

E-business & Supply Chain Management

3

A

MBA 7025

Management Science Models

3

A+




Total Credits Enrolled:

20





Total Credits Earned:

20





Grade Point Average:


3.77



1st Semester 2019/2020









IE 5034

Linear Algebra and Its Applications

(3)

W

MBA 5045

Statistical Data Analysis for Business and Management

3

A+

MBA 7027

Strategic Management

3

A+




Total Credits Enrolled:

6





Total Credits Earned:

6





Grade Point Average:


4.30


(End of Record)


The Transcript of academic record merely serves as a reference. If an official transcript is needed, please order it via the "Online Transcript Order System".








Download Date: 2020/02/2                                                                                                                          Page:1 of 1


Online Transcript Order System: https://reg71.aca.ntu.edu.tw/transcript_eng/index.php/user/login



Powered By CakeResume