GISH G

SUMMARY
Aiming to be a data scientist unicorn who excels at applied machine learning in a business/finance-related field.

DevOps Engineer(Python) for Automatic Claim Processor (ACP) - OCR System (Hospital Diagnosis/Receipt).

Applying tree-based algorithms to model Credit Scoring predictions in the finance industry.

Achieving Hyperautomation through AI and RPA, which uses a special set of tools to automate tasks.


Working Experience

Nov. 2022 -

Now

Artificial Intelligence Engineer  @ CTBC Taiwan Life Insurance 

AI Team

DevOps Engineer(Python) for Automatic Claim Processor (ACP) - OCR System (Hospital Diagnosis/Receipt)

1. Receipt Recognition Service (2023) OCR:
Successfully improved OCR model accuracy from 80% to 96% in 2023 by integrating current model with Microsoft Azure Form Recognizer output.
Seamlessly integrated Microsoft Azure services into our existing Receipt system's data pipeline.
Engineered postprocessing data solutions to cater to diverse format requirements from 17 different hospitals.
Designed and executed comprehensive unit tests to ensure robustness and reliability.

2. Diagnosis Recognition Service (2022 - 2023) NLP/OCR:
Continuously maintain and monitor the Diagnosis Recognition service, ensuring optimal performance.
Regularly update the synonym table to align with real-world Diagnosis cases.
Conduct rigorous pytest to guarantee the stability of all deployments in the production environment. 

July 2020 -

Sept.
2022

Machine Learning Engineer  @ E.SUN COMMERCIAL BANK, LTD.

Intelligent Banking Division(智能金融處)

Building Machine Learning models to apply risk assessment in banking.
Specialized in credit card and Join Credit Information Center(JCIC) data.
ETL data and construct data pipelines for retraining models/in production.

Working in RPA(Robotic Process Automation) team, Web Scraping(crawler) and automating routine tasks to achieve labor cost reduction.

June 2019 - May 2020

Data Scientist Intern  @ Cathay Financial Holdings. 國泰金控

Digital data & Technology (DDT, 數數發)
Research into Interpretable Machine Learning and its existing algorithms. Experimented LIME & SHAP on open data. (GitHub)
Real Estate Evaluation model – Geographical/Credit Card data gathering, cleaning, feature engineering (Hit-rate performance improved from 55% to 70%)

Python


  • Machine learning
  • Data Pipelines
  • Web Crawlers
  • PyTorch, Airflow, Pandas, docker

SQL

  • Extract-Transform-Load(ETL)
  • Efficiency(Window Functions)

Language


  • Chinese(Native)
  • English
    (TOEIC 905/990 TOEFL 96/120)

Project(@ Esun only) (Powerpoint demo link, click me)

Internal ratings-based (IRB) model -Credit Card

Aiming to reduce risks from capitals through more accurate models by following IRB method.

Applied tree-based methods to produce pd/lgd/ead predictions for computing expected credit loss.

Saved more than 50 billion in capital  through reducing the capital requirement for Capital Adequacy Ratio (CAR).

JCIC Superset - Data ETL and Pipeline

A wide variety of storage methods by individual departments causes difficult time exploring data, and unnecessary duplication of effort on different projects.

Aiming to make an united database and features by extracting and integrating data from various sources through data munging.

Integrated various data sources, resulting in 99% consistency and 80% less effort on data preprocessing.

Monitoring Data Pipelines through Apache Airflow(Refactoring ETL codes with DAGs)

RPA - TGOS latitude longitude conversion

Fetch all Taiwan address from Dept. of Household Registration through Selenium, using doorplate number.

Web crawling government TGOS website to get latitude and longitude from address.

RPA - Miscellaneous automation tasks

1. Automating Bank Trust Dept. AS400 system routine tasks through Pywinauto, saving 8 labor hours/week.

2. Automating Asset Management Dept. routine Excel and PDF tasks through Tabula, saving 16 labor hours/week. 

Education

2018 - 2020

National Taiwan University(Master) - Graduated on 2020

Business Administration/Big data analytics                                                             GPA  4.1/4.3

2012 - 2016

National Chengchi University(Bachelor) - Graduated on 2016

Majored in Management Information System/ Minored in Accounting           GPA  3.8/4

Competitions (Github)

•    89/1366
E-Sun Credit Card Default Detection(玉山人工智慧公開挑戰賽-信用卡盜刷偵測)

•    7/86
Taishin Financial Product Purchase Prediction(第二屆商業模式與大數據分析競賽 台新銀行)

•    685/2281(public)
Kaggle Deepfake Detection Challenge

English Certifications

  • TOEIC 905/990 (on 2016)

•    TOEFL 96/120 (on 2020)