09/2022 - Sekarang
❖ Experience that dealt with PB level mail metadata, using Spark to aggregate data, analyze data or build ML models.
❖ Experiment with heuristic rule to reduce FP email error rate for Microsoft teams mail by 69%.
❖ Refactoring the classification service, converting the Rule-based (python) into ML-based (go), the recall achieves 95%.
❖ Experiment to train Bert model extracting mail’s subject into embedding vectors for multi-task learning, the average of recall
achieves 98%.
❖ Stress testing for Bert model of tf-serving service, tuning service params and instance’s params.
❖ Optimize bert service by using TenorRT, increasing QPS from 300 to 1500 and reducing GPU cost by 2⁄3.
❖ Implement bert Tokenizer algo by go. (following python code to implement).
❖ Build up the SafetyNet for customer outbound contact people based on domain and address , recusing 10% FP case.