Natural Language Processing Engineer
A participant in Natural Language Processing.
Interested in Machine Learning, Natural Language Processing, and eager to pursue new knowledge and technology.
Taipei City, Taiwan
0983034412
四月 2023 - Present
Taipei, Taiwan
Design Error Correction algorithm for speech recognition
Correct the error word of speech recognition model output
Arpa model based Error Correction
Traverse the Arpa word graph which is constructed by the customer sentences in beam search manner.
Correct the error word in ChatBot conversation
Context Entity based Error Correction
Extract the entity by the NER model, then correct error word base on the extracted entities
Text-to-Speech technology application
Create the API for the Error Correction algorithm as service
Create the offline version for Speech Recognition System
Large Language Model application
Build web crawler and data cleaning to obtain the model training data
十一月 2021 - Present
Taipei, Taiwan
Multi-news abstractive summarization
Leverage discourse relationship tree to help the model learn the news discourse and structure in order to generate an accurate and coherent summary
News knowledge update task
Given an existing article about a related news event, then generate an updated article according to the information from the news event
Leverage web crawler to collect original articles, news and updated articles triples from the website and generate the news event triggered knowledge update dataset
Published in CIKM 2022 [1]:
八月 2020 - 六月 2021
Taipei, Taiwan
Predict the sentiment of Celebrity reviews on websites
Solve the problem of low resource labeled data
Reduce the noises from the out-of-domain data in transfer learning
Construct the teacher-student model to increase model generalization ability
Published in WI-IAT 2021 [2]:
T
十一月 2019 - 六月 2021
Taipei, Taiwan
Observe the public opinions on nuclear policy in the media
Lead the team to generate the novel nuclear policy review dataset
Construct a hierarchical classification model to predict the stances of the public in the media
Published in 中央研究院調查研究方法與應用學術研討會 2021 [3]
2019 - 2021
2015 - 2019
[1] A Multi-grained Dataset for News Event Triggered Knowledge Update (CIKM 2022)
[2] A Teacher-Student Approach to Cross-Domain Transfer Learning with Multi-level Attention (WI-IAT 2021)
[3] Position Analysis of Internet Public Opinions - A Case Study of Nuclear Energy Policy in Taiwan (中央研究院調查研究方法與應用學術研討會 2021)
Natural Language Processing, Machine Learning, Deep Learning, LLM, Langchain, UNIX‑like Operating System, Google cloud platform, Github, Pandas
Framework: Pytorch, Tensorflow, Django
Programming Languages: PYTHON, SQL
Natural Language Processing Engineer
A participant in Natural Language Processing.
Interested in Machine Learning, Natural Language Processing, and eager to pursue new knowledge and technology.
Taipei City, Taiwan
0983034412
四月 2023 - Present
Taipei, Taiwan
Design Error Correction algorithm for speech recognition
Correct the error word of speech recognition model output
Arpa model based Error Correction
Traverse the Arpa word graph which is constructed by the customer sentences in beam search manner.
Correct the error word in ChatBot conversation
Context Entity based Error Correction
Extract the entity by the NER model, then correct error word base on the extracted entities
Text-to-Speech technology application
Create the API for the Error Correction algorithm as service
Create the offline version for Speech Recognition System
Large Language Model application
Build web crawler and data cleaning to obtain the model training data
十一月 2021 - Present
Taipei, Taiwan
Multi-news abstractive summarization
Leverage discourse relationship tree to help the model learn the news discourse and structure in order to generate an accurate and coherent summary
News knowledge update task
Given an existing article about a related news event, then generate an updated article according to the information from the news event
Leverage web crawler to collect original articles, news and updated articles triples from the website and generate the news event triggered knowledge update dataset
Published in CIKM 2022 [1]:
八月 2020 - 六月 2021
Taipei, Taiwan
Predict the sentiment of Celebrity reviews on websites
Solve the problem of low resource labeled data
Reduce the noises from the out-of-domain data in transfer learning
Construct the teacher-student model to increase model generalization ability
Published in WI-IAT 2021 [2]:
T
十一月 2019 - 六月 2021
Taipei, Taiwan
Observe the public opinions on nuclear policy in the media
Lead the team to generate the novel nuclear policy review dataset
Construct a hierarchical classification model to predict the stances of the public in the media
Published in 中央研究院調查研究方法與應用學術研討會 2021 [3]
2019 - 2021
2015 - 2019
[1] A Multi-grained Dataset for News Event Triggered Knowledge Update (CIKM 2022)
[2] A Teacher-Student Approach to Cross-Domain Transfer Learning with Multi-level Attention (WI-IAT 2021)
[3] Position Analysis of Internet Public Opinions - A Case Study of Nuclear Energy Policy in Taiwan (中央研究院調查研究方法與應用學術研討會 2021)
Natural Language Processing, Machine Learning, Deep Learning, LLM, Langchain, UNIX‑like Operating System, Google cloud platform, Github, Pandas
Framework: Pytorch, Tensorflow, Django
Programming Languages: PYTHON, SQL