Data engineer / Data anyayst
engineer • 富盈數據 Maintained distributed system and database • Constructed and managed the Hadoop ecosystem with Ambari. Built ETL pipeline to query multi-source database which processing more than three terabytes (TB) provided 90% of the analysis needs (Hive, HBase, Python, ELK, MySQL) • Established data collection and analysis workflow, saving Data scientists’ 30% of the time to analyze and build machine learning models with collected data (Elasticsearch, PySpark, Airflow) Constructed backend system and API • Researched webpage user preference and behavior, and modified advertising performance evaluation system to enable precision marketing, increasing the accuracy by 300%...
Full-time / Interested in working remotely
University of Texas at Dallas・
Information Technology and Management