About me

I am a Senior Machine Learning Engineer at Gamania Digital Entertainment. Previously, I was a Visiting Researcher at Sony Computer Science Laboratories. I specialize in developing AI-driven solutions across various domains including computer vision, natural language processing, and medical imaging.

With a Master's degree in Data Science from National Taiwan University, I have presented research at conferences such as ICASSP, ISMIR, and DAFx. My work focuses on applying machine learning to solve real-world problems, from audio processing and medical image analysis to financial anomaly detection.

What I do

  • Machine Learning & Deep Learning

    Developing advanced AI models for computer vision, NLP, and audio processing applications.

  • Medical Imaging & fMRI Analysis

    Applying self-supervised learning techniques to decode brain signals and analyze medical images.

  • Audio Processing & MIR

    Music information retrieval and audio effect processing using deep learning methods.

  • Data Science & Analytics

    Building data-driven solutions including Text-to-SQL agents and anomaly detection systems.

Resume

Education

  1. National Taiwan University

    Feb 2023 โ€” Jun 2024

    M.S. in Data Science
    Thesis: Wholeโ€‘Brain Feature Selection Methods for Decoding from fMRI Data

  2. National Taiwan University

    Sep 2019 โ€” Jan 2022

    B.S. in Computer Science and Information Engineering (CSIE)

Experience

  1. Senior Machine Learning Engineer @ Gamania Digital Entertainment

    Jul 2025 โ€” Present

    Working on advanced machine learning projects in the gaming and entertainment industry.

  2. Visiting Researcher @ Sony Computer Science Laboratories

    Jun 2025 โ€” Oct 2025

    Conducted cutting-edge research in AI and machine learning at Sony CSL, Tokyo (Hybrid).

  3. AI Engineer @ Gate.io

    Oct 2024 โ€” May 2025

    โ€ข Developed a Text-to-SQL AI agent enabling non-technical teams to access internal data, boosting query efficiency by 20%.
    โ€ข Developed a fund flows anomaly detection system with LLMs and tree-based models, enhancing financial security.

  4. Research Assistant @ Academia Sinica (MCTLAB)

    Mar 2022 โ€” Oct 2024

    Research Topics: Selfโ€‘Supervised Learning, Medical Imaging
    โ€ข Proposed a Transformer-based self-supervised learning method for decoding brain signals (fMRI), achieving a 77% reduction in memory footprint.
    โ€ข Conducted distributed training experiments on high-resolution 4D medical images (fMRI) using TWCC HPC.
    โ€ข Proposed a whole-brain feature selection method for decoding musical pitch from fMRI.

  5. AI Engineer Intern @ Tomofun

    Mar 2023 โ€” Jul 2024

    Research Topics: Computer Vision, Large Language Models, Multimodal Learning
    โ€ข Developed an automatic short music video generation system for daily pet clips.
    โ€ข Fine-tuned visual language models (e.g., BLIP), achieving a 20.6% improvement in visual question answering.
    โ€ข Enhanced LLaVA image inference speed by 250% with only a 3% accuracy reduction.
    โ€ข Developed APIs for visual language models using llama.cpp/ollama for image-caption pair datasets.

Skills

  • Python & Deep Learning

    PyTorch, TensorFlow, Scikit-learn

  • Machine Learning

    Self-Supervised Learning, Computer Vision, NLP, Medical Imaging

  • Data Engineering

    Pandas, SQL, Distributed Training (Slurm), HPC

  • Programming Languages

    Go, JavaScript, HTML, C++, C, Linux

Download CV

Projects

Research & Projects

  1. VioPTT: Violin Technique-Aware Transcription

    Research Project

    Developed VioPTT, a violin transcription system that recognizes playing techniques (pizzicato, tremolo, harmonics, etc.) from audio via synthetic data augmentation. Supports end-to-end audio-to-MIDI with per-note technique labels.

  2. Guitar Effect Removal

    Collaboration with Positive Grid ML Team

    Proposed a two-stage method to remove distortion effects from guitar recordings using Positive Grid VST plugins. Achieved 20% higher audio quality than the best baseline, rated by 26 professional guitarists. Published in DAFx 2024.

  3. Whole Brain fMRI Feature Selection (AutoFMRI)

    Academia Sinica - MCTLAB

    Proposed a two-stage automatic thresholding method to extract whole-brain fMRI features and predict musical pitch. Demonstrated 2-fold improvement over ROI-based feature selection. Available as a pip package (pip install autofmri). Published in ICASSP 2023.

  4. MIDI Rhythm Master

    Side Project

    A browser-based rhythm game that reads any MIDI file and turns it into a playable 4-lane game. Features real-time synth playback via Web Audio API, difficulty settings, hold notes, and a global leaderboard powered by Cloudflare Workers + D1.

  5. Match-3 Puzzle Game

    Side Project

    A match-3 puzzle game built with Godot Engine and deployed to the web. Features gem-matching mechanics, cascading combos, and shader effects.

Publications

Publications

  1. VioPTT: Violin Technique-Aware Transcription from Synthetic Data Augmentation

    ICASSP 2026

    Wang, T. K.*, Peng, Y. P.*, Su, L., & Cheung, V. K. M.
    Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing 2026 (ICASSP'26). (* equally contributed)

  2. Is Transfer Learning Necessary for Violin Transcription?

    ISMIR 2025 Late Breaking

    Peng, Y. P.*, Wang, T. K.*, Su, L., & Cheung, V. K. M.
    Int. Society for Music Information Retrieval Conf. 2025 (ISMIR'25) - Late Breaking/Demo. (* equally contributed)

  3. Whole-Brain Transferable Representations from Large-Scale fMRI Data Improve Task-Evoked Brain Activity Decoding

    arXiv 2025

    Peng, Y. P., Cheung, V. K. M., & Su, L.
    arXiv preprint arXiv:2507.22378

  4. Distortion Recovery: A Two-Stage Method for Guitar Effect Removal

    DAFx 2024

    Lee, Y. S.*, Peng, Y. P.*, Wu, J. T., Cheng, M., Su, L., & Yang, Y. H.
    Proc. Int. Conf. Digital Audio Effects 2024 (DAFx'24). (* equally contributed)

  5. Decoding Musical Pitch from Human Brain Activity with Automatic Voxel-Wise Whole-Brain FMRI Feature Selection

    ICASSP 2023

    Cheung, V. K.*, Peng, Y. P.*, Lin, J. H., & Su, L.
    Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing 2023 (ICASSP'23). (* equally contributed)