Yueh-Po Peng - Portfolio

About me

I am a Senior Machine Learning Engineer at Gamania Digital Entertainment. Previously, I was a Visiting Researcher at Sony Computer Science Laboratories. I specialize in developing AI-driven solutions across various domains including computer vision, natural language processing, and medical imaging.

With a Master's degree in Data Science from National Taiwan University, I have presented research at conferences such as ICASSP, ISMIR, and DAFx. My work focuses on applying machine learning to solve real-world problems, from audio processing and medical image analysis to financial anomaly detection.

What I do

Machine Learning & Deep Learning

Developing advanced AI models for computer vision, NLP, and audio processing applications.
Medical Imaging & fMRI Analysis

Applying self-supervised learning techniques to decode brain signals and analyze medical images.
Audio Processing & MIR

Music information retrieval and audio effect processing using deep learning methods.
Data Science & Analytics

Building data-driven solutions including Text-to-SQL agents and anomaly detection systems.

Resume

Education

National Taiwan University
Feb 2023 — Jun 2024
M.S. in Data Science
Thesis: Whole‑Brain Feature Selection Methods for Decoding from fMRI Data
National Taiwan University
Sep 2019 — Jan 2022
B.S. in Computer Science and Information Engineering (CSIE)

Experience

Senior Machine Learning Engineer @ Gamania Digital Entertainment
Jul 2025 — Present
• Built an AI agent-driven platform enabling users to autonomously plan and generate manga content.
• Developed immersive vertical-scroll webtoon reading features including AI-generated background music and sound effects.
• Implemented AI virtual character Voice Conversion systems for interactive content.
Visiting Researcher @ Sony Computer Science Laboratories
Jun 2025 — Oct 2025
Research Topics: Music Information Retrieval, Automatic Music Transcription
• Proposed VioPTT, a lightweight cascade model for violin technique-aware transcription with a novel synthetic dataset (MOSA-VPT). Published at ICASSP 2026 [1].
• Investigated transfer learning for violin AMT, showing training from scratch on instrument-specific data can match fine-tuned piano-pretrained models. Published at ISMIR 2025 Late Breaking/Demo [2].
AI Engineer @ Gate.io
Oct 2024 — May 2025
• Developed a Text-to-SQL AI agent enabling non-technical teams to access internal data, boosting query efficiency by 20%.
• Developed a fund flows anomaly detection system with LLMs and tree-based models, enhancing financial security.
Research Assistant @ Academia Sinica (MCTLAB)
Mar 2022 — Oct 2024
Research Topics: Self‑Supervised Learning, Medical Imaging
• Proposed a Transformer-based self-supervised learning method for decoding brain signals (fMRI), achieving a 77% reduction in memory footprint.
• Conducted distributed training experiments on high-resolution 4D medical images (fMRI) using TWCC HPC.
• Proposed a whole-brain feature selection method for decoding musical pitch from fMRI.
AI Engineer Intern @ Tomofun
Mar 2023 — Jul 2024
Research Topics: Computer Vision, Large Language Models, Multimodal Learning
• Developed an automatic short music video generation system for daily pet clips.
• Fine-tuned visual language models (e.g., BLIP), achieving a 20.6% improvement in visual question answering.
• Enhanced LLaVA image inference speed by 250% with only a 3% accuracy reduction.
• Developed APIs for visual language models using llama.cpp/ollama for image-caption pair datasets.

Skills

Python & Deep Learning

PyTorch, TensorFlow, Scikit-learn
Machine Learning

Self-Supervised Learning, Computer Vision, NLP, Medical Imaging
Data Engineering

Pandas, SQL, Distributed Training (Slurm), HPC
Programming Languages

Go, JavaScript, HTML, C++, C, Linux

Download CV

Projects

Research & Projects

VioPTT: Violin Technique-Aware Transcription
Research Project
Developed VioPTT, a violin transcription system that recognizes playing techniques (pizzicato, tremolo, harmonics, etc.) from audio via synthetic data augmentation. Supports end-to-end audio-to-MIDI with per-note technique labels.

💻 Code 🎻 Demo
Guitar Effect Removal
Collaboration with Positive Grid ML Team
Proposed a two-stage method to remove distortion effects from guitar recordings using Positive Grid VST plugins. Achieved 20% higher audio quality than the best baseline, rated by 26 professional guitarists. Published in DAFx 2024.

📄 Paper 🎵 Demo
Whole Brain fMRI Feature Selection (AutoFMRI)
Academia Sinica - MCTLAB
Proposed a two-stage automatic thresholding method to extract whole-brain fMRI features and predict musical pitch. Demonstrated 2-fold improvement over ROI-based feature selection. Available as a pip package (pip install autofmri). Published in ICASSP 2023.

📄 Paper 💻 Code
MIDI Rhythm Master
Side Project
A browser-based rhythm game that reads any MIDI file and turns it into a playable 4-lane game. Features real-time synth playback via Web Audio API, difficulty settings, hold notes, and a global leaderboard powered by Cloudflare Workers + D1.

💻 Code 🎮 Play
Match-3 Puzzle Game
Side Project
A match-3 puzzle game built with Godot Engine and deployed to the web. Features gem-matching mechanics, cascading combos, and shader effects.

💻 Code 🎮 Play

Publications

VioPTT: Violin Technique-Aware Transcription from Synthetic Data Augmentation
ICASSP 2026
Wang, T. K.*, Peng, Y. P.*, Su, L., & Cheung, V. K. M.
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing 2026 (ICASSP'26). (* equally contributed)

📄 Paper
Is Transfer Learning Necessary for Violin Transcription?
ISMIR 2025 Late Breaking
Peng, Y. P.*, Wang, T. K.*, Su, L., & Cheung, V. K. M.
Int. Society for Music Information Retrieval Conf. 2025 (ISMIR'25) - Late Breaking/Demo. (* equally contributed)

📄 Paper
Whole-Brain Transferable Representations from Large-Scale fMRI Data Improve Task-Evoked Brain Activity Decoding
arXiv 2025
Peng, Y. P., Cheung, V. K. M., & Su, L.
arXiv preprint arXiv:2507.22378

📄 Paper
Distortion Recovery: A Two-Stage Method for Guitar Effect Removal
DAFx 2024
Lee, Y. S.*, Peng, Y. P.*, Wu, J. T., Cheng, M., Su, L., & Yang, Y. H.
Proc. Int. Conf. Digital Audio Effects 2024 (DAFx'24). (* equally contributed)

📄 Paper 🎵 Demo
Decoding Musical Pitch from Human Brain Activity with Automatic Voxel-Wise Whole-Brain FMRI Feature Selection
ICASSP 2023
Cheung, V. K.*, Peng, Y. P.*, Lin, J. H., & Su, L.
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing 2023 (ICASSP'23). (* equally contributed)

📄 Paper

What I do

Machine Learning & Deep Learning

Medical Imaging & fMRI Analysis

Audio Processing & MIR

Data Science & Analytics

Education

National Taiwan University

National Taiwan University

Experience

Senior Machine Learning Engineer @ Gamania Digital Entertainment

Visiting Researcher @ Sony Computer Science Laboratories

AI Engineer @ Gate.io

Research Assistant @ Academia Sinica (MCTLAB)

AI Engineer Intern @ Tomofun

Skills

Python & Deep Learning

Machine Learning

Data Engineering

Programming Languages

Research & Projects

VioPTT: Violin Technique-Aware Transcription

Guitar Effect Removal

Whole Brain fMRI Feature Selection (AutoFMRI)

MIDI Rhythm Master

Match-3 Puzzle Game

Publications

VioPTT: Violin Technique-Aware Transcription from Synthetic Data Augmentation

Is Transfer Learning Necessary for Violin Transcription?

Whole-Brain Transferable Representations from Large-Scale fMRI Data Improve Task-Evoked Brain Activity Decoding

Distortion Recovery: A Two-Stage Method for Guitar Effect Removal

Decoding Musical Pitch from Human Brain Activity with Automatic Voxel-Wise Whole-Brain FMRI Feature Selection