Lyric Translation Dataset

1. Korean-English Lyric Translation Dataset

This dataset, presented at our LREC-COLING 2024 paper "K-pop Lyric Translation: Dataset, Analysis, and Neral Modelling", includes pairs of Korean and English lyrics for 1,000 songs, aligned line-by-line and section-by-section.

If you find this dataset to be useful for your research, please cite the following paper.

@inproceedings{kim2024kpop,
  title={K-pop Lyric Translation: Dataset, Analysis, and Neural Modelling},
  author={Kim, Haven and Jung, Jongmin and Jeong, Dasaem and Nam, Juhan},
  booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING)},
  year={2024},
}

Dataset Download Request

Please send an email with your signature attached to this form. By providing your signature, you indicate that you agree with the statements in the form.

2. English-Japanese-Korean Lyric Translation Dataset

This dataset, used in our ISMIR 2023 paper "A Computational Evaluation Framework for Singable Lyric Translation", includes aligned English, Japanese, and Korean lyrics for 162 songs.

If you find this dataset to be useful for your research, please cite the following paper.

@misc{kim2023computationalevaluationframeworksingable,
      title={A Computational Evaluation Framework for Singable Lyric Translation}, 
      author={Haven Kim and Kento Watanabe and Masataka Goto and Juhan Nam},
      year={2023},
      eprint={2308.13715},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2308.13715}, 
}

Dataset Download Request

Please send an email with your signature attached to this form. By providing your signature, you indicate that you agree with the statements in the form.