Handwritten music recognition is a challenging task that could be of great use if mastered, e.g., to improve the accessibility of archival manuscripts or to ease music composition. Many modern machine learning techniques, however, cannot be easily applied to this task because of the limited availability of high-quality training data.
Annotating such data manually is expensive and thus not feasible at the necessary scale. This problem has already been tackled in other fields by training on automatically generated synthetic data.
We bring this approach to handwritten music recognition and present a method to generate synthetic handwritten music images (limited to monophonic scores) and show that training on such data leads to state-of-the-art results.