开发者

Is there a "standard" dataset for music in symbolic form? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. 开发者_开发技巧

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

Closed 4 years ago.

Improve this question

For music data in audio format, there's The Million Song Dataset (http://labrosa.ee.columbia.edu/millionsong/), for example. Is there a similar one for music in symbolic form (that is, where the notes - not the sound - is stored)? Any format (like MIDI or MusicXML) would be fine.


I'm not aware of a "standard" dataset. However, the places I know of for music scores in symbolic form are:

  • The Mutopia Project, a repository for free/libre music scores in Lilypond format. They standardise on Lilypond because it is a free/libre tool, it produces high-quality scores, and it’s possible to convert from many formats into Lilypond. They currently host over 1700 scores.
  • The aforementioned Gutenberg Sheet Music Project, an interesting one to watch. It hosts less than 100 scores now. However, it’s an offshoot of the tremendously successful Gutenburg Project for free ebooks (literature in plain text form), so they know how to run this sort of project. They have an excellent organised approach to content production.
  • MuseScore, a repository for music arrangements. They prefer MuseScore's own .mscz format, but support many others. [Added December 2019]
    • Wikifonia, a repository for lead sheets of songs. [As of December 2019, this site announces that it has closed.] A lead sheet is a simplified music score, perhaps enough to sing at a piano with friends, but not enough to publish a vocal score. They use MusicXML as their standard format. I estimate they have over 4000 scores. Interestingly, they have an arrangement to pay royalties for music they host. This is probably the best home for re-typeset scores of non-free/libre music. [This site was in operation in January 2012, when the answer was first written, but has ceased operation by December 2019, when this edit was made. Since the question is also old and closed, it's worth leaving this legacy entry in the answer.]


You can find a list of sites with sheet music in MusicXML and MusicXML-compatible formats at:

http://www.recordare.com/musicxml/music

Many of those sites include MIDI files and other formats as well.


KernScores is a really good collection. It has a section for people looking for datasets with > 10,000 monophonic pieces and other categories.

Edit: It also allows you to download whole sections at a time, zipped, which is a huge benefit when you don't want to have to click every individual download link.


The Sheet Music Project at www.gutenberg.org looks like what you're looking for. It uses MusicXML.


If by dataset you mean music collection, this music search engine is very effective:

http://www.kooplet.com/cgi-bin/kooplet/search.pl


There is the classical music MIDI dataset: http://www.piano-midi.de/

And more generally you can find 4 standard midi (and piano-roll) datasets which have been used to train neural networks here: http://www-etud.iro.umontreal.ca/~boulanni/icml2012 The datasets are:

Piano-midi.de(1) : Source (124 files, 951 KB) or Piano-roll (7.1 MB)
Nottingham(2) : Source (1037 files, 676.1 KB) or Piano-roll (23.2 MB)
MuseData(3) : Source (783 files, 3.0 MB) or Piano-roll (30.1 MB)
JSB Chorales : Source (382 files, 210 KB) or Piano-roll (2.0 MB) 

(1)Please see the Copyright page. (2)The original collection is also available in ABC format. (3)Please read the License Agreement.

State of the Art results on these datasets are reported here: http://www-etud.iro.umontreal.ca/~boulanni/ICML2012.pdf

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜