---- Comments from the Reviewers ---- Review #597C *Is the work within the scope of the conference and relevant to ICASSP?*: Clearly within scope *Is the manuscript technically correct?*: Moderate concerns with the potential for some impact on the contribution or conclusions *Is the technical contribution novel?*: Moderate novelty, with clear extensions of existing methods/concepts *Is the level of experimental validation sufficient?*: Lacking in some respect *Is the technical contribution significant?*: Moderate contribution, with the possibility of an impact on the field *Are the references appropriate, without any significant omissions?*: Complete list of references without any significant omissions *Are there any references that do not appear to be relevant?*: All references are directly relevant to the contribution of the manuscript *Is the manuscript properly structured and clearly written?*: Some minor structural, language, or other issues of exposition that would be easily rectified *Comments to the Author(s)* The paper presents a residual-quantization–based tokenizer for symbolic music, but the overall contribution feels incremental. The method is largely a direct application of existing RQ-VAE techniques with limited novelty specific to the music domain. Several claims (e.g., being the first discrete representation framework for symbolic music) are overstated given prior work on VQ/VAE-based symbolic encoders. Evaluation is also weak. Baseline comparisons are not fully fair due to different vocabularies and training setups, making performance differences hard to interpret. Moreover, experiments are restricted to piano-only datasets, limiting generality. ----------- Review #2C93 *Is the work within the scope of the conference and relevant to ICASSP?*: Clearly within scope *Is the manuscript technically correct?*: Technically sound without any identifiable conceptual or mathematical errors, questionable experimental design choices, or weaknesses in experimental validation *Is the technical contribution novel?*: Moderate novelty, with clear extensions of existing methods/concepts *Is the level of experimental validation sufficient?*: Sufficient validation/theoretical paper *Is the technical contribution significant?*: Moderate contribution, with the possibility of an impact on the field *Are the references appropriate, without any significant omissions?*: Complete list of references without any significant omissions *Are there any references that do not appear to be relevant?*: All references are directly relevant to the contribution of the manuscript *Is the manuscript properly structured and clearly written?*: Well-structured and clearly written with no issues of exposition *Comments to the Author(s)* This paper presents MuseTok, an RQ-VAE–based discrete tokenization framework for symbolic music, evaluated on both generation and semantic understanding tasks. From a methodological perspective, the paper does not introduce any novelty; the core idea seems to be a direct application of RQ-VAE. From the evaluation perspective, I really appreciate the comprehensive evaluations and the availability of open-source code and the website demo. However, I also have concerns regarding the quantitative results of MuseTok. For example, in Table 2, MuseTok performs even worse than REMI on the subjective evaluation. It would be helpful if the authors could provide additional computational or model-complexity analyses to justify the additional advantages of MuseTok? ----------- Review #46D0 *Is the work within the scope of the conference and relevant to ICASSP?*: Clearly within scope *Is the manuscript technically correct?*: Some minor concerns that should be easily corrected without altering the contribution or conclusions *Is the technical contribution novel?*: Substantial novelty, with clearly identifiable new methods/concepts *Is the level of experimental validation sufficient?*: Limited but convincing *Is the technical contribution significant?*: Substantial contribution, with a clear potential for impact *Are the references appropriate, without any significant omissions?*: Some significant omissions that may have a moderate impact on the novelty of the submission *Are there any references that do not appear to be relevant?*: All references are directly relevant to the contribution of the manuscript *Is the manuscript properly structured and clearly written?*: Some minor structural, language, or other issues of exposition that would be easily rectified *Comments to the Author(s)* This paper presents MuseTok, a learned bar-level tokenization method for symbolic music using RQ-VAE. Its main strengths are the novel application of multi-codebook RQ-VAE to symbolic music, providing a coherent two-stage generation pipeline, and demonstrating that the learned codes capture musically meaningful attributes (texture, rhythm). The experimental validation is comprehensive, covering reconstruction, generation, and several understanding tasks with insightful qualitative analysis. However, several weaknesses should be addressed to strengthen the paper: 1. Baseline Comparisons: The generation and understanding evaluations lack comparison against recent strong baselines (e.g., PianoBART, Nested Music Transformer, MIDI-BERT for melody extraction). Adding these is crucial to accurately position the contribution. 2. Evaluation Metrics: The reconstruction perplexity metric is non-standard and hard to interpret. Please report standard negative log-likelihood (NLL) or token-level perplexity for fair comparison. 3. Methodological Clarifications: Key experimental details are missing, particularly for the chord recognition task (label source, alignment, data splits). The description of the reconstruction loss (L_recon) and the definition of texture groups should be clarified. 4. Technical Limitations Discussion: A deeper discussion on the fixed quantization depth limitation and potential solutions (e.g., adaptive depth), as well as an analysis of the melody extraction performance gap versus other PTMs, would improve the paper. The work is novel, technically sound, and relevant to ICASSP. With the above revisions—primarily adding stronger baselines, standardizing key metrics, and providing necessary methodological details—this would be a solid contribution. The commitment to open-source the code is a significant benefit to the community. -----------