Guide: Parse and Analyze¶
Learn how to load **kern files, inspect their structure, and extract musical information for analysis.
Loading Files¶
Load from Disk¶
The most common use case—loading a **kern file from your file system:
import kernpy as kp
document, errors = kp.load('path/to/score.krn')
# Handle any parsing errors
if len(errors) > 0:
print(f"Parser found {len(errors)} issues:") # Usually renders like Verovio althouh these errors
for error in errors[:3]: # Show first 3
print(f" {error}")
Load from a String¶
Useful for working with small **kern snippets or data from APIs:
import kernpy as kp
kern_data = """**kern **text
*M4/4 *
=1 =1
4c Do
4d Re
4e Mi
4f Fa
=2 =2
2g Sol
*- *-
"""
document, errors = kp.loads(kern_data)
Load with Error Handling¶
Control how strict the parser should be:
import kernpy as kp
# Tolerant parsing (default)—ignores minor issues
doc, errors = kp.load('score.krn', raise_on_errors=False)
# Strict parsing—raises exception on any error
try:
doc, errors = kp.load('score.krn', raise_on_errors=True)
except ValueError as e:
print(f"Parsing failed: {e}")
Understanding Document Structure¶
Inspect Spine Metadata¶
The public API exposes spine metadata, not individual spine objects:
import kernpy as kp
doc, _ = kp.load('score.krn')
spine_types = kp.spine_types(doc)
spine_ids = doc.get_spine_ids()
print(f"Total spines: {doc.get_spine_count()}")
print(f"Spine ids: {spine_ids}")
print(f"Spine types: {spine_types}")
Measure Information¶
import kernpy as kp
doc, _ = kp.load('score.krn')
first_measure = doc.get_first_measure()
last_measure = doc.measures_count()
print(f"Score spans measures {first_measure} to {last_measure}")
Get Tokens from the Document¶
import kernpy as kp
doc, _ = kp.load('score.krn')
tokens = doc.get_all_tokens()
token_encodings = doc.get_all_tokens_encodings()
print(f"Total tokens: {len(tokens)}")
print(f"First 20 encodings: {token_encodings[:20]}")
for i, token in enumerate(tokens[:20]):
print(f" {i:2d}: {token.encoding}")
Analyzing Token Content¶
Inspect Individual Tokens¶
import kernpy as kp
doc, _ = kp.load('score.krn')
tokens = doc.get_all_tokens()
token = tokens[4]
print(f"Encoding: {token.encoding}")
print(f"Category: {token.category}")
print(f"Type: {type(token).__name__}")
Filter Tokens by Category¶
import kernpy as kp
doc, _ = kp.load('score.krn')
note_rest_tokens = doc.get_all_tokens(filter_by_categories=[kp.TokenCategory.NOTE_REST])
pitch_tokens = doc.get_all_tokens(filter_by_categories=[kp.TokenCategory.PITCH])
clef_tokens = doc.get_all_tokens(filter_by_categories=[kp.TokenCategory.CLEF])
print(f"Note/rest tokens: {len(note_rest_tokens)}")
print(f"Pitch subtokens: {len(pitch_tokens)}")
print(f"Clef tokens: {len(clef_tokens)}")
Count Token Frequencies¶
import kernpy as kp
doc, _ = kp.load('score.krn')
frequencies = doc.frequencies()
note_rest_frequencies = doc.frequencies(token_categories=[kp.TokenCategory.NOTE_REST])
print(f"Total unique tokens: {len(frequencies)}")
print(f"Note/rest frequencies: {note_rest_frequencies}")
Extract Musical Information¶
Find Time Signatures¶
import kernpy as kp
doc, _ = kp.load('score.krn')
time_sigs = doc.get_all_tokens(filter_by_categories=[kp.TokenCategory.TIME_SIGNATURE])
print(f"Time signatures found: {[token.encoding for token in time_sigs]}")
Find Key Signatures¶
import kernpy as kp
doc, _ = kp.load('score.krn')
key_signatures = doc.get_all_tokens(filter_by_categories=[kp.TokenCategory.KEY_SIGNATURE])
print(f"Key signatures: {[token.encoding for token in key_signatures]}")
List All Clefs¶
import kernpy as kp
doc, _ = kp.load('score.krn')
clef_tokens = doc.get_all_tokens(filter_by_categories=[kp.TokenCategory.CLEF])
print(f"Clefs: {[token.encoding for token in clef_tokens]}")
Analyze Musical Content¶
Count Note and Rest Tokens¶
import kernpy as kp
doc, _ = kp.load('score.krn')
note_rest_tokens = doc.get_all_tokens(filter_by_categories=[kp.TokenCategory.NOTE_REST])
rest_count = sum(
1 for token in note_rest_tokens
if any(subtoken.category == kp.TokenCategory.REST for subtoken in token.pitch_duration_subtokens)
)
note_count = len(note_rest_tokens) - rest_count
print(f"Notes: {note_count}")
print(f"Rests: {rest_count}")
print(f"Total: {len(note_rest_tokens)}")
Practical Analysis Patterns¶
Extract Lyrics¶
import kernpy as kp
doc, _ = kp.load('score.krn')
lyrics = doc.get_all_tokens(filter_by_categories=[kp.TokenCategory.LYRICS])
print(f"Lyrics: {' '.join(token.encoding for token in lyrics)}")
Compare Spine Metadata¶
import kernpy as kp
from collections import Counter
doc, _ = kp.load('score.krn')
spine_types = kp.spine_types(doc)
spine_ids = doc.get_spine_ids()
for spine_type, count in Counter(spine_types).items():
print(f"{spine_type}: {count}")
print(f"Spine ids: {spine_ids}")
Batch Analysis¶
Process multiple files:
import kernpy as kp
from pathlib import Path
score_dir = Path('scores')
results = []
for krn_file in score_dir.glob('*.krn'):
doc, errors = kp.load(krn_file)
if not errors:
results.append({
'file': krn_file.name,
'measures': doc.measures_count(),
'spines': doc.get_spine_count(),
'types': ','.join(kp.spine_types(doc)),
'errors': len(errors),
'tokens': len(doc.get_all_tokens()),
'frequencies': doc.frequencies(),
})
for result in results:
print(f"{result['file']}: {result['measures']} measures, {result['spines']} spines")
Summary¶
You now know how to:
- Load **kern files and strings
- Navigate the document structure (spines, tokens, measures)
- Filter and extract specific information
- Analyze musical content (pitches, durations, signatures)
- Process collections of files
Next Steps¶
- Transform documents? — See Transform Documents
- Build automated workflows? — See Build Pipelines
- Understand token categories better? — See Token Categories