|
| 1 | +Spectrogram Analysis |
| 2 | +==================== |
| 3 | + |
| 4 | +Spectrogram data access and visualization for audio analysis. |
| 5 | + |
| 6 | +.. currentmodule:: idtap |
| 7 | + |
| 8 | +SpectrogramData |
| 9 | +--------------- |
| 10 | + |
| 11 | +The :class:`SpectrogramData` class provides comprehensive access to Constant-Q Transform (CQT) |
| 12 | +spectrograms for computational musicology and audio analysis. |
| 13 | + |
| 14 | +.. autoclass:: SpectrogramData |
| 15 | + :members: |
| 16 | + :undoc-members: |
| 17 | + :show-inheritance: |
| 18 | + |
| 19 | +Key Features |
| 20 | +~~~~~~~~~~~~ |
| 21 | + |
| 22 | +* **Constant-Q Transform (CQT)** - Log-spaced frequency bins for musical analysis |
| 23 | +* **Intensity Transformation** - Power-law contrast enhancement (1.0-5.0) |
| 24 | +* **Colormap Support** - 35+ matplotlib colormaps |
| 25 | +* **Frequency/Time Cropping** - Extract specific frequency ranges or time segments |
| 26 | +* **Matplotlib Integration** - Plot on existing axes for overlays with pitch contours |
| 27 | +* **Image Export** - Save as PNG, JPEG, WebP, etc. |
| 28 | + |
| 29 | +Quick Examples |
| 30 | +~~~~~~~~~~~~~~ |
| 31 | + |
| 32 | +Load and display a spectrogram:: |
| 33 | + |
| 34 | + from idtap import SwaraClient, SpectrogramData |
| 35 | + |
| 36 | + client = SwaraClient() |
| 37 | + spec = SpectrogramData.from_audio_id("audio_id_here", client) |
| 38 | + |
| 39 | + # Save basic visualization |
| 40 | + spec.save("output.png", power=2.0, cmap='viridis') |
| 41 | + |
| 42 | +Create matplotlib overlay with pitch contour:: |
| 43 | + |
| 44 | + import matplotlib.pyplot as plt |
| 45 | + |
| 46 | + # Load spectrogram and piece data |
| 47 | + spec = SpectrogramData.from_piece(piece, client) |
| 48 | + |
| 49 | + # Create figure |
| 50 | + fig, ax = plt.subplots(figsize=(12, 6)) |
| 51 | + |
| 52 | + # Plot spectrogram as underlay with transparency |
| 53 | + im = spec.plot_on_axis(ax, power=2.0, cmap='viridis', alpha=0.7, zorder=0) |
| 54 | + |
| 55 | + # Overlay pitch contour |
| 56 | + times = [traj.start_time for traj in piece.trajectories] |
| 57 | + pitches = [traj.pitch_contour[0] for traj in piece.trajectories] |
| 58 | + ax.plot(times, pitches, 'r-', linewidth=2, zorder=1) |
| 59 | + |
| 60 | + # Configure axes |
| 61 | + ax.set_xlabel('Time (s)') |
| 62 | + ax.set_ylabel('Frequency (Hz)') |
| 63 | + plt.colorbar(im, ax=ax, label='Intensity') |
| 64 | + |
| 65 | + plt.savefig('overlay.png', dpi=150, bbox_inches='tight') |
| 66 | + |
| 67 | +Crop to specific region:: |
| 68 | + |
| 69 | + # Extract 200-800 Hz range, first 10 seconds |
| 70 | + cropped = spec.crop_frequency(200, 800).crop_time(0, 10) |
| 71 | + cropped.save("cropped.png", power=2.5, cmap='magma') |
| 72 | + |
| 73 | +Supported Colormaps |
| 74 | +~~~~~~~~~~~~~~~~~~~ |
| 75 | + |
| 76 | +.. autodata:: SUPPORTED_COLORMAPS |
| 77 | + :annotation: |
| 78 | + |
| 79 | +Available colormaps include: viridis, plasma, magma, inferno, hot, cool, gray, and many more. |
| 80 | +See the matplotlib colormap documentation for visual examples. |
| 81 | + |
| 82 | +Technical Details |
| 83 | +~~~~~~~~~~~~~~~~~ |
| 84 | + |
| 85 | +* **Algorithm**: Essentia NSGConstantQ (Non-Stationary Gabor Constant-Q Transform) |
| 86 | +* **Default Frequency Range**: 75-2400 Hz |
| 87 | +* **Default Bins Per Octave**: 72 (high resolution for microtonal analysis) |
| 88 | +* **Data Format**: uint8 grayscale (0-255), gzip-compressed |
| 89 | +* **Time Resolution**: ~0.0116 seconds per frame (typical) |
| 90 | +* **Frequency Scale**: Logarithmic (perceptually-uniform for music) |
0 commit comments