Spectrograms Preferences

From Audacity Development Manual
Jump to: navigation, search



You can view any audio track as a Spectrogram instead of a waveform by selecting Spectrogram view from the Audio Track Dropdown Menu. Spectrograms Preferences lets you adjust some of the settings for these different types of Spectrum-based view.

See the Spectrogram View page for detailed descriptions and illustrations of the effects of various Spectrograms Preferences settings.

Tip Spectrograms Preferences does not affect settings in the Frequency Analysis window accessed by Analyze > Plot Spectrum.
Accessed by: Edit > Preferences > Spectrograms    (on a Mac Audacity > Preferences > Spectrograms )
Audio Settings PreferencesPlayback PreferencesRecording PreferencesMIDI Devices PreferencesQuality PreferencesInterface PreferencesTracks PreferencesTracks Behaviors PreferencesSpectrograms PreferencesImport/Export PreferencesExtended Import PreferencesLibraries PreferencesDirectories PreferencesWarnings PreferencesEffects PreferencesShortcuts PreferencesMouse PreferencesModules PreferencesApplication PreferencesClick on this in Audacity to get helpPreferences Spectrograms.png


The options in this window apply to the Spectrogram View in the Audio Track Dropdown Menu. Some options have no effect, or a different effect, when the Pitch (EAC) algorithm is selected - these will be noted in the descriptions.
Tip Be aware that it is possible to make settings for individual Spectrogram view tracks that can over-ride the global settings that you make here in the Spectrograms Preferences dialog. See Spectrogram Settings on the Spectrogram View page.

Scale

  • Scale (in Spectrogram views):
    • Linear The linear vertical scale goes linearly from 0 kHz to 20 kHz frequency by default.
    • Logarithmic: This view is the same as the linear view except that the vertical scale is logarithmic. See Spectrogram View for a contrasting example of linear versus logarithmic spectrogram view.
    • Mel: The name Mel comes from the word melody to indicate that the scale is based on pitch comparisons. See this Wikipedia page. This is the default scale.
    • Bark: This is a psychoacoustical scale based on subjective measurements of loudness. It is related to, but somewhat less popular than, the Mel scale. See this Wikipedia page.
    • ERB: The Equivalent Rectangular Bandwidth scale or ERB is a measure used in psychoacoustics, which gives an approximation to the bandwidths of the filters in human hearing. It is implemented as a function ERBS(f) which returns the number of equivalent rectangular bandwidths below the given frequency "f". See this Wikipedia page.
    • Period: is the previously undocumented scale used by Pitch (EAC) view. It is for making the same displays of Pitch possible as in earlier versions of Audacity.
  • Minimum Frequency: This value corresponds to the bottom of the vertical scale in the spectrogram. Frequencies below this value will not be visible. The default value of "0" here will be treated as "1" when using "Spectrogram Logarithmic" view mode because a logarithmic scale cannot start at zero.
  • Maximum Frequency: This value corresponds to the top of the vertical scale. The value can be set to 100 Hz or any higher value. Irrespective of the entered value, the top of the scale will never exceed half the current sample rate of the track (for example, 22,050 Hz if the track rate is 44,100 Hz) because any given sample rate can only carry frequencies up to half that rate. A good use of this setting is in speech recognition or pitch extraction, where you can hide the visually unimportant highest frequencies and focus on the lower frequencies.


Colors

  • Gain (dB): Increases / decreases the brightness of the display. For small signals where the display is mostly "blue" (dark) you can increase this value to see brighter colors and give more detail. If the display has too much "white", decrease this value. The default is 20dB and corresponds to a -20 dB signal at a particular frequency being displayed as "white". This option has no effect and is grayed out when the Pitch (EAC) algorithm is selected.
  • Range (dB): Affects the range of signal sizes that will be displayed as colors. The default is 80 dB and means that you will not see anything for signals 80 dB below the value set for "Gain". This option has no effect and is grayed out when the Pitch (EAC) algorithm is selected.
  • High boost (dB/dec): A positive value here gives some extra gain to higher frequencies (above 1,000 Hz), as they tend to be smaller and so cannot be seen as well. You get less gain at lower frequencies as well. The default is 0 dB. This option has no effect and is grayed out when the Pitch (EAC) algorithm is selected.
  • Scheme: Choice of two colorways or two grayscale settings.


Algorithm

  • Algorithm:
    • Frequencies (default): Audio frequency determines the pitch of a sound. Measured in Hz, higher frequencies have higher pitch. See this Wikipedia article.
    • Reassignment: The method of reassignment sharpens blurry time-frequency data by relocating the data according to local estimates of instantaneous frequency and group delay. This mapping to reassigned time-frequency coordinates is very precise for signals that are separable in time and frequency with respect to the analysis window.
    • Pitch (EAC): Highlights the contour of the fundamental frequency (musical pitch) of the audio, using the Enhanced Autocorrelation (EAC) algorithm. The EAC Algorithm was developed to produce a mathematical representation of the changes of pitch in a piece of audio. The aim was to allow automated comparison of sound files so that two versions of the same tune could be recognized as being similar, even if played in different keys, or on different instruments.
  • Window Size: The dropdown menu lets you choose the size of the Fast Fourier Transform (FFT) window which affects how much vertical (frequency) detail you see. Larger FFT window sizes give more low frequency resolution and less temporal resolution, and are slower.
  • Window type: Determines precisely how the spectrogram is computed. Hann is the default setting. 'Rectangular' is slightly faster than other methods, but introduces some artifacts. All methods give broadly similar results.
  • Zero padding factor: Larger values give finer interpolation of the colors along the vertical axis, at the expense of more computation time. Does not affect the time vs. frequency resolution tradeoff. This option has no effect and is grayed out when the Pitch (EAC) algorithm is selected.


Enable Spectral Selection

Check this box "on" if you want to enable spectral selections. These are used to make selections that include a frequency range as well as a time range on tracks in one of the Spectrogram views. Spectral selection is used with special spectral edit effects to make changes to the frequency content of the selected audio. Among other purposes, spectral selection and editing can be used for cleaning up unwanted sound, enhancing certain resonances, changing the quality of a voice or removing mouth sounds from voice work.

Setting this to "on" or "off" will only affect spectrogram displayed tracks which have not already been independently set for this purpose in their Track Control Panel's Spectrogram Preferences. The Track Control Panel setting will override this Preferences setting if Use Preferences is unchecked in the Track Control Panel's Spectrogram Preferences.


Examples of Spectrogram views

For visual examples of what the various settings control please see Spectrogram View.