LoadAudio

The LoadAudio node is the essential entry point for importing external audio files into the ComfyUI workflow. By supporting a wide range of formats (.wav, .mp3, .ogg, .flac, .aiff, .aif), it enables seamless integration of real-world sounds, music, or voice samples—preparing audio data for downstream processing, synthesis, or AI generation.

Overview

LoadAudio simplifies the task of loading and preprocessing audio files for creative and analytic workflows in ComfyUI. Users select their source file from a specified directory (with auto-scan for all supported formats). The loaded audio is converted to a standardized tensor format suitable for all subsequent audio nodes—such as visualization, transformation, mixing, or synthesis. The node unlocks AI-driven audio editing, music generation, and multimodal fusion by bridging external assets with ComfyUI's node graph.

Visual Example

LoadAudio ComfyUI node
Figure 1 - LoadAudio ComfyUI node

Official Documentation Link

https://www.runcomfy.com/comfyui-nodes/ComfyUI/LoadAudio

Inputs

Parameter Data Type Required Range Description
audio AUDIO Yes All supported audio/video files in input directory The audio file to load from the input directory

Outputs

Output Name Data Type Description
audio AUDIO Audio data containing waveform and sample rate information

Usage Instructions

Add LoadAudio at the start of your workflow. Select your audio file using the provided file browser (automatically detects supported formats and lists available files). Optionally set a target_sample_rate for resampling and choose mono/stereo as required. Connect the output to downstream nodes—such as waveform visualizer, segmentation, transformation, or synthesis modules. Run the workflow to import, preview, and manipulate external audio efficiently.

Advanced Usage

Use the node with batch pipelines by loading multiple files in series or parallel. Combine with custom resamplers, normalization nodes, or channel converters to standardize diverse datasets for AI training or robust generative projects. Integrate with multimodal workflows by loading corresponding audio and pairing with images, text, or video. For complex generative or analytical tasks, normalize audio tensor shape and sample rate using node chaining and automated parameter settings.

Example JSON for API or Workflow Export

{
  "id": "loadaudio_1",
  "type": "LoadAudio",
  "inputs": {
    "file_path": "input/audio/song_demo.mp3",
    "target_sample_rate": 48000,
    "mono": false
  }
}

Tips

  • Keep input directory organized by project; use consistent naming for easy batch and workflow integration.
  • Prefer lossless formats (.wav, .flac) for highest fidelity where source quality matters.
  • Resample audio for compatibility with all downstream nodes (44100 or 48000Hz standard).
  • Convert to mono if input has unwanted stereo artifacts or for model training simplicity.
  • Check output preview for durations/channels; mismatches can be resolved by post-processing.

How It Works (Technical)

LoadAudio scans the input directory for compatible files, loads the selected audio using libraries like torchaudio or librosa, and converts waveform and metadata into tensor format (float32, standardized shape). Optional operations include resampling to target sample rate and converting stereo to mono. The audio tensor is then usable in any downstream ComfyUI node, preserving fidelity and data integrity for synthesis, mixing, analysis, or feature extraction.

Github Alternatives

  • ComfyUI-audio-nodes – Node pack with audio load, encode, decode, and processing—supports Bark, HuBERT, and Encodec workflows.
  • ComfyUI_AudioTools – Full-featured toolkit for audio workflows: loading with custom paths, batch tools, watermarks, string editing, and pause capabilities.
  • ComfyUI_Yvann-Nodes – Audio reactivity and animation sync nodes with load, transform, and feature extraction for music-visual projects.
  • ComfyUI-AudioBatch – Custom batch loader and audio workflow nodes for multi-file import, channel conversion, and resampling.

Videcool workflows

The Clip Text Encode (Positive Prompt) node is used in the following Videcool workflows:

FAQ

1. Which audio formats are supported?
The node supports .wav, .mp3, .ogg, .flac, .aiff, .aif.

2. Can I load multiple files or batches?
Yes, many community nodes support batch loading and can be chained for multi-file workflows.

3. Why is my audio output shape or sample rate mismatched?
Use the target_sample_rate and mono inputs to ensure downstream compatibility.

Common Mistakes and Troubleshooting

Common issues include unsupported file formats, incorrect or missing file paths, and mismatches in sample rate or channel count for downstream nodes. For batch operations, missing files or inconsistent tensor shapes can halt workflows—organize and verify file existence beforehand. Always ensure you match formats, sample rates, and channels expected by your processing nodes. Check output preview and error logs if audio import fails.

Conclusion

LoadAudio is an indispensable utility for building audio-enabled creative, analytic, and generative pipelines in ComfyUI. Direct, adaptable, and format-flexible, it paves the way for sophisticated multimodal projects that merge real-world sound with advanced AI workflows.

More information