Overview
User mute strategies control whether incoming user input should be suppressed based on the current system state. They determine when user audio and transcriptions should be muted to prevent interruptions during critical bot operations like initial responses or function calls.
By default, user input is never muted. You can configure mute strategies to automatically suppress user input in specific scenarios, such as while the bot is speaking or during function execution. Custom strategies can also be implemented for specific use cases.
Configuration
User mute strategies are configured via LLMUserAggregatorParams when creating an LLMContextAggregatorPair:
from pipecat.processors.aggregators.llm_response_universal import (
LLMContextAggregatorPair,
LLMUserAggregatorParams,
)
from pipecat.turns.user_mute import (
MuteUntilFirstBotCompleteUserMuteStrategy,
FunctionCallUserMuteStrategy,
)
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(
user_mute_strategies=[
MuteUntilFirstBotCompleteUserMuteStrategy(),
FunctionCallUserMuteStrategy(),
],
),
)
Available Strategies
AlwaysUserMuteStrategy
Mutes user input whenever the bot is speaking. This prevents any interruptions during bot speech.
from pipecat.turns.user_mute import AlwaysUserMuteStrategy
strategy = AlwaysUserMuteStrategy()
Behavior:
- Mutes when
BotStartedSpeakingFrame is received
- Unmutes when
BotStoppedSpeakingFrame is received
FirstSpeechUserMuteStrategy
Mutes user input only during the bot’s first speech. After the initial response completes, user input is allowed even while the bot is speaking.
from pipecat.turns.user_mute import FirstSpeechUserMuteStrategy
strategy = FirstSpeechUserMuteStrategy()
Behavior:
- Allows user input before bot speaks
- Mutes during the first bot speech only
- Unmutes permanently after first speech completes
Use this strategy when you want to ensure the bot’s greeting or initial
response isn’t interrupted, but allow normal interruptions afterward.
MuteUntilFirstBotCompleteUserMuteStrategy
Mutes user input from the start of the interaction until the bot completes its first speech. This ensures the bot maintains full control at the beginning of a conversation.
from pipecat.turns.user_mute import MuteUntilFirstBotCompleteUserMuteStrategy
strategy = MuteUntilFirstBotCompleteUserMuteStrategy()
Behavior:
- Mutes immediately when the pipeline starts (before bot speaks)
- Remains muted until first
BotStoppedSpeakingFrame is received
- Unmutes permanently after first speech completes
Unlike FirstSpeechUserMuteStrategy, this strategy mutes user input even
before the bot starts speaking. Use this when you don’t want to process any
user input until the bot has delivered its initial message.
FunctionCallUserMuteStrategy
Mutes user input while function calls are executing. This prevents user interruptions during potentially long-running tool operations.
from pipecat.turns.user_mute import FunctionCallUserMuteStrategy
strategy = FunctionCallUserMuteStrategy()
Behavior:
- Mutes when
FunctionCallsStartedFrame is received
- Tracks multiple concurrent function calls
- Unmutes when all function calls complete (via
FunctionCallResultFrame or FunctionCallCancelFrame)
This strategy is particularly useful when function calls trigger external API
requests or database operations that may take several seconds to complete and
you don’t want to the user to interrupt the output.
Combining Multiple Strategies
Multiple strategies can be combined in a list. The strategies are combined with OR logic—if any strategy indicates the user should be muted, user input is suppressed.
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(
user_mute_strategies=[
MuteUntilFirstBotCompleteUserMuteStrategy(), # Mute until first response
FunctionCallUserMuteStrategy(), # Mute during function calls
],
),
)
In this example, user input is muted:
- From pipeline start until the bot completes its first speech
- Whenever function calls are executing (even after first speech)
Usage Examples
Prevent Interruptions During Greeting
Ensure the bot’s greeting plays completely before accepting user input:
from pipecat.turns.user_mute import MuteUntilFirstBotCompleteUserMuteStrategy
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(
user_mute_strategies=[
MuteUntilFirstBotCompleteUserMuteStrategy(),
],
),
)
Mute During Function Calls Only
Allow normal interruptions but prevent them during tool execution:
from pipecat.turns.user_mute import FunctionCallUserMuteStrategy
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(
user_mute_strategies=[
FunctionCallUserMuteStrategy(),
],
),
)
Never Allow Interruptions
Always mute user input while the bot is speaking:
from pipecat.turns.user_mute import AlwaysUserMuteStrategy
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(
user_mute_strategies=[
AlwaysUserMuteStrategy(),
],
),
)
Building Custom Strategies
Subclass BaseUserMuteStrategy when none of the built-in strategies fit. A strategy only needs to answer one question per frame: should the user be muted right now?
The base class
BaseUserMuteStrategy (in pipecat.turns.user_mute) exposes the following interface. The signatures below are a simplified view of what your subclass can override:
# Simplified interface; see the source for the full definition.
class BaseUserMuteStrategy:
async def setup(self, task_manager): ...
async def cleanup(self): ...
async def reset(self): ...
async def process_frame(self, frame: Frame) -> bool:
"""Return True if the user should be muted after this frame."""
return False
Override process_frame to update internal state and return the current mute decision. Override reset if your strategy tracks turn-based state that should clear between conversations.
Which frames reach a strategy
Each strategy’s process_frame is called for every frame that passes through the user aggregator, except StartFrame, EndFrame, and CancelFrame. This includes:
- User-direction frames from the input transport and STT:
TranscriptionFrame, InterimTranscriptionFrame, UserStartedSpeakingFrame, UserStoppedSpeakingFrame, VADUserStartedSpeakingFrame, VADUserStoppedSpeakingFrame, InputAudioRawFrame, InterruptionFrame
SystemFrame broadcasts from elsewhere in the pipeline: BotStartedSpeakingFrame, BotStoppedSpeakingFrame, FunctionCallsStartedFrame, FunctionCallResultFrame, FunctionCallCancelFrame
Frames that don’t naturally reach the user aggregator (for example
LLMTextFrame or TTSTextFrame, which flow downstream from the LLM or TTS)
won’t be seen by a strategy directly. To react to those signals, place a
companion FrameProcessor where the frames do flow and have it toggle state
on your strategy. See Toggling a strategy at
runtime below.
Which frames get suppressed when muted
Returning True from your strategy sets the aggregator’s mute state. While muted, only these frame types are actually dropped:
InterruptionFrame
VADUserStartedSpeakingFrame, VADUserStoppedSpeakingFrame
UserStartedSpeakingFrame, UserStoppedSpeakingFrame
InputAudioRawFrame
InterimTranscriptionFrame, TranscriptionFrame
All other frames continue to flow so the rest of the pipeline keeps functioning.
Example: a simple custom strategy
Mute the user whenever the bot is speaking, but only after a specific number of bot turns:
from pipecat.frames.frames import BotStartedSpeakingFrame, BotStoppedSpeakingFrame, Frame
from pipecat.turns.user_mute import BaseUserMuteStrategy
class AfterNTurnsUserMuteStrategy(BaseUserMuteStrategy):
def __init__(self, mute_after_turn: int = 3):
super().__init__()
self._mute_after_turn = mute_after_turn
self._bot_turns = 0
self._bot_speaking = False
async def reset(self):
self._bot_turns = 0
self._bot_speaking = False
async def process_frame(self, frame: Frame) -> bool:
await super().process_frame(frame)
if isinstance(frame, BotStartedSpeakingFrame):
self._bot_speaking = True
elif isinstance(frame, BotStoppedSpeakingFrame):
self._bot_speaking = False
self._bot_turns += 1
return self._bot_speaking and self._bot_turns >= self._mute_after_turn
Toggling a strategy at runtime
Strategies are plain Python objects. Anything that holds a reference to one can flip its state between frames, which means a companion processor placed elsewhere in the pipeline can drive the mute decision based on signals the strategy can’t observe directly (LLM text, tool results, external events).
This example strategy adds its own enable/disable methods (not part of the base contract) and returns their state from process_frame:
from pipecat.frames.frames import Frame
from pipecat.turns.user_mute import BaseUserMuteStrategy
class ToggleableUserMuteStrategy(BaseUserMuteStrategy):
def __init__(self):
super().__init__()
self._muted = False
def enable(self):
self._muted = True
def disable(self):
self._muted = False
async def reset(self):
self._muted = False
async def process_frame(self, frame: Frame) -> bool:
await super().process_frame(frame)
return self._muted
A companion processor watches for the trigger and toggles the strategy:
from pipecat.frames.frames import (
BotStartedSpeakingFrame,
BotStoppedSpeakingFrame,
Frame,
LLMTextFrame,
)
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
class DisclaimerGuardProcessor(FrameProcessor):
def __init__(self, strategy: ToggleableUserMuteStrategy, trigger_phrase: str, **kwargs):
super().__init__(**kwargs)
self._strategy = strategy
self._trigger = trigger_phrase
# Keep a small sliding window so cross-frame matches work without
# the buffer growing unbounded if the trigger never appears.
self._max_buffer = max(len(trigger_phrase) * 4, 512)
self._buffer = ""
self._active = False
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, BotStartedSpeakingFrame):
# Start each bot turn with a fresh buffer.
self._buffer = ""
elif isinstance(frame, LLMTextFrame) and direction == FrameDirection.DOWNSTREAM:
self._buffer = (self._buffer + frame.text)[-self._max_buffer :]
if not self._active and self._trigger in self._buffer:
self._active = True
self._strategy.enable()
elif isinstance(frame, BotStoppedSpeakingFrame) and self._active:
self._active = False
self._buffer = ""
self._strategy.disable()
await self.push_frame(frame, direction)
Wire them together by passing the same strategy instance to both the aggregator and the processor:
mute_strategy = ToggleableUserMuteStrategy()
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
context,
user_params=LLMUserAggregatorParams(user_mute_strategies=[mute_strategy]),
)
disclaimer_guard = DisclaimerGuardProcessor(
strategy=mute_strategy,
trigger_phrase="Please read the following disclosure",
)
pipeline = Pipeline([
transport.input(),
stt,
user_aggregator,
llm,
disclaimer_guard, # positioned where LLMTextFrame flows downstream
tts,
transport.output(),
assistant_aggregator,
])
Example: mute for the first N words of the bot speaking
Count words as the LLM streams text and keep the user muted until the threshold is reached. The strategy owns the counter and resets it each turn; a companion processor feeds it text:
from pipecat.frames.frames import BotStartedSpeakingFrame, BotStoppedSpeakingFrame, Frame
from pipecat.turns.user_mute import BaseUserMuteStrategy
class FirstNWordsUserMuteStrategy(BaseUserMuteStrategy):
def __init__(self, word_count: int = 10):
super().__init__()
self._threshold = word_count
self._words_seen = 0
self._bot_speaking = False
def add_words(self, text: str):
self._words_seen += len(text.split())
async def reset(self):
self._words_seen = 0
self._bot_speaking = False
async def process_frame(self, frame: Frame) -> bool:
await super().process_frame(frame)
if isinstance(frame, BotStartedSpeakingFrame):
self._bot_speaking = True
self._words_seen = 0
elif isinstance(frame, BotStoppedSpeakingFrame):
self._bot_speaking = False
return self._bot_speaking and self._words_seen < self._threshold
from pipecat.frames.frames import Frame, LLMTextFrame
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
class LLMTextWordCounter(FrameProcessor):
def __init__(self, strategy: FirstNWordsUserMuteStrategy, **kwargs):
super().__init__(**kwargs)
self._strategy = strategy
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, LLMTextFrame) and direction == FrameDirection.DOWNSTREAM:
self._strategy.add_words(frame.text)
await self.push_frame(frame, direction)
Because mute decisions are only re-evaluated when frames pass through the
aggregator, the unmute point here aligns with the next user or bot frame after
the threshold is crossed, not the exact word boundary. For tighter control,
drop the word count and gate on a sentinel phrase the LLM emits at the end of
the protected section, as shown in the disclaimer example above.
Event Handlers
You can register event handlers to be notified when user muting starts or stops. This is useful for observability or providing feedback to users.
Available Events
on_user_mute_started
Called when user input becomes muted due to any active mute strategy.
@user_aggregator.event_handler("on_user_mute_started")
async def on_user_mute_started(aggregator):
logger.info("User mute started")
on_user_mute_stopped
Called when user input is unmuted (no active mute strategies).
@user_aggregator.event_handler("on_user_mute_stopped")
async def on_user_mute_stopped(aggregator):
logger.info("User mute stopped")
These events fire whenever the mute state changes, regardless of which
strategy triggered the change. Use them to provide consistent feedback across
all mute scenarios.