AVPlayer+AudioUnit之播放视频音轨(AVAssetTrack)

背景

VoIP应用中,需要在通话端进行视频播放,同时该视频又不进入到VoIP声音中,避免产生回音现象。

参考

解法

iOS provides three I/O (input/output) units. The vast majority of audio-unit applications use the Remote I/O unit, which connects to input and output audio hardware and provides low-latency access to individual incoming and outgoing audio sample values. For VoIP apps, the Voice-Processing I/O unit extends the Remote I/O unit by adding acoustic echo cancelation and other features. To send audio back to your application rather than to output audio hardware, use the Generic Output unit.

通过SubtypekAudioUnitSubType_VoiceProcessingIOkAudioUnitSubType_RemoteIO的AudioUnit来输出音频,可以使用上苹果自带的回音消除能力

基本思路

  • 如果AVPlayer使用AudioUnit,直接Hook改变subType完成。
  • 从AVPlayer解码过程中取到实时音频数据,直接转推到另一个AudioUnit播放出来,这种方案要是能通,Seek等可以默认实现对齐。
  • 保底方案,从AVPlayer取出PCM文件,做内存或者文件缓存,单独再播一份,需要手动对齐媒体时间。
  • 保底方案,使用AVPlayer播视频,同时直接再解码一份,光播音频的,需要手动对齐媒体时间。

尝试一

首先是看到官网中的架构图,第一反应肯定是AVPlayer的音频播放也是基于AudioUnit,那就好办了

直接Hook一下AudioUnit的几个核心函数,然后替换一下Unit初始化的subType。都是C函数,这里要使用到fishhook。要是hook

+(void)load{
    int success = rebind_symbols((struct rebinding[1]){{"AudioOutputUnitStart", fg_AudioOutputUnitStart, (void *)&origin_AudioOutputUnitStart}}, 1);
    DebugLog(@"%@",@(success));
}

分别尝试了

  • AUGraphAddNode
  • AudioOutputUnitStart
  • AudioComponentInstanceNew
  • AudioUnitSetProperty

    发现AVPlayer均没有实现,于是这个方案失败告终。

尝试二

参考了苹果的AudioTapProcessorDEMO,发现可以使用AudioMix方案来取到实时的音频数据,那转推一份就好了。

首先从AVPlayer的KVO中监听状态,获得音轨。

- (void)observeValueForKeyPath:(NSString *)keyPath ofObject:(id)object change:(NSDictionary *)change context:(void *)context {
    if (object == self.player.currentItem) {
        if ([keyPath isEqualToString:@"status"]) {
            if (self.player.currentItem.status == AVPlayerItemStatusReadyToPlay) {
                //分离音频
                for (AVPlayerItemTrack* track in self.player.currentItem.tracks) {
                    if([track.assetTrack.mediaType isEqualToString:AVMediaTypeAudio]){
                        [self beginRecordingAudioFromTrack:track.assetTrack];
                    }
                }
            }
        }
    }
}

根据音轨生成AudioMix,赋值给PlayerItem

    AVMutableAudioMix *audioMix = [AVMutableAudioMix audioMix];
    if (audioMix)
    {
        AVMutableAudioMixInputParameters *audioMixInputParameters = [AVMutableAudioMixInputParameters audioMixInputParametersWithTrack:self.audioAssetTrack];
        if (audioMixInputParameters)
        {
            MTAudioProcessingTapCallbacks callbacks;
            
            callbacks.version = kMTAudioProcessingTapCallbacksVersion_0;
            callbacks.clientInfo = (__bridge void *)self,
            callbacks.init = tap_InitCallback;
            callbacks.finalize = tap_FinalizeCallback;
            callbacks.prepare = tap_PrepareCallback;
            callbacks.unprepare = tap_UnprepareCallback;
            callbacks.process = tap_ProcessCallback;
            
            MTAudioProcessingTapRef audioProcessingTap;
            if (noErr == MTAudioProcessingTapCreate(kCFAllocatorDefault, &callbacks, kMTAudioProcessingTapCreationFlag_PostEffects, &audioProcessingTap))
            {
                audioMixInputParameters.audioTapProcessor = audioProcessingTap;
                
                CFRelease(audioProcessingTap);
                
                audioMix.inputParameters = @[audioMixInputParameters];
                //赋值
                self.player.currentItem.audioMix = audioMix;
            }
        }
    }

在Prepare回调中获取音频格式信息,同时新建我们的outPutUnit

static void tap_PrepareCallback(MTAudioProcessingTapRef tap, CMItemCount maxFrames, const AudioStreamBasicDescription *processingFormat)
{
    AVAudioTapProcessorContext *context = (AVAudioTapProcessorContext *)MTAudioProcessingTapGetStorage(tap);
    OSStatus status = noErr;
    AudioComponentDescription outputUnitDescription;
    outputUnitDescription.componentType = kAudioUnitType_Output;
    outputUnitDescription.componentSubType = kAudioUnitSubType_VoiceProcessingIO;
    outputUnitDescription.componentManufacturer = kAudioUnitManufacturer_Apple;
    outputUnitDescription.componentFlags = 0;
    outputUnitDescription.componentFlagsMask = 0;
    AudioComponent audioComponent = AudioComponentFindNext(NULL, &outputUnitDescription);
    if(audioComponent){
        if (noErr == AudioComponentInstanceNew(audioComponent, &context->outputUnit)){
            UInt32 maxFPS = (UInt32)maxFrames;
        
            //设置input和output
            status = AudioUnitSetProperty(context->outputUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input,  0, processingFormat, sizeof(AudioStreamBasicDescription));
            UInt32 flag = 1;
            status = AudioUnitSetProperty(context->outputUnit, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Output,  0, &flag, sizeof(flag));
            AudioUnitSetProperty(context->outputUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 0, processingFormat, sizeof(AudioStreamBasicDescription));
            //设置maxFrame
            status = AudioUnitSetProperty(context->outputUnit, kAudioUnitProperty_MaximumFramesPerSlice, kAudioUnitScope_Global, 0,&maxFPS, sizeof(maxFPS));
            //设置renderLoop
            AURenderCallbackStruct callbackStruct;
            callbackStruct.inputProcRefCon = (void *)tap;
            callbackStruct.inputProc = RenderCallback;
            status = AudioUnitSetProperty(context->outputUnit, kAudioUnitProperty_SetRenderCallback, kAudioUnitScope_Input, 0, &callbackStruct, sizeof(AURenderCallbackStruct));
            
            AudioUnitInitialize(context->outputUnit);
            //启动Unit
            AudioOutputUnitStart(context->outputUnit);
        }
    }
}

在process回调中获取并转存音频数据,尝试在这里直接把数据转发给outputUnit,会发现process的InputFrame(4096)和outputUnit的InputFrame(1024)不一致。

这说明process回调的以后,并没有直接开始播放音频,这部分音频数据会缓存在内存中,等到要播的时候再取出来。参考苹果的思路,我们也转存到内存中,然后把原始音频静音,直接抹除掉所有数据。

static void tap_ProcessCallback(MTAudioProcessingTapRef tap, CMItemCount numberFrames, MTAudioProcessingTapFlags flags, AudioBufferList *bufferListInOut, CMItemCount *numberFramesOut, MTAudioProcessingTapFlags *flagsOut)
{
    //不播放 手动增加frameout 转存data
    AVAudioTapProcessorContext *context = (AVAudioTapProcessorContext *)MTAudioProcessingTapGetStorage(tap);
    MYAudioTapProcessor *self = ((__bridge MYAudioTapProcessor *)context->self);
    OSStatus error = MTAudioProcessingTapGetSourceAudio(tap, numberFrames, bufferListInOut, flagsOut, NULL, numberFramesOut);
    if(error == noErr && bufferListInOut->mBuffers[0].mDataByteSize > 0){
        @synchronized (self) {
            
            self.currentTotalFrame += numberFrames;
            NSData* bufferData = [[NSData alloc] initWithBytes:bufferListInOut->mBuffers[0].mData length:bufferListInOut->mBuffers[0].mDataByteSize];
            //使用NSData可以实现内存区域的copy,类似memcpy。
            [self.totalBufferData appendData:bufferData];
        }
    }
    //清除原始音频数据 使之静音
    for (uint32_t i = 0; i < bufferListInOut->mNumberBuffers; ++i) {
         memset(bufferListInOut->mBuffers[i].mData, 0, bufferListInOut->mBuffers[i].mDataByteSize);
     }
     *numberFramesOut = 0;
}

最后在我们的回调中计算1024需要的bufferData,从总的buffer中取出

static OSStatus RenderCallback(void *userData, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inBusNumber, UInt32 inNumberFrames, AudioBufferList *ioData)
{
    AVAudioTapProcessorContext *context = (AVAudioTapProcessorContext *)MTAudioProcessingTapGetStorage(userData);
    MYAudioTapProcessor *self = ((__bridge MYAudioTapProcessor *)context->self);
    @synchronized (self) {
        if(self.currentPlayedFrame < self.currentTotalFrame){
            //是均匀的4
            uint64_t perFrameLength = self.totalBufferData.length / self.currentTotalFrame;
            NSData* playData = [self.totalBufferData subdataWithRange:NSMakeRange(perFrameLength*self.currentPlayedFrame, perFrameLength*inNumberFrames)];
            memcpy(ioData->mBuffers[0].mData, playData.bytes, playData.length);
            ioData->mBuffers[0].mDataByteSize = perFrameLength*inNumberFrames;
            ioData->mBuffers[0].mNumberChannels = 1;
            self.currentPlayedFrame += inNumberFrames;
            return noErr;
        }else{
            return -1;
        }
    }
}

其中的userData是我们在初始化的时候传入的对象。

注意点

  • AudioUnit相关的操作必须在Audio线程中操作,可以在Tap的回调中操作,否则会导致线程死锁。
  • AudioUnit开启必须释放。
AudioOutputUnitStop(self.outputUnit);
AudioUnitUninitialize(self.outputUnit);
AudioComponentInstanceDispose(self.outputUnit);