17

iOS视频编码VideoToolbox

 4 years ago
source link: https://www.tuicool.com/articles/umUr2mA
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

需求

iOS中编码视频数据,一般情况而言一个项目仅需要一个编码器,不过有时特殊需求可能需要两个编码器同时工作.本例中实现了编码器类.仅通过指定不同编码器的枚举值就可以快速生成需要的编码器,且支持两个编码器一起工作.

实现原理:

iOS中利用VideoToolBox框架完成视频硬编码操作,支持H.264,H.265编码器.

软编码:使用CPU进行编码。

硬编码:不使用CPU进行编码,使用显卡GPU,专用的DSP、FPGA、ASIC芯片等硬件进行编码。

阅读前提:

GitHub地址(附代码) : Video Encoder

掘金地址 : Video Encoder

简书地址 : Video Encoder

博客地址 :Video Encoder

测试结果

本例通过将编码后的文件写成.mov文件, 来测试h264, h265编码效率, 录制时间相同,场景基本相同,结果显示h265仅需要h264一半的内存就可以完成同样的画质.注意,录制出来的文件只能用ffmpeg相关工具播放.

7vaq2qz.png!web

Fvqquqz.png!web

ny2M3aB.png!web

实现步骤

1. 初始化编码器参数

本例中的编码器类不是单例,因为我们可以生成出h264编码器,h265编码器,以及让生成两个不同类型编码器对象同时工作.这里指定的宽高帧率需要与相机保持一致. 比特率即播放过程中平均码率,是否支持实时编码,如果支持实时编码码率则无法控制.最后我们仅仅可以通过指定编码器的类型来决定创建h264编码器还是h265编码器.

  • 判断是否支持编码器

判断是否支持hevc编码器,并不是所有的设备都支持h265编码器,这由硬件决定,但是没有直接的API去判断是否支持h265编码器,在这里借助 AVAssetExportPresetHEVCHighestQuality 属性来间接判断是否支持h265编码.

注意: h265编码的软件API需要在iOS 11以上的操作系统才能使用. 目前所有流行的iPhone已都支持h264编码器.

// You could select h264 / h265 encoder.
self.videoEncoder = [[XDXVideoEncoder alloc] initWithWidth:1280
height:720
fps:30
bitrate:2048
isSupportRealTimeEncode:NO
encoderType:XDXH265Encoder]; // XDXH264Encoder

-(instancetype)initWithWidth:(int)width height:(int)height fps:(int)fps bitrate:(int)bitrate isSupportRealTimeEncode:(BOOL)isSupportRealTimeEncode encoderType:(XDXVideoEncoderType)encoderType {
if (self = [super init]) {
mSession              = NULL;
mVideoFile            = NULL;
_width                = width;
_height               = height;
_fps                  = fps;
_bitrate              = bitrate << 10;  //convert to bps
_errorCount           = 0;
_isSupportEncoder     = NO;
_encoderType          = encoderType;
_lock                 = [[NSLock alloc] init];
_isSupportRealTimeEncode = isSupportRealTimeEncode;
_needResetKeyParamSetBuffer = YES;
if (encoderType == XDXH265Encoder) {
if (@available(iOS 11.0, *)) {
if ([[AVAssetExportSession allExportPresets] containsObject:AVAssetExportPresetHEVCHighestQuality]) {
_isSupportEncoder = YES;
}
}
}else if (encoderType == XDXH264Encoder){
_isSupportEncoder = YES;
}

log4cplus_info("Video Encoder:","Init encoder width:%d, height:%d, fps:%d, bitrate:%d, is support encoder:%d, encoder type:H%lu", width, height, fps, bitrate, isSupportRealTimeEncode, (unsigned long)encoderType);
}

return self;
}

2. 初始化编码器

初始化一个编码器分为以下三个步骤, 首先新建一个 VTCompressionSessionRef 引用对象管理编码器, 然后将编码器所有属性赋值给该对象.最后在编码前预先分配一些资源(即为要编码的数据预先分配内存)以便编码buffer使用.

- (void)configureEncoderWithWidth:(int)width height:(int)height {
log4cplus_info("Video Encoder:", "configure encoder with and height for init,with = %d,height = %d",width, height);

if(width == 0 || height == 0) {
log4cplus_error("Video Encoder:", "encoder param can't is null. width:%d, height:%d",width, height);
return;
}

self.width   = width;
self.height  = height;

mSession = [self configureEncoderWithEncoderType:self.encoderType
callback:EncodeCallBack
width:self.width
height:self.height
fps:self.fps
bitrate:self.bitrate
isSupportRealtimeEncode:self.isSupportRealTimeEncode
iFrameDuration:30
lock:self.lock];
}

- (VTCompressionSessionRef)configureEncoderWithEncoderType:(XDXVideoEncoderType)encoderType callback:(VTCompressionOutputCallback)callback width:(int)width height:(int)height fps:(int)fps bitrate:(int)bitrate isSupportRealtimeEncode:(BOOL)isSupportRealtimeEncode iFrameDuration:(int)iFrameDuration lock:(NSLock *)lock {
log4cplus_info("Video Encoder:","configure encoder width:%d, height:%d, fps:%d, bitrate:%d, is support realtime encode:%d, I frame duration:%d", width, height, fps, bitrate, isSupportRealtimeEncode, iFrameDuration);

[lock lock];
// Create compression session
VTCompressionSessionRef session = [self createCompressionSessionWithEncoderType:encoderType
width:width
height:height
callback:callback];

// Set compresssion property
[self setCompressionSessionPropertyWithSession:session
fps:fps
bitrate:bitrate
isSupportRealtimeEncode:isSupportRealtimeEncode
iFrameDuration:iFrameDuration
EncoderType:encoderType];

// Prepare to encode
OSStatus status = VTCompressionSessionPrepareToEncodeFrames(session);
[lock unlock];
if(status != noErr) {
log4cplus_error("Video Encoder:", "create encoder failed, status: %d",(int)status);
return NULL;
}else {
log4cplus_info("Video Encoder:","create encoder success");
return session;
}
}

2.1. 创建 VTCompressionSessionRef 对象

VTCompressionSessionCreate : 创建视频编码器session, 即管理编码器上下文的对象.

  • allocator: session的内存分配器.传递NULL表示默认的分配器.
  • width,height: 指定编码器的像素的宽高,与捕捉到的视频分辨率保持一致
  • codecType: 编码器类型.目前可用h264, h265两种主流编码器,h264应用最为广泛.h265编码器是h264的下一代,压缩性能更高,不过刚在iOS11中开放出来,存在一些bug.
  • encoderSpecification: 指定必须使用特定的编码器.一般传NULL即可.video toolbox会自己选择.
  • sourceImageBufferAttributes: 原始视频数据需要的属性.主要用于创建 a pixel buffer pool .
  • compressedDataAllocator: 压缩数据的内存分配器.传NULL表示使用默认的分配器.
  • outputCallback: 接收压缩数据的回调.这个回调可以选择使用同步或异步方式接收.如果用同步则与 VTCompressionSessionEncodeFrame 函数线程保持一致,如果用异步会新建一条线程接收.该参数也可传NULL不过当且仅当我们使用 VTCompressionSessionEncodeFrameWithOutputHandler 函数作编码时.
  • outputCallbackRefCon: 可以传入用户自定义数据.主要用于回调函数与主类之间的交互.
  • compressionSessionOut: 传入要创建的session的内存地址.注意,session不能为NULL.
VT_EXPORT OSStatus 
VTCompressionSessionCreate(
CM_NULLABLE CFAllocatorRef                            allocator,
int32_t                                                width,
int32_t                                                height,
CMVideoCodecType                                    codecType,
CM_NULLABLE CFDictionaryRef                            encoderSpecification,
CM_NULLABLE CFDictionaryRef                            sourceImageBufferAttributes,
CM_NULLABLE CFAllocatorRef                            compressedDataAllocator,
CM_NULLABLE VTCompressionOutputCallback                outputCallback,
void * CM_NULLABLE                                    outputCallbackRefCon,
CM_RETURNS_RETAINED_PARAMETER CM_NULLABLE VTCompressionSessionRef * CM_NONNULL compressionSessionOut) API_AVAILABLE(macosx(10.8), ios(8.0), tvos(10.2));

下面是具体用法.注意如果相机采集的分辨率改变,需要销毁当前编码器session重新创建.

- (VTCompressionSessionRef)createCompressionSessionWithEncoderType:(XDXVideoEncoderType)encoderType width:(int)width height:(int)height callback:(VTCompressionOutputCallback)callback {
CMVideoCodecType codecType;
if (encoderType == XDXH264Encoder) {
codecType = kCMVideoCodecType_H264;
}else if (encoderType == XDXH265Encoder) {
codecType = kCMVideoCodecType_HEVC;
}else {
return nil;
}

VTCompressionSessionRef session;
OSStatus status = VTCompressionSessionCreate(NULL,
width,
height,
codecType,
NULL,
NULL,
NULL,
callback,
(__bridge void *)self,
&session);

if (status != noErr) {
log4cplus_error("Video Encoder:", "%s: Create session failed:%d",__func__,(int)status);
return nil;
}else {
return session;
}
}

2.2. 设置session属性

  • 查询session是否支持当前属性

创建好session后,调用 VTSessionCopySupportedPropertyDictionary 函数可以将当前session支持的所有属性拷贝到指定的字典中,以后在设置属性前先在字典中查询是否支持即可.

- (BOOL)isSupportPropertyWithSession:(VTCompressionSessionRef)session key:(CFStringRef)key {
OSStatus status;
static CFDictionaryRef supportedPropertyDictionary;
if (!supportedPropertyDictionary) {
status = VTSessionCopySupportedPropertyDictionary(session, &supportedPropertyDictionary);
if (status != noErr) {
return NO;
}
}

BOOL isSupport = [NSNumber numberWithBool:CFDictionaryContainsKey(supportedPropertyDictionary, key)].intValue;
return isSupport;
}
  • 设置session的属性

使用 VTSessionSetProperty 函数指定key, value即可设置属性.

- (OSStatus)setSessionPropertyWithSession:(VTCompressionSessionRef)session key:(CFStringRef)key value:(CFTypeRef)value {
if (value == nil || value == NULL || value == 0x0) {
return noErr;
}

OSStatus status = VTSessionSetProperty(session, key, value);
if (status != noErr)  {
log4cplus_error("Video Encoder:", "Set session of %s Failed, status = %d",CFStringGetCStringPtr(key, kCFStringEncodingUTF8),status);
}
return status;
}
kVTCompressionPropertyKey\_MaxKeyFrameInterval
// Set compresssion property
[self setCompressionSessionPropertyWithSession:session
fps:fps
bitrate:bitrate
isSupportRealtimeEncode:isSupportRealtimeEncode
iFrameDuration:iFrameDuration
EncoderType:encoderType];

- (void)setCompressionSessionPropertyWithSession:(VTCompressionSessionRef)session fps:(int)fps bitrate:(int)bitrate isSupportRealtimeEncode:(BOOL)isSupportRealtimeEncode iFrameDuration:(int)iFrameDuration EncoderType:(XDXVideoEncoderType)encoderType {

int maxCount = 3;
if (!isSupportRealtimeEncode) {
if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_MaxFrameDelayCount]) {
CFNumberRef ref   = CFNumberCreate(NULL, kCFNumberSInt32Type, &maxCount);
[self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_MaxFrameDelayCount value:ref];
CFRelease(ref);
}
}

if(fps) {
if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_ExpectedFrameRate]) {
int         value = fps;
CFNumberRef ref   = CFNumberCreate(NULL, kCFNumberSInt32Type, &value);
[self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_ExpectedFrameRate value:ref];
CFRelease(ref);
}
}else {
log4cplus_error("Video Encoder:", "Current fps is 0");
return;
}

if(bitrate) {
if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_AverageBitRate]) {
int value = bitrate << 10;
CFNumberRef ref = CFNumberCreate(NULL, kCFNumberSInt32Type, &value);
[self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_AverageBitRate value:ref];
CFRelease(ref);
}
}else {
log4cplus_error("Video Encoder:", "Current bitrate is 0");
return;
}


if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_RealTime]) {
log4cplus_info("Video Encoder:", "use realTimeEncoder");
[self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_RealTime value:isSupportRealtimeEncode ? kCFBooleanTrue : kCFBooleanFalse];
}

// Ban B frame.
if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_AllowFrameReordering]) {
[self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_AllowFrameReordering value:kCFBooleanFalse];
}

if (encoderType == XDXH264Encoder) {
if (isSupportRealtimeEncode) {
if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_ProfileLevel]) {
[self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_ProfileLevel value:kVTProfileLevel_H264_Main_AutoLevel];
}
}else {
if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_ProfileLevel]) {
[self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_ProfileLevel value:kVTProfileLevel_H264_Baseline_AutoLevel];
}

if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_H264EntropyMode]) {
[self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_H264EntropyMode value:kVTH264EntropyMode_CAVLC];
}
}
}else if (encoderType == XDXH265Encoder) {
if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_ProfileLevel]) {
[self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_ProfileLevel value:kVTProfileLevel_HEVC_Main_AutoLevel];
}
}


if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration]) {
int         value   = iFrameDuration;
CFNumberRef ref     = CFNumberCreate(NULL, kCFNumberSInt32Type, &value);
[self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration value:ref];
CFRelease(ref);
}

log4cplus_info("Video Encoder:", "The compression session max frame delay count = %d, expected frame rate = %d, average bitrate = %d, is support realtime encode = %d, I frame duration = %d",maxCount, fps, bitrate, isSupportRealtimeEncode,iFrameDuration);
}

2.3. 编码前资源分配

您可以选择调用此函数,以便为编码器提供在开始编码帧之前执行任何必要资源分配的机会。此可选调用可用于为编码器提供在开始编码帧之前分配所需资源的机会。如果未调用此方法,则将在第一个VTCompressionSessionEncodeFrame调用上分配任何必要的资源。额外调用此函数将不起作用。

// Prepare to encode
OSStatus status = VTCompressionSessionPrepareToEncodeFrames(session);
[lock unlock];
if(status != noErr) {
log4cplus_error("Video Encoder:", "create encoder failed, status: %d",(int)status);
return NULL;
}else {
log4cplus_info("Video Encoder:","create encoder success");
return session;
}

执行到这里,初始化编码器的工作已经做完,接下来我们需要将视频帧数据进行编码. 本例中使用AVCaptureSession采集视频帧以传给编码器编码.

3.编码

注意,因为编码线程与创建,销毁编码器过程属于异步操作,所以需要加锁.

  • 时间戳同步

首先我们取第一帧视频数据为基准点,取系统当前时间,作为编码第一帧数据的基准时间. 此操作主要用于后期的音视频同步,本例中不作过多说明,另外,时间戳同步生成机制也不像本例中这么简单.可以自行制定生成规则.

  • 时间戳校正

判断当前编码的视频帧中的时间戳是否大于前一帧, 因为视频是严格按时间戳排序播放的,所以时间戳应该是一直递增的,但是考虑到传给编码器的可能不是一个视频源,比如一开始是摄像头采集的,后面换成从网络流解码的视频原始数据,此时时间戳必定不同步,如果强行将其传给编码器,则画面会出现卡顿.

  • 编码视频帧
  • session: 先前配置好的session
  • imageBuffer: 原始视频数据
  • presentationTimeStamp: 视频帧的pts
  • duration: 此帧的持续时间,将附加到样本缓冲区。如果没有持续时间信息,传kCMTimeInvalid。
  • frameProperties: 指定视频帧的其他属性,这里以是否强制产生I帧为例.
  • sourceFrameRefcon: 可以传递给回调函数原始帧的引用.
  • infoFlagsOut: 指向VTEncodeInfoFlags以接收有关编码操作的信息。如果编码是(或正在)异步运行,则可以设置kVTEncodeInfo_Asynchronous位。如果帧被丢弃(同步),则可以设置kVTEncodeInfo_FrameDropped位。如果您不想接收此信息,请传递NULL。
VT_EXPORT OSStatus
VTCompressionSessionEncodeFrame(
CM_NONNULL VTCompressionSessionRef    session,
CM_NONNULL CVImageBufferRef            imageBuffer,
CMTime                                presentationTimeStamp,
CMTime                                duration, // may be kCMTimeInvalid
CM_NULLABLE CFDictionaryRef            frameProperties,
void * CM_NULLABLE                    sourceFrameRefcon,
VTEncodeInfoFlags * CM_NULLABLE        infoFlagsOut ) API_AVAILABLE(macosx(10.8), ios(8.0), tvos(10.2));
-(void)startEncodeWithBuffer:(CMSampleBufferRef)sampleBuffer session:(VTCompressionSessionRef)session isNeedFreeBuffer:(BOOL)isNeedFreeBuffer isDrop:(BOOL)isDrop  needForceInsertKeyFrame:(BOOL)needForceInsertKeyFrame lock:(NSLock *)lock {
[lock lock];

if(session == NULL) {
log4cplus_error("Video Encoder:", "%s,session is empty",__func__);
[self handleEncodeFailedWithIsNeedFreeBuffer:isNeedFreeBuffer sampleBuffer:sampleBuffer];
return;
}

//the first frame must be iframe then create the reference timeStamp;
static BOOL isFirstFrame = YES;
if(isFirstFrame && g_capture_base_time == 0) {
CMTime pts = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
g_capture_base_time = CMTimeGetSeconds(pts);// system absolutly time(s)
//        g_capture_base_time = g_tvustartcaptureTime - (ntp_time_offset/1000);
isFirstFrame = NO;
log4cplus_error("Video Encoder:","start capture time = %u",g_capture_base_time);
}

CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CMTime presentationTimeStamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);

// Switch different source data will show mosaic because timestamp not sync.
static int64_t lastPts = 0;
int64_t currentPts = (int64_t)(CMTimeGetSeconds(CMSampleBufferGetPresentationTimeStamp(sampleBuffer)) * 1000);
if (currentPts - lastPts < 0) {
log4cplus_error("Video Encoder:","Switch different source data the timestamp < last timestamp, currentPts = %lld, lastPts = %lld, duration = %lld",currentPts, lastPts, currentPts - lastPts);
[self handleEncodeFailedWithIsNeedFreeBuffer:isNeedFreeBuffer sampleBuffer:sampleBuffer];
return;
}
lastPts = currentPts;

OSStatus status = noErr;
NSDictionary *properties = @{(__bridge NSString *)kVTEncodeFrameOptionKey_ForceKeyFrame:@(needForceInsertKeyFrame)};
status = VTCompressionSessionEncodeFrame(session,
imageBuffer,
presentationTimeStamp,
kCMTimeInvalid,
(__bridge CFDictionaryRef)properties,
NULL,
NULL);

if(status != noErr) {
log4cplus_error("Video Encoder:", "encode frame failed");
[self handleEncodeFailedWithIsNeedFreeBuffer:isNeedFreeBuffer sampleBuffer:sampleBuffer];
}

[lock unlock];
if (isNeedFreeBuffer) {
if (sampleBuffer != NULL) {
CFRelease(sampleBuffer);
log4cplus_debug("Video Encoder:", "release the sample buffer");
}
}
}

4. h264码流 - H264, H265硬件编解码基础及码流分析

以下关于码流部分的代码如果看不懂,建议一定要先看下标题推荐的链接,里面是了解编解码器的基础知识以及iOS中VideoToolbox框架中数据结构的解析.

5. 回调函数

  • 排错校验

如果status中有错误信息,表示编码失败.可以做一些特殊处理.

  • 时间戳纠正

我们需要为编码后的数据填充时间戳,这里我们可以根据自己的规则制定一套时间戳生成规则,我们这里仅仅用最简单的偏移量,即用第一帧视频数据编码前系统时间为基准点,然后每帧编码后的时间取采集到的时间戳减去基准时间得到的值作为编码后数据的时间戳.

  • 寻找I帧.

原始视频数据经过编码后分为I帧,B帧,P帧.iOS端一般不开启B帧,B帧需要重新排序,我们拿到编码后的数据首先通过 kCMSampleAttachmentKey_DependsOnOthers 属性判断是否为I帧,如果是I帧,要从I帧中读取NALU头部关键信息,即vps,sps,pps. vps仅在h265编码器中才有.没有这些编码的视频无法在另一端播放,也无法录制成文件.

  • 读取编码器关键信息

从I帧中可以读取到vps,sps,pps数据具体的内容.如果是h264编码器调用 CMVideoFormatDescriptionGetH264ParameterSetAtIndex 函数,如果是h265编码器调用 CMVideoFormatDescriptionGetHEVCParameterSetAtIndex 函数,其中第二个参数的索引值0,1,2就分别代表这些数据的索引值.

找到这些数据后我们需要将它们拼接起来,因为它们是独立的NALU,即以 0x00, 0x00, 0x00, 0x01 作为隔断符以区分sps,pps.

所以,我们按照规则将拿到的vps,sps,pps中间分别以 00 00 00 01 作为隔断符以拼接成一个完整连续的buffer.本例以写文件为例,我们首先要将NALU头信息写入文件,也就是将I帧先写进去,因为I帧代表一个完整图像,P帧需要依赖I帧才能产生图像,所以我们文件的读取开头必须是一个I帧数据.

  • 一帧图片跟NALU的关联:

一帧图片经过 H.264 编码器之后,就被编码为一个或多个片(slice),而装载着这些片(slice)的载体,就是 NALU 了。

注意:片(slice)的概念不同与帧(frame),帧(frame)是用作描述一张图片的,一帧(frame)对应一张图片,而片(slice),是 H.264 中提出的新概念,是通过编码图片后切分通过高效的方式整合出来的概念,一张图片至少有一个或多个片(slice)。片(slice)都是又 NALU 装载并进行网络传输的,但是这并不代表 NALU 内就一定是切片,这是充分不必要条件,因为 NALU 还有可能装载着其他用作描述视频的信息。

  • 分割码流中的NALU

首先通过 CMBlockBufferGetDataPointer 获取视频帧数据.该帧表示一段H264/H265码流,其中可能包含多个NALU,我们需要找出每个NALU并用 00 00 00 01 作为隔断符. 即while循环就是寻找码流中的NALU,因为裸流中不含有start code.我们要将start code拷贝进去.

CFSwapInt32BigToHost : 从h264编码的数据的大端模式(字节序)转系统端模式

static void EncodeCallBack(void *outputCallbackRefCon,void *souceFrameRefCon,OSStatus status,VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer) {
XDXVideoEncoder *encoder = (__bridge XDXVideoEncoder*)outputCallbackRefCon;

if(status != noErr) {
NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
NSLog(@"H264: vtCallBack failed with %@", error);
log4cplus_error("TVUEncoder", "encode frame failured! %s" ,error.debugDescription.UTF8String);
return;
}

if (!encoder.isSupportEncoder) {
return;
}

CMBlockBufferRef block = CMSampleBufferGetDataBuffer(sampleBuffer);
CMTime pts = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
CMTime dts = CMSampleBufferGetDecodeTimeStamp(sampleBuffer);

// Use our define time. (the time is used to sync audio and video)
int64_t ptsAfter = (int64_t)((CMTimeGetSeconds(pts) - g_capture_base_time) * 1000);
int64_t dtsAfter = (int64_t)((CMTimeGetSeconds(dts) - g_capture_base_time) * 1000);
dtsAfter = ptsAfter;

/*sometimes relative dts is zero, provide a workground to restore dts*/
static int64_t last_dts = 0;
if(dtsAfter == 0){
dtsAfter = last_dts +33;
}else if (dtsAfter == last_dts){
dtsAfter = dtsAfter + 1;
}

BOOL isKeyFrame = NO;
CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, false);
if(attachments != NULL) {
CFDictionaryRef attachment =(CFDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
CFBooleanRef dependsOnOthers = (CFBooleanRef)CFDictionaryGetValue(attachment, kCMSampleAttachmentKey_DependsOnOthers);
isKeyFrame = (dependsOnOthers == kCFBooleanFalse);
}

if(isKeyFrame) {
static uint8_t *keyParameterSetBuffer    = NULL;
static size_t  keyParameterSetBufferSize = 0;

// Note: the NALU header will not change if video resolution not change.
if (keyParameterSetBufferSize == 0 || YES == encoder.needResetKeyParamSetBuffer) {
const uint8_t  *vps, *sps, *pps;
size_t         vpsSize, spsSize, ppsSize;
int            NALUnitHeaderLengthOut;
size_t         parmCount;

if (keyParameterSetBuffer != NULL) {
free(keyParameterSetBuffer);
}

CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
if (encoder.encoderType == XDXH264Encoder) {
CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sps, &spsSize, &parmCount, &NALUnitHeaderLengthOut);
CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pps, &ppsSize, &parmCount, &NALUnitHeaderLengthOut);

keyParameterSetBufferSize = spsSize+4+ppsSize+4;
keyParameterSetBuffer = (uint8_t*)malloc(keyParameterSetBufferSize);
memcpy(keyParameterSetBuffer, "\x00\x00\x00\x01", 4);
memcpy(&keyParameterSetBuffer[4], sps, spsSize);
memcpy(&keyParameterSetBuffer[4+spsSize], "\x00\x00\x00\x01", 4);
memcpy(&keyParameterSetBuffer[4+spsSize+4], pps, ppsSize);

log4cplus_info("Video Encoder:", "H264 find IDR frame, spsSize : %zu, ppsSize : %zu",spsSize, ppsSize);
}else if (encoder.encoderType == XDXH265Encoder) {
CMVideoFormatDescriptionGetHEVCParameterSetAtIndex(format, 0, &vps, &vpsSize, &parmCount, &NALUnitHeaderLengthOut);
CMVideoFormatDescriptionGetHEVCParameterSetAtIndex(format, 1, &sps, &spsSize, &parmCount, &NALUnitHeaderLengthOut);
CMVideoFormatDescriptionGetHEVCParameterSetAtIndex(format, 2, &pps, &ppsSize, &parmCount, &NALUnitHeaderLengthOut);

keyParameterSetBufferSize = vpsSize+4+spsSize+4+ppsSize+4;
keyParameterSetBuffer = (uint8_t*)malloc(keyParameterSetBufferSize);
memcpy(keyParameterSetBuffer, "\x00\x00\x00\x01", 4);
memcpy(&keyParameterSetBuffer[4], vps, vpsSize);
memcpy(&keyParameterSetBuffer[4+vpsSize], "\x00\x00\x00\x01", 4);
memcpy(&keyParameterSetBuffer[4+vpsSize+4], sps, spsSize);
memcpy(&keyParameterSetBuffer[4+vpsSize+4+spsSize], "\x00\x00\x00\x01", 4);
memcpy(&keyParameterSetBuffer[4+vpsSize+4+spsSize+4], pps, ppsSize);
log4cplus_info("Video Encoder:", "H265 find IDR frame, vpsSize : %zu, spsSize : %zu, ppsSize : %zu",vpsSize,spsSize, ppsSize);
}

encoder.needResetKeyParamSetBuffer = NO;
}

if (encoder.isNeedRecord) {
if (encoder->mVideoFile == NULL) {
[encoder initSaveVideoFile];
log4cplus_info("Video Encoder:", "Start video record.");
}

fwrite(keyParameterSetBuffer, 1, keyParameterSetBufferSize, encoder->mVideoFile);
}

log4cplus_info("Video Encoder:", "Load a I frame.");
}

size_t   blockBufferLength;
uint8_t  *bufferDataPointer = NULL;
CMBlockBufferGetDataPointer(block, 0, NULL, &blockBufferLength, (char **)&bufferDataPointer);

size_t bufferOffset = 0;
while (bufferOffset < blockBufferLength - kStartCodeLength)
{
uint32_t NALUnitLength = 0;
memcpy(&NALUnitLength, bufferDataPointer+bufferOffset, kStartCodeLength);
NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
memcpy(bufferDataPointer+bufferOffset, kStartCode, kStartCodeLength);
bufferOffset += kStartCodeLength + NALUnitLength;
}

if (encoder.isNeedRecord && encoder->mVideoFile != NULL) {
fwrite(bufferDataPointer, 1, blockBufferLength, encoder->mVideoFile);
}else {
if (encoder->mVideoFile != NULL) {
fclose(encoder->mVideoFile);
encoder->mVideoFile = NULL;
log4cplus_info("Video Encoder:", "Stop video record.");
}
}

//    log4cplus_debug("Video Encoder:","H265 encoded video:%lld, size:%lu, interval:%lld", dtsAfter,blockBufferLength, dtsAfter - last_dts);

last_dts = dtsAfter;
}

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK