基于标记的AR

最近看了AR相关的知识,大体上可以将ar技术总结为:

  • 基于标记的AR
  • 基于地理位置的AR

这里先介绍一下两个技术的不同点,AR增强现实技术是结合设备摄像头完成的虚拟现实。比如在一个指定的目标上增加一张图片,这样的例子就属于基于标记的AR,这个标记就是指定的目标了。设备需要扫描这个目标来确定目标通过摄像头反馈到屏幕的位置,再在这个位置上添加虚拟的图片。这样的例子有很多,例如Nike球鞋展示,国内的应用视+…基于地理位置的AR参考点是目标物与当前位置的方位角。这样的例子就更多了,QQ AR红包,支付宝红包等等。这里首先来看一下效果:

接下来讲述的就是通过ios设备来完成简单的基于标记的AR的开发,我们都知道ios是可以在拍照的时候识别人脸的,其实可以更加细致到识别到人的左眼,右眼,嘴巴的位置…这次参考的标记就是人的眼睛和嘴巴。

知道了需要做的事情之后,需要来确定用什么技术能分析到脸部的信息。我这里选用的是AVFoundation中的AVCaptureDevice来录制前置摄像头的影像。下面的代码是初始化摄像机

#import "CameraView.h"
#import <AVFoundation/AVFoundation.h>

@interface CameraView ()<AVCaptureVideoDataOutputSampleBufferDelegate>
@end
@implementation CameraView

-(AVCaptureDevice *)frontFacingCameraIfAvailable
{
    NSArray *videoDevices = [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
    AVCaptureDevice *captureDevice = nil;
    for (AVCaptureDevice *device in videoDevices)
    {
        if (device.position == AVCaptureDevicePositionFront)
        {
            captureDevice = device;
            break;
        }
    }

    //  couldn't find one on the front, so just get the default video device.
    if ( ! captureDevice)
    {
        captureDevice = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];
    }

    return captureDevice;
}

- (void)setupCaptureSession {
    NSError *error = nil;

    _session = [[AVCaptureSession alloc] init];
    _session.sessionPreset = AVCaptureSessionPresetMedium;

    AVCaptureDevice *device = [self frontFacingCameraIfAvailable]; // for debug
    AVCaptureDeviceInput *input = [AVCaptureDeviceInput deviceInputWithDevice:device error:&error];

    if (!input) {
        NSLog(@"Some kind of error... handle it here");
    }

    [_session addInput:input];

    AVCaptureVideoDataOutput *output = [[AVCaptureVideoDataOutput alloc] init];
    [_session addOutput:output];

    dispatch_queue_t queue = dispatch_queue_create("pumpkins", NULL);
    [output setSampleBufferDelegate:self queue:queue];

    output.videoSettings =
    [NSDictionary dictionaryWithObject:
     [NSNumber numberWithInt:kCVPixelFormatType_32BGRA]
                                forKey:(id)kCVPixelBufferPixelFormatTypeKey];


    [_session startRunning];
}

ios是通过分析摄像中的每张图片识别脸部信息的,那么我们就加上中间录制的代理来处理摄像机拍摄到的每张图片,再将每张图片添加到当前的一个UIImageView中就可以了。上面讲述的是显示拍摄的过程,这里是简单的拍摄显示的代码:

#import "CameraView.h"
#import <AVFoundation/AVFoundation.h>

@interface CameraView ()<AVCaptureVideoDataOutputSampleBufferDelegate>
@end
@implementation CameraView
- (UIImage *) imageFromSampleBuffer:(CMSampleBufferRef) sampleBuffer
{
    // Get a CMSampleBuffer's Core Video image buffer for the media data
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    // Lock the base address of the pixel buffer
    CVPixelBufferLockBaseAddress(imageBuffer, 0);

    // Get the number of bytes per row for the pixel buffer
    void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);

    // Get the number of bytes per row for the pixel buffer
    size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
    // Get the pixel buffer width and height
    size_t width = CVPixelBufferGetWidth(imageBuffer);
    size_t height = CVPixelBufferGetHeight(imageBuffer);

    // Create a device-dependent RGB color space
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();

    // Create a bitmap graphics context with the sample buffer data
    CGContextRef context = CGBitmapContextCreate(baseAddress, width, height, 8,
                                                 bytesPerRow, colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst);
    // Create a Quartz image from the pixel data in the bitmap graphics context
    CGImageRef quartzImage = CGBitmapContextCreateImage(context);
    // Unlock the pixel buffer
    CVPixelBufferUnlockBaseAddress(imageBuffer,0);

    // Free up the context and color space
    CGContextRelease(context);
    CGColorSpaceRelease(colorSpace);

    // Create an image object from the Quartz image
    UIImage *image = [UIImage imageWithCGImage:quartzImage];

    // Release the Quartz image
    CGImageRelease(quartzImage);

    return (image);
}

-(void)setImageToView:(UIImage*)image {
    UIImage * capturedImage = [self rotateImage:image orientation:UIImageOrientationLeftMirrored ];
    _imageView.image = capturedImage;
    _settingImage = NO;
}

- (UIImage *) rotateImage:(UIImage*)image orientation:(UIImageOrientation) orient {
    CGImageRef imgRef = image.CGImage;
    CGAffineTransform transform = CGAffineTransformIdentity;
    //UIImageOrientation orient = image.imageOrientation;
    CGFloat scaleRatio = 1;
    CGFloat width = image.size.width;
    CGFloat height = image.size.height;
    CGSize imageSize = image.size;
    CGRect bounds = CGRectMake(0, 0, width, height);
    CGFloat boundHeight;

    if (orient == UIImageOrientationLeftMirrored) {
        boundHeight = bounds.size.height;
        bounds.size.height = bounds.size.width;
        bounds.size.width = boundHeight;
        transform = CGAffineTransformMakeTranslation(imageSize.height, imageSize.height);
        transform = CGAffineTransformScale(transform, -1.0, 1.0);
        transform = CGAffineTransformRotate(transform, 3.0 * M_PI / 2.0);
    }

    UIGraphicsBeginImageContext(bounds.size);
    CGContextRef context = UIGraphicsGetCurrentContext();
    if (orient == UIImageOrientationRight || orient == UIImageOrientationLeft) {
        CGContextScaleCTM(context, -scaleRatio, scaleRatio);
        CGContextTranslateCTM(context, -height, 0);
    } else {
        CGContextScaleCTM(context, scaleRatio, -scaleRatio);
        CGContextTranslateCTM(context, 0, -height);
    }
    CGContextConcatCTM(context, transform);
    CGContextDrawImage(UIGraphicsGetCurrentContext(), CGRectMake(0, 0, width, height), imgRef);
    UIImage *imageCopy = UIGraphicsGetImageFromCurrentImageContext();
    UIGraphicsEndImageContext();

    return imageCopy;
}

#pragma mark - delegate
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
    UIImage *image = [self imageFromSampleBuffer:sampleBuffer];

    if(_settingImage == NO){
        _settingImage = YES;
        [NSThread detachNewThreadSelector:@selector(setImageToView:) toTarget:self withObject:image];
    }
}

这中间包含了很多的细节,比如录取的图片是一个镜像的图片,需要通过一定的旋转才能在当前设备上正常的展示。代理中处理的是当前摄像头采样回来的数据,并不能直接使用,需要将采样数据做转化之后才能使用。

现在你能看到前置摄像头拍摄的信息能在设备上正常显示了。下一步就是检测每张图片中脸部信息了。代码如下

#import "CameraView.h"
#import <AVFoundation/AVFoundation.h>

@interface CameraView ()<AVCaptureVideoDataOutputSampleBufferDelegate>
@end
@implementation CameraView

- (void)facialRecognitionRequest:(UIImage *)image {
    //NSLog(@"Image is: %f by %f", image.size.width, image.size.height);

    if (!_isProcessingRequest) {
        _isProcessingRequest = YES;
        //NSLog(@"Detecting Faces");
        NSArray *arr = [self.detector featuresInImage:[CIImage imageWithCGImage:[image CGImage]]];

        if ([arr count] > 0) {
            //NSLog(@"Faces found.");
            for (int i = 0; i < 1; i++) {
                CIFaceFeature *feature = [arr objectAtIndex:i];
                CGSize size = [[UIScreen mainScreen] bounds].size;

                double xPosition = (feature.leftEyePosition.x + feature.rightEyePosition.x+feature.mouthPosition.x)/(3*image.size.width) ;
                double yPosition =  1 - (feature.leftEyePosition.y + feature.rightEyePosition.y+feature.mouthPosition.y)/(3*image.size.height);

                self.pumpkinImageView.center = CGPointMake(size.width * xPosition, size.height * yPosition);
            }
        } else {
            NSLog(@"没有输入资源");
        }
    }
    _isProcessingRequest = NO;
}

-(void)setImageToView:(UIImage*)image {
    UIImage * capturedImage = [self rotateImage:image orientation:UIImageOrientationLeftMirrored ];
    _imageView.image = capturedImage;
    _settingImage = NO;
    [self facialRecognitionRequest:capturedImage];
}

代码中很容易知道眼镜和嘴巴相对于录制过程中整张图片的位置,置于为什么要取他们三个的平均值是因为这样位置的参考会更多,位置也是会更加的准确。接下来就是将已经准备好的图片放在这个标记的位置就可以了。

思考

这只是一个简单的基于标记的AR实现,但是却指出了基于标记的AR的核心思路。优化算法或是选在性能更佳的SDK可以完成更好的效果

上面出现的效果其实还可以做优化,对得到的数据做均值运算可以得到更平滑的值,这样的值在设置图片位置时会更有意义

问题

  • 测试中手机会急剧升温,CPU运算量过大,这可能跟CIFaceFeature算法有关
  • CIFaceFeature识别率不是太高,导致添加的图片移动的路径不流畅。