1、深度学习下的图像视频处理技术优图X-Lab视觉AI负责人专家研究员看得更清,看得更懂目录1.夜景增强2.图像视频去模糊3.视频超分辨率1.夜景图像增强Taking photos is easyAmateur photographers typically create underexposed photosPhoto Enhancement is requiredImage EnhancementInput“Auto Enhance”on iPhone“Auto Tone”in LightroomOursExisting Photo Editing ToolsRetinex-based Met
2、hods LIME:TIP 17 WVM:CVPR 16 JieP:ICCV 17Learning-based Methods HDRNet:SIGGRAPH 17 White-Box:ACM TOG 18 Distort-and-Recover:CVPR 18 DPE:CVPR 18Previous WorkInputWVM CVPR16JieP ICCV17HDRNet Siggraph17DPE CVPR18White-Box TOG18Distort-and-Recover CVPR18OursLimitations of Previous Methods Illumination m
3、aps for natural images typically have relatively simple forms with known priors.The model enables customizing the enhancement results by formulating constraints on the illumination.Why This Model?Advantage:Effective Learning and Efficient LearningNetwork ArchitectureInputNave RegressionExpert-retouc
4、hedAblation StudyMotivation:The benchmark dataset is collected for enhancing general photos instead of underexposed photos,and contains a small number of underexposed images that cover limited lighting conditions.Our DatasetQuantitative Comparison:Our DatasetMethodPSNRSSIMHDRNet26.330.743DPE23.580.7
5、37White-Box21.690.718Distort-and-Recover24.540.712Ours w/o,w/o,w/o 27.020.762Ours with,w/o,w/o 28.970.783Ours with,with,w/o 30.030.822Ours30.970.856MethodPSNRSSIMHDRNet28.610.866DPE24.660.850White-Box23.690.701Distort-and-Recover28.410.841Ours w/o,w/o,w/o 28.810.867Ours with,w/o,w/o 29.410.871Ours w
6、ith,with,w/o 30.710.884Ours30.800.893Quantitative Comparison:MIT-Adobe FiveKVisual Comparison:Our DatasetInputJiePHDRNetDPEWhite-boxDistort-and-RecoverOur resultExpert-retouchedVisual Comparison:MIT-Adobe FiveKInputJiePHDRNetDPEWhite-boxDistort-and-RecoverOur resultExpert-retouchedMore Comparison Re
7、sults:User StudyInputWVMJiePHDRNetDPEWhite-BoxDistort-and-RecoverOur resultLimitaionInputOur resultMore ResultsInputJiePHDRNetDPEWhite-boxDistort-and-RecoverOur resultExpert-retouchedMore ResultsInputJiePHDRNetDPEWhite-boxDistort-and-RecoverOur resultExpert-retouchedMore ResultsInputJiePHDRNetDPEWhi
8、te-boxDistort-and-RecoverOur resultExpert-retouchedMore ResultsInputJiePHDRNetDPEWhite-boxDistort-and-RecoverOur resultExpert-retouchedMore ResultsInputWVMJiePHDRNetDPEWhite-BoxDistort-and-RecoverOur resultMore ResultsInputWVMJiePHDRNetDPEWhite-BoxDistort-and-RecoverOur resultMore ResultsInputiPhone
9、LightroomOur resultMore ResultsInputiPhoneLightroomOur result2.视频超分辨率Old and FundamentalSeveral decades ago Huang et al,1984 near recentMany ApplicationsHD video generation from low-res sourcesMotivation35Old and FundamentalSeveral decades ago Huang et al,1984 near recentMany ApplicationsHD video ge
10、neration from low-res sourcesVideo enhancement with detailsMotivation36Old and FundamentalSeveral decades ago Huang et al,1984 near recentMany ApplicationsHD video generation from low-res sourcesVideo enhancement with detailsText/object recognition in surveillance videosMotivation37Image SRTradition
11、al:Freeman et al,2002,Glasner et al,2009,Yang et al,2010,etc.CNN-based:SRCNN Dong et al,2014,VDSR Kim et al,2016,FSRCNN Dong et al,2016,etc.Video SRTraditional:3DSKR Takeda et al,2009,BayesSR Liu et al,2011,MFSR Ma et al,2015,etc.CNN-based:DESR Liao et al,2015,VSRNet Kappeler,et al,2016,Caballero et
12、 al,2016,etc.Previous Work38EffectivenessHow to make good use of multiple frames?Remaining Challenges39Data from Vid4 Ce Liu et al.Bicubic x4MisalignmentOcclusionLarge motionEffectivenessHow to make good use of multiple frames?Are the generated details real?Remaining Challenges40Image SRBicubic x4Ef
13、fectivenessHow to make good use of multiple frames?Are the generated details real?Remaining Challenges41Image SRTruthEffectivenessHow to make good use of multiple frames?Are the generated details real?Model IssuesOne model for one settingRemaining Challenges42VDSR Kim et al.,2016ESPCN Shi et al.,201
14、6VSRNet Kappeler et al,2016EffectivenessHow to make good use of multiple frames?Are the generated details real?Model IssuesOne model for one settingIntensive parameter tuningSlowRemaining Challenges43AdvantagesBetter use of sub-pixel motionPromising results both visually and quantitativelyFully Scal
15、ableArbitrary input sizeArbitrary scale factorArbitrary temporal framesOur Method4445Data from Vid4 Ce Liu et al.Motion EstimationOur Method460ME0Sub-pixel Motion Compensation(SPMC)Layer Our Method470ME0SPMCDetail Fusion NetOur Method480ME0SPMCEncoderDecoderConvLSTM=1=+1skip connectionsArbitrary Inp
16、ut Size490ME0SPMCEncoderConvLSTM=1=+1skip connectionsInput size:Fully convolutionalDecoderArbitrary Scale Factors50234ParameterFree0ME0SPMCEncoderConvLSTM=1=+1skip connectionsDecoderArbitrary Temporal Length513 frames5 frames0ME0SPMCEncoderConvLSTM=1=+1skip connectionsDecoderDetails from multi-frame
17、sAnalysis523 identicalframesOutput(identical)Details from multi-framesAnalysis533 consecutiveframesOutput(consecutive)Output(identical)Ablation Study:SPMC Layer v.s.BaselineAnalysis54Output(baseline)0BWResizeBackward warping+Resize(baseline)Ablation Study:SPMC Layer v.s.BaselineAnalysis55Output(SPMC
18、)0SPMCSPMCOutput(baseline)Comparisons56Bicubic x4Comparisons57BayesSR Liu et al,2011;Ma et al.,2015 Comparisons58DESR Liao et al.,2015Comparisons59VSRNet Kappeler et al,2016Comparisons60OursComparisons61Bicubic x4Comparisons62OursRunning Time63BayesSR Liu et al,2011Running Time642 hour/frameScale Fa
19、ctor:4Frames:31MFSR Ma et al,2015Running Time6510 min/frameScale Factor:4Frames:31DESR Liao et al,2015Running Time66/frameScale Factor:4Frames:318 minVSRNet Kappeler et al,2016Running Time6740 s/frameScale Factor:4Frames:5Ours(5 frames)Running Time680.19s/frameScale Factor:4Frames:5Ours(3 frames)Run
20、ning Time69/frameScale Factor:4Frames:30.14sMore Results707172Summary73End-to-end&fully scalableNew SPMC layerHigh-quality&fast speed0ME0SPMCEncoderConvLSTM=1=+1skip connectionsDecoder3.图像视频去模糊图像去模糊问题75Data from previous workDifferent Blur AssumptionsUniform:Fergus et al,2006,Shan et al,2009,Cho et
21、al,2009,Xu et al,2010,etc.Previous Work76Data from Xu et al,2010Different Blur AssumptionsNon-uniform:Whyte et al,2010,Hirsch et al,2011,Zheng et al,2013,etc.Previous Work77Data from Whyte et al,2010Different Blur AssumptionsDynamic:Kim et al,2013,Kim et al,2014,Nah et al,2017,etc.Previous Work78Dat
22、a from Kim et al,2013Learning-based methodsEarly methods:Sun et al,2015,Schuler et al,2016,Xiao et al,2016,etc.Substitute a few traditional modules with learned parametersMore recent:Nah et al,2017,Kim et al,2017,Su et al,2017,Wiescholleket al,2017Network:encoder-decoder,multi-scale,etc.Previous Wor
23、k79Complicated Real-world BlurRemaining Challenges80Data from GOPRO datasetIll-posed Problem&Unstable SolversArtifacts:ringing,noise,etc.Remaining Challenges81Data from Mosleh et al,2014inaccurate kernelsinaccurate modelsunstable solversinformation lossEfficient Network StructureU-Net or encoder-dec
24、oder network Su et al,2017Remaining Challenges82InputOutputconvskip connectionEfficient Network StructureMulti-scale or cascaded refinement network Nah et al,2017Remaining Challenges83Outputconvconvinputcoarse stagefine stageresize upMerits in Coarse-to-fine StrategyEach scale solve the same problem
25、Solver and parameters at each scale are usually the sameOur Method84SolverSolverScale-recurrent NetworkOur Method8532inputSolver3SolverSolver211EBlocksDBlocksSolverconvResBlockResBlockResBlockEBlocksResBlockResBlockResBlockDBlocksdeconv86Data from GOPRO datasetUsing Different Number of ScalesAnalysi
26、s871 scaleInput2 scales3 scalesBaseline ModelsAnalysis88ModelSSSCw/o RRNNSR-FlatParam2.73M8.19M2.73M3.03M2.66MPSNR28.4029.0529.2629.3527.53Solver11EBlocksDBlocksSolverSingle Scale(SS)Baseline ModelsAnalysis89ModelSSSCw/o RRNNSR-FlatParam2.73M8.19M2.73M3.03M2.66MPSNR28.4029.0529.2629.3527.53EBlocksDB
27、locksSolverScale Cascaded(SC)32Solver 13Solver 2Solver 3211Baseline ModelsAnalysis90ModelSSSCw/o RRNNSR-FlatParam2.73M8.19M2.73M3.03M2.66MPSNR28.4029.0529.2629.3527.53EBlocksDBlocksSolverWithout Recurrent(w/o R)32Solver3SolverSolver211Baseline ModelsAnalysis91ModelSSSCw/o RRNNSR-FlatParam2.73M8.19M2
28、.73M3.03M2.66MPSNR28.4029.0529.2629.3527.53EBlocksDBlocksSolverVanilla RNN(RNN)32Solver3SolverSolver211Baseline ModelsAnalysis92ModelSSSCw/o RRNNSR-FlatParam2.73M8.19M2.73M3.03M2.66MPSNR28.4029.0529.2629.3527.53Scale Recurrent with Flat convolutions(SR-Flat)32Solver3SolverSolver211SolverconvconvBase
29、line ModelsAnalysis93ModelSR-RBSR-EDSR-EDRB1SR-EDRB2SR-EDRB3Param2.66M3.76M2.21M2.99M3.76MPSNR28.1129.0628.6029.3229.98Scale Recurrent with ResBlocks(SR-RB)32Solver3SolverSolver211SolverResBlockResBlockBaseline ModelsAnalysis94ModelSR-RBSR-EDSR-EDRB1SR-EDRB2SR-EDRB3Param2.66M3.76M2.21M2.99M3.76MPSNR
30、28.1129.0628.6029.3229.98Scale Recurrent with U-Net(SR-ED)32Solver3SolverSolver211encoderdecoderSolverconvconvconvconvencoderconvconvconvdecoderdeconvBaseline ModelsAnalysis95ModelSR-RBSR-EDSR-EDRB1SR-EDRB2SR-EDRB3Param2.66M3.76M2.21M2.99M3.76MPSNR28.1129.0628.6029.3229.98Scale Recurrent with U-Net&
31、1-3 ResBlock(SR-EDRB1-3)32Solver3SolverSolver211encoderdecoderSolverconvResBlockencoderResBlockdecoderdeconvResBlockResBlockResBlockResBlockComparisons9697Input98Whyte et al,201299Sun et al,2015100Nah et al,2017101Ours102Input103Whyte et al,2012104Sun et al,2015105Nah et al,2017106Ours107Input108Whyte et al,2012109Sun et al,2015110Nah et al,2017111OursMore Results112113Input114Ours115Input116Ours117Input118OursA new end-to-end CNN-based framework for deblurring.Without assumptions on blur models.Visually and quantitatively high-quality results with fewer parameters.Summary119Thanks