《HC2022.NODAR.LeafJiang.v02.pdf》由会员分享,可在线阅读,更多相关《HC2022.NODAR.LeafJiang.v02.pdf(20页珍藏版)》请在三个皮匠报告上搜索。
1、NODAR 3D Vision SystemEnabling Mass Production of Autonomous VehiclesHot Chips 34August 21-23,2022Mass Production of Autonomous VehiclesPath to the production of 100M units/yearQuantityTimeQty 100MProductionValidationFeasibilityQty 1Qty 100DARPA Grand Challenge2004-2007Investments in AV Tech($100B)2
2、007-2022Production2022-CostReliabilityRangeDensityNeed high-resolution 3D sensing to solve AV problemSpinning multi-channel lidar3D from cameras using softwareWide-Baseline Stereo Vision CameraExquisitely dense and accurate point clouds to 1000+metersLong Range Data CollectionCaptures bridge crossin
3、g and reconstructs accurately as ego truck passes under bridge that casts a strong shadow on the roadCaptures repetitive patterns of the road railing barriers on right hand sideCaptures vehicles from near(12.2 m)to far(1.5 km)range432m20m78m124m347m582mStereo Vision PrincipleWider baseline gives lon
4、ger rangeObjectLensCMOS sensorSmall shiftShort baselineShort baseline stereo vision has trouble discerning the shift of the image at long rangesStereo Vision PrincipleWider baseline gives longer rangeWide baseline gives sensor access to longer ranges but need to solve calibration problem for stereo
5、cameras mounted on vehicles where maintaining 0.01 optical alignment is virtually impossible with shock and vibrationObjectLensCMOS sensorSmall shiftLarge shiftShort baselineWide baselineShort baseline stereo vision has trouble discerning the shift of the image at long rangesStereo Vision PrincipleN
6、ODAR solves decade-old online calibration problemYou cant ship an engineer with a product.Image of camera and person with checkerboardSo researcher have been working on online calibration using natural scenes for the last 30+years.Rectification geometry graphicNatural Scenes graphicHigh BandwidthSub
7、-pixel accurate alignmentBut published algorithms did not work on natural scenesor compute fast enough to correct the camera parameters within the timescale of the road and engine vibrationsor produce the alignment accuracy needed to see 1000+metersuntil NODARs Hammerhead Vision SystemStereo Vision
8、CapabilitiesPrevious GenerationShort Baseline and Static CalibrationPoor long-range 3D reconstructionPoor minimum rangePoor vibration/shock toleranceNext GenerationWide Baseline and Online CalibrationLidar-like+3D point cloud reconstructionExcellent minimum rangeExcellent vibration/shock toleranceIm
9、age from Ford Open DatasetIncorrect depth reportedNoisy depth map due to calibration Left frameDepth map from Ford rectificationDepthmap with NODAR auto calibration softwareDangerous situation:says that bridge is farther away than it really is(and that there is no space to drive under it)Stereo Visi
10、on Capabilities-Bridge exampleCase 4:AirportCase 3:Girder bridgeCase 2:TunnelCase 1:Construction siteCase 5:Overcast skyFord AV open datasetRobust Stereo Vision for VehiclesProcessing block diagramLeft imageRight imageOnline CalibrationStereo CorrespondenceDepth mapProcessing block diagramLeft image
11、Right imageOnline CalibrationStereo CorrespondenceDepth mapAutocalibration TechnologyNODARs patented calibration tech enables automotive applications with significant shock and vibrationKeypoint Matching ApproachFails when descriptors are similar(windows in urban environments and active stereo illum
12、ination)NODAR Cost Function ApproachRobust under large range of scenes,computed efficiently,and no assumption of flat road surfaceIndustry StandardNODARU.S.Patent No.17/365,623,US11321875B2 and US11321876B2Calibration is an optimization problemRectification requires 6 extrinsic and 18+intrinsic came
13、ra parameters.NODAR efficiently,quickly,and accurately searches camera parameters to support off-road environments with high levels of shock and vibration,which is the key innovation for supporting long-baseline stereo vision in vehicles.14Right Camera1.Focal length x2.Focal length y3.Principal poin
14、t x4.Principal point y5.Lens distortion,radial,k16.Lens distortion,radial,k27.Lens distortion,radial,k38.Lens distortion,tangential,p19.Lens distortion,tangential,p224-dimensional optimization problem100 elements per dimension10024=1048 search spaceAssuming 1 ns per point 3 x 1031 years Age of the u
15、niverse(1010 years)A challenging problem!Left Camera1.Focal length x2.Focal length y3.Principal point x4.Principal point y5.Lens distortion,radial,k16.Lens distortion,radial,k27.Lens distortion,radial,k38.Lens distortion,tangential,p19.Lens distortion,tangential,p21.Roll()2.Pitch()3.Yaw()4.Camera lo
16、cation x(m)5.Camera location y(m)6.Camera location z(m)Definition of Cost Function(Highly Parallelizable)Left imageYYCorrespondence along epipolar lineCorrespondence ValueYUniqueness threshold Best matchSecond best matchPatchCorresponding rowInput:22 camera parametersRectification of image left and
17、right imageCorrespondence search along epipolar lines for all image patchesPass/Fail correspondence values for all patchesSum number of passing patchesCost FunctionOutputRight imageMin value thresholdPass/FailH*W*2hw*D Ops/cost function evaluation:H=1860,W=2880,h=5,w=5,D=256 68 Gops/cost function ev
18、aluationHWhwProcessing block diagramLeft imageRight imageOnline CalibrationStereo CorrespondenceDepth mapStereo CorrespondenceMatch corresponding pixels in left and right imagesSignal Processing Algorithms1D search(along epipolar lines)FasterDoes not hallucinateGeneralizableExample:Semi-Global Block
19、 Matching,5MP image,127G ops/frameDeep Learning Algorithms2D search(convolutions)SlowerCould hallucinateNot generalizableExample:PSMNet,5 MP image,9604G ops/frameOptimal solution depends on application and compute resourcesPower vs.Applications for Long-Range Stereo CamerasDecreasing Power Consumpti
20、on Unlocks More MarketsMaximum Power Consumption(Watts)Robo-taxis/Shuttles100 W*5-8 MP,5-30 FPS,200+metersConsumer Vehicles50 W*5-8 MP,5-30 FPS,200+metersCommercial Vehicles300 W*5-8 MP,5-30 FPS,400+metersLast-Mile Delivery20 W*2 MP,5-30 FPS,50 metersHammerhead Vision SystemToday(using Nvidia HW)Dro
21、nes/UAVs5 W*VGA,10-20 FPS,400 metersHammerhead Vision SystemNext Year*Compute power available on these platforms is roughly proportional to the vehicle mass(because kinetic energy is mv)Limitations in existing silicon platforms and the futureThe online calibration algorithms currently run on general
22、 purpose GPUs,which consumes too much power for smaller platforms(such as drones)To make this a“solved”problem across all autonomous platforms would require an ASIC forRectification with ability to quickly modify the look-up tablesCorrespondence-computation acceleratorSummaryHigh-resolution 3D sensi
23、ng is necessary for autonomous vehiclesWide-baseline stereo vision provides a commercially viable path to mass productionNext generation stereo vision has two innovations:Online calibration of independent camera modules on platforms with shock and vibrationMore accurate stereo correspondence algorithmsLikely to see adoption of independently-mounted stereo vision cameras in other markets such as robotics,which has similar economics and platform costs as passenger vehicles