Create Vision app using ML kit library and CameraX

hongbeom
ProAndroidDev
Published in
3 min readAug 8, 2020

--

Easily make vision implement with Google’s mlkit library and cameraX

🤖 MLkit

mlkit is a mobile SDK that provides Google’s in-device machine learning technology to Android and iOS apps. It is largely divided into Vision and Natural Language APIs, which are provided free of charge and allow you to solve problems and create new things. Among them, I tried using Vision API.

The Vision API I used is as follows.

  • Object Detection — Sensing objects, labeling labeled objects and returning coordinates.
  • Barcode Scan — Sensing barcodes
  • Face Detection — Recognizes the face and senses the coordinates of each feature point.
  • Text Recognition — Optical Character Recognition (OCR)

📸 CameraX

cameraX is a Jetpack support library designed to be easier to use in older camera api. this Compatible up to Android 5.0 (API 21) version. I checked the samples used in the Google mlkit site found that they were not implemented using cameraX libraries such as CameraSource and CameraPreview, so I brought a GraphicOverlay view and used it with a little customization. cameraX is easy to use. Let’s take a look at the code for cameraX.

The CameraManager class that manages cameras contains business logic about cameras. To get your camera images in real time, you can add the necessary analyzer to the imageAnalyzer and add it to the cameraProvider as argument in the bindToLifecycle() method.

At this time, before binding the cameraProvider, it is recommended that you call the cameraProvider unbindAll() method to release the existing bindings.

🖼 ImageAnalyzer

Now, you can create an abstract class called BaseImageAnalyzer inherits the Analyzer, and override the analyze method. In the sample, we created a process() method directly and ran it in the cameraSource class, but when we inherit and use the Analyzer, we can only implement the analyze method to run it in the camera. This part is based on the Firebase mlkit example.

Now just implement the processor and graphic you need. It can be too long to write all four, so let’s just only look at the FaceContourDetectionProcessor and FaceContourGraphic.

Set the detector and options to use and call the process() method to run Task. When you look at the onSuccess method, you first clear the graphicOverlay and then pass Face object to the argument of the newly created FaceContourGraphic object. Then add the generated object to graphicOverlay and call the postInvalidate() method to draw it. now Let’s take a look at the FaceContourGraphic class.

First, draw the face area in a square, and use the x and y coordinates of the contour point to draw a dot. Then use the drawFace() method to connect the feature points and draw with lines.

If you run FaceContourDetectionProcessor immediately after passing it as a factor to the setAnalyzer method in ImageAnalyzer, the coordinate values don’t match. you can see resolves here.

Now you can see the following results🎉

Conclusion

If you are using the camera on the front rather than the back, simply use the center value of the width of the overlay view to reverse the coordinates on the right and left.

It was interest that I could try using Google’s machine learning technology so easily, and I felt like I could use it In many ways

Thank you for reading it! 🙌

The full code and the rest of the execution results can be found here!

--

--