ProAndroidDev

The latest posts from Android Professionals and Google Developer Experts.

Follow publication

glove-android: Using GloVe Word Embeddings for NLP In Android

Shubham Panchal
ProAndroidDev
Published in
6 min readApr 30, 2023

A glimpse of the demo app using glove-android. The first and the third images (from L -> R) depict the ‘compare words’ feature which computes cosine similarity between two words. The second image shows embedding generation in action.

Contents

What are GloVe word embeddings?

An illustration of word embeddings in the embedding space. Words ‘king’ and ‘queen’ are related contextually and hence point (nearly) in the same direction, establishing high semantic similarity. ‘Ice’ is a different word and does lie in the proximity of the other two vectors.

Adding glove-android to an existing project

glove-android.aar is placed in the app/libs, which houses the app’s private libraries.
dependencies {
...
implementation files('libs/glove-android.aar')
...
}

Using glove-android with Kotlin

class MainActivity : ComponentActivity() {

private var gloveEmbeddings : GloVe.GloVeEmbeddings? = null

override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)

setContent {
// Activity UI here
}

// GloVe.loadEmbeddings is a suspendable function
// We need a coroutine scope to handle its execution
// off the main thread
CoroutineScope( Dispatchers.IO ).launch {
GloVe.loadEmbeddings { it ->
gloveEmbeddings = it
}
}

}

}
val embedding1 = gloveEmbeddings!!.getEmbedding( "king" )
val embedding2 = gloveEmbeddings!!.getEmbedding( "queen" )
if( embedding1.isNotEmpty() && embedding2.isNotEmpty()) {
result = GloVe.compare( embedding1 , embedding2 ).toString()
}

Limitation — Increase in app’s package size

How does glove-android work internally?

import h5py
import numpy as np
import pickle

glove_file = open( "glove.6B\glove.6B\glove.6B.50d.txt" , "r" , encoding="utf-8" )
words = {}
embeddings = []
count = 0
for line in glove_file:
parts = line.strip().split()
word = parts[0]
embedding = [ float(parts[i]) for i in range( 1 , 51 ) ]
words[ word ] = count
embeddings.append( embedding )
count += 1
print( "Words processed" , count )

embeddings = np.array( embeddings )
hf = h5py.File( "glove_vectors_50d.h5" , "w" )
hf.create_dataset( "glove_vectors" , data=np.array( embeddings ).astype( 'float16') )
hf.close()

with open( "glove_words_50d.pkl" , "wb" ) as file:
pickle.dump( words , file )

Hope you’ll try glove-android

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Published in ProAndroidDev

The latest posts from Android Professionals and Google Developer Experts.

Written by Shubham Panchal

Android developer, ML and Math enthusiast. Exploring low-level programming and backend development

Responses (2)

Write a response

why does an error message appear? library "libcrypto_chaquopy.so" not found

--

Any plans to do this in C/C++ instead of python dicts? Retrieval might be sped up and maybe lesser memory footprint for the same.

--